Template-Type: ReDIF-Article 1.0
Author-Name: Sangit Chatterjee
Author-X-Name-First: Sangit
Author-X-Name-Last: Chatterjee
Author-Name: Matthew Laudato
Author-X-Name-First: Matthew
Author-X-Name-Last: Laudato
Title: Gender and performance of world-class athletes
Abstract:
The athletic performances of men and women are compared based on
worldrecord times for various distance events in swimming, running and
skating. The ratio of the times of women to those of men against years is
modelled through a modified exponential distribution. The rate of
improvement is found to be higher for women in the three sports. Law-like
relationships are observed for world-record times against distance.
Although men's absolute performance is generally superior, the disparity
diminishes with increasing distance.
Journal: Journal of Applied Statistics
Pages: 3-10
Issue: 1
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723846
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723846
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:1:p:3-10
Template-Type: ReDIF-Article 1.0
Author-Name: P. L. H. Yu
Author-X-Name-First: P. L. H.
Author-X-Name-Last: Yu
Author-Name: K. Lam
Author-X-Name-First: K.
Author-X-Name-Last: Lam
Title: How to predict election winners from a poll
Abstract:
Suppose that we have k candidates in an election and that the top m
winners will be elected. Assume that the voters can select up to m (
Journal: Journal of Applied Statistics
Pages: 11-24
Issue: 1
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723855
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723855
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:1:p:11-24
Template-Type: ReDIF-Article 1.0
Author-Name: Philip Hans Franses
Author-X-Name-First: Philip Hans
Author-X-Name-Last: Franses
Author-Name: Bart Hobijn
Author-X-Name-First: Bart
Author-X-Name-Last: Hobijn
Title: Critical values for unit root tests in seasonal time series
Abstract:
In this paper, we present tables with critical values for a variety of
tests for seasonal and non-seasonal unit roots in seasonal time series. We
consider (extensions of) the Hylleberg et al. and Osborn et al. test
procedures. These extensions concern time series with increasing seasonal
variation and time series with structural breaks in the seasonal means.
For each case, we give the appropriate auxiliary test regression, the test
statistics, and the corresponding critical values for a selected set of
sample sizes. We also illustrate the practical use of the auxiliary
regressions for quarterly new car sales in the Netherlands. Supplementary
to this paper, we provide Gauss programs with which one can generate
critical values for particular seasonal frequencies and sample sizes.
Journal: Journal of Applied Statistics
Pages: 25-48
Issue: 1
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723864
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723864
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:1:p:25-48
Template-Type: ReDIF-Article 1.0
Author-Name: C. Raju
Author-X-Name-First: C.
Author-X-Name-Last: Raju
Author-Name: J. Jothikumar
Author-X-Name-First: J.
Author-X-Name-Last: Jothikumar
Title: Procedures and tables for the construction and selection of chain sampling plans ChSP-4A(c1 ,c2 )r-Part 5
Abstract:
This paper presents a design procedure for ChSP-4A(c1,c2)r plans based on
Kullback-Leibler information and the minimum sum of risks. A table that
gives the values of the parameters n and k indexed by the acceptable
quality level (AQL) and limiting quality level (LQL) is presented, from
which one can select a plan which gives a desired AQL and LQL when the
producer's risk alpha= 0.05 and the consumer's risk beta = 0.10. A
concluding remark for this series of five papers is given at the end of
this paper.
Journal: Journal of Applied Statistics
Pages: 49-76
Issue: 1
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723873
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723873
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:1:p:49-76
Template-Type: ReDIF-Article 1.0
Author-Name: Ken Hung
Author-X-Name-First: Ken
Author-X-Name-Last: Hung
Title: A comparison of two large sample confidence intervals for a proportion: A Monte Carlo simulation
Abstract:
Two pairs of confidence intervals for a proportion, similar to that of
Larson, are compared. It can be shown through computer simulation
experiments that, for certain values of p, the confidence interval
obtained by the approximation is superior.
Journal: Journal of Applied Statistics
Pages: 77-84
Issue: 1
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723882
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723882
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:1:p:77-84
Template-Type: ReDIF-Article 1.0
Author-Name: Dhaifalla Al-Mutairi
Author-X-Name-First: Dhaifalla
Author-X-Name-Last: Al-Mutairi
Author-Name: Satish Agarwal
Author-X-Name-First: Satish
Author-X-Name-Last: Agarwal
Title: Distributions of the lifetimes of system components operating under an unknown common environment
Abstract:
Families of joint distributions for describing the lifetimes of a system
of components that operate under an unknown environment, when the
environment follows a Weibull distribution, are derived. The reliability
function for this system is calculated and several properties of the
aforementioned joint distributions are investigated.
Journal: Journal of Applied Statistics
Pages: 85-96
Issue: 1
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723891
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723891
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:1:p:85-96
Template-Type: ReDIF-Article 1.0
Author-Name: M. C. Agrawal
Author-X-Name-First: M. C.
Author-X-Name-Last: Agrawal
Author-Name: A. B. Sthapit
Author-X-Name-First: A. B.
Author-X-Name-Last: Sthapit
Title: Hierarchic predictive ratio-based and product-based estimators and their efficiency
Abstract:
Invoking the predictive approach with a fixed population set-up, and
employing initially the customary ratio and product estimators as
potential predictors for the non-surveyed part of the population, we have
generated sequences of ratio-based and product-based estimators. The
proposed ratio-based and product-based estimators of order k are-under
some practical conditions-found to be more efficient than the customary
ratio and product estimators and the usual simple mean when k is chosen
optimally. Under the optimal value of k, the kth-order ratio-based and
product-based estimators are found to be as efficient as the linear
regression estimator. We have used real population data to illustrate the
efficacy of the proposed ratio-based and product-based estimators relative
to the usual simple mean and the customary ratio and product estimators.
Journal: Journal of Applied Statistics
Pages: 97-104
Issue: 1
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723909
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723909
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:1:p:97-104
Template-Type: ReDIF-Article 1.0
Author-Name: Sergio Munoz
Author-X-Name-First: Sergio
Author-X-Name-Last: Munoz
Author-Name: Shrikant Bangdiwala
Author-X-Name-First: Shrikant
Author-X-Name-Last: Bangdiwala
Title: Interpretation of Kappa and B statistics measures of agreement
Abstract:
The Kappa statistic proposed by Cohen and the B statistic proposed by
Bangdiwala are used to quantify the agreement between two observers,
independently classifying the same n units into the same k categories.
Both statistics correct for the agreement expected to result from chance
alone, but the Kappa statistic is a measure that adjusts the observed
proportion of agreement and ranges from- pc/(1- pc) to 1, where pc is the
expected agreement that results from chance, and the B statistic is a
measure that adjusts the observed area of agreement with that expected to
result from chance, and ranges from 0 to 1. Statistical guidelines for the
interpretation of either statistic are not available. For the Kappa
statistic, the suggested arbitrary interpretation given by Landis and Koch
is commonly quoted. This paper compares the behavior of the Kappa
statistic and the B statistic in 3 3 and 4 4 contingency tables, under
different agreement patterns. Based on simulation results, non-arbitrary
guidelines for the interpretation of both statistics are provided.
Journal: Journal of Applied Statistics
Pages: 105-112
Issue: 1
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723918
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723918
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:1:p:105-112
Template-Type: ReDIF-Article 1.0
Author-Name: Warren Gilchrist
Author-X-Name-First: Warren
Author-X-Name-Last: Gilchrist
Title: Modelling with quantile distribution functions
Abstract:
The definition and construction of distributions are explored using
parametric forms of the quantile distribution function. Short reviews are
given of the identification and construction of such distribution
functions, and of methods for estimation and testing.
Journal: Journal of Applied Statistics
Pages: 113-122
Issue: 1
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723927
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723927
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:1:p:113-122
Template-Type: ReDIF-Article 1.0
Author-Name: L. I. Pettit
Author-X-Name-First: L. I.
Author-X-Name-Last: Pettit
Author-Name: J. L. Palmer
Author-X-Name-First: J. L.
Author-X-Name-Last: Palmer
Title: Seasonal patterns of fertility measures: A Bayesian approach
Abstract:
Becker (1981) presents some theory about related measures of fertility.
He SUMMARY compares his theoretical predictions with observed
relationships found in a set of data collected in Bangladesh. In general,
he finds good agreement. In this paper, we reanalyse the data using
Bayesian methods. In particular, we use Gibbs sampling to fit
trigonometric regression models with autocorrelated errors. The results
are generally in agreement with Becker's. However, evidence from one of
the autocorrelation parameters and a residual analysis casts some doubt on
whether the basic cosine model which is assumed fits the data well.
Journal: Journal of Applied Statistics
Pages: 139-146
Issue: 2
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723756
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723756
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:2:p:139-146
Template-Type: ReDIF-Article 1.0
Author-Name: P. Prescott
Author-X-Name-First: P.
Author-X-Name-Last: Prescott
Author-Name: N. R. Draper
Author-X-Name-First: N. R.
Author-X-Name-Last: Draper
Author-Name: S. M. Lewis
Author-X-Name-First: S. M.
Author-X-Name-Last: Lewis
Author-Name: A. M. Dean
Author-X-Name-First: A. M.
Author-X-Name-Last: Dean
Title: Further properties of mixture designs for five components in orthogonal blocks
Abstract:
Orthogonally blocked experimental designs for mixtures of five
ingredients, formed from Latin squares, were previously discussed by
Prescott et al. Here, we extend this development by studying the
properties of three classes of possible designs, with recommendations on
their practical application. Restrictions on the design classes are
explored and D-optimal (within the classes) versions are identified.
Remarks on general D-optimality conclude the paper.
Journal: Journal of Applied Statistics
Pages: 147-156
Issue: 2
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723765
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723765
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:2:p:147-156
Template-Type: ReDIF-Article 1.0
Author-Name: Tapio Nummi
Author-X-Name-First: Tapio
Author-X-Name-Last: Nummi
Title: Estimation in a random effects growth curve model
Abstract:
This paper considers estimation under the growth curve model of Potthoff
and Roy (1964) with random effects. Estimation under a multivariate model
is also considered. Estimation under incomplete data and estimation of
random effects are also discussed. A numerical example of data on bulls is
presented to illustrate these techniques.
Journal: Journal of Applied Statistics
Pages: 157-168
Issue: 2
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723774
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723774
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:2:p:157-168
Template-Type: ReDIF-Article 1.0
Author-Name: K. E. Basford
Author-X-Name-First: K. E.
Author-X-Name-Last: Basford
Author-Name: G. J. Mclachlan
Author-X-Name-First: G. J.
Author-X-Name-Last: Mclachlan
Author-Name: M. G. York
Author-X-Name-First: M. G.
Author-X-Name-Last: York
Title: Modelling the distribution of stamp paper thickness via finite normal mixtures: The 1872 Hidalgo stamp issue of Mexico revisited
Abstract:
Izenman and Sommer (1988) used a non-parametric kernel density estimation
technique to fit a seven-component model to the paper thickness of the
1872 Hidalgo stamp issue of Mexico. They observed an apparent conflict
when fitting a normal mixture model with three components with unequal
variances. This conflict is examined further by investigating the most
appropriate number of components when fitting a normal mixture of
components with equal variances.
Journal: Journal of Applied Statistics
Pages: 169-180
Issue: 2
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723783
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723783
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:2:p:169-180
Template-Type: ReDIF-Article 1.0
Author-Name: Y. Eric Shao
Author-X-Name-First: Y. Eric
Author-X-Name-Last: Shao
Title: Multiple intervention analysis with application to sales promotion data
Abstract:
The sales promotion data resulting from multiple marketing strategies are
usually autocorrelated. Consequently, the characteristics of those data
sets can be analyzed using time-series and/or intervention analysis.
Traditional time-series intervention analysis focuses on the effects of
single or few interventions, and forecasts may be obtained as long as the
future interventions can be assured. This study is different from
traditional approaches, and considers the cases in which multiple
interventions and the uncertainty of future interventions exist in the
system. In addition, this study utilizes a set of real sales promotion
data to demonstrate the effectiveness of the proposed approach.
Journal: Journal of Applied Statistics
Pages: 181-192
Issue: 2
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723792
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723792
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:2:p:181-192
Template-Type: ReDIF-Article 1.0
Author-Name: Martin Schader
Author-X-Name-First: Martin
Author-X-Name-Last: Schader
Author-Name: Friedrich Schmid
Author-X-Name-First: Friedrich
Author-X-Name-Last: Schmid
Title: Power of tests for uniformity when limits are unknown
Abstract:
Power of modifications of the Kolmogorov, Cramer-von Mises, Watson and
Anderson-Darling tests for testing uniformity when limits are unknown is
compared. Power is computed by Monte Carlo simulation within one-parameter
families of alternative distributions containing the uniform distribution
as a special case. A table of mostly unpublished quantiles is given and
continuous power curves are plotted.
Journal: Journal of Applied Statistics
Pages: 193-206
Issue: 2
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723800
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723800
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:2:p:193-206
Template-Type: ReDIF-Article 1.0
Author-Name: V. Soundararajan
Author-X-Name-First: V.
Author-X-Name-Last: Soundararajan
Author-Name: A. L. Christina
Author-X-Name-First: A. L.
Author-X-Name-Last: Christina
Title: Selection of single sampling variables plans based on the minimum angle
Abstract:
This paper provides single sampling variables plans for given values of
n, the acceptable quality level and limiting quality level. Tables are
constructed using the minimum angle technique.
Journal: Journal of Applied Statistics
Pages: 207-218
Issue: 2
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723819
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723819
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:2:p:207-218
Template-Type: ReDIF-Article 1.0
Author-Name: M. Mahibbur Rahman
Author-X-Name-First: M. Mahibbur
Author-X-Name-Last: Rahman
Author-Name: Z. Govindarajulu
Author-X-Name-First: Z.
Author-X-Name-Last: Govindarajulu
Title: A modification of the test of Shapiro and Wilk for normality
Abstract:
The W statistic of Shapiro and Wilk provides the best omnibus test of
normality, but its application is limited up to n= 50. This study modifies
W, such that it can be extended for all sample sizes. The critical values
of W, i.e. the modification of W, is given for n up to 5000. The empirical
moments show that the null distribution of W is skewed to the left and is
consistant for all sample sizes. Empirical powers of W are also comparable
with those of W.
Journal: Journal of Applied Statistics
Pages: 219-236
Issue: 2
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723828
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723828
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:2:p:219-236
Template-Type: ReDIF-Article 1.0
Author-Name: L. H. Kao
Author-X-Name-First: L. H.
Author-X-Name-Last: Kao
Author-Name: S. Chakraborti
Author-X-Name-First: S.
Author-X-Name-Last: Chakraborti
Title: One-sided sign-type non-parametric procedures for comparing treatments with a control in a randomized complete block design
Abstract:
Non-parametric procedures are presented for comparing several treatments
with a control when the data are collected in a randomized complete block
design with no interaction. The procedures are generalizations of some
well-known sign-type tests, and include both overall tests and multiple
comparisons procedures. A numerical example is used to motivate the
problem and illustrate the proposed methods. Some concluding remarks are
offered.
Journal: Journal of Applied Statistics
Pages: 251-264
Issue: 3
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723666
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723666
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:3:p:251-264
Template-Type: ReDIF-Article 1.0
Author-Name: H. H. Chen
Author-X-Name-First: H. H.
Author-X-Name-Last: Chen
Author-Name: S. W. Duffy
Author-X-Name-First: S. W.
Author-X-Name-Last: Duffy
Author-Name: L. Tabar
Author-X-Name-First: L.
Author-X-Name-Last: Tabar
Title: A mover-stayer mixture of Markov chain models for the assessment of dedifferentiation and tumour progression in breast cancer
Abstract:
Malignancy grade is a histological measure of attributes related to a
breast tumour's aggressive potential. It is not established whether the
grade is an inate characteristic which remains unchanged throughout the
tumour's development or whether it evolves as the tumour grows. It is
likely that a proportion of tumours have the potential to evolve, and so a
statistical method was required to assess this hypothesis and, if
possible, to estimate the proportion with the potential for evolution.
Therefore, a mover-stayer mixture of Markov chain models was developed,
with the complication that 'movers' were unobservable because tumours were
excised on diagnosis. A quasi-likelihood method was used for estimation.
The methods are demonstrated using data from the Swedish twocounty trial
of breast-cancer screening.
Journal: Journal of Applied Statistics
Pages: 265-278
Issue: 3
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723675
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723675
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:3:p:265-278
Template-Type: ReDIF-Article 1.0
Author-Name: C. V. Rao
Author-X-Name-First: C. V.
Author-X-Name-Last: Rao
Author-Name: S. Hari Krishna
Author-X-Name-First: S. Hari
Author-X-Name-Last: Krishna
Title: A graphical method for testing the equality of several variances
Abstract:
The problem of testing the equality of several variances arises in many
areas. For testing the equality of variances, several tests are available
in the literature which demonstrate only the statistical significance of
the variances. In this paper, a graphical method is presented for testing
the equality of variances. This method simultaneously demonstrates the
statistical and engineering significance. Two examples are given to
illustrate the proposed graphical method, and the conclusions obtained are
compared with the existing tests.
Journal: Journal of Applied Statistics
Pages: 279-288
Issue: 3
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723684
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723684
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:3:p:279-288
Template-Type: ReDIF-Article 1.0
Author-Name: Anita Ghatak
Author-X-Name-First: Anita
Author-X-Name-Last: Ghatak
Title: Unit roots and structural breaks: The case of India 1900-1988
Abstract:
This paper tests the hypothesis of difference stationarity of
macro-economic time series against the alternative of trend stationarity,
with and without allowing for possible structural breaks. The
methodologies used are that of Dickey and Fuller familiarized by Nelson
and Plosser, and that of dummy variables familiarized by Perron, including
the Zivot and Andrews extension of Perron's tests. We have chosen 12
macro-economic variables in the Indian economy during the period 1900-1988
for this study. A study of this nature has not previously been undertaken
for the Indian economy. The conventional Dickey-Fuller methodology without
allowing for structural breaks cannot reject the unit root hypothesis
(URH) for any series. Allowing for exogenous breaks in level and rate of
growth in the years 1914, 1939 and 1951, Perron's tests reject the URH for
three series after 1951, i.e. the year of introduction of economic
planning in India. The Zivot and Andrews tests for endogenous breaks
confirm the Perron tests and lead to the rejection of the URH for three
more series.
Journal: Journal of Applied Statistics
Pages: 289-300
Issue: 3
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723693
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723693
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:3:p:289-300
Template-Type: ReDIF-Article 1.0
Author-Name: H. J. Khamis
Author-X-Name-First: H. J.
Author-X-Name-Last: Khamis
Title: The delta-corrected Kolmogorov-Smirnov test for the two-parameter Weibull distribution
Abstract:
Monte Carlo simulation techniques are used to create tables of critical
values for the delta-corrected Kolmogorov-Smirnov statistic-a modification
of the classical Kolmogorov-Smirnov statistic-for the Weibull distribution
with known location parameter and unknown shape and scale parameters. The
power of the proposed test is investigated relative to values of delta in
the unit interval and relative to a wide variety of alternative
distributions. The results indicate that using the delta-correction can
lead to as many as 8.4 percentage points more power than can be achieved
with the classical Kolmogorov-Smirnov test, with no change in the size of
the test. Furthermore, carrying out the delta-corrected test involves no
more steps or calculations than for the classical Kolmogorov-Smirnov test.
In general, it is shown that a slight modification-or correction-in the
definition of the empirical distribution function of the
Kolmogorov-Smirnov test can lead to power enhancement without changing the
type I error rate of the test. Two examples clearly show the effectiveness
of the delta-corrected test. The delta-corrected Kolmogorov-Smirnov test
is recommended for testing the goodness of fit to the twoparameter Weibull
distribution.
Journal: Journal of Applied Statistics
Pages: 301-318
Issue: 3
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723701
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723701
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:3:p:301-318
Template-Type: ReDIF-Article 1.0
Author-Name: Herbert Buning
Author-X-Name-First: Herbert
Author-X-Name-Last: Buning
Title: Robust analysis of variance
Abstract:
For the c -sample location problem with equal and unequal variances, we
compare the classical F -test and its robustified version-the Welch
test-with some nonparametric counterparts defined for two-sided and
one-sided ordered alternatives, such as trend and umbrella alternatives. A
new rank test for long-tailed distributions is proposed. The comparison is
referred to level alpha and power beta of the tests, and is carried out
via Monte Carlo simulation, assuming short-, medium- and long-tailed as
well as asymmetric distributions. It turns out that the Welch test is the
best one in the case of unequal variances but in the case of equal
variances special non-parametric tests are to prefer.
Journal: Journal of Applied Statistics
Pages: 319-332
Issue: 3
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723710
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723710
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:3:p:319-332
Template-Type: ReDIF-Article 1.0
Author-Name: J. Munoz-Garcia
Author-X-Name-First: J.
Author-X-Name-Last: Munoz-Garcia
Author-Name: R. Pino-Mejias
Author-X-Name-First: R.
Author-X-Name-Last: Pino-Mejias
Author-Name: J. M. Munoz-Pichardo
Author-X-Name-First: J. M.
Author-X-Name-Last: Munoz-Pichardo
Author-Name: M. D. Cubiles-De-La-Vega
Author-X-Name-First: M. D.
Author-X-Name-Last: Cubiles-De-La-Vega
Title: Identification of outlier bootstrap samples
Abstract:
We define a variation of Efron's method II based on the outlier bootstrap
sample concept. A criterion for the identification of such samples is
given, with which a variation in the bootstrap sample generation algorithm
is introduced. The results of several simulations are analyzed in which,
in comparison with Efron's method II, a higher degree of closeness to the
estimated quantities can be observed.
Journal: Journal of Applied Statistics
Pages: 333-342
Issue: 3
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723729
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723729
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:3:p:333-342
Template-Type: ReDIF-Article 1.0
Author-Name: Jill Johnes
Author-X-Name-First: Jill
Author-X-Name-Last: Johnes
Title: Inter-university variations in undergraduate non-completion rates: A statistical analysis by subject of study
Abstract:
Non-completion of higher education degree courses is a considerable
problem, incurring costs on the taxpayer, higher education institutions
and the students who fail to complete. Closer examination of the data
reveals that non-completion rates in higher education vary substantially
across institutions and by subject of degree. The purpose of this paper is
to investigate, within each of 13 broad subject categories, the potential
determinants of inter-university variations in non-completion rates.
Published data are used to compute university non-completion rates over
four time periods and to construct corresponding explanatory variables
which could potentially be related to non-completion rates. The
explanatory variables measure the characteristics (both academic and
socioeconomic) of students recruited by universities and the
characteristics of the institutions themselves. The significance of the
relationship between the possible explanatory variables and non-completion
rates within each given subject is assessed using both weighted
leastsquares and weighted logit analysis. The conclusions drawn from the
results of each technique are identical, and, therefore, for
interpretation reasons, only the results of the weighted least-squares
analysis are reported. As expected, the academic quality of student
entrants is an important determinant of non-completion rates in the
majority of subjects, although the magnitude of the effect varies
according to subject. Variables reflecting the age and gender mix of
university entrants are generally not significantly related to
noncompletion rates. The characteristics of institutions which are
significantly related to non-completion rates in specific subjects include
the staff student ratio and the length of the degree course
Journal: Journal of Applied Statistics
Pages: 343-362
Issue: 3
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723738
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723738
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:3:p:343-362
Template-Type: ReDIF-Article 1.0
Author-Name: Charles Katholi
Author-X-Name-First: Charles
Author-X-Name-Last: Katholi
Author-Name: Anthony Merriweather
Author-X-Name-First: Anthony
Author-X-Name-Last: Merriweather
Author-Name: Thomas Unnasch
Author-X-Name-First: Thomas
Author-X-Name-Last: Unnasch
Title: An analysis of variance type test for comparing clusters of DNA sequences based on randomization test methodologies
Abstract:
A method for comparing groupings of DNA sequences is presented, which
utilizes randomization test methods to assign significance levels to a
test statistic defined in terms of the Hamming distance between two
sequences. The method, which is intuitively motivated by the analysis of
variance procedure, partitions the variation caused by differences between
clusters from the variation attributable to differences at random base
pair locations within clusers. Implementation issues are discussed, and an
example of the application of the method is provided.
Journal: Journal of Applied Statistics
Pages: 371-382
Issue: 4
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723585
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723585
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:4:p:371-382
Template-Type: ReDIF-Article 1.0
Author-Name: Shragga Irmay
Author-X-Name-First: Shragga
Author-X-Name-Last: Irmay
Title: The relationship between Zipf's law and the distribution of first digits
Abstract:
Zipf 's experimental law states that, for a given large piece of text,
the product of the relative frequency of a word and its order in
descending frequency order is a constant, shown to be equal to 1 divided
by the natural logarithm of the number of different words. It is shown to
be approximately equal to Benford's logarithmic distribution of first
significant digits in tables of numbers. Eleven samples allow comparison
of observed and theoretical frequencies.
Journal: Journal of Applied Statistics
Pages: 383-394
Issue: 4
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723594
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723594
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:4:p:383-394
Template-Type: ReDIF-Article 1.0
Author-Name: Ravindra Khattree
Author-X-Name-First: Ravindra
Author-X-Name-Last: Khattree
Author-Name: Dayanand Naik
Author-X-Name-First: Dayanand
Author-X-Name-Last: Naik
Author-Name: Robert Mason
Author-X-Name-First: Robert
Author-X-Name-Last: Mason
Title: Estimation of variance components in staggered nested designs
Abstract:
Variance components are estimated by two different methods for a general
p stage random-effects staggered nested design. In addition to estimation
from an analysis of variance, a new approach is introduced. The main
features of this new technique are its simplicity and its ability to yield
non-negative estimates of the variance components. The performances of the
two procedures are compared using simulation and the meansquared-error
criterion.
Journal: Journal of Applied Statistics
Pages: 395-408
Issue: 4
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723602
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723602
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:4:p:395-408
Template-Type: ReDIF-Article 1.0
Author-Name: Eric Schoen
Author-X-Name-First: Eric
Author-X-Name-Last: Schoen
Author-Name: Kirsten Wolff
Author-X-Name-First: Kirsten
Author-X-Name-Last: Wolff
Title: Design and analysis of a fractional 413125 split-plot experiment
Abstract:
This paper is a case study on two aspects of constructing mixed factorial
experiments: (1) three equally sized fractions of a 2p+ 2 design are
combined under a three level factor, yielding a 312p+ 2 experiment; (2)
two carefully selected factors from a 2p+ 2 design are combined to obtain
a 412p design. We consider both aspects for the design of a 1/8 fraction
of a 413125 experiment (48 observations) to investigate a DNA
amplification technique. The experiment is of the split-plot type, because
the main effects of two factors had to be confounded with runs of a piece
of equipment (whole-plots), while the other factors were varied between
vials (subplots) contained within the equipment. We confounded an
additional effect to avoid the usual difficulty in evaluating the
whole-plot effects in unreplicated experiments. Both whole-plot and
subplot effects can then be evaluated with half-normal plots. The analysis
is illustrated with the results of the experiment.
Journal: Journal of Applied Statistics
Pages: 409-420
Issue: 4
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723611
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723611
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:4:p:409-420
Template-Type: ReDIF-Article 1.0
Author-Name: Carmen Acuna
Author-X-Name-First: Carmen
Author-X-Name-Last: Acuna
Author-Name: Joseph Horowitz
Author-X-Name-First: Joseph
Author-X-Name-Last: Horowitz
Title: A statistical approach to the resolution of point sources
Abstract:
The Rayleigh criterion in optics states that two point sources of equal
intensity are 'barely resolved' when the maximum of the diffraction
pattern of one source overlaps the first minimum of the diffraction
pattern of the second source. Although useful for rough comparisons of
optical systems, such a criterion does not take into account the
randomness in the detection process and does not tell whether sources can
actually be distinguished. We present a statistical approach that
addressed these issues. From quantum optics, the photon counts in the
pixels are independent Poisson random variables with means that depend on
the distance 2theta between the sources. Resolving the sources corresponds
to testing H0: theta =0 vs Ha: theta >0, under conditions that make the
information number zero at theta =0. We define resolution as the
(asymptotic) power function of the likelihood ratio test rather than as a
single number. The asymptotic distribution of the test statistic is
derived under H0 and under contiguous alternatives. The results are
illustrated by an application to a sky survey to detect binary stars using
the Hubble space telescope.
Journal: Journal of Applied Statistics
Pages: 421-436
Issue: 4
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723620
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723620
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:4:p:421-436
Template-Type: ReDIF-Article 1.0
Author-Name: M. L. Aggarwal
Author-X-Name-First: M. L.
Author-X-Name-Last: Aggarwal
Author-Name: A. Goel
Author-X-Name-First: A.
Author-X-Name-Last: Goel
Author-Name: S. R. Chowdhury
Author-X-Name-First: S. R.
Author-X-Name-Last: Chowdhury
Title: Catalogue of group structures for two-level fractional factorial designs
Abstract:
Taguchi introduced the concept of split-unit design to sort factors into
different groups with respect to difficulties involved in changing the
levels of factors. Li et al. have developed all possible group structures
for eight factors in an L16 orthogonal array for resolution IV with
split-plot design. Chen et al. have searched for a best design, according
to the various criteria for two-level fractional factorial design and have
presented a catalogue. In this paper, we have developed an algorithm for
generating group structure and possible allocations for various 2n- k
fractional factorial designs that correspond to the designs given by Chen
et al.
Journal: Journal of Applied Statistics
Pages: 437-452
Issue: 4
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723639
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723639
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:4:p:437-452
Template-Type: ReDIF-Article 1.0
Author-Name: Raviprakash Salagame
Author-X-Name-First: Raviprakash
Author-X-Name-Last: Salagame
Author-Name: Russell Barton
Author-X-Name-First: Russell
Author-X-Name-Last: Barton
Title: Factorial hypercube designs for spatial correlation regression
Abstract:
The problem of generating a good experimental design for spatial
correlation regression is studied in this paper. The quality of fit
generated by random designs, Latin hypercube designs and factorial designs
is studied for a particular response surface that arises in inkjet
printhead design. These studies indicate that the quality of fit generated
by spatial correlation models is highly dependent on the choice of design.
A design strategy that we call 'factorial hypercubes' is introduced as a
new method. This method can be thought of as an example of a more general
class of hybrid designs. The quality of fit generated by these designs is
compared with those of other methods. These comparisons indicate a better
fit and less numerical problems with factorial hypercubes.
Journal: Journal of Applied Statistics
Pages: 453-474
Issue: 4
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723648
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723648
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:4:p:453-474
Template-Type: ReDIF-Article 1.0
Author-Name: Nien Fan Zhang
Author-X-Name-First: Nien Fan
Author-X-Name-Last: Zhang
Title: Detection capability of residual control chart for stationary process data
Abstract:
In recent years, methods for dealing with autocorrelated data in the
statistical process control environment have been proposed. A primary
method is based on modeling the process data and applying control charts
to the residuals. However, the residual charts do not have the same
properties as the traditional charts. In the literature, there has been no
systematic study on the detection capability of the residual chart for the
stationary processes. The article develops a measure of the detection
capability of the residual chart for the general stationary processes.
Conditions under which the residual chart reduces or increases the
detection capability are given. The relationships between the detection
capability and the average run length of the residual chart are also
established.
Journal: Journal of Applied Statistics
Pages: 475-492
Issue: 4
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723657
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723657
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:4:p:475-492
Template-Type: ReDIF-Article 1.0
Author-Name: Thaddeus Tarpey
Author-X-Name-First: Thaddeus
Author-X-Name-Last: Tarpey
Title: Estimating principal points of univariate distributions
Abstract:
The term 'principal points' originated in a problem of determining
'typical' heads for the design of protection masks, as described by Flury.
Two principal points in the mask example correspond to a small and a large
size. Principal points are cluster means for theoretical distributions,
and sample cluster means from a k -means algorithm are non-parametric
estimators of principal points. This paper demonstrates that maximum
likelihood estimators and semi-parametric estimators based on symmetry
constraints typically perform much better than the k -means estimators.
Asymptotic results on the efficiency of these estimators of two principal
points for four symmetric univariate distributions are given. Simulation
results are provided to examine the performance of the estimators for
finite sample sizes. Finally, the different estimators of two principal
points are compared using the head dimension data for the design of
protection masks.
Journal: Journal of Applied Statistics
Pages: 499-512
Issue: 5
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723503
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723503
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:5:p:499-512
Template-Type: ReDIF-Article 1.0
Author-Name: I. L. Dryden
Author-X-Name-First: I. L.
Author-X-Name-Last: Dryden
Author-Name: K. V. Mardia
Author-X-Name-First: K. V.
Author-X-Name-Last: Mardia
Author-Name: A. N. Walder
Author-X-Name-First: A. N.
Author-X-Name-Last: Walder
Title: Review of the use of context in statistical image analysis
Abstract:
This paper is a review of the use of contextual information in
statistical image analysis. After defining what we mean by 'context', we
describe the Bayesian approach to high-level image analysis using
deformable templates. We describe important aspects of work on character
recognition and syntactic pattern recognition; in particular, aspects of
the work which are relevant to scene understanding. We conclude with a
review of some work on knowledge-based systems which use context to aid
object recognition.
Journal: Journal of Applied Statistics
Pages: 513-538
Issue: 5
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723512
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723512
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:5:p:513-538
Template-Type: ReDIF-Article 1.0
Author-Name: Reay-Chen Wang
Author-X-Name-First: Reay-Chen
Author-X-Name-Last: Wang
Author-Name: Chung-Ho Chen
Author-X-Name-First: Chung-Ho
Author-X-Name-Last: Chen
Title: Minimum average fraction inspected for continuous sampling plan CSP-1 under inspection error
Abstract:
In this paper, we present a further modification of Endres's method to
construct the problem of minimizing the average fraction inspected (AFI)
for the continuous sampling plan CSP-1 under inspection error. The
measures of average outgoing quality under perfect and imperfect
replacement conditions are considered. The formulae for searching the
smallest clearance number i for minimizing the AFI for a CSP-1 plan are
also provided. The solution procedure of the proposed method is more
reliable, clearer and easier than that of Endres.
Journal: Journal of Applied Statistics
Pages: 539-548
Issue: 5
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723521
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723521
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:5:p:539-548
Template-Type: ReDIF-Article 1.0
Author-Name: Gilles Ducharme
Author-X-Name-First: Gilles
Author-X-Name-Last: Ducharme
Title: Consistent selection of the actual model in regression analysis
Abstract:
In regression analysis, a best subset of regressors is usually selected
by minimizing Mallows's C statistic or some other equivalent criterion,
such as the Akaike lambda information criterion or cross-validation. It is
known that the resulting procedure suffers from a lack of consistency that
can lead to a model with too many variables. For this reason, corrections
have been proposed that yield consistent procedures. The object of this
paper is to show that these corrected criteria, although asymptotically
consistent, are usually too conservative for finite sample sizes. The
paper also proposes a new correction of Mallows's statistic that yields
better results. A simulation study is conducted that shows that the
proposed criterion performs well in a variety of situations.
Journal: Journal of Applied Statistics
Pages: 549-558
Issue: 5
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723530
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723530
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:5:p:549-558
Template-Type: ReDIF-Article 1.0
Author-Name: E. Ayuga Tellez
Author-X-Name-First: E. Ayuga
Author-X-Name-Last: Tellez
Author-Name: C. Ayuga Tellez
Author-X-Name-First: C. Ayuga
Author-X-Name-Last: Tellez
Author-Name: C. Gonzalez Garcia
Author-X-Name-First: C. Gonzalez
Author-X-Name-Last: Garcia
Author-Name: E. Martinez Falero
Author-X-Name-First: E. Martinez
Author-X-Name-Last: Falero
Title: Estimation of non-parametric regression in the analysis of the anti-inflammatory activity of diverse extracts of Sideritis foetens
Abstract:
A procedure to choose the best non-parametric estimator from among all
nonparametric methods to fit regression curves is described. The
methodology that is proposed prevents a lack of fit at the edges of the
regression curve. The method is summed up in a few steps to facilitate its
application by researchers. The procedure is applied to the determination
of various curves that explain the anti-inflammatory activity of diverse
extracts of Sideritis foetens and phenylbutazone against the time elapsed
from the application of the agent which provokes the inflammation.
Discussion shows that it is possible to obtain valid conclusions about the
effects of the different products and to establish comparisons between
them. Such conclusions are not possible when starting from the classical
statistics methods usually employed in pharmacology.
Journal: Journal of Applied Statistics
Pages: 559-572
Issue: 5
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723549
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723549
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:5:p:559-572
Template-Type: ReDIF-Article 1.0
Author-Name: Wieslaw Madry
Author-X-Name-First: Wieslaw
Author-X-Name-Last: Madry
Title: A statistical approach to multivariate evaluation of diversity with respect to quantitative characteristics in cereal germplasm collections
Abstract:
The aim of this paper is to undertake the problem of adapting some
multivariate statistical methods (MANOVA, cluster analysis with
simultaneous test procedures T 2 based on Roy's union-intersection rule
and canonical variate analysis) max and describing their possible usage in
the evaluation and interpretation of the phenotypic diversity with regard
to quantitative traits in cereal collections. The presented procedures are
used in a case where experimental data have been obtained from
single-replicated trials conducted at the same location over a few years.
In such cases, data can be nonorthogonal connected accessions x years
cross-classification with none or one observation in a given subclass. The
application of the suggested procedures is illustrated by a numerical
example of a winter rye collection from the Plant Breeding and
Acclimatization Institute in Radzikow near Warsaw (Poland).
Journal: Journal of Applied Statistics
Pages: 573-588
Issue: 5
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723558
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723558
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:5:p:573-588
Template-Type: ReDIF-Article 1.0
Author-Name: Russell Boyles
Author-X-Name-First: Russell
Author-X-Name-Last: Boyles
Title: Using the chi-square statistic to monitor compositional process data
Abstract:
We investigate the use of the chi-square control chart as a simple
multivariate method for shopfloor monitoring of compositional process
data. Although this chart is usually considered to be applicable only with
multinomial process data, we show that it is also valid, in a certain
asymptotic sense, for compositional data that arise from the Dirichlet
distribution. For general compositional data, we show that the chi-square
statistic can be used for process monitoring, provided that we make a
simple adjustment to the degrees of freedom in the chi-square reference
distribution. This method is illustrated and compared in four examples
with the T 2 chart based on log-ratio transformation of the data.
Journal: Journal of Applied Statistics
Pages: 589-602
Issue: 5
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723567
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723567
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:5:p:589-602
Template-Type: ReDIF-Article 1.0
Author-Name: Hisashi Tanizaki
Author-X-Name-First: Hisashi
Author-X-Name-Last: Tanizaki
Title: Power comparison of non-parametric tests: Small-sample properties from Monte Carlo experiments
Abstract:
Non-parametric tests that deal with two samples include scores tests
(such as the Wilcoxon rank sum test, normal scores test, logistic scores
test, Cauchy scores test, etc.) and Fisher's randomization test. Because
the non-parametric tests generally require a large amount of computational
work, there are few studies on small-sample properties, although
asymptotic properties with regard to various aspects were studied in the
past. In this paper, the non-parametric tests are compared with the t
-test through Monte Carlo experiments. Also, we consider testing
structural changes as an application in economics.
Journal: Journal of Applied Statistics
Pages: 603-632
Issue: 5
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723576
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723576
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:5:p:603-632
Template-Type: ReDIF-Article 1.0
Author-Name: Raymond Stefani
Author-X-Name-First: Raymond
Author-X-Name-Last: Stefani
Title: Survey of the major world sports rating systems
Abstract:
Using a common framework, this paper presents a survey of the major world
sports rating systems (WSRSs) in skiing (sponsored by the International
Skiing Federation (FIS)), men's tennis (Association of Tennis
Professionals (ATP)), women's tennis (Women's Tennis Association (WTA)),
soccer (Federation of International Football Associations (FIFA)) and golf
(Royal and Ancient Golf Club of St Andrews). These systems are not
otherwise available in the literature. Each of the WSRSs has three phases:
first, the observed results are weighted to provide points for each
competition; second, these points are combined to provide a seasonal
value; third, the seasonal values are combined to provide a rating. The
final result or placement (and not the score or time) is the most
important factor in determining points for a given competition. In skiing,
men's tennis and women's tennis, the rating is calculated from results
over one season, while three seasons are used in golf and six seasons are
used in soccer. In cross-country skiing and men's tennis, the seasonal
value is calculated from the sum of the best values from that season's
competitions. In alpine skiing and women's tennis, the sum of all values
from that season's competitions is used. In golf and soccer, an averaging
process is used. Besides potentially encouraging more entries, a 'best'
system and one using all values also generates simple integer ratings
rather than decimal ratings as are obtained with an averaging system. The
simplest system is that of FIS in skiing, where one table of points is
used for all alpine and cross-country disciplines. In contrast,
considering that soccer (as a sport) prides itself on the simplicity of
the game, it is surprising that FIFA's system is so complex, It is also
surprising in soccer that a 'friendly' (often a pick-up exhibition used
for player development) counts two-thirds as much as does a World Cup
final played before a worldwide TV audience. It is hoped that this survey
will serve as a valuable resource for those studying sports rating
systems.
Journal: Journal of Applied Statistics
Pages: 635-646
Issue: 6
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723387
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723387
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:6:p:635-646
Template-Type: ReDIF-Article 1.0
Author-Name: E. M. Qannari
Author-X-Name-First: E. M.
Author-X-Name-Last: Qannari
Author-Name: E. Vigneau
Author-X-Name-First: E.
Author-X-Name-Last: Vigneau
Author-Name: M. Semenou
Author-X-Name-First: M.
Author-X-Name-Last: Semenou
Title: New approach in biased regression
Abstract:
An optimization problem which provides a new characterization for ridge
regression is discussed. A variant of this optimization problem leads to a
new family of biased estimators that includes the Stein estimation method
and principal components regression as particular cases. The whole
approach is illustrated on the basis of real data sets.
Journal: Journal of Applied Statistics
Pages: 647-658
Issue: 6
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723396
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723396
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:6:p:647-658
Template-Type: ReDIF-Article 1.0
Author-Name: Murari Singh
Author-X-Name-First: Murari
Author-X-Name-Last: Singh
Author-Name: Michael Jones
Author-X-Name-First: Michael
Author-X-Name-Last: Jones
Title: Estimating time to detect time trends in continuous cropping
Abstract:
In long-term field trials comparing different sequences of crops and
husbandry practices, the identification and understanding of trends in
productivity over time is an important issue of sustainable crop
production. This paper presents a statistical technique for the estimation
of time trends in yield variables of a seasonal annual crop under
continuous cropping. The estimation procedure incorporates the correlation
structure, which is assumed to follow first-order autocorrelation in the
errors that arise over time on the same plot. Because large differences in
annual rainfall have a major effect on crop performance, rainfall has been
allowed for in the estimation of the time trends. Expressions for the
number of years (time) required to detect statistically significant time
trends have been obtained. Illustrations are based on a 7-year data set of
grain and straw yields from a trial in northern Syria. Although agronomic
interpretation is not intended in this paper, the barley yield data
indicated that a significant time trend can apparently be detected even in
a suboptimal data set of 7 years' duration.
Journal: Journal of Applied Statistics
Pages: 659-670
Issue: 6
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723404
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723404
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:6:p:659-670
Template-Type: ReDIF-Article 1.0
Author-Name: M. A. Kaboudan
Author-X-Name-First: M. A.
Author-X-Name-Last: Kaboudan
Title: Non-traditional analysis of stock returns
Abstract:
An investigation of the prices of eight individual stocks showed that
pricechange returns are significantly less complex than are time-dependent
returns. Timedependent returns computed every 15, 30 and 45 minutes were
found to be more complex, using a complexity measure. Complexity is
quantified by measuring the number of times that the estimated correlation
dimension of an observed series is multiplied by when its original
sequence is randomly shuffled.
Journal: Journal of Applied Statistics
Pages: 671-688
Issue: 6
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723413
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723413
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:6:p:671-688
Template-Type: ReDIF-Article 1.0
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Author-Name: M. Kalyanasundaram
Author-X-Name-First: M.
Author-X-Name-Last: Kalyanasundaram
Title: Determination of an attribute single sampling scheme
Abstract:
This paper presents procedures for the selection of a new sampling scheme
called 'single sampling scheme' (SSS). It presents a compact table for the
selection of an SSS indexed by various combinations of entry parameters.
The advantages of SSSs are discussed. The basis for the construction of
the table is also given.
Journal: Journal of Applied Statistics
Pages: 689-696
Issue: 6
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723422
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723422
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:6:p:689-696
Template-Type: ReDIF-Article 1.0
Author-Name: Vic Barnett
Author-X-Name-First: Vic
Author-X-Name-Last: Barnett
Author-Name: Karen Moore
Author-X-Name-First: Karen
Author-X-Name-Last: Moore
Title: Best linear unbiased estimates in ranked-set sampling with particular reference to imperfect ordering
Abstract:
Ranked-set sampling is a widely used sampling procedure when sample
observations are expensive or difficult to obtain. It departs from simple
random sampling by seeking to spread the observations in the sample widely
over the distribution or population. This is achieved by ranking methods
which may need to employ concomitant information. The ranked-set sample
mean is known to be more efficient than the corresponding simple random
sample mean. Instead of the ranked-set sample mean, this paper considers
the corresponding optimal estimator: the ranked-set best linear unbiased
estimator. This is shown to be more efficient, even for normal data, but
particularly for skew data, such as from an exponential distribution. The
corresponding forms of the estimators are quite distinct from the
ranked-set sample mean. Improvement holds where the ordering is perfect or
imperfect, with this prospect of improper ordering being explored through
the use of concomitants. In addition, the corresponding optimal linear
estimator of a scale parameter is also discussed. The results are applied
to a biological problem that involves the estimation of root weights for
experimental plants, where the expense of measurement implies the need to
minimize the number of observations taken.
Journal: Journal of Applied Statistics
Pages: 697-710
Issue: 6
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723431
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723431
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:6:p:697-710
Template-Type: ReDIF-Article 1.0
Author-Name: A. I. Khuri
Author-X-Name-First: A. I.
Author-X-Name-Last: Khuri
Title: Quantile dispersion graphs for analysis of variance estimates of variance components
Abstract:
The exact distribution of an analysis of variance estimator of a variance
component is obtained by determining its quantiles on the basis of R. B.
Davies' algorithm. A plot of these quantiles provides useful information
concerning the efficiency of the estimator, including the extent to which
it can be negative. Furthermore, the variability in the values of each
quantile is assessed by varying the values of the variance components for
the model under consideration. The maximum and minimum of such quantile
values can then be determined. A plot of the maxima and minima for various
selected quantiles produces the so-called 'quantile dispersion graphs'.
These graphs can be used to provide a comprehensive picture of the quality
of estimation obtained with a particular design. They also provide an
effective graphical tool for comparing designs on the basis of their
estimation capabilities.
Journal: Journal of Applied Statistics
Pages: 711-722
Issue: 6
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723440
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723440
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:6:p:711-722
Template-Type: ReDIF-Article 1.0
Author-Name: Ram Mudambi
Author-X-Name-First: Ram
Author-X-Name-Last: Mudambi
Title: Estimating turning points using polynomial regression
Abstract:
This paper describes a method for estimating regime switches in
non-monotonic relationships, using polynomial regressions. Data from the
UK financial services industry are used to illustrate the technique. The
methodology provides a means of statistically ascertaining the existence
of turning points, as well as a means of locating them, should they exist.
While the methodology is most suited to applications that involve
cross-sectional data, it may also be useful in short-horizon time series
turning point prediction.
Journal: Journal of Applied Statistics
Pages: 723-732
Issue: 6
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723459
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723459
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:6:p:723-732
Template-Type: ReDIF-Article 1.0
Author-Name: Eileen O'Donnell
Author-X-Name-First: Eileen
Author-X-Name-Last: O'Donnell
Author-Name: G. Geoffrey Vining
Author-X-Name-First: G. Geoffrey
Author-X-Name-Last: Vining
Title: Mean squared error of prediction approach to the analysis of a combined array
Abstract:
The combined array provides a powerful, more statistically rigorous
alternative to Taguchi's crossed-array approach to robust parameter
design. The combined array assumes a single linear model in the control
and the noise factors. One may then find conditions for the control
factors which will minimize an appropriate loss function that involves the
noise factors. The most appropriate loss function is often simply the
resulting process variance, recognizing that the noise factors are
actually random effects in the process. Because the major focus of such an
experiment is to optimize the estimated process variance, it is vital to
understand the resulting prediction properties. This paper develops the
mean squared error for the estimated process variance for the combined
array approach, under the assumption that the model is correctly
specified. Specific combined arrays are compared for robustness. A
practical example outlines how this approach may be used to select
appropriate combined arrays within a particular experimental situation.
Journal: Journal of Applied Statistics
Pages: 733-746
Issue: 6
Volume: 24
Year: 1997
X-DOI: 10.1080/02664769723468
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769723468
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:24:y:1997:i:6:p:733-746
Template-Type: ReDIF-Article 1.0
Author-Name: Eric Hillson
Author-X-Name-First: Eric
Author-X-Name-Last: Hillson
Author-Name: Jaxk Reeves
Author-X-Name-First: Jaxk
Author-X-Name-Last: Reeves
Author-Name: Charlotte Mcmillan
Author-X-Name-First: Charlotte
Author-X-Name-Last: Mcmillan
Title: A statistical signalling model for use in surveillance of adverse drug reaction data
Abstract:
This paper presents a statistically superior lag-adjusted model for
detecting increased frequency of reports of adverse drug event (ADE)
rates. The effect of a significant lag time between ADE occurrence and
report dates is studied. The approach in this paper to analyzing ADE data
of this nature involves proposing a statistical model that utilizes a lag
density function. The statistical method proposed was the development of
an 'exact' procedure to monitor drugs that have a low incidence of ADEs.
The approach determines statistically whether a change in the frequency of
a specific ADE exists between two predetermined time intervals. There
exist immense public health implications associated with the early
detection of serious ADEs. The reduced risk of unfavorable outcomes
associated with medication therapy is the goal of all involved. Simulated
illustrations and discussion are provided, along with a detailed FORTRAN
program used to implement the newly suggested lag-adjusted procedure.
Journal: Journal of Applied Statistics
Pages: 23-40
Issue: 1
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823287
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823287
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:1:p:23-40
Template-Type: ReDIF-Article 1.0
Author-Name: Sang-Jun Park
Author-X-Name-First: Sang-Jun
Author-X-Name-Last: Park
Author-Name: Bong-Jin Yum
Author-X-Name-First: Bong-Jin
Author-X-Name-Last: Yum
Title: Optimal design of accelerated life tests under modified stress loading methods
Abstract:
Most of the previous work on optimal design of accelerated life test
(ALT) plans has assumed instantaneous changes in stress levels, which may
not be possible or desirable in practice, because of the limited
capability of test equipment, possible stress shocks or the presence of
undesirable failure modes. We consider the case in which stress levels are
changed at a finite rate, and develop two types of ALT plan under the
assumptions of exponential lifetimes of test units and type I censoring.
One type of plan is the modified step-stress ALT plan, and the other type
is the modified constant-stress ALT plan. These two plans are compared in
terms of the asymptotic variance of the maximum likelihood estimator of
the log mean lifetime for the use condition (i.e. avar\[ln (0)]).
Computational results indicate that, for both types of plan, avar\[ln (0)]
is not sensitive to the stress-increasing rate R, if R is greater than or
equal to 10, say, in the standardized scale. This implies that the
proposed stress loading method can be used effectively with little loss in
statistical efficiency. In terms of avar\[ln (0)], the modified
step-stress ALT generally performs better than the modified
constant-stress ALT, unless R or the probability of failure until the
censoring time under a certain stress-increasing rate is small. We also
compare the progressive-stress ALT plan with the above two modified ALT
plans in terms of avar\[ln (0)], using the optimal stress-increasing rate
R* determined for the progressivestress ALT plan. We find that the
proposed ALTs perform better than the progressivestress ALT for the
parameter values considered.
Journal: Journal of Applied Statistics
Pages: 41-62
Issue: 1
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823296
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823296
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:1:p:41-62
Template-Type: ReDIF-Article 1.0
Author-Name: John Stonehouse
Author-X-Name-First: John
Author-X-Name-Last: Stonehouse
Author-Name: Guy Forrester
Author-X-Name-First: Guy
Author-X-Name-Last: Forrester
Title: Robustness of the t and U tests under combined assumption violations
Abstract:
When the assumptions of parametric statistical tests for the difference
between two means are violated, it is commonly advised that non-parametric
tests are a more robust substitute. The history of the investigation of
this issue is summarized. The robustness of the t -test was evaluated, by
repeated computer testing for differences between samples from two
populations of equal means but non-normal distributions and with different
variances and sample sizes. Two common alternatives to t -Welch's
approximate t and the Mann-Whitney U -test-were evaluated in the same way.
The t -test is sufficiently robust for use in all likely cases, except
when skew is severe or when population variances and sample sizes both
differ. The Welch test satisfactorily addressed the latter problem, but
was itself sensitive to departures from normality. Contrary to its popular
reputation, the U -test showed a dramatic 'lack of robustness' in many
cases-largely because it is sensitive to population differences other than
between means, so it is not properly a 'non-parametric analogue' of the t
-test, as it is too often described.
Journal: Journal of Applied Statistics
Pages: 63-74
Issue: 1
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823304
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823304
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:1:p:63-74
Template-Type: ReDIF-Article 1.0
Author-Name: Siu-Keung Tse
Author-X-Name-First: Siu-Keung
Author-X-Name-Last: Tse
Author-Name: Hak-Keung Yuen
Author-X-Name-First: Hak-Keung
Author-X-Name-Last: Yuen
Title: Expected experiment times for the Weibull distribution under progressive censoring with random removals
Abstract:
This paper considers the expected experiment times for
Weibull-distributed lifetimes under type II progressive censoring, with
the numbers of removals being random. The formula to compute the expected
experiment times is given. A detailed numerical study of this expected
time is carried out for different combinations of model parameters.
Furthermore, the ratio of the expected experiment time under this type of
progressive censoring to the expected experiment time under complete
sampling is studied.
Journal: Journal of Applied Statistics
Pages: 75-83
Issue: 1
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823313
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823313
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:1:p:75-83
Template-Type: ReDIF-Article 1.0
Author-Name: Wai-Yuan Tan
Author-X-Name-First: Wai-Yuan
Author-X-Name-Last: Tan
Author-Name: Si Chin Tang
Author-X-Name-First: Si Chin
Author-X-Name-Last: Tang
Author-Name: Sho Rong Lee
Author-X-Name-First: Sho Rong
Author-X-Name-Last: Lee
Title: Estimation of HIV seroconversion and effects of age in the San Francisco homosexual population
Abstract:
Using San Francisco city clinic cohort data, we estimate the HIV
seroconversion distribution by both non-parametric and parametric methods,
and illustrate the effects of age on this distribution. The non-parametric
methods include the Turnbull method, the Bacchetti method, the
expectation, maximization and smoothing (EMS) method and the penalized
spline method. The seroconversion density curves estimated by these
nonparametric methods are of bimodal nature with obvious effects of age.
As a result of the bimodal nature of the seroconversion curves, the
parametric models considered are mixtures of two distributions taken from
the generalized log-logistic distribution with three parameters, the
Weibull distribution and the log-normal distribution. In terms of the
logarithm of the likelihood values, it appears that the non-parametric
methods with smoothing as well as without smoothing (i.e. the Turnbull
method) provided much better fits than did the parametric models. Among
the non-parametric methods, the EMS and the spline estimates are more
appealing, because the unsmoothed Turnbull estimates are very unstable and
because the Bacchetti estimates have a longer tail. Among the parametric
models, the mixture of a generalized log-logistic distribution with three
parameters and a Weibull distribution or a log-normal distribution
provided better fits than did other mixtures of parametric models.
Journal: Journal of Applied Statistics
Pages: 85-102
Issue: 1
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823322
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823322
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:1:p:85-102
Template-Type: ReDIF-Article 1.0
Author-Name: K. Govindaraju
Author-X-Name-First: K.
Author-X-Name-Last: Govindaraju
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Title: Chain sampling plan for variables inspection
Abstract:
This paper extends the concept of chain sampling to variables inspection
when the standard deviation of the normally distributed characteristic is
known. A discussion of the shape of the known sigma single-sampling
variables plan is given. The chain sampling plan for variables inspection
will be useful when testing is costly or destructive.
Journal: Journal of Applied Statistics
Pages: 103-109
Issue: 1
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823331
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823331
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:1:p:103-109
Template-Type: ReDIF-Article 1.0
Author-Name: Hongzhu Qiao
Author-X-Name-First: Hongzhu
Author-X-Name-Last: Qiao
Author-Name: Chris Tsokos
Author-X-Name-First: Chris
Author-X-Name-Last: Tsokos
Title: Best efficient estimates of the intensity function of the power law process
Abstract:
We develop a general statistical procedure to obtain linearly the best
efficient estimate of existing estimations of the parameter of a
probability process. This procedure is used to obtain the best efficient
estimates of the shape parameter, the intensity failure function and its
reciprocal of the power law process. These estimates are important in the
study of reliability growth modelling. The effectiveness of our findings
is illustrated analytically and numerically, using real data and numerical
simulations.
Journal: Journal of Applied Statistics
Pages: 111-120
Issue: 1
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823340
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823340
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:1:p:111-120
Template-Type: ReDIF-Article 1.0
Author-Name: M. Yahyah
Author-X-Name-First: M.
Author-X-Name-Last: Yahyah
Author-Name: A. Baines
Author-X-Name-First: A.
Author-X-Name-Last: Baines
Author-Name: D. N. Joanes
Author-X-Name-First: D. N.
Author-X-Name-Last: Joanes
Title: Graphical approach to model adequacy based on exact and near replicates
Abstract:
In this paper, we present an intuitive graphical approach to model
validity, which, although to some extent subjective, can be extremely
valuable for both presentation and interpretation purposes. In particular,
the idea behind such a procedure arises naturally through the generation
of a sequence of elements derived from the residuals about a fitted
graduating function, based on datum points that are identical or that are
relatively close together in a multi-dimensional factor space.
Journal: Journal of Applied Statistics
Pages: 121-129
Issue: 1
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823359
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823359
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:1:p:121-129
Template-Type: ReDIF-Article 1.0
Author-Name: J. E. Toler
Author-X-Name-First: J. E.
Author-X-Name-Last: Toler
Author-Name: P. M. Burrows
Author-X-Name-First: P. M.
Author-X-Name-Last: Burrows
Title: Genotypic performance over environmental arrays: A non-linear grouping protocol
Abstract:
A non-linear model for examining genotypic responses across an array of
environments is contrasted with the 'joint regression' formulation, and a
rigorous approach to hypothesis testing using the conditional error
principle is demonstrated. The model is extended to cater for situations
where single straight-line response patterns fail to characterize
genotypic behaviors over an environmental array: a combination of two
straight lines, with slope in below-average and in above-average
environments, is offered as the 1 2 simplest representation of convex and
concave patterns. A protocol for classifying genotypes according to the
results of hypothesis tests, i.e. H( = ) and H( = = = 1), is 1 2 1 2
presented . A doubly desirable response pattern is convex ( < 1<
), while a doubly 1 2 undesirable pattern is concave ( > 1> ). 1
2
Journal: Journal of Applied Statistics
Pages: 131-143
Issue: 1
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823368
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823368
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:1:p:131-143
Template-Type: ReDIF-Article 1.0
Author-Name: Myung Geun Kim
Author-X-Name-First: Myung Geun
Author-X-Name-Last: Kim
Title: Local influence on a test of linear hypothesis in multiple regression model
Abstract:
The local influence method is adapted to investigate the influence of
observations on testing the linear hypothesis. The method provides
information about individually or jointly influential observations in
performing the test, which the usual diagnostic methods do not yield. An
example is presented for illustration of the method.
Journal: Journal of Applied Statistics
Pages: 145-152
Issue: 1
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823377
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823377
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:1:p:145-152
Template-Type: ReDIF-Article 1.0
Author-Name: C. A. Glasbey
Author-X-Name-First: C. A.
Author-X-Name-Last: Glasbey
Author-Name: K. V. Mardia
Author-X-Name-First: K. V.
Author-X-Name-Last: Mardia
Title: A review of image-warping methods
Abstract:
Image warping is a transformation which maps all positions in one image
plane to positions in a second plane. It arises in many image analysis
problems, whether in order to remove optical distortions introduced by a
camera or a particular viewing perspective, to register an image with a
map or template, or to align two or more images. The choice of warp is a
compromise between a smooth distortion and one which achieves a good
match. Smoothness can be ensured by assuming a parametric form for the
warp or by constraining it using differential equations. Matching can be
specified by points to be brought into alignment, by local measures of
correlation between images, or by the coincidence of edges. Parametric and
non-parametric approaches to warping, and matching criteria, are reviewed.
Journal: Journal of Applied Statistics
Pages: 155-171
Issue: 2
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823151
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823151
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:2:p:155-171
Template-Type: ReDIF-Article 1.0
Author-Name: Edvin Bredrup
Author-X-Name-First: Edvin
Author-X-Name-Last: Bredrup
Author-Name: Li-Chun Zhang
Author-X-Name-First: Li-Chun
Author-X-Name-Last: Zhang
Title: Imperfectly shuffled decks in bridge
Abstract:
In this paper, we study the distribution tables of imperfectly shuffled
hands in a bridge game under a conditional Markov chain model, based on
which a simple approximate test on the randomness of the decks is derived.
The idea is to examine whether a stochastic process is a compound
hypergeometric process, through the number of its ties, and it is easily
adapted to similar situations.
Journal: Journal of Applied Statistics
Pages: 173-179
Issue: 2
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823160
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823160
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:2:p:173-179
Template-Type: ReDIF-Article 1.0
Author-Name: James Koziol
Author-X-Name-First: James
Author-X-Name-Last: Koziol
Title: A non-parametric index of tracking
Abstract:
A two-sample version of the non-parametric index of tracking for
longitudinal data introduced by Foulkes and Davis is described. The index
is based on a multivariate U -statistic, and provides a measure of the
stochastic ordering of the underlying growth curves of the samples. The
utility of the U -statistic approach is explored with two applications
related to growth curves and repeated measures analyses.
Journal: Journal of Applied Statistics
Pages: 181-191
Issue: 2
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823179
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823179
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:2:p:181-191
Template-Type: ReDIF-Article 1.0
Author-Name: James Taylor
Author-X-Name-First: James
Author-X-Name-Last: Taylor
Author-Name: Derek Bunn
Author-X-Name-First: Derek
Author-X-Name-Last: Bunn
Title: Combining forecast quantiles using quantile regression: Investigating the derived weights, estimator bias and imposing constraints
Abstract:
A novel proposal for combining forecast distributions is to use quantile
regression to combine quantile estimates. We consider the usefulness of
the resultant linear combining weights. If the quantile estimates are
unbiased, then there is strong intuitive appeal for omitting the constant
and constraining the weights to sum to unity in the quantile regression.
However, we show that suppressing the constant renders one of the main
attractive features of quantile regression invalid. We establish necessary
and sufficient conditions for unbiasedness of a quantile estimate, and
show that a combination with zero constant and weights that sum to unity
is not necessarily unbiased.
Journal: Journal of Applied Statistics
Pages: 193-206
Issue: 2
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823188
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823188
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:2:p:193-206
Template-Type: ReDIF-Article 1.0
Author-Name: A. B. M. Zohrul Kabir
Author-X-Name-First: A. B. M. Zohrul
Author-X-Name-Last: Kabir
Title: Estimation of Weibull distribution parameters for irregular interval group failure data with unknown failure times
Abstract:
This paper presents three methods for estimating Weibull distribution
parameters for the case of irregular interval group failure data with
unknown failure times. The methods are based on the concepts of the
piecewise linear distribution function (PLDF), an average interval failure
rate (AIFR) and sequential updating of the distribution function (SUDF),
and use an analytical approach similar to that of Ackoff and Sasieni for
regular interval group data. Results from a large number of simulated case
problems generated with specified values of Weibull distribution
parameters have been presented, which clearly indicate that the SUDF
method produces near-perfect parameter estimates for all types of failure
pattern. The performances of the PLDF and AIFR methods have been evaluated
by goodness-of-fit testing and statistical confidence limits on the shape
parameter. It has been found that, while the PLDF method produces
acceptable parameter estimates, the AIFR method may fail for low and high
shape parameter values that represent the cases of random and wear-out
types of failure. A real-life application of the proposed methods is also
presented, by analyzing failures of hydrogen make-up compressor valves in
a petroleum refinery.
Journal: Journal of Applied Statistics
Pages: 207-219
Issue: 2
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823197
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823197
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:2:p:207-219
Template-Type: ReDIF-Article 1.0
Author-Name: Mukhtar Ali
Author-X-Name-First: Mukhtar
Author-X-Name-Last: Ali
Title: Probability models on horse-race outcomes
Abstract:
A number of models have been examined for modelling probability based on
rankings. Most prominent among these are the gamma and normal probability
models. The accuracy of these models in predicting the outcomes of horse
races is investigated in this paper. The parameters of these models are
estimated by the maximum likelihood method, using the information on win
pool fractions. These models are used to estimate the probabilities that
race entrants finish second or third in a race. These probabilities are
then compared with the corresponding objective probabilities estimated
from actual race outcomes. The data are obtained from over 15 000 races.
it is found that all the models tend to overestimate the probability of a
horse finishing second or third when the horse has a high probability of
such a result, but underestimate the probability of a horse finishing
second or third when this probability is low.
Journal: Journal of Applied Statistics
Pages: 221-229
Issue: 2
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823205
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823205
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:2:p:221-229
Template-Type: ReDIF-Article 1.0
Author-Name: K. V. Mardia
Author-X-Name-First: K. V.
Author-X-Name-Last: Mardia
Title: Fisher's repeated normal integral function and shape distributions
Abstract:
Fisher has presented various applications of the repeated integral of the
normal distribution function. A new application has appeared in shape
distributions. This work has led to the search for an explicit expression
for the repeated normal integral function in place of an infinite series
expansion for this function first given by Fisher. We provide an explicit
expression for this function with a finite number of terms.
Journal: Journal of Applied Statistics
Pages: 231-235
Issue: 2
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823214
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823214
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:2:p:231-235
Template-Type: ReDIF-Article 1.0
Author-Name: Dalton Andrade
Author-X-Name-First: Dalton
Author-X-Name-Last: Andrade
Author-Name: Julio Singer
Author-X-Name-First: Julio
Author-X-Name-Last: Singer
Title: Profile analysis for randomized complete block experiments
Abstract:
We consider the use of standard univariate and multivariate methods for
profile analysis of randomized complete block experiments. Although the
analysis for the case where the block time interaction is included in the
model parallels that used for factorial experiments, situations where such
interaction is not present may not be handled in the same way. We identify
hypotheses for which the standard analysis may be applied, as well as
those for which some adaptation is required. We also indicate how to
implement such analyses via existing computer software.
Journal: Journal of Applied Statistics
Pages: 237-244
Issue: 2
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823223
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823223
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:2:p:237-244
Template-Type: ReDIF-Article 1.0
Author-Name: Jin Zhang
Author-X-Name-First: Jin
Author-X-Name-Last: Zhang
Title: Tests for multiple upper or lower outliers in an exponential sample
Abstract:
T = \[x + … + x ]/ Sigma x (T*= \[x + … + x ] Sigma x ) is
the max k (n- k+ 1 ) (n) i k ( 1 ) (k) i imum likelihood ratio test
statistic for k upper ( lower ) outliers in an exponential sample x ,
…, x . The null distributions of T for k= 1,2 were given by Fisher
and by Kimber 1 n k and Stevens , while those of T*(k= 1,2) were given by
Lewis and Fieller . In this paper , k the simple null distributions of T
and T* are found for all possible values of k, and k k percentage points
are tabulated for k= 1, 2, …, 8. In addition , we find a way of
determining k, which can reduce the masking or ' swamping ' effects .
Journal: Journal of Applied Statistics
Pages: 245-255
Issue: 2
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823232
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823232
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:2:p:245-255
Template-Type: ReDIF-Article 1.0
Author-Name: Jin Zhang
Author-X-Name-First: Jin
Author-X-Name-Last: Zhang
Author-Name: Xueren Wang
Author-X-Name-First: Xueren
Author-X-Name-Last: Wang
Title: Unmasking test for multiple upper or lower outliers in normal samples
Abstract:
The discordancy test for multiple outliers is complicated by problems of
masking and swamping. The key to the settlement of the question lies in
the determination of k , i.e. the number of 'contaminants' in a sample.
Great efforts have been made to solve this problem in recent years, but no
effective method has been developed. In this paper, we present two ways of
determining k , free from the effects of masking and swamping, when
testing upper (lower) outliers in normal samples. Examples are given to
illustrate the methods.
Journal: Journal of Applied Statistics
Pages: 257-261
Issue: 2
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823241
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823241
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:2:p:257-261
Template-Type: ReDIF-Article 1.0
Author-Name: D. R. Anderson
Author-X-Name-First: D. R.
Author-X-Name-Last: Anderson
Author-Name: K. P. Burnham
Author-X-Name-First: K. P.
Author-X-Name-Last: Burnham
Author-Name: G. C. White
Author-X-Name-First: G. C.
Author-X-Name-Last: White
Title: Comparison of Akaike information criterion and consistent Akaike information criterion for model selection and statistical inference from capture-recapture studies
Abstract:
We compare properties of parameter estimators under Akaike information
criterion (AIC) and 'consistent' AIC (CAIC) model selection in a nested
sequence of open population capture-recapture models. These models consist
of product multinomials, where the cell probabilities are parameterized in
terms of survival ( ) and capture ( p ) i i probabilities for each time
interval i . The sequence of models is derived from 'treatment' effects
that might be (1) absent, model H ; (2) only acute, model H ; or (3) acute
and 0 2 p chronic, lasting several time intervals, model H . Using a 35
factorial design, 1000 3 repetitions were simulated for each of 243 cases.
The true number of parameters ranged from 7 to 42, and the sample size
ranged from approximately 470 to 55 000 per case. We focus on the quality
of the inference about the model parameters and model structure that
results from the two selection criteria. We use achieved confidence
interval coverage as an integrating metric to judge what constitutes a
'properly parsimonious' model, and contrast the performance of these two
model selection criteria for a wide range of models, sample sizes,
parameter values and study interval lengths. AIC selection resulted in
models in which the parameters were estimated with relatively little bias.
However, these models exhibited asymptotic sampling variances that were
somewhat too small, and achieved confidence interval coverage that was
somewhat below the nominal level. In contrast, CAIC-selected models were
too simple, the parameter estimators were often substantially biased, the
asymptotic sampling variances were substantially too small and the
achieved coverage was often substantially below the nominal level. An
example case illustrates a pattern: with 20 capture occasions, 300
previously unmarked animals are released at each occasion, and the
survival and capture probabilities in the control group on each occasion
were 0.9 and 0.8 respectively using model H . There was a strong acute
treatment effect 3 on the first survival ( ) and first capture probability
( p ), and smaller, chronic effects 1 2 on the second and third survival
probabilities ( and ) as well as on the second capture 2 3 probability ( p
); the sample size for each repetition was approximately 55 000. CAIC 3
selection led to a model with exactly these effects in only nine of the
1000 repetitions, compared with 467 times under AIC selection. Under CAIC
selection, even the two acute effects were detected only 555 times,
compared with 998 for AIC selection. AIC selection exhibited a balance
between underfitted and overfitted models (270 versus 263), while CAIC
tended strongly to select underfitted models. CAIC-selected models were
overly parsimonious and poor as a basis for statistical inferences about
important model parameters or structure. We recommend the use of the AIC
and not the CAIC for analysis and inference from capture-recapture data
sets.
Journal: Journal of Applied Statistics
Pages: 263-282
Issue: 2
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823250
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823250
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:2:p:263-282
Template-Type: ReDIF-Article 1.0
Author-Name: Victor Guerrero
Author-X-Name-First: Victor
Author-X-Name-Last: Guerrero
Author-Name: Edmundo Berumen
Author-X-Name-First: Edmundo
Author-X-Name-Last: Berumen
Title: Forecasting electricity consumption with extra-model information provided by consumers
Abstract:
Univariate time series models make efficient use of available historical
records of electricity consumption for short-term forecasting. However,
the information (expectations) provided by electricity consumers in an
energy-saving survey, even though qualitative, was considered to be
particularly important, because the consumers' perception of the future
may take into account the changing economic conditions. Our approach to
forecasting electricity consumption combines historical data with
expectations of the consumers in an optimal manner, using the technique of
restricted forecasts. The same technique can be applied in some other
forecasting situations in which additional information-besides the
historical record of a variable-is available in the form of expectations.
Journal: Journal of Applied Statistics
Pages: 283-299
Issue: 2
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823269
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823269
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:2:p:283-299
Template-Type: ReDIF-Article 1.0
Author-Name: Hans-Joachim Mittag
Author-X-Name-First: Hans-Joachim
Author-X-Name-Last: Mittag
Author-Name: Dietmar Stemann
Author-X-Name-First: Dietmar
Author-X-Name-Last: Stemann
Title: Gauge imprecision effect on the performance of the X-S control chart
Abstract:
This paper examines the effect of stochastic measurement error (gauge
imprecision) on the performance of Shewhart-type X-S control charts. It is
shown that gauge imprecision may seriously affect the ability of the chart
to detect process disturbances quickly or, depending on the point in time
when the error occurs, the probability of erroneously signalling an
out-of-control process state.
Journal: Journal of Applied Statistics
Pages: 307-317
Issue: 3
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823043
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823043
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:3:p:307-317
Template-Type: ReDIF-Article 1.0
Author-Name: Duolao Wang
Author-X-Name-First: Duolao
Author-X-Name-Last: Wang
Author-Name: Mike Murphy
Author-X-Name-First: Mike
Author-X-Name-Last: Murphy
Title: Use of a mixture model for the analysis of contraceptive-use duration among long-term users
Abstract:
This paper introduces a mixture model that combines proportional hazards
regression with logistic regression for the analysis of survival data, and
describes its parameter estimation via an expectation maximization
algorithm. The mixture model is then applied to analyze the determinants
of the timing of intrauterine device (IUD) discontinuation and long-term
IUD use, utilizing 14 639 instances of IUD use by Chinese women. The
results show that socio-economic and demographic characteristics of women
have different influences on the acceleration or deceleration of the
timing of stopping IUD use and on the likelihood of long-term IUD use.
Journal: Journal of Applied Statistics
Pages: 319-332
Issue: 3
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823052
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823052
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:3:p:319-332
Template-Type: ReDIF-Article 1.0
Author-Name: Michael Sherman
Author-X-Name-First: Michael
Author-X-Name-Last: Sherman
Author-Name: F. Michael Speed
Author-X-Name-First: F. Michael
Author-X-Name-Last: Speed
Author-Name: F. Michael Speed
Author-X-Name-First: F. Michael
Author-X-Name-Last: Speed
Title: Analysis of tidal data via the blockwise bootstrap
Abstract:
We analyze tidal data from Port Mansfield, TX, using Kunsch's blockwise
bootstrap in the regression setting. In particular, we estimate the
variability of parameter estimates in a harmonic analysis via block
subsampling of residuals from a least-squares fit. We see that naive
least-squares variance estimates can be either too large or too small,
depending on the strength of correlation and the design matrix. We argue
that the block bootstrap is a simple, omnibus method of accounting for
correlation in a regression model with correlated errors.
Journal: Journal of Applied Statistics
Pages: 333-340
Issue: 3
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823061
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823061
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:3:p:333-340
Template-Type: ReDIF-Article 1.0
Author-Name: R. Vijayaraghavan
Author-X-Name-First: R.
Author-X-Name-Last: Vijayaraghavan
Author-Name: V. Soundararajan
Author-X-Name-First: V.
Author-X-Name-Last: Soundararajan
Title: Design and evaluation of skip-lot sampling inspection plans with double-sampling plan as the reference plan
Abstract:
This paper presents a design for skip-lot sampling inspection plans with
the double-sampling plan as the reference plan, so as to reduce the sample
size and produce more efficient plans in return for the same sampling
effort. The efficiency of the proposed plan compared with that of the
conventional double-sampling plan is also discussed. The need for smaller
acceptance numbers under the plan is highlighted. Methods of selecting the
plan indexed by the acceptable quality level and limiting quality level,
and by the acceptable quality level and average outgoing quality level are
also presented.
Journal: Journal of Applied Statistics
Pages: 341-348
Issue: 3
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823070
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823070
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:3:p:341-348
Template-Type: ReDIF-Article 1.0
Author-Name: D. K. Ghosh
Author-X-Name-First: D. K.
Author-X-Name-Last: Ghosh
Author-Name: S. C. Bagui
Author-X-Name-First: S. C.
Author-X-Name-Last: Bagui
Title: Identification of confounded design and its interactions
Abstract:
Kane has discussed a simple method for identifying the confounded
interactions from 2n factorial experiments when a replication consists of
(1) two blocks and (2) more than two blocks. It should be noted that
Kane's method holds only for (1) regular design and (2) when one
interaction is confounded. In the present investigation, we proposed a new
way of identifying the confounded designs and the confounded interactions
in 2n factorial experiments. Furthermore, the same method is extended to
3n and Sn factorial experiments.
Journal: Journal of Applied Statistics
Pages: 349-356
Issue: 3
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823089
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823089
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:3:p:349-356
Template-Type: ReDIF-Article 1.0
Author-Name: M. M. Shoukri
Author-X-Name-First: M. M.
Author-X-Name-Last: Shoukri
Author-Name: M. Attanasio
Author-X-Name-First: M.
Author-X-Name-Last: Attanasio
Author-Name: J. M. Sargeant
Author-X-Name-First: J. M.
Author-X-Name-Last: Sargeant
Title: Parametric versus semi-parametric models for the analysis of correlated survival data: A case study in veterinary epidemiology
Abstract:
Correlated survival data arise frequently in biomedical and epidemiologic
research, because each patient may experience multiple events or because
there exists clustering of patients or subjects, such that failure times
within the cluster are correlated. In this paper, we investigate the
appropriateness of the semi-parametric Cox regression and of the
generalized estimating equations as models for clustered failure time data
that arise from an epidemiologic study in veterinary medicine. The
semi-parametric approach is compared with a proposed fully parametric
frailty model. The frailty component is assumed to follow a gamma
distribution. Estimates of the fixed covariates effects were obtained by
maximizing the likelihood function, while an estimate of the variance
component ( frailty parameter) was obtained from a profile likelihood
construction.
Journal: Journal of Applied Statistics
Pages: 357-374
Issue: 3
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823098
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823098
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:3:p:357-374
Template-Type: ReDIF-Article 1.0
Author-Name: Christopher Todd Edwards
Author-X-Name-First: Christopher Todd
Author-X-Name-Last: Edwards
Title: Non-parametric procedure for knockout tournaments
Abstract:
In a seeded knockout tournament, where teams have some preassigned
strength, do we have any assurances that the best team in fact has won? Is
there some insight to be gained by considering which teams beat which
other teams solely examining the seeds? We pose an answer to these
questions by using the difference in the seeds of the two players as the
basis for a test statistic. We offer several models for the underlying
probability structure to examine the null distribution and power functions
and determine these for small tournaments (less than five teams). One
structure each for 8 teams and 16 teams is examined, and we conjecture an
asymptotic normal distribution for the test statistic.
Journal: Journal of Applied Statistics
Pages: 375-385
Issue: 3
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823106
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823106
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:3:p:375-385
Template-Type: ReDIF-Article 1.0
Author-Name: Sadao Tomizawa
Author-X-Name-First: Sadao
Author-X-Name-Last: Tomizawa
Author-Name: Takashi Seo
Author-X-Name-First: Takashi
Author-X-Name-Last: Seo
Author-Name: Hideharu Yamamoto
Author-X-Name-First: Hideharu
Author-X-Name-Last: Yamamoto
Title: Power-divergence-type measure of departure from symmetry for square contingency tables that have nominal categories
Abstract:
For square contingency tables that have nominal categories, Tomizawa
considered two kinds of measure to represent the degree of departure from
symmetry. This paper proposes a generalization of those measures. The
proposed measure is expressed by using the average of the power divergence
of Cressie and Read, or the average of the diversity index of Patil and
Taillie. Special cases of the proposed measure include Tomizawa's
measures. The proposed measure would be useful for comparing the degree of
departure from symmetry in several tables.
Journal: Journal of Applied Statistics
Pages: 387-398
Issue: 3
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823115
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823115
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:3:p:387-398
Template-Type: ReDIF-Article 1.0
Author-Name: K. Govindaraju
Author-X-Name-First: K.
Author-X-Name-Last: Govindaraju
Author-Name: S. Ganesalingam
Author-X-Name-First: S.
Author-X-Name-Last: Ganesalingam
Title: Zero acceptance number quick switching system for compliance sampling
Abstract:
The zero acceptance number plan is invariably used for compliance
sampling and safety inspection of products. The disadvantage of such a
plan is that its discriminating power between good and bad lots is poor.
This paper presents a quick switching system that has zero acceptance
numbers, with a provision for the resubmission of lots not accepted during
normal inspection. The proposed system is found to require a smaller
average sample size, and possesses greater discriminating power.
Journal: Journal of Applied Statistics
Pages: 399-407
Issue: 3
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823124
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823124
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:3:p:399-407
Template-Type: ReDIF-Article 1.0
Author-Name: F. Javier Trivez
Author-X-Name-First: F. Javier
Author-X-Name-Last: Trivez
Author-Name: Javier Nievas
Author-X-Name-First: Javier
Author-X-Name-Last: Nievas
Title: Analyzing the effects of level shifts and temporary changes on the identification of ARIMA models
Abstract:
The presence of outliers in time series gives rise to important effects
on the sample autocorrelation coefficients. In the case where these
outliers are not adequately treated, their presence causes errors in the
identification of the stochastic process generator of the time series
under study. In this respect, Chan has demonstrated that, independent of
the underlying process of the outlier-free series, a level shift (LS) at
the limit (i.e. asymptotically and considering an LS of a sufficiently
large size) will lead to the identification of non-stationary processes;
with respect to a temporary change (TC), this will lead, again at the
limit, to the identification of an AR(1) autoregressive process with a
coefficient equal to the dampening factor that defines this TC. The
objective of this paper is to analyze, by way of a simulation exercise,
how large the LS and TC present in the time series must be for the
limiting result to be relevant, in the sense of seriously affecting the
instruments used at the identification stage of the ARIMA models, i.e. the
sample autocorrelation function and the sample partial autocorrelation
function.
Journal: Journal of Applied Statistics
Pages: 409-424
Issue: 3
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823133
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823133
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:3:p:409-424
Template-Type: ReDIF-Article 1.0
Author-Name: Stan Li
Author-X-Name-First: Stan
Author-X-Name-Last: Li
Title: Bayesian object matching
Abstract:
A Bayesian approach to object matching is presented. An object and a
scene are each represented by features, such as critical points, line
segments and surface patches, constrained by unary properties and
contextual relations. The matching is presented as a labeling problem,
where each feature in the scene is assigned (associated with) a feature of
the known model objects. The prior distribution of a scene's labeling is
modeled as a Markov random field, which encodes the between-object
constraints. The conditional distribution of the observed features labeled
is assumed to be Gaussian, which encodes the within-object constraints. An
optimal solution is defined as a maximum a posteriori estimate.
Relationships with previous work are discussed. Experimental results are
shown.
Journal: Journal of Applied Statistics
Pages: 425-443
Issue: 3
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823142
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823142
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:3:p:425-443
Template-Type: ReDIF-Article 1.0
Author-Name: K. Govindaraju
Author-X-Name-First: K.
Author-X-Name-Last: Govindaraju
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Title: Tightened single-level continuous sampling plan
Abstract:
In this paper, a new tightening concept has been incorporated into the
single-level continuous sampling plan CSP-1, such that quality degradation
will warrant sampling inspection to cease beyond a certain number of
sampled items, until new evidence of good quality is established. The
expressions of the performance measures for this new plan, such as the
operating characteristic, average outgoing quality and average fraction
inspected, are derived using a Markov chain model. The advantage of the
tightened CSP-1 plan is that it is possible to lower the average outgoing
quality limit.
Journal: Journal of Applied Statistics
Pages: 451-461
Issue: 4
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822945
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822945
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:4:p:451-461
Template-Type: ReDIF-Article 1.0
Author-Name: Fritz Efaw
Author-X-Name-First: Fritz
Author-X-Name-Last: Efaw
Title: Test of alternative strike settlement models
Abstract:
This paper extends our understanding of what determines the length of
strikes, by comparing two alternative models of the strike settlement
process, while simultaneously allowing for variation in this process as a
result of economic conditions and unobserved heterogeneity. Some
inconclusive support is found for the view that settlement proceeds by one
or both sides presenting terms for acceptance or rejection, rather than by
both sides yielding ground toward an intermediate position. Strike
duration is found to be shortest at peaks of business cycles, and
settlement is not duration dependent.
Journal: Journal of Applied Statistics
Pages: 463-474
Issue: 4
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822954
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822954
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:4:p:463-474
Template-Type: ReDIF-Article 1.0
Author-Name: Anita Ghatak
Author-X-Name-First: Anita
Author-X-Name-Last: Ghatak
Title: Aggregate consumption functions for India: A cointegration analysis under structural changes, 1919-86
Abstract:
This paper extends cointegration methodology to include the effect of
possible structural changes on aggregate consumption behaviour in India
during 1919-86. The only cointegrated relation is found to be a dynamic
linear regression of lag order two, with 1944 as the year in which
structural change began. The estimated short-run marginal propensity to
consume (MPC) is greater than the long-run MPC. The estimates of the MPC
are different from previous estimates for the Indian economy based on
conventional econometrics. The initial year of structural change has been
selected by extending the method of Perron and that of Zivot and Andrews.
Journal: Journal of Applied Statistics
Pages: 475-488
Issue: 4
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822963
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822963
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:4:p:475-488
Template-Type: ReDIF-Article 1.0
Author-Name: Gary Koop
Author-X-Name-First: Gary
Author-X-Name-Last: Koop
Title: Carbon dioxide emissions and economic growth: A structural approach
Abstract:
This paper uses data for 44 countries from 1970-1990, to investigate the
relationship between economic growth and carbon dioxide emissions.
Empirical results are obtained from a structural model from the empirical
growth literature modified to include environmental 'bads'. Results
suggest that richer countries exhibit technical progress in a way that
economizes on carbon dioxide emissions but that poorer countries do not.
Furthermore, there is no indication that the growth process is leading
poorer countries to move towards the adoption of the same
pollution-ameliorating technology as characterizes richer countries.
Journal: Journal of Applied Statistics
Pages: 489-515
Issue: 4
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822972
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822972
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:4:p:489-515
Template-Type: ReDIF-Article 1.0
Author-Name: Alexei Dmitrienko
Author-X-Name-First: Alexei
Author-X-Name-Last: Dmitrienko
Author-Name: Z. Govindarajulu
Author-X-Name-First: Z.
Author-X-Name-Last: Govindarajulu
Title: The 'demon' problem of Youden: Exponential case
Abstract:
We consider the problem of finding the probability of a sample mean
falling above the (n - k)th-order statistic in a random sample of size n.
Explicit expressions are obtained for the exponential distribution. Some
applications that pertain to testing for outliers and goodness of fit are
given.
Journal: Journal of Applied Statistics
Pages: 517-523
Issue: 4
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822981
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822981
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:4:p:517-523
Template-Type: ReDIF-Article 1.0
Author-Name: Alan Hutson
Author-X-Name-First: Alan
Author-X-Name-Last: Hutson
Title: Direct estimation of the percentile 'p-value' for the one-sample median test
Abstract:
In this paper we outline and illustrate an easy-to-use inference
procedure for directly calculating the approximate bootstrap
percentile-type p-value for the one-sample median test, i.e. we calculate
the bootstrap p -value without resampling, by using a fractional order
statistics based approach. The method parallels earlier work on
fractionalorder-statistics-based non-parametric bootstrap percentile-type
confidence intervals for quantiles. Monte Carlo simulation studies are
performed, which illustrate that the fractional-order-statistics-based
approach to the one-sample median test has accurate type I error control
for small samples over a wide range of distributions; is easy to
calculate; and is preferable to the sign test in terms of type I error
control and power. Furthermore, the fractional-order-statistics-based
median test is easily generalized to testing that any quantile has some
hypothesized value; for example, tests for the upper or lower quartile may
be performed using the same framework.
Journal: Journal of Applied Statistics
Pages: 525-533
Issue: 4
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822990
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822990
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:4:p:525-533
Template-Type: ReDIF-Article 1.0
Author-Name: C. D. Lai
Author-X-Name-First: C. D.
Author-X-Name-Last: Lai
Author-Name: K. Govindaraju
Author-X-Name-First: K.
Author-X-Name-Last: Govindaraju
Author-Name: M. Xie
Author-X-Name-First: M.
Author-X-Name-Last: Xie
Title: Effects of correlation on fraction non-conforming statistical process control procedures
Abstract:
High-yield production processes that involve a low fraction
non-conforming are becoming more common, and the limitations of the
standard control charting procedures for such processes are well known.
This paper examines the control procedures based on the conforming unit
run lengths applied to near-zero-defect processes in the presence of
serial correlation. Using a correlation binomial model, a few control
schemes are investigated and control limits are derived. The results
reduce to the traditional case when the measurements are independent.
However, it is shown that the false alarm rate cannot be reduced to below
the amount of serial correlation present in the process.
Journal: Journal of Applied Statistics
Pages: 535-543
Issue: 4
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823007
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823007
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:4:p:535-543
Template-Type: ReDIF-Article 1.0
Author-Name: Alan Veevers
Author-X-Name-First: Alan
Author-X-Name-Last: Veevers
Title: Viability and capability indexes for multiresponse processes
Abstract:
The viability index Vr is introduced as an intuitively appealing measure
of the capability potential of a process. It is related to the well-known
index Cp but has some advantages over it. The statistical properties of Vr
are readily obtainable and, unlike Cp, it extends naturally to
multi-response processes. The multivariate viability index Vrn is defined,
discussed and illustrated using an example from the minerals sector.
Journal: Journal of Applied Statistics
Pages: 545-558
Issue: 4
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823016
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823016
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:4:p:545-558
Template-Type: ReDIF-Article 1.0
Author-Name: Nien Fan Zhang
Author-X-Name-First: Nien Fan
Author-X-Name-Last: Zhang
Title: Estimating process capability indexes for autocorrelated data
Abstract:
Process capability indexes are widely used in the manufacturing
industries and by supplier companies in process assessments and in the
evaluation of purchasing decisions. One concern about using the process
capability indexes is the assumption of the mutual independence of the
process data, because, in process industries, process data are often
autocorrelated. This paper discusses the use of the process capability
indexes Cp and Cpk when the process data are autocorrelated. Interval
estimation procedures for Cp and Cpk are proposed and their properties are
studied.
Journal: Journal of Applied Statistics
Pages: 559-574
Issue: 4
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769823025
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769823025
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:4:p:559-574
Template-Type: ReDIF-Article 1.0
Author-Name: Anita Ghatak
Author-X-Name-First: Anita
Author-X-Name-Last: Ghatak
Title: Vector autoregression modelling and forecasting growth of South Korea
Abstract:
In this paper, we have estimated vector
autoregression (VAR), Bayesian vector autoregression (BVAR) and vector
error-correction models (VECMs) using annual time-series data of South
Korea for 1950-94. We find evidence supporting the view that growth of
real per-capita income has been aided by income, investment and export
growth, as well as government spending and exchange rate policies. The
VECMs provide better forecasts of growth than do the VAR and BVAR models
for both short-term and long-term predictions.
Journal: Journal of Applied Statistics
Pages: 579-592
Issue: 5
Volume: 25
Year: 1998
Month: 6
X-DOI: 10.1080/02664769822837
File-URL: http://hdl.handle.net/10.1080/02664769822837
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:5:p:579-592
Template-Type: ReDIF-Article 1.0
Author-Name: Larisa Matejic
Author-X-Name-First: Larisa
Author-X-Name-Last: Matejic
Title: Testing for brain anomalies: A hippocampus study
Abstract:
A mathematical classification method is
presented to show how numerical tests for abnormal anatomical shape change
can be used to study geometrical shape changes of the hippocampus in
relation to the occurrence of schizophrenia. The method uses the
well-known best Bayesian decision rule for two simple hypotheses.
Furthermore, the technique is illustrated by applying the hypothesis
testing method to some preliminary hippocampal data. The data pool
available for the experiment consisted of 10 subjects, five of whom were
diagnosed with schizophrenia and five of whom were not schizophrenics.
Even though the information used in the experiment is limited and the
number of subjects is relatively small, we are confident that the
mathematical classification method presented is of significance and can be
used successfully, given proper data, as a diagnostic tool.
Journal: Journal of Applied Statistics
Pages: 593-600
Issue: 5
Volume: 25
Year: 1998
Month: 6
X-DOI: 10.1080/02664769822846
File-URL: http://hdl.handle.net/10.1080/02664769822846
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:5:p:593-600
Template-Type: ReDIF-Article 1.0
Author-Name: M. Hosseini
Author-X-Name-First: M.
Author-X-Name-Last: Hosseini
Author-Name: R. G. Carpenter
Author-X-Name-First: R. G.
Author-X-Name-Last: Carpenter
Author-Name: K. Mohammad
Author-X-Name-First: K.
Author-X-Name-Last: Mohammad
Title: Identification of outlying height and weight data in the Iranian National Health Survey 1990-92
Abstract:
Data on the weights and heights of
children 2-18 yeas old in Iran were obtained in a National Health Survey
of 10 660 families in 1990-92. Data were 'cleaned' in 1 year age groups.
After excluding gross outliers by inspection of bivariate scatter plots,
Box-Cox power transformations were used to normalize the distributions of
height and weight. If a multivariate Box-Cox power transformation to
normality exists, then it is equivalent to normalizing the data variable
by variable. After excluding gross outliers, exclusions based on the
Mahalanobis distance were almost identical to those identified by Hadi's
iterative procedure, because the percentages of outliers were small. In
all, 1% of the observations were gross outliers and a further 0.4% were
identified by multivariate analysis. Review of records showed that the
outliers identified by multivariate analysis resulted from data-processing
errors. After transformation and 'cleaning', the data quality was
excellent and suitable for the construction of growth charts.
Journal: Journal of Applied Statistics
Pages: 601-612
Issue: 5
Volume: 25
Year: 1998
Month: 6
X-DOI: 10.1080/02664769822855
File-URL: http://hdl.handle.net/10.1080/02664769822855
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:5:p:601-612
Template-Type: ReDIF-Article 1.0
Author-Name: Philip Prescott
Author-X-Name-First: Philip
Author-X-Name-Last: Prescott
Author-Name: Norman R. Draper
Author-X-Name-First: Norman R.
Author-X-Name-Last: Draper
Title: Mixture designs for constrained components in orthogonal blocks
Abstract:
It is often the case in mixture
experiments that some of the ingredients, such as additives or
flavourings, are included with proportions constrained to lie in a
restricted interval, while the majority of the mixture is made up of a
particular ingredient used as a filler. The experimental region in such
cases is restricted to a parallelepiped in or near one corner of the full
simplex region. In this paper, orthogonally blocked designs with two
experimental blends on each edge of the constrained region are considered
for mixture experiments with three and four ingredients. The optimal
symmetric orthogonally blocked designs within this class are determined
and it is shown that even better designs are obtained for the asymmetric
situation, in which some experimental blends are taken at the vertices of
the experimental region. Some examples are given to show how these ideas
may be extended to identify good designs in three and four blocks.
Finally, an example is included to illustrate how to overcome the problems
of collinearity that sometimes occur when fitting quadratic models to
experimental data from mixture experiments in which some of the ingredient
proportions are restricted to small values.
Journal: Journal of Applied Statistics
Pages: 613-638
Issue: 5
Volume: 25
Year: 1998
Month: 6
X-DOI: 10.1080/02664769822864
File-URL: http://hdl.handle.net/10.1080/02664769822864
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:5:p:613-638
Template-Type: ReDIF-Article 1.0
Author-Name: James Carpenter
Author-X-Name-First: James
Author-X-Name-Last: Carpenter
Title: Assessing parameter uncertainty via bootstrap likelihood ratio confidence regions
Abstract:
In this paper, we show that, under
certain regularity conditions, constructing likelihood ratio confidence
regions using a boostrap estimate of the distribution of the likelihood
ratio statistic-instead of the usual chi 2 approximation-leads to regions
which have a coverage error of O(n- 2), which is the same as that achieved
using a Bartlett-corrected likelihood ratio statistic. We use the boostrap
method to assess the uncertainty associated with dose-response parameters
that arise in models for the Japanese atomic bomb survivors data.
Journal: Journal of Applied Statistics
Pages: 639-649
Issue: 5
Volume: 25
Year: 1998
Month: 6
X-DOI: 10.1080/02664769822873
File-URL: http://hdl.handle.net/10.1080/02664769822873
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:5:p:639-649
Template-Type: ReDIF-Article 1.0
Author-Name: James F. Reed
Author-X-Name-First: James F.
Author-X-Name-Last: Reed
Title: Contributions to adaptive estimation
Abstract:
There are many statistics which can be
used to characterize data sets and provide valuable information regarding
the data distribution, even for large samples. Traditional measures, such
as skewness and kurtosis, mentioned in introductory statistics courses,
are rarely applied. A variety of other measures of tail length, skewness
and tail weight have been proposed, which can be used to describe the
underlying population distribution. Adaptive statistical procedures change
the estimator of location, depending on sample characteristics. The
success of these estimators depends on correctly classifying the
underlying distribution model. Advocates of adaptive distribution testing
propose to proceed by assuming (1) that an appropriate model, say Omega ,
is such that Omega { Omega , Omega , i i 1 2 ... , Omega }, and (2) that
the character of the model selection process is statistically k
independent of the hypothesis testing. We review the development of
adaptive linear estimators and adaptive maximum-likelihood estimators.
Journal: Journal of Applied Statistics
Pages: 651-669
Issue: 5
Volume: 25
Year: 1998
Month: 6
X-DOI: 10.1080/02664769822882
File-URL: http://hdl.handle.net/10.1080/02664769822882
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:5:p:651-669
Template-Type: ReDIF-Article 1.0
Author-Name: M. J. Baxter
Author-X-Name-First: M. J.
Author-X-Name-Last: Baxter
Author-Name: N. H. Gale
Author-X-Name-First: N. H.
Author-X-Name-Last: Gale
Title: Testing for multivariate normality via univariate tests: A case study using lead isotope ratio data
Abstract:
Samples from ore bodies, mined for
copper in antiquity, can be characterized by measurements on three lead
isotope ratios. Given sufficient samples, it is possible to estimate the
lead isotope field-a three-dimensional construct-that characterizes the
ore body. For the purposes of estimating the extent of a field, or
assessing whether bronze artefacts could have been made using copper from
a particular field, it is often assumed that fields have a trivariate
normal distribution. Using recently published data, for which the sample
sizes are larger than usual, this paper casts doubt on this assumption. A
variety of tests of univariate normality are applied, both to the original
lead isotope ratios and to transformations of them based on principal
component analysis; the paper can be read as a case study in the use of
tests of univariate normality for assessing multivariate normality. This
is not an optimal approach, but is sufficient in the cases considered to
suggest that fields are, in fact, 'non-normal'. A direct test of
multivariate normality confirms this. Some implications for the use of
lead isotope ratio data in archaeology are discussed.
Journal: Journal of Applied Statistics
Pages: 671-683
Issue: 5
Volume: 25
Year: 1998
Month: 6
X-DOI: 10.1080/02664769822891
File-URL: http://hdl.handle.net/10.1080/02664769822891
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:5:p:671-683
Template-Type: ReDIF-Article 1.0
Author-Name: Christine A. Ribic
Author-X-Name-First: Christine A.
Author-X-Name-Last: Ribic
Author-Name: Thomas W. Miller
Author-X-Name-First: Thomas W.
Author-X-Name-Last: Miller
Title: Evaluation of alternative model selection criteria in the analysis of unimodal response curves using CART
Abstract:
We investigated CART performance with a
unimodal response curve for one continuous response and four continuous
explanatory variables, where two variables were important (i.e. directly
related to the response) and the other two were not. We explored
performance under three relationship strengths and two explanatory
variable conditions: equal importance and one variable four times as
important as the other. We compared CART variable selection performance
using three tree-selection rules ('minimum risk', 'minimum risk
complexity', 'one standard error') to stepwise polynomial ordinary least
squares (OLS) under four sample size conditions. The one-standard-error
and minimum risk-complexity methods performed about as well as stepwise
OLS with large sample sizes when the relationship was strong. With weaker
relationships, equally important explanatory variables and larger sample
sizes, the one-standard-error and minimum-risk-complexity rules performed
better than stepwise OLS. With weaker relationships and explanatory
variables of unequal importance, tree-structured methods did not perform
as well as stepwise OLS. Comparing performance within tree-structured
methods, with a strong relationship and equally important explanatory
variables, the one-standard-error rule was more likely to choose the
correct model than were the other tree-selection rules. The
minimum-risk-complexity rule was more likely to choose the correct model
than were the other tree-selection rules (1) with weaker relationships and
equally important explanatory variables; and (2) under all relationship
strengths when explanatory variables were of unequal importance and sample
sizes were lower.
Journal: Journal of Applied Statistics
Pages: 685-698
Issue: 5
Volume: 25
Year: 1998
Month: 6
X-DOI: 10.1080/02664769822909
File-URL: http://hdl.handle.net/10.1080/02664769822909
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:5:p:685-698
Template-Type: ReDIF-Article 1.0
Author-Name: Christine Gunter
Author-X-Name-First: Christine
Author-X-Name-Last: Gunter
Author-Name: Colin Rallings
Author-X-Name-First: Colin
Author-X-Name-Last: Rallings
Author-Name: Michael Thrasher
Author-X-Name-First: Michael
Author-X-Name-Last: Thrasher
Title: Calculating the total vote where the district magnitude is greater than one: A test of some algorithms using British local election data
Abstract:
Electoral analysis using aggregate data
relies on the availability of accurate voting statistics. One vital piece
of information, often missing from official electoral returns,
particularly British local government elections, is the total number of
valid ballot papers. This figure is essential for the calculation of
electoral turnout. When voters have a single vote and official information
about the number of ballot papers issued is missing, a figure for the
total vote can still be derived. However, local elections in Britain
frequently use a system of multiple-member wards, where voters have as
many votes as there are seats to be filled. In such cases, calculating the
total vote and, hence, the turnout does present a real problem. It cannot
be assumed that all voters will use their full quota of votes or that
voters will cast a ballot in favour of a single party. This paper develops
and tests diff erent algorithms for calculating the total vote in such
circumstances. We conclude that the accuracy of an algorithm is closely
related to the structure of party competition. The findings of this paper
have a number of important implications. First, the difficulties in
calculating the turnout in multiple-member wards are identified. This will
inform the debate about public participation in the local electoral
process. Second, the method for deriving a figure for the total vote has
an important bearing on a number of other statistics widely employed in
electoral analysis.
Journal: Journal of Applied Statistics
Pages: 699-706
Issue: 5
Volume: 25
Year: 1998
Month: 6
X-DOI: 10.1080/02664769822918
File-URL: http://hdl.handle.net/10.1080/02664769822918
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:5:p:699-706
Template-Type: ReDIF-Article 1.0
Author-Name: Uditha Balasooriya
Author-X-Name-First: Uditha
Author-X-Name-Last: Balasooriya
Author-Name: Sutaip L. C. Saw
Author-X-Name-First: Sutaip L. C.
Author-X-Name-Last: Saw
Title: Reliability sampling plans for the two-parameter exponential distribution under progressive censoring
Abstract:
This paper presents reliability sampling
plans for the two-parameter exponential distribution under progressive
censoring. These sampling plans are quite useful to practitioners, because
they provide savings in resources and in total test time. Furthermore,
they off er the flexibility to remove functioning test specimens from
further testing at various stages of the experimentation. In the
construction of these sampling plans, the operating characteristic curve
is derived using the exact distributional properties of maximum likelihood
estimators. An example is given to illustrate the application of the
proposed sampling plans.
Journal: Journal of Applied Statistics
Pages: 707-714
Issue: 5
Volume: 25
Year: 1998
Month: 6
X-DOI: 10.1080/02664769822927
File-URL: http://hdl.handle.net/10.1080/02664769822927
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:5:p:707-714
Template-Type: ReDIF-Article 1.0
Author-Name: Vladimir Brajkovic
Author-X-Name-First: Vladimir
Author-X-Name-Last: Brajkovic
Title: Mechanics of microelectrics examined by design of experiments techniques
Abstract:
We live in a world full of variations. We need to understand its sources
and we need a scientific method for predicting it, for reducing it and for
controlling it. Statistical thinking is the only way to deal with
variations. Continuous improvement means continuously solving the
variation problem. But this relies on a successful marriage of theory and
practice; experience is insufficient without theory. The theory needs to
be taught. There is no substitute for knowledge (Logothetis, 1991).
Journal: Journal of Applied Statistics
Pages: 723-731
Issue: 6
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822710
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822710
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:6:p:723-731
Template-Type: ReDIF-Article 1.0
Author-Name: Reay-Chen Wang
Author-X-Name-First: Reay-Chen
Author-X-Name-Last: Wang
Title: Minimum average fraction inspected for short-run CSP-1 plan
Abstract:
This paper presents details of the calculation of the average outgoing
quality limit (AOQL) for a short-run CSP-1 plan based on Y ang's renewal
process approach. A solution procedure is developed to find the unique
combination (i,f) that will meet the AOQL requirement, while also
minimizing the average fraction inspected for the shortrun CSP-1 plan when
the process average p (> AOQL) and production run length R are known.
Journal: Journal of Applied Statistics
Pages: 733-738
Issue: 6
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822729
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822729
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:6:p:733-738
Template-Type: ReDIF-Article 1.0
Author-Name: Ling Chen
Author-X-Name-First: Ling
Author-X-Name-Last: Chen
Title: Improved penalized mean for estimating the mean concentration of contaminants
Abstract:
Chen and Jernigan proposed a non-parametric, conservative method that
involved using a penalized mean to estimate the average concentration of
contaminants in soils. The method assumes a random sample obtained from a
whole site involved in the US Superfund program. However, in some cases,
about 10% of known data are collected from the 'hot spots'. In this paper,
two procedures are proposed to use the information from hot spots data or
an extreme value to estimate the mean concentration of contaminants. These
procedures are evaluated using a data set of chromium concentrations from
one of the Environmental Protection Agency's toxic waste sites. The
simulation results show that these new procedures are cost-eff ective.
Journal: Journal of Applied Statistics
Pages: 739-750
Issue: 6
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822738
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822738
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:6:p:739-750
Template-Type: ReDIF-Article 1.0
Author-Name: Brenton Clarke
Author-X-Name-First: Brenton
Author-X-Name-Last: Clarke
Author-Name: Toby Lewis
Author-X-Name-First: Toby
Author-X-Name-Last: Lewis
Title: An outlier problem in the determination of ore grade
Abstract:
Data from recordings of ore assays from the Western Australian goldfields
provide motivation to devise new tests for outliers when observations are
distributed with the same mean but diff ering variances. In the case of
equal variances, tests for a single outlier reduce to well-known tests of
discordancy. A block discordancy test for k outliers is also described.
The question of whether or not one should omit any observation(s) in the
calculation of the mean recoverable gold content is addressed in the
context of whether or not the data contain outliers, as judged by a normal
model for the 'logged' ore assay values. The given data suggest that
models with 'logged' values that follow long-tailed approximately normal
distributions may be appropriate.
Journal: Journal of Applied Statistics
Pages: 751-762
Issue: 6
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822747
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822747
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:6:p:751-762
Template-Type: ReDIF-Article 1.0
Author-Name: Xavier De Luna
Author-X-Name-First: Xavier
Author-X-Name-Last: De Luna
Title: Projected polynomial autoregression for prediction of stationary time series
Abstract:
Polynomial autoregressions are usually considered to be unrealistic
models for time series. However, this paper shows that they can
successfully be used when the purpose of the time series study is to
provide forecasts. A projection scheme inspired from projection pursuit
regression and feedforward artificial neural networks is used in order to
avoid an explosion of the number of parameters when considering a large
number of lags. The estimation of the parameters of the projected
polynomial autoregressions is a non-linear least-squares problem. A
consistency result is proved. A simulation study shows that the naive use
of the common final prediction error criterion is inappropriate to
identify the best projected polynomial autoregression. An explanation of
this phenomenon is given and a correction to the criterion is proposed. An
important feature of the polynomial predictors introduced in this paper is
their simple implementation, which allows for automatic use. This is
illustrated with real data for the three-month US Treasury Bill.
Journal: Journal of Applied Statistics
Pages: 763-775
Issue: 6
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822756
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822756
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:6:p:763-775
Template-Type: ReDIF-Article 1.0
Author-Name: Reza Modarres
Author-X-Name-First: Reza
Author-X-Name-Last: Modarres
Author-Name: Joseph Gastwirth
Author-X-Name-First: Joseph
Author-X-Name-Last: Gastwirth
Title: Hybrid test for the hypothesis of symmetry
Abstract:
In recent years, McWilliams and Tajuddin have proposed new and more
powerful non-parametric tests of symmetry for continuous distributions
about a known center. In this paper, we propose a simple non-parametric
two-stage procedure based on the sign test and a percentile-modified
two-sample Wilcoxon test. The small-sample properties of this test,
Tajuddin's test, McWilliams' test and a modified runs test of Modarres and
Gastwirth are investigated in a Monte Carlo simulation study. The
simulations indicate that, for a wide variety of asymmetric alternatives
in the lambda family, the hybrid test is more powerful than are existing
tests in the literature.
Journal: Journal of Applied Statistics
Pages: 777-783
Issue: 6
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822765
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822765
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:6:p:777-783
Template-Type: ReDIF-Article 1.0
Author-Name: Yoshikazu Ojima
Author-X-Name-First: Yoshikazu
Author-X-Name-Last: Ojima
Title: General formulae for expectations, variances and covariances of the mean squares for staggered nested designs
Abstract:
Staggered nested experimental designs are the most popular class of
unbalanced nested designs. Using a special notation which covers the
particular structure of the staggered nested design, this paper
systematically derives the canonical form for the arbitrary m-factors.
Under the normality assumption for every random variable, a vector
comprising m canonical variables from each experimental unit is normally
independently and identically distributed. Every sum of squares used in
the analysis of variance (ANOVA) can be expressed as the sum of squares of
the corresponding canonical variables. Hence, general formulae for the
expectations, variances and covariances of the mean squares are directly
obtained from the canonical form. Applying the formulae, the explicit
forms of the ANOVA estimators of the variance components and unbiased
estimators of the ratios of the variance components are introduced in this
paper. The formulae are easily applied to obtain the variances and
covariances of any linear combinations of the mean squares, especially the
ANOVA estimators of the variance components. These results are eff
ectively applied for the standardization of measurement methods.
Journal: Journal of Applied Statistics
Pages: 785-799
Issue: 6
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822774
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822774
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:6:p:785-799
Template-Type: ReDIF-Article 1.0
Author-Name: W. L. Pearn
Author-X-Name-First: W. L.
Author-X-Name-Last: Pearn
Title: New generalization of process capability index Cpk
Abstract:
The process capability index Cpk has been widely used in manufacturing
industry to provide numerical measures of process potential and
performance. As noted by many quality control researchers and
practitioners, Cpk is yield-based and is independent of the target T. This
fails to account for process centering with symmetric tolerances, and
presents an even greater problem with asymmetric tolerances. To overcome
the problem, several generalizations of Cpk have been proposed to handle
processes with asymmetric tolerances. Unfortunately, these generalizations
understate or overstate the process capability in many cases, so reflect
the process potential and performance inaccurately. In this paper, we
first introduce a new index Cp"k, which is shown to be superior to
the existing generalizations of Cpk. We then investigate the statistical
properties of the natural estimator of Cp"k, assuming that the
process is normally distributed.
Journal: Journal of Applied Statistics
Pages: 801-810
Issue: 6
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822783
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822783
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:6:p:801-810
Template-Type: ReDIF-Article 1.0
Author-Name: Graham Upton
Author-X-Name-First: Graham
Author-X-Name-Last: Upton
Title: Rounding halves
Abstract:
This paper examines the consequences of requiring that data measured as
multiples of a half should be reported as integers. General formulae are
given for the mean and variance of rounded values. The formulae are
applied in the context of fibre counting, where fibres that overlap a
boundary are given a value of 1/2.
Journal: Journal of Applied Statistics
Pages: 811-816
Issue: 6
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822792
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822792
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:6:p:811-816
Template-Type: ReDIF-Article 1.0
Author-Name: Grace Montepiedra
Author-X-Name-First: Grace
Author-X-Name-Last: Montepiedra
Title: Application of genetic algorithms to the construction of exact D-optimal designs
Abstract:
This paper studies the application of genetic algorithms to the
construction of exact D-optimal experimental designs. The concept of
genetic algorithms is introduced in the general context of the problem of
finding optimal designs. The algorithm is then applied specifically to
finding exact D-optimal designs for three different types of model. The
performance of genetic algorithms is compared with that of the modified
Fedorov algorithm in terms of computing time and relative efficiency.
Finally, potential applications of genetic algorithms to other optimality
criteria and to other types of model are discussed, along with some open
problems for possible future research.
Journal: Journal of Applied Statistics
Pages: 817-826
Issue: 6
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822800
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822800
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:6:p:817-826
Template-Type: ReDIF-Article 1.0
Author-Name: D. K. Ghosh
Author-X-Name-First: D. K.
Author-X-Name-Last: Ghosh
Title: Robustness of complete diallel crosses plans to the unavailability of one block
Abstract:
The present investigation involved the estimation of the general
combining ability of CDC plans subject to the unavailability of one block
for Griffing's system IV. Further, it has been shown that CDC plans are
fairly robust to the unavailability of one block.
Journal: Journal of Applied Statistics
Pages: 827-837
Issue: 6
Volume: 25
Year: 1998
X-DOI: 10.1080/02664769822819
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769822819
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:25:y:1998:i:6:p:827-837
Template-Type: ReDIF-Article 1.0
Author-Name: Saling Huang
Author-X-Name-First: Saling
Author-X-Name-Last: Huang
Author-Name: Morton Brown
Author-X-Name-First: Morton
Author-X-Name-Last: Brown
Title: A Markov chain model for longitudinal categorical data when there may be non-ignorable non-response
Abstract:
Longitudinal data with non-response occur in studies where the same
subject is followed over time but data for each subject may not be
available at every time point. When the response is categorical and the
response at time t depends on the response at the previous time points, it
may be appropriate to model the response using a Markov model. We
generalize a second-order Markov model to include a non-ignorable
non-response mechanism. Simulation is used to study the properties of the
estimators. Large sample sizes are necessary to ensure that the algorithm
converges and that the asymptotic properties of the estimators can be
used.
Journal: Journal of Applied Statistics
Pages: 5-18
Issue: 1
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922610
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922610
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:1:p:5-18
Template-Type: ReDIF-Article 1.0
Author-Name: L. Y. Chan
Author-X-Name-First: L. Y.
Author-X-Name-Last: Chan
Title: Optimal orthogonal block designs for a quadratic mixture model for three components
Abstract:
In experiments with mixtures that involve process variables, if the
response function is expressed as the sum of a function of mixture
components and a function of process variables, then the parameters in the
mixture part and in the process part can be estimated independently using
orthogonal block designs. This paper is concerned with such a block design
for parameter estimation in the mixture part of a quadratic mixture model
for three mixture components. The behaviour of the eigenvalues of the
moment matrix of the design is investigated in detail, the design is
optimized according to E- and Aoptimality criteria, and the results are
compared together with a known result on Doptimality. It is found that
this block design is robust with respect to these diff erent optimality
criteria against the shifting of experimental points. As a result, we
recommend experimental points of the form (a, b, c) in the simplex S2,
where c=0, b=1-a, and a can be any value in the range 0.17+/-0.02.
Journal: Journal of Applied Statistics
Pages: 19-34
Issue: 1
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922629
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922629
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:1:p:19-34
Template-Type: ReDIF-Article 1.0
Author-Name: Wai-Sum Chan
Author-X-Name-First: Wai-Sum
Author-X-Name-Last: Chan
Title: Exact joint forecast regions for vector autoregressive models
Abstract:
Assume that a k-element vector time series follows a vector
autoregressive (VAR) model. Obtaining simultaneous forecasts of the k
elements of the vector time series is an important problem. Based on the
Bonferroni inequality, Lutkepohl (1991) derived the procedures which
construct the conservative joint forecast regions for the VAR model. In
this paper, we propose to use an exact method which provides shorter
prediction intervals than does the Bonferroni method. Three illustrative
examples are given for comparison of the various VAR forecasting
procedures.
Journal: Journal of Applied Statistics
Pages: 35-44
Issue: 1
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922638
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922638
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:1:p:35-44
Template-Type: ReDIF-Article 1.0
Author-Name: Guadalupe Gomez
Author-X-Name-First: Guadalupe
Author-X-Name-Last: Gomez
Author-Name: M. Luz Calle
Author-X-Name-First: M. Luz
Author-X-Name-Last: Calle
Title: Non-parametric estimation with doubly censored data
Abstract:
Data from longitudinal studies in which an initiating event and a
subsequent event occur in sequence are called 'doubly censored' data if
the time of both events is interval-censored. This paper is concerned with
using doubly censored data to estimate the distribution function of the
so-called 'duration time', i.e. the elapsed time between the originating
event and the subsequent event. The paper proposes a generalization of the
Gomez and Lagakos two-step method for the case where both the time to the
initiating event and the duration time are continuous. This approach is
applied to estimate the AIDS-latency time from a haemophiliacs cohort.
Journal: Journal of Applied Statistics
Pages: 45-58
Issue: 1
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922647
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922647
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:1:p:45-58
Template-Type: ReDIF-Article 1.0
Author-Name: W. J. Krzanowski
Author-X-Name-First: W. J.
Author-X-Name-Last: Krzanowski
Title: Antedependence models in the analysis of multi-group high-dimensional data
Abstract:
Antedependence modelling has previously been shown to be useful for
twogroup discriminant analysis of high-dimensional data. In this paper,
the theory of such models is extended to multi-group discriminant analysis
and to canonical variate analysis for data display. The application of
antedependence models of orders 1, 2 and 3 to spectroscopic analyses of
rice samples is described, and the results are compared with those from
standard methods based on principal component scores calculated from the
data.
Journal: Journal of Applied Statistics
Pages: 59-67
Issue: 1
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922656
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922656
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:1:p:59-67
Template-Type: ReDIF-Article 1.0
Author-Name: Donald Martin
Author-X-Name-First: Donald
Author-X-Name-Last: Martin
Title: Paired comparison models applied to the design of the Major League baseball play-offs
Abstract:
This paper presents an analysis of the eff ect of various baseball
play-off configurations on the probability of advancing to the World
Series. Play-off games are assumed to be independent. Several paired
comparisons models are considered for modeling the probability of a home
team winning a single game as a function of the winning percentages of the
contestants over the course of the season. The uniform and logistic
regression models are both adequate, whereas the Bradley-Terry model
(modified for within-pair order eff ects, i.e. the home field advantage)
is not. The single-game probabilities are then used to compute the
probability of winning the play-off s under various structures. The extra
round of play-off s, instituted in 1994, significantly lowers the
probability of the team with the best record advancing to the World
Series, whereas home field advantage and the diff erent possible
play-offdraws have a minimal eff ect.
Journal: Journal of Applied Statistics
Pages: 69-80
Issue: 1
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922665
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922665
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:1:p:69-80
Template-Type: ReDIF-Article 1.0
Author-Name: Glen Meeden
Author-X-Name-First: Glen
Author-X-Name-Last: Meeden
Title: Interval estimators for the population mean for skewed distributions with a small sample size
Abstract:
In finite population sampling, it has long been known that, for small
sample sizes, when sampling from a skewed population, the usual
frequentist intervals for the population mean cover the true value less
often than their stated frequency of coverage. Recently, a non-informative
Bayesian approach to some problems in finite population sampling has been
developed, which is based on the 'Polya posterior'. For large sample
sizes, these methods often closely mimic standard frequentist methods. In
this paper, a modification of the 'Polya posterior', which employs the
weighted Polya distribution, is shown to give interval estimators with
improved coverage properties for problems with skewed populations and
small sample sizes. This approach also yields improved tests for
hypotheses about the mean of a skewed distribution.
Journal: Journal of Applied Statistics
Pages: 81-96
Issue: 1
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922674
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922674
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:1:p:81-96
Template-Type: ReDIF-Article 1.0
Author-Name: Seymour Geisser
Author-X-Name-First: Seymour
Author-X-Name-Last: Geisser
Title: Remarks on the 'Bayesian' method of moments
Abstract:
Zellner has proposed a novel methodology for estimating structural
parameters and predicting future observables based on two moments of a
subjective distribution and the application of the maximum entropy
principle-all in the absence of an explicit statistical model or
likelihood function for the data. He calls his procedure the 'Bayesian
method of moments' (BMOM). In a recent paper in this journal, Green and
Strawderman applied the BMOM to a model for slash pine plantations. It is
our view that there are inconsistencies between BMOM and Bayesian
(conditional) probability, as we explain in this paper.
Journal: Journal of Applied Statistics
Pages: 97-101
Issue: 1
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922683
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922683
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:1:p:97-101
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Young
Author-X-Name-First: Peter
Author-X-Name-Last: Young
Title: Recursive and en-bloc approaches to signal extraction
Abstract:
In the literature on unobservable component models , three main
statistical instruments have been used for signal extraction: fixed
interval smoothing (FIS), which derives from Kalman's seminal work on
optimal state-space filter theory in the time domain;
Wiener-Kolmogorov-Whittle optimal signal extraction (OSE) theory, which is
normally set in the frequency domain and dominates the field of classical
statistics; and regularization , which was developed mainly by numerical
analysts but is referred to as 'smoothing' in the statistical literature
(such as smoothing splines, kernel smoothers and local regression).
Although some minor recognition of the interrelationship between these
methods can be discerned from the literature, no clear discussion of their
equivalence has appeared. This paper exposes clearly the
interrelationships between the three methods; highlights important
properties of the smoothing filters used in signal extraction; and
stresses the advantages of the FIS algorithms as a practical solution to
signal extraction and smoothing problems. It also emphasizes the
importance of the classical OSE theory as an analytical tool for obtaining
a better understanding of the problem of signal extraction.
Journal: Journal of Applied Statistics
Pages: 103-128
Issue: 1
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922692
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922692
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:1:p:103-128
Template-Type: ReDIF-Article 1.0
Author-Name: M. F. Ramalhoto
Author-X-Name-First: M. F.
Author-X-Name-Last: Ramalhoto
Author-Name: M. Morais
Author-X-Name-First: M.
Author-X-Name-Last: Morais
Title: Shewhart control charts for the scale parameter of a Weibull control variable with fixed and variable sampling intervals
Abstract:
In this paper, we are concerned with pure statistical Shewhart control
charts for the scale parameter of the three-parameter Weibull control
variable, where, and are the location, the scale and the shape parameters,
respectively, with fixed (FSI) and variable (VSI) sampling intervals. The
parameters and are assumed to be known. We consider two-sided, and lower
and upper one-sided Shewhart control charts and their FSI and VSI versions
. They jointly control the mean and the variance of the Weibull control
variable X. The pivotal statistic of those control charts is the
maximum-likelihood estimator of for the Nth random sample
XN=(X1N,X2N,…,XnN) of the Weibull control variable X. The design
and performance of these control charts are studied. Two criteria, i.e.
'comparability criterion' (or 'matched criterion') under control and
'primordial criterion', are imposed on their design. The performance of
these control charts is measured using the function average time to
signal. For the VSI versions, the constant which defines the partition of
the 'continuation region' is obtained through the 'comparability
criterion' under control. The monotonic behaviour of the function average
time to signal in terms of the parameters (magnitude of the shift suff
ered by the target value 0), and is studied. We show that the function
average time to signal of all the control charts studied in this paper
does not depend on the value of the parameter or on 0, and, under control,
does not depend on the parameter, when Delta (the probability of a false
alarm) and n (sample size) are fixed. All control charts satisfy the
'primordial criterion' and, for fixed, on average, they all (except the
two-sided VSI, for which we were not able to ascertain proof) are quicker
in detecting the shift as increases. We conjecture - and we are not
contradicted by the numerical example considered - that the same is true
for the two-sided VSI control chart. We prove that, under the average time
to signal criterion, the VSI versions are always preferable to their FSI
versions. In the case of one-sided control charts, under the
'comparability criterion', the VSI version is always preferable to the FSI
version, and this advantage increases with and the extent of the shift.
Our one-sided control charts perform better and have more powerful
statistical properties than does our two-sided control chart. The
numerical example where n=5,0=1,=0.5, 1.0, 2.0, and Delta=1/370.4 is
presented for the two-sided, and the lower and upper one-sided control
charts. These numerical results are presented in tables and in figures.
The joint influence of the parameters and in the function average time to
signal is illustrated.
Journal: Journal of Applied Statistics
Pages: 129-160
Issue: 1
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922700
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922700
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:1:p:129-160
Template-Type: ReDIF-Article 1.0
Author-Name: Joseph Brian Adams
Author-X-Name-First: Joseph Brian
Author-X-Name-Last: Adams
Title: Predicting pickle harvests using a parametric feedforward neural network
Abstract:
Feedforward networks have demonstrated their ability to model non-linear
data. Despite this success, their use as a statistical analysis tool has
been limited by the persistent assumption that these networks can only be
implemented as non-parametric models. In fact, a feedforward network can
be used for parametric modeling, with the result that many of the common
parametric testing procedures can be applied to the nonlinear network. In
this paper, a feedforward network for predicting the biological growth
rate of pickles is developed. Using this network, the parametric nature of
the network is demonstrated. Once trained, the network model is tested
using standard parametric methods. In order to facilitate this testing, it
is first necessary to develop a method for calculating the degrees of
freedom for the neural network, and the residual covariance matrix. It is
shown that the degrees of freedom is determined by the number of
parameters that actually contribute to an output. With this information,
the covariance matrix can be created by adapting the error matrix. Using
these results, the trained network is tested using a simple F-statistic.
Journal: Journal of Applied Statistics
Pages: 165-176
Issue: 2
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922502
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922502
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:2:p:165-176
Template-Type: ReDIF-Article 1.0
Author-Name: Nedret Billor
Author-X-Name-First: Nedret
Author-X-Name-Last: Billor
Title: An application of the local influence approach to ridge regression
Abstract:
In this study, the method of local influence, which was introduced by
Cook as a general tool for assessing the influence of local departures
from the underlying assumptions, is applied to ridge regression, by
defining the maximum pseudo-likelihood ridge estimator obtained using the
augmentation approach, because this method is suitable for
likelihood-based models. In addition, an alternative local influence
approach suggested by Billor and Loynes is applied to ridge regression. A
comparison of these approaches and an example are given.
Journal: Journal of Applied Statistics
Pages: 177-183
Issue: 2
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922511
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922511
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:2:p:177-183
Template-Type: ReDIF-Article 1.0
Author-Name: K. Kalirajan
Author-X-Name-First: K.
Author-X-Name-Last: Kalirajan
Title: Stochastic varying coefficients gravity model: An application in trade analysis
Abstract:
In international trade analysis between countries, the central theme is
the examination of whether or not there are any significant diff erences
between the actual trade and potential trade, given the determinants of
trade flows. Thus, estimating the potential trade is an important
component in trade analysis. The objective of this paper is to suggest a
methodology to estimate the potential trade flows between countries, using
the gravity model, which has been established in the literature as the
most successful empirical trade flow equation, usually producing a good
fit. The application of the method has been demonstrated using trade flows
between Australia and its trading partners in the Indian Ocean Rim.
Journal: Journal of Applied Statistics
Pages: 185-193
Issue: 2
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922520
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922520
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:2:p:185-193
Template-Type: ReDIF-Article 1.0
Author-Name: L. S. Kaushik
Author-X-Name-First: L. S.
Author-X-Name-Last: Kaushik
Title: Partial diallel crosses based on three associate class association schemes
Abstract:
Designs of partial diallel crosses obtained by including parents based on
rectangular and cubic association schemes have been presented. In
addition, a simplified method of their analysis by making use of latent
roots and idempotent matrices has also been presented. The method has been
illustrated with the help of numerical data.
Journal: Journal of Applied Statistics
Pages: 195-201
Issue: 2
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922539
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922539
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:2:p:195-201
Template-Type: ReDIF-Article 1.0
Author-Name: Scott Richter
Author-X-Name-First: Scott
Author-X-Name-Last: Richter
Title: Nearly exact tests in factorial experiments using the aligned rank transform
Abstract:
A procedure is studied that uses rank-transformed data to perform exact
and estimated exact tests, which is an alternative to the commonly used
F-ratio test procedure. First, a common parametric test statistic is
computed using rank-transformed data, where two methods of ranking-ranks
taken for the original observations and ranks taken after aligning the
observations-are studied. Significance is then determined using either the
exact permutation distribution of the statistic or an estimate of this
distribution based on a random sample of all possible permutations.
Simulation studies compare the performance of this method with the normal
theory parametric F-test and the traditional rank transform procedure.
Power and nominal type I error rates are compared under conditions when
normal theory assumptions are satisfied, as well as when these assumptions
are violated. The method is studied for a two-factor factorial arrangement
of treatments in a completely randomized design and for a split-unit
experiment. The power of the tests rivals the parametric F-test when
normal theory assumptions are satisfied, and is usually superior when
normal theory assumptions are not satisfied. Based on the evidence of this
study, the exact aligned rank procedure appears to be the overall best
choice for performing tests in a general factorial experiment.
Journal: Journal of Applied Statistics
Pages: 203-217
Issue: 2
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922548
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922548
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:2:p:203-217
Template-Type: ReDIF-Article 1.0
Author-Name: James Hughes
Author-X-Name-First: James
Author-X-Name-Last: Hughes
Author-Name: Elizabeth Savoca
Author-X-Name-First: Elizabeth
Author-X-Name-Last: Savoca
Title: Accounting for censoring in duration data: An application to estimating the effect of legal reforms on the duration of medical malpractice disputes
Abstract:
Using a sample of medical malpractice insurance claims closed between 1
October 1985 and 1 October 1989 in the USA, we estimate the impact of
legal reforms on the longevity of disputes, via a competing risks model
that accounts for length-biased sampling and a finite sampling horizon. We
find that only the 'English rule'-a rule which requires the loser at trial
to pay all legal expenses-shortens the duration of disputes. Our results
for this law also show that failure to correct for length-biased sampling
can incorrectly imply that the English rule lengthens the time needed for
settlement and litigation. Our estimates also suggest that tort reforms
that place additional procedural hurdles in the plaintiff s' paths tend to
lengthen the time to disposition. Here, correction for a finite sampling
horizon substantially changes the inferences with regard to the eff ect of
this reform on duration.
Journal: Journal of Applied Statistics
Pages: 219-228
Issue: 2
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922557
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922557
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:2:p:219-228
Template-Type: ReDIF-Article 1.0
Author-Name: R. Vijayaraghavan
Author-X-Name-First: R.
Author-X-Name-Last: Vijayaraghavan
Title: Procedure for the selection of CSP-M one level skip-lot sampling inspection plans that have a single-sampling plan with acceptance number zero as the reference plan
Abstract:
This paper presents a procedure for the selection of CSP-M one-level
skip-lot sampling plans, designated as CSP-MSkSP, that have a
single-sampling plan with acceptance number zero as the reference plan.
The parameters of the plan are determined when two points on the operating
characteristic curve are specified, the two points being (p1,) and (p2,),
where p1 is the acceptable quality level, is the producer's risk, p2 is
the limiting quality level and is the consumer's risk.
Journal: Journal of Applied Statistics
Pages: 229-233
Issue: 2
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922566
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922566
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:2:p:229-233
Template-Type: ReDIF-Article 1.0
Author-Name: Sture Holm
Author-X-Name-First: Sture
Author-X-Name-Last: Holm
Author-Name: Kerstin Wiklander
Author-X-Name-First: Kerstin
Author-X-Name-Last: Wiklander
Title: Simultaneous estimation of location and dispersion in two-level fractional factorial designs
Abstract:
The reduction of variation is one of the obvious goals in quality
improvement. The identification of factors aff ecting the dispersion is a
step towards this goal. In this paper, the problem of estimating location
effects and dispersion eff ects simultaneously in unreplicated factorial
experiments is considered. By making a one-to-one transformation of the
response variables, the study of the quadratic functions becomes clearer.
The transformation also gives a natural motivation to the model of the
variances of the original variables. The covariances of the transformed
responses appear as parameters in the variances of the original variables.
Results of Hadamard products are used for deriving these covariances. The
method of estimating dispersion effects is shown in two illustrations. In
a 24 factorial design, the essential covariance matrix of the transformed
variables is also presented. The method is also illustrated in a 25-1
fractional design with a model which is saturated in this context.
Journal: Journal of Applied Statistics
Pages: 235-242
Issue: 2
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922575
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922575
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:2:p:235-242
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Wludyka
Author-X-Name-First: Peter
Author-X-Name-Last: Wludyka
Title: Two non-parametric, analysis-of-means-type tests for homogeneity of variances
Abstract:
After a brief review of the literature, two non-parametric tests for
homogeneity of variances are presented. The first test is based on the
analysis of means for ranks, which is a non-parametric version of the
analysis of means (ANOM) that uses ranks as input for an ANOM test. The
second test uses inverse normal scores of the ranks of scale
transformations of the observations as input to the ANOM. Both homogeneity
of variances tests can be presented in a graphical form, which makes it
easy for practitioners to assess the practical and the statistical
significance. A Monte Carlo study is used to show that these tests have
power comparable with that of well-known robust tests for homogeneity of
variances.
Journal: Journal of Applied Statistics
Pages: 243-256
Issue: 2
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922584
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922584
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:2:p:243-256
Template-Type: ReDIF-Article 1.0
Author-Name: K. K. W. Yau
Author-X-Name-First: K. K. W.
Author-X-Name-Last: Yau
Title: Estimation of surgeon effects in the analysis of post-operative colorectal cancer patients data
Abstract:
There has been increasing interest in the assessment of surgeon effects
for survival data of post-operative cancer patients. In particular, the
measurement of surgeon's surgical performance after eliminating
significant risk variables is considered. The generalized linear mixed
model approach, which assumes a log-normal-distributed surgeon effects in
the hazard function, is adopted to assess the random surgeon effects of
post-operative colorectal cancer patients data. The method extends the
traditional Cox's proportional hazards regression model, by including a
random component in the linear predictor. Estimation is accomplished by
constructing an appropriate log-likelihood function in the spirit of the
best linear unbiased predictor method and extends to obtain residual
maximum likelihood estimates. As a result of the non-proportionality of
the hazard of colon and rectal cancer, the data are analyzed separately
according to these two kinds of cancer. Significant risk variables are
identified. The 'predictions' of random surgeon effects are obtained and
their association with the rank of surgeon is examined.
Journal: Journal of Applied Statistics
Pages: 257-272
Issue: 2
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922593
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922593
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:2:p:257-272
Template-Type: ReDIF-Article 1.0
Author-Name: P. J. Harrison
Author-X-Name-First: P. J.
Author-X-Name-Last: Harrison
Title: Statistical process control and model monitoring
Abstract:
This paper is concerned with model monitoring and quality control
schemes, which are founded on a decision theoretic formulation. After
identifying unacceptable weaknesses associated with Wald, sequential
probability ratio test (SPRT) and Cuscore monitors, the Bayes decision
monitor is developed. In particular, the paper focuses on what is termed a
'popular decision scheme' (PDS) for which the monitoring run loss
functions are specified simply in terms of two indiff erence qualities.
For most applications, the PDS results in forward cumulative sum tests of
functions of the observations. For many exponential family applications,
the PDS is equivalent to well-used SPRTs and Cusums. In particular, a neat
interpretation of V-mask cusum chart settings is derived when
simultaneously running two symmetric PDSs. However, apart from providing a
decision theoretic basis for monitoring, sensible procedures occur in
applications for which SPRTs and Cuscores are particularly unsatisfactory.
Average run lengths (ARLs) are given for two special cases, and the
inadequacy of the Wald and similar ARL approximations is revealed.
Generalizations and applications to normal and dynamic linear models are
discussed. The paper concludes by deriving conditions under which
sequences of forward and backward sequential or Cusum chart tests are
equivalent.
Journal: Journal of Applied Statistics
Pages: 273-292
Issue: 2
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922601
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922601
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:2:p:273-292
Template-Type: ReDIF-Article 1.0
Author-Name: Uttam Bandyopadhyay
Author-X-Name-First: Uttam
Author-X-Name-Last: Bandyopadhyay
Author-Name: Atanu Biswas
Author-X-Name-First: Atanu
Author-X-Name-Last: Biswas
Title: Sequential-type nonparametric test using Mann-Whitney statistics
Abstract:
The paper provides a nonparametric test for the identity of two
continuous univariate distribution functions when observations are drawn
in pairs from the populations, by adopting a sampling scheme which, using
Mann-Whitney scores, generalizes the existing inverse binomial sampling
technique. Some exact performance characteristics of the proposed test are
formulated and compared numerically with existing competitors of the
proposed test. The applicability of the proposed test is illustrated using
real-life data.
Journal: Journal of Applied Statistics
Pages: 301-308
Issue: 3
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922412
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922412
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:3:p:301-308
Template-Type: ReDIF-Article 1.0
Author-Name: Reay-Chen Wang
Author-X-Name-First: Reay-Chen
Author-X-Name-Last: Wang
Title: Designing a variable sampling plan based on Taguchi's loss function
Abstract:
This paper discusses the problem of designing a new variable sampling
plan. Suppose that the lot quality characteristic obeys an exponential
distribution. Adopting Taguchi's loss function, the objective is to design
a plan under which the producer's risk of rejecting a lot that has a
specified average loss per item is no greater than alpha, and the
consumer's risk of accepting a lot that has a specified average loss per
item is no greater than beta. The method of designing this plan is an
extension of the method used by Derman and Ross.
Journal: Journal of Applied Statistics
Pages: 309-313
Issue: 3
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922421
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922421
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:3:p:309-313
Template-Type: ReDIF-Article 1.0
Author-Name: G. Galliani
Author-X-Name-First: G.
Author-X-Name-Last: Galliani
Author-Name: F. Filippini
Author-X-Name-First: F.
Author-X-Name-Last: Filippini
Author-Name: F. Screpanti
Author-X-Name-First: F.
Author-X-Name-Last: Screpanti
Title: A queuing-theory-based approach to evaluate the efficiency of a network of automated stations and of a communication system
Abstract:
The Regional Meteorological Service for the Emilia-Romagna Region manages
a network of automatic weather stations equipped with electronic sensors
suitable for measuring meteorological parameters. The automatic stations
consist of electronic instruments, which are subject to failures at more
or less frequent intervals. A summary of their performance is necessary.
In this paper, we compare the results of the summary, such as the
contiguous absence or simultaneous inactivity of different stations, with
theoretical simulations in order to evaluate the nature and recurrence of
the failures. A single- and multi-server queue simulation model was also
used to evaluate the performance of the data transmission system, so as to
optimize the communications system.
Journal: Journal of Applied Statistics
Pages: 315-326
Issue: 3
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922430
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922430
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:3:p:315-326
Template-Type: ReDIF-Article 1.0
Author-Name: U. S. Pasaribu
Author-X-Name-First: U. S.
Author-X-Name-Last: Pasaribu
Title: Statistical assumptions underlying the fitting of the Michaelis-Menten equation
Abstract:
An experiment was carried out to test the various assumptions usually
made when evaluating statistical procedures for estimating the parameters
of the Michaelis Menten equation, which describes enzyme-catalyzed
reactions. The usual assumption of normality is not strongly supported,
but is probably not too unreasonable. We study the variation in
experimental results and, in consequence, a more complex model is
proposed, which incorporates extra components of variation associated with
substrate levels and diff erent days. The model is fitted using the EM
algorithm.
Journal: Journal of Applied Statistics
Pages: 327-341
Issue: 3
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922449
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922449
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:3:p:327-341
Template-Type: ReDIF-Article 1.0
Author-Name: Alan Hutson
Author-X-Name-First: Alan
Author-X-Name-Last: Hutson
Title: Calculating nonparametric confidence intervals for quantiles using fractional order statistics
Abstract:
In this paper, we provide an easy-to-program algorithm for constructing
the preselected 100(1 - alpha)% nonparametric confidence interval for an
arbitrary quantile, such as the median or quartile, by approximating the
distribution of the linear interpolation estimator of the quantile
function Q L ( u ) = (1 - epsilon) X \[ n u ] + epsilon X \[ n u ] + 1
with the distribution of the fractional order statistic Q I ( u ) = Xn u ,
as defined by Stigler, where n = n + 1 and \[ . ] denotes the floor
function. A simulation study verifies the accuracy of the coverage
probabilities. An application to the extreme-value problem in flood data
analysis in hydrology is illustrated.
Journal: Journal of Applied Statistics
Pages: 343-353
Issue: 3
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922458
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922458
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:3:p:343-353
Template-Type: ReDIF-Article 1.0
Author-Name: Takafumi Isogai
Author-X-Name-First: Takafumi
Author-X-Name-Last: Isogai
Title: Power transformation of the F distribution and a power normal family
Abstract:
To transform the F distribution to a normal distribution, two types of
formula for power transformation of the F variable are introduced. One
formula is an extension of the Wilson-Hilferty transformation for the chi
2 variable, and the other type is based on the median of the F
distribution. Combining those two formulas, a simple formula for the
median of the F distribution is derived, and its numerical accuracy is
evaluated. Simplification of the formula of the Wilson-Hilferty
transformation, through the median formula, leads us to construct a power
normal family from the generalized F distribution. Unlike the Box-Cox
power normal family, our family has a property that the covariance
structure of the maximum-likelihood estimates of the parameters is
invariant under a scale transformation of the response variable. Numerical
examples are given to show the diff erence between two power normal
families.
Journal: Journal of Applied Statistics
Pages: 355-371
Issue: 3
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922467
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922467
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:3:p:355-371
Template-Type: ReDIF-Article 1.0
Author-Name: K. V. Mardia
Author-X-Name-First: K. V.
Author-X-Name-Last: Mardia
Title: Estimation of torsion
Abstract:
Spinal deformity is a more common problem in children than is usually
realized, and early diagnosis is highly desirable. One current measure of
detection is quite crude, with an angle being taken by hand from X-rays.
In this paper, we present some thoughts and exploratory results for
assisting orthopaedic surgeons, by estimating torsion.
Journal: Journal of Applied Statistics
Pages: 373-381
Issue: 3
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922476
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922476
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:3:p:373-381
Template-Type: ReDIF-Article 1.0
Author-Name: Wai-Yin Poon
Author-X-Name-First: Wai-Yin
Author-X-Name-Last: Poon
Title: Sources of heterogeneity in distributions with ordered categorical variables
Abstract:
The chi-squared statistic is used to test the homogeneity for several
groups in a contingency table. However, it may be inappropriate to apply
the test when ordinal categories are involved. If it can be assumed that
the ordinal categorical variables are realizations of underlying
continuous random variables, then it is possible to study the properties
of different groups in a relative sense. Assuming that the distributions
of the continuous variables are in the same family and that the thresholds
that define the categories are invariant across groups, we propose a
procedure to test homogeneity and to address the sources of heterogeneity
in different groups. An example based on a real data set is used to
demonstrate the practical applicability of the suggested method.
Journal: Journal of Applied Statistics
Pages: 383-392
Issue: 3
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922485
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922485
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:3:p:383-392
Template-Type: ReDIF-Article 1.0
Author-Name: Luis Rodriguez-Carvajal
Author-X-Name-First: Luis
Author-X-Name-Last: Rodriguez-Carvajal
Title: Multivariate AB-BA crossover trial
Abstract:
One way to analyze the AB-BA crossover trial with multivariate response
is proposed. The multivariate model is given and the assumptions
discussed. Two possibilities for the treatment eff ects hypothesis are
considered. The statistical tests include the use of Hotelling's T 2
statistic, and a transformation equivalent to that of Jones and Kenward
for the univariate case. Data from a nutrition experiment in Mexico
illustrate the method. The multiple comparisons are carried out using
Bonferroni intervals and the validity of the assumptions is explored. The
main conclusions include the finding that some of the assumptions are not
a requirement for the multivariate analysis; however, the sample sizes are
important.
Journal: Journal of Applied Statistics
Pages: 393-403
Issue: 3
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922494
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922494
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:3:p:393-403
Template-Type: ReDIF-Article 1.0
Author-Name: Pai-Lien Chen
Author-X-Name-First: Pai-Lien
Author-X-Name-Last: Chen
Author-Name: Estrada Bernard
Author-X-Name-First: Estrada
Author-X-Name-Last: Bernard
Author-Name: Pranab Sen
Author-X-Name-First: Pranab
Author-X-Name-Last: Sen
Title: A Markov chain model used in analyzing disease history applied to a stroke study
Abstract:
In clinical research, study subjects may experience multiple events that
are observed and recorded periodically. To analyze transition patterns of
disease processes, it is desirable to use those multiple events over time
in the analysis. This study proposes a multi-state Markov model with
piecewise transition probability, which is able to accommodate
periodically observed clinical data without a time homogeneity assumption.
Models with ordinal outcomes that incorporate covariates are also
discussed. The proposed models are illustrated by an analysis of the
severity of morbidity in a monthly follow-up study for patients with
spontaneous intracerebral hemorrhage.
Journal: Journal of Applied Statistics
Pages: 413-422
Issue: 4
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922304
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922304
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:4:p:413-422
Template-Type: ReDIF-Article 1.0
Author-Name: Kaushik Ghosh
Author-X-Name-First: Kaushik
Author-X-Name-Last: Ghosh
Author-Name: S. Rao Jammalamadaka
Author-X-Name-First: S. Rao
Author-X-Name-Last: Jammalamadaka
Author-Name: Mangalam Vasudaven
Author-X-Name-First: Mangalam
Author-X-Name-Last: Vasudaven
Title: Change-point problems for the von Mises distribution
Abstract:
A generalized likelihood ratio procedure and a Bayes procedure are
considered for change-point problems for the mean direction of the von
Mises distribution, both when the concentration parameter is known and
when it is unknown. These tests are based on sample resultant lengths.
Tables that list critical values of these test statistics are provided.
These tests are shown to be valid even when the data come from other
similar unimodal circular distributions. Some empirical studies of powers
of these test procedures are also incorporated.
Journal: Journal of Applied Statistics
Pages: 423-434
Issue: 4
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922313
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922313
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:4:p:423-434
Template-Type: ReDIF-Article 1.0
Author-Name: Claes Cassel
Author-X-Name-First: Claes
Author-X-Name-Last: Cassel
Author-Name: Peter Hackl
Author-X-Name-First: Peter
Author-X-Name-Last: Hackl
Author-Name: Anders Westlund
Author-X-Name-First: Anders
Author-X-Name-Last: Westlund
Title: Robustness of partial least-squares method for estimating latent variable quality structures
Abstract:
Latent variable structural models and the partial least-squares (PLS)
estimation procedure have found increased interest since being used in the
context of customer satisfaction measurement. The well-known property that
the estimates of the inner structure model are inconsistent implies biased
estimates for finite sample sizes. A simplified version of the structural
model that is used for the Swedish Customer Satisfaction Index (SCSI)
system has been used to generate simulated data and to study the PLS
algorithm in the presence of three inadequacies: (i) skew instead of
symmetric distributions for manifest variables; (ii) multi-collinearity
within blocks of manifest and between latent variables; and (iii)
misspecification of the structural model (omission of regressors). The
simulation results show that the PLS method is quite robust against these
inadequacies. The bias that is caused by the inconsistency of PLS
estimates is substantially increased only for extremely skewed
distributions and for the erroneous omission of a highly relevant latent
regressor variable. The estimated scores of the latent variables are
always in very good agreement with the true values and seem to be
unaffected by the inadequacies under investigation.
Journal: Journal of Applied Statistics
Pages: 435-446
Issue: 4
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922322
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922322
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:4:p:435-446
Template-Type: ReDIF-Article 1.0
Author-Name: Magnar Lillegard
Author-X-Name-First: Magnar
Author-X-Name-Last: Lillegard
Author-Name: Steinar Engen
Author-X-Name-First: Steinar
Author-X-Name-Last: Engen
Title: Exact confidence intervals generated by conditional parametric bootstrapping
Abstract:
Conditional parametric bootstrapping is defined as the samples obtained
by performing the simulations in such a way that the estimator is kept
constant and equal to the estimate obtained from the data. Order
statistics of the bootstrap replicates of the parameter chosen in each
simulation provide exact confidence intervals, in a probabilistic sense,
in models with one parameter under quite general conditions. The method is
still exact in the case of nuisance parameters when these are location and
scale parameters, and the bootstrapping is based on keeping the
maximum-likelihood estimates constant. The method is also exact if there
exists a sufficient statistic for the nuisance parameters and if the
simulations are performed conditioning on this statistic. The technique
may also be used to construct prediction intervals. These are generally
not exact, but are likely to be good approximations.
Journal: Journal of Applied Statistics
Pages: 447-459
Issue: 4
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922331
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922331
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:4:p:447-459
Template-Type: ReDIF-Article 1.0
Author-Name: Jan Magnus
Author-X-Name-First: Jan
Author-X-Name-Last: Magnus
Author-Name: Franc Klaassen
Author-X-Name-First: Franc
Author-X-Name-Last: Klaassen
Title: The final set in a tennis match: Four years at Wimbledon
Abstract:
We consider the 'final' (deciding) set in a tennis match. We examine
whether it is true that the chances for both players to win the match are
equal at the beginning of the final set, even though they were not equal
at the beginning of the match. We also test whether it is easier for an
unseeded woman to beat a seeded player than it is for an unseeded man, and
whether male players are more closely equal in quality than are females.
We examine whether the service dominance decreases in long matches, and
whether winning the 'pre-final' set provides an advantage in the final
set. We use almost 90 000 points at Wimbledon to test all five hypotheses.
Journal: Journal of Applied Statistics
Pages: 461-468
Issue: 4
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922340
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922340
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:4:p:461-468
Template-Type: ReDIF-Article 1.0
Author-Name: Sueli Mingoti
Author-X-Name-First: Sueli
Author-X-Name-Last: Mingoti
Title: Bayesian estimator for the total number of distinct species when quadrat sampling is used
Abstract:
A Bayesian estimator for the total number of distinct species present in
the region of investigation is constructed when the quadrat sampling
procedure is used to collect a sample of species. The estimator is based
on a model similar to that used by Mingoti and Meeden, and uses as a
special case the zero truncated negative binomial distribution as a prior
distribution for the true number S of distinct species in the region.
Confidence intervals are also obtained. Simple comparisons with the
first-order jackknife estimator and the empirical Bayesian estimator are
performed.
Journal: Journal of Applied Statistics
Pages: 469-483
Issue: 4
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922359
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922359
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:4:p:469-483
Template-Type: ReDIF-Article 1.0
Author-Name: Francisco Fernando
Author-X-Name-First: Francisco
Author-X-Name-Last: Fernando
Author-Name: Ribeiro Ramos
Author-X-Name-First: Ribeiro
Author-X-Name-Last: Ramos
Title: Underreporting of purchases of port wine
Abstract:
In this paper, we develop a new approach for modelling underreported
Poisson counts. The parameters of the model are estimated by Markov chain
Monte Carlo simulation. An application to a real data set from a
Portuguese marketing survey illustrates the fruitfulness of the approach.
We find that purchases of bottles of port wine increase significantly with
income class and the size of the household.
Journal: Journal of Applied Statistics
Pages: 485-494
Issue: 4
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922368
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922368
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:4:p:485-494
Template-Type: ReDIF-Article 1.0
Author-Name: Eric Schoen
Author-X-Name-First: Eric
Author-X-Name-Last: Schoen
Title: Designing fractional two-level experiments with nested error structures
Abstract:
A common feature of experiments with a random blocking factor and
splitplot experiments is their nested error structure. This paper proposes
a general strategy to handle fractional two-level experiments with such
error structures. The strategy aims to create error strata with sufficient
numbers of contrasts to separate active effects from inactive effects. The
strategy also details the construction of treatment generators, given the
constraints of a predetermined error structure. The key elements of the
strategy are illustrated with a chemical experiment that has 16 factors
and 32 runs blocked according to working days, and a cheese-making
experiment that has 11 factors and 128 runs, divided over milk supplies as
whole plots, curds productions as subplots and sets of identically treated
cheeses as sub-subplots.
Journal: Journal of Applied Statistics
Pages: 495-508
Issue: 4
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922377
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922377
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:4:p:495-508
Template-Type: ReDIF-Article 1.0
Author-Name: Stephen Walker
Author-X-Name-First: Stephen
Author-X-Name-Last: Walker
Title: The uniform power distribution
Abstract:
This paper introduces a generalization of the normal distribution: the
uniform power distribution. It is a symmetric and unimodal family of
distributions, defined on the real line, and is closely related to the
exponential power family. The exponential power family was introduced to
allow the modelling of kurtosis. The uniform power family matches the
exponential power family with respect to the range of kurtosis. However,
whereas the exponential is somewhat difficult to work with, the contrary
is true for the uniform power family.
Journal: Journal of Applied Statistics
Pages: 509-517
Issue: 4
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922386
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922386
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:4:p:509-517
Template-Type: ReDIF-Article 1.0
Author-Name: Paul Zhang
Author-X-Name-First: Paul
Author-X-Name-Last: Zhang
Title: Omnibus test of normality using the Q statistic
Abstract:
A new statistical procedure for testing normality is proposed. The Q
statistic is derived as the ratio of two linear combinations of the
ordered random observations. The coefficients of the linear combinations
are utilizing the expected values of the order statistics from the
standard normal distribution. This test is omnibus to detect the
deviations from normality that result from either skewness or kurtosis.
The statistic is independent of the origin and the scale under the null
hypothesis of normality, and the null distribution of Q can be very well
approximated by the Cornish-Fisher expansion. The powers for various
alternative distributions were compared with several other test statistics
by simulations.
Journal: Journal of Applied Statistics
Pages: 519-528
Issue: 4
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922395
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922395
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:4:p:519-528
Template-Type: ReDIF-Article 1.0
Author-Name: Herbert BUNing
Author-X-Name-First: Herbert
Author-X-Name-Last: BUNing
Title: Adaptive Jonckheere-type tests for ordered alternatives
Abstract:
Testing against ordered alternatives in the c -sample location problem
plays an important role in statistical practice. The parametric test
proposed by Barlow et al .-in the following, called the 'B-test'-is an
appropriate test under the model of normality. For non-normal data,
however, there are rank tests which have higher power than the B-test,
such as the Jonckheere test or so-called Jonckheere-type tests introduced
and studied by Buning and Kossler. However, we usually have no information
about the underlying distribution. Thus, an adaptive test should be
applied which takes into account the given data set. Two versions of such
an adaptive test are proposed, which are based on the concept introduced
by Hogg in 1974. These adaptive tests are compared with each of the single
Jonckheere-type tests in the adaptive scheme and also with the B-test. It
is shown via Monte Carlo simulation that the adaptive tests behave well
over a broad class of symmetric distributions with short, medium and long
tails, as well as for asymmetric distributions.
Journal: Journal of Applied Statistics
Pages: 541-551
Issue: 5
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922214
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922214
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:5:p:541-551
Template-Type: ReDIF-Article 1.0
Author-Name: Jerry Dechert
Author-X-Name-First: Jerry
Author-X-Name-Last: Dechert
Author-Name: Kenneth Case
Author-X-Name-First: Kenneth
Author-X-Name-Last: Case
Title: An economic model for clinical quality control
Abstract:
With increased focus on reducing costs in the healthcare industry, the
economic aspects of quality control for clinical laboratories must be
addressed. In order to evaluate the economic performance of statistical
quality control approaches used in the clinical setting, an economic model
is developed. Although the economic model is applied specifically to the
clinical laboratory in this research, it is easily generalized for use in
a wide variety of industry applications. Use of the economic model is
illustrated through the comparison of traditional approaches to clinical
quality control. Recommendations concerning the performance of the
traditional approaches to clinical quality control are provided.
Journal: Journal of Applied Statistics
Pages: 553-562
Issue: 5
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922223
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922223
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:5:p:553-562
Template-Type: ReDIF-Article 1.0
Author-Name: D. K. Ghosh
Author-X-Name-First: D. K.
Author-X-Name-Last: Ghosh
Author-Name: Naimesh Desai
Author-X-Name-First: Naimesh
Author-X-Name-Last: Desai
Title: Robustness of a complete diallel crosses plan with an unequal number of crosses to the unavailability of one block
Abstract:
The present investigation involved the estimation of the general
combining ability of complete diallel crosses (CDC) plans with unequal
numbers of crosses, subject to the unavailability of one block for
Griffing's system IV . Further, it has been shown that CDC plans with
unequal numbers of crosses are fairly robust to the unavailability of one
block.
Journal: Journal of Applied Statistics
Pages: 563-577
Issue: 5
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922232
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922232
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:5:p:563-577
Template-Type: ReDIF-Article 1.0
Author-Name: Paul Taylor
Author-X-Name-First: Paul
Author-X-Name-Last: Taylor
Author-Name: David Hand
Author-X-Name-First: David
Author-X-Name-Last: Hand
Title: Finding 'superclassifications' with an acceptable misclassification rate
Abstract:
Cluster analysis methods are based on measures of 'distance' between
objects. Sometimes the objects have an internal structure, and use of this
can be made when defining such distances. This leads to non-standard
cluster analysis methods. We illustrate with an application in which the
objects are themselves classes and the aim is to produce clusters of
classes which minimize the error rate of a supervised classification rule.
For supervised classification problems with more than a handful of
classes, there may exist groups of classes which are well separated from
other groups, even though individual classes are not all well separated.
In such cases, the overall misclassification rate is a crude measure of
performance and more subtle measures, taking note of subgroup separation,
are desirable. The fact that points can be assigned accurately to groups,
if not to individual classes, can sometimes be practically useful.
Journal: Journal of Applied Statistics
Pages: 579-590
Issue: 5
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922241
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922241
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:5:p:579-590
Template-Type: ReDIF-Article 1.0
Author-Name: Habshah Midi
Author-X-Name-First: Habshah
Author-X-Name-Last: Midi
Title: Preliminary estimators for robust non-linear regression estimation
Abstract:
In this paper, the robustness of weighted non-linear least-squares
estimation based on some preliminary estimators is examined. The
preliminary estimators are the Lnorm estimates proposed by Schlossmacher,
by El-Attar et al., by Koenker and Park, and by Lawrence and Arthur. A
numerical example is presented to compare the robustness of the weighted
non-linear least-squares approach when based on the preliminary estimators
of Schlossmacher (HS), El-Attar et al. (HEA), Koenker and Park (HKP), and
Lawrence and Arthur (HLA). The study shows that the HEA estimator is as
robust as the HKP estimator. However, the HEA estimator posed certain
computational problems and required more storage and computing time.
Journal: Journal of Applied Statistics
Pages: 591-600
Issue: 5
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922250
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922250
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:5:p:591-600
Template-Type: ReDIF-Article 1.0
Author-Name: M. Carme Ruiz De Villa
Author-X-Name-First: M. Carme Ruiz
Author-X-Name-Last: De Villa
Author-Name: M. Salome
Author-X-Name-First: M.
Author-X-Name-Last: Salome
Author-Name: E. Cabral
Author-X-Name-First: E.
Author-X-Name-Last: Cabral
Author-Name: Eduardo Escrich Escriche
Author-X-Name-First: Eduardo Escrich
Author-X-Name-Last: Escriche
Author-Name: Montse Solanas
Author-X-Name-First: Montse
Author-X-Name-Last: Solanas
Title: A non-parametric regression approach to repeated measures analysis in cancer experiments
Abstract:
The validity conditions for univariate or multivariate analyses of
repeated measures are highly sensitive to the usual assumptions. In cancer
experiments, the data are frequently heteroscedastic and strongly
correlated with time, and standard analyses do not perform well.
Alternative non-parametric approaches can contribute to an analysis of
these longitudinal data. This paper describes a method for such
situations, using the results from a comparative experiment in which
tumour volume is evaluated over time. First, we apply the non-parametric
approach proposed by Raz in constructing a randomization Ftest for
comparing treatments. A local polynomial fit is conducted to estimate the
growth curves and confidence intervals for each treatment. Finally, this
technique is used to estimate the velocity of tumour growth.
Journal: Journal of Applied Statistics
Pages: 601-611
Issue: 5
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922269
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922269
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:5:p:601-611
Template-Type: ReDIF-Article 1.0
Author-Name: A. J. Scallan
Author-X-Name-First: A. J.
Author-X-Name-Last: Scallan
Title: Regression modelling of interval-censored failure time data using the Weibull distribution
Abstract:
A method is described for fitting the Weibull distribution to
failure-time data which may be left, right or interval censored. The
method generalizes the auxiliary Poisson approach and, as such, means that
it can be easily programmed in statistical packages with macro programming
capabilities. Examples are given of fitting such models and an
implementation in the GLIM package is used for illustration.
Journal: Journal of Applied Statistics
Pages: 613-618
Issue: 5
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922278
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922278
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:5:p:613-618
Template-Type: ReDIF-Article 1.0
Author-Name: Alexandra Mello Schmidt
Author-X-Name-First: Alexandra Mello
Author-X-Name-Last: Schmidt
Author-Name: Dani Gamerman
Author-X-Name-First: Dani
Author-X-Name-Last: Gamerman
Author-Name: Ajax Moreira
Author-X-Name-First: Ajax
Author-X-Name-Last: Moreira
Title: An adaptive resampling scheme for cycle estimation
Abstract:
Bayesian dynamic linear models (DLMs) are useful in time series
modelling, because of the flexibility that they off er for obtaining a
good forecast. They are based on a decomposition of the relevant factors
which explain the behaviour of the series through a series of state
parameters. Nevertheless, the DLM as developed by West and Harrison depend
on additional quantities, such as the variance of the system disturbances,
which, in practice, are unknown. These are referred to here as
'hyper-parameters' of the model. In this paper, DLMs with autoregressive
components are used to describe time series that show cyclic behaviour.
The marginal posterior distribution for state parameters can be obtained
by weighting the conditional distribution of state parameters by the
marginal distribution of hyper-parameters. In most cases, the joint
distribution of the hyperparameters can be obtained analytically but the
marginal distributions of the components cannot, so requiring numerical
integration. We propose to obtain samples of the hyperparameters by a
variant of the sampling importance resampling method. A few applications
are shown with simulated and real data sets.
Journal: Journal of Applied Statistics
Pages: 619-641
Issue: 5
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922287
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922287
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:5:p:619-641
Template-Type: ReDIF-Article 1.0
Author-Name: Sharifah Sakinah Aidid
Author-X-Name-First: Sharifah Sakinah
Author-X-Name-Last: Aidid
Author-Name: Mick Silver
Author-X-Name-First: Mick
Author-X-Name-Last: Silver
Title: Modelling market shares by segments using volatility
Abstract:
This paper presents the results of market share modelling for individual
segments of the UK tea market using scanner panel data. The study is novel
in its introduction of the use of volatility as one of the bases for
segmentation, others being usage, loyalty or switching between product
types and product forms. The segmentation is undertaken on an a priori,
quasi-experimental basis, allowing nested tests of constancy of
elasticities across segments. The estimated equations (using seemingly
unrelated regressions) benefit from extensive specification, including
four diff erent forms for the price variable, four variables for
promotion, and six for product characteristic, distribution and
macroeconomic variables. Tests for the constancy of the parameters across
segments show the segmentation to be successful.
Journal: Journal of Applied Statistics
Pages: 643-660
Issue: 5
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922296
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922296
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:5:p:643-660
Template-Type: ReDIF-Article 1.0
Author-Name: Jorge Alberto Achcar
Author-X-Name-First: Jorge Alberto
Author-X-Name-Last: Achcar
Author-Name: Gilberto De AraUJo Pereira
Author-X-Name-First: Gilberto De AraUJo
Author-X-Name-Last: Pereira
Title: Use of exponential power distributions for mixture models in the presence of covariates
Abstract:
In this paper, we present a Bayesian analysis of exponential power
mixture models in the presence of a covariate. Considering Gibbs sampling
with MetropolisHastings algorithms, we obtain Monte Carlo estimates for
the posterior quantities of interest.
Journal: Journal of Applied Statistics
Pages: 669-679
Issue: 6
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922115
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922115
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:6:p:669-679
Template-Type: ReDIF-Article 1.0
Author-Name: Yssa Dewoody
Author-X-Name-First: Yssa
Author-X-Name-Last: Dewoody
Author-Name: V. T. Gururaj
Author-X-Name-First: V. T.
Author-X-Name-Last: Gururaj
Author-Name: Clyde Martin
Author-X-Name-First: Clyde
Author-X-Name-Last: Martin
Title: Assessing risk for rare events
Abstract:
This paper develops a method for assessing the risk for rare events based
on the following scenario. There exists a large population with an unknown
percentage p of defects. A sample of size N is drawn from the population
and, in the sample, 0 defects are drawn. Given these data, we want to
determine the probability that no more than n defects will be found in
another random sample of N drawn from the population. Estimates on the
range of p and n are calculated from a derived joint distribution which
depends on p, n and N. Asymptotic risk results based on an infinite sample
are then developed. It is shown that these results are applicable even
with relatively small sample spaces.
Journal: Journal of Applied Statistics
Pages: 681-687
Issue: 6
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922124
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922124
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:6:p:681-687
Template-Type: ReDIF-Article 1.0
Author-Name: L. Duchateau
Author-X-Name-First: L.
Author-X-Name-Last: Duchateau
Author-Name: D. L. Berkvens
Author-X-Name-First: D. L.
Author-X-Name-Last: Berkvens
Author-Name: G. J. Rowlands
Author-X-Name-First: G. J.
Author-X-Name-Last: Rowlands
Title: Decision rules for small vaccine experiments with binary outcomes based on conditional and expected power and size of the Fisherexact test
Abstract:
Vaccine experiments with a binary outcome typically use a small number of
animals for financial and ethical reasons. The choice of a design,
characterized by the total number of animals and the allocation of animals
to treated and control groups, needs to be based on an assessment of
change in expected size and power, with corresponding changes in the
nominal significance level. This paper shows how an analysis of the
conditional and the expected size and power of the Fisher-exact test,
given predicted values for the proportions of success in control and
treated groups, can lead to appropriate decision rules.
Journal: Journal of Applied Statistics
Pages: 689-699
Issue: 6
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922133
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922133
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:6:p:689-699
Template-Type: ReDIF-Article 1.0
Author-Name: Yue Fang
Author-X-Name-First: Yue
Author-X-Name-Last: Fang
Author-Name: John Zhang
Author-X-Name-First: John
Author-X-Name-Last: Zhang
Title: Performance of control charts for autoregressive conditional heteroscedastic processes
Abstract:
This paper examines the robustness of control schemes to data conditional
heteroscedasticity. Overall, the results show that the control schemes
which do not account for heteroscedasticity fail in providing reliable
information on the status of the process. Consequently, incorrect
conclusions will be drawn by applying these procedures in the presence of
data conditional heteroscedasticity. Control charts with time-varying
control limits are shown to be useful in that context.
Journal: Journal of Applied Statistics
Pages: 701-714
Issue: 6
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922142
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922142
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:6:p:701-714
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Harris
Author-X-Name-First: Peter
Author-X-Name-Last: Harris
Author-Name: Mark Hann
Author-X-Name-First: Mark
Author-X-Name-Last: Hann
Author-Name: Simon Kirby
Author-X-Name-First: Simon
Author-X-Name-Last: Kirby
Author-Name: John Dearden
Author-X-Name-First: John
Author-X-Name-Last: Dearden
Title: Interval estimation of the median effective dose for a logistic dose-response curve
Abstract:
In 1986, Williams showed how, assuming a logistic dose-response curve,
one can construct a confidence interval for the median effective dose from
the asymptotic likelihood ratio test. He gave reasons for preferring this
likelihood ratio interval to the established interval calculated by
applying Fieller's theorem to the maximum-likelihood estimates. Here, we
assess the impact of applying a Bartlett adjustment to the likelihood
ratio statistic and introduce the score test as an alternative approach
for constructing a confidence interval for the median effective dose.
Journal: Journal of Applied Statistics
Pages: 715-722
Issue: 6
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922151
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922151
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:6:p:715-722
Template-Type: ReDIF-Article 1.0
Author-Name: Ulric Lund
Author-X-Name-First: Ulric
Author-X-Name-Last: Lund
Title: Least circular distance regression for directional data
Abstract:
Least-squares regression is not appropriate when the response variable is
circular, and can lead to erroneous results. The reason for this is that
the squared difference is not an appropriate measure of distance on the
circle. In this paper, a circular analog to least-squares regression is
presented for predicting a circular response variable by another circular
variable and a set of linear covariates. An alternative maximum-likelihood
formulation yields the same regression parameter estimates. Under the
maximum-likelihood model, asymptotic standard errors of the parameter
estimates are obtained. As an example, the regression model is used to
model data from a marine biology study.
Journal: Journal of Applied Statistics
Pages: 723-733
Issue: 6
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922160
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922160
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:6:p:723-733
Template-Type: ReDIF-Article 1.0
Author-Name: K. V. Mardia
Author-X-Name-First: K. V.
Author-X-Name-Last: Mardia
Author-Name: A. N. Walder
Author-X-Name-First: A. N.
Author-X-Name-Last: Walder
Author-Name: E. Berry
Author-X-Name-First: E.
Author-X-Name-Last: Berry
Author-Name: D. Sharples
Author-X-Name-First: D.
Author-X-Name-Last: Sharples
Author-Name: P. A. Millner
Author-X-Name-First: P. A.
Author-X-Name-Last: Millner
Author-Name: R. A. Dickson
Author-X-Name-First: R. A.
Author-X-Name-Last: Dickson
Title: Assessing spinal shape
Abstract:
Idiopathic scoliosis is the most common spinal deformity, affecting
perhaps as many as 5% of children. Early recognition of the condition is
essential for optimal treatment. A widely used technique for
identification is based on a somewhat crude angle measurement from a
frontal spinal X-ray. Here, we provide a technique and new summary
statistical measures for classifying spinal shape, and present results
obtained from clinical X-rays.
Journal: Journal of Applied Statistics
Pages: 735-745
Issue: 6
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922179
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922179
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:6:p:735-745
Template-Type: ReDIF-Article 1.0
Author-Name: A. J. Scallan
Author-X-Name-First: A. J.
Author-X-Name-Last: Scallan
Title: Fitting a mixture distribution to complex censored survival data using generalized linear models
Abstract:
Mixture models may arise for a variety of reasons in survival data
analysis. This paper shows how such models that involve potentially
complex cross-classification by covariates may be easily fitted using a
package such as GLIM. The method employs an auxiliary Poisson-binomial
model in order to find the maximum-likelihood estimates of the model
parameters, and has been implemented using GLIM macros.
Journal: Journal of Applied Statistics
Pages: 747-753
Issue: 6
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922188
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922188
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:6:p:747-753
Template-Type: ReDIF-Article 1.0
Author-Name: Yuehjen Shao
Author-X-Name-First: Yuehjen
Author-X-Name-Last: Shao
Author-Name: Yue-Fa Lin
Author-X-Name-First: Yue-Fa
Author-X-Name-Last: Lin
Author-Name: Soe-Tsyr Yuan
Author-X-Name-First: Soe-Tsyr
Author-X-Name-Last: Yuan
Title: Integrated application of time series multiple-interventions analysis and knowledge-based reasoning
Abstract:
This study examines the data that result from multiple promotional
strategies when the data are autocorrelated. Time series intervention
analysis is the traditional way to analyze such data, focusing on the
effects of a single or a few interventions. Time series intervention
analysis delivers good results, provided that there is a known and
predetermined schedule of future interventions. This study opts for a
different type of analysis. Instead of adopting the traditional time
series intervention analysis with only one or a few interventions, this
study explores the possibility of integrating time series intervention
analysis and a knowledge-based system to analyze multiple-interventions
data. This integrated approach does not require attempts to ascertain the
effects of future interventions. Through the analysis of actual promotion
data, this study shows the benefits of using the proposed method.
Journal: Journal of Applied Statistics
Pages: 755-766
Issue: 6
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922197
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922197
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:6:p:755-766
Template-Type: ReDIF-Article 1.0
Author-Name: I. H. Tajuddin
Author-X-Name-First: I. H.
Author-X-Name-Last: Tajuddin
Title: A comparison between two simple measures of skewness
Abstract:
In 1995, Arnold and Groeneveld introduced the measure of skewness gammaM
in terms of F(mode)-the cumulative probability of a random variable less
than or equal to the mode of the distribution. They assumed that the mode
of a distribution exists and is unique. Independently, in 1996, the
present author arrived at the measure of skewness T, which is given in
terms of F(mean). This measure possesses desirable properties and is
equally simple. The measure gammaM satisfies - 1 gammaM 1 , with 1 (- 1)
indicating extreme right (left) skewness. However, the measure T can take
on any value on the real line; hence, an equivalent measure gammaT is
considered and is compared with gammaM. We consider a variety of families
of distributions and include in our study other measures of skewness of
interest. Skewness values are easily obtained using MINITAB programs.
Journal: Journal of Applied Statistics
Pages: 767-774
Issue: 6
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922205
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922205
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:6:p:767-774
Template-Type: ReDIF-Article 1.0
Author-Name: JosE Nilo Binongo
Author-X-Name-First: JosE Nilo
Author-X-Name-Last: Binongo
Author-Name: M. W. A. Smith
Author-X-Name-First: M. W. A.
Author-X-Name-Last: Smith
Title: A bridge between statistics and literature: The graphs of Oscar Wilde's literary genres
Abstract:
The availability of computing devices and the proliferation of electronic
texts (the so-called 'e-texts') in centres for literary and linguistic
computing in major universities have encouraged non-traditional
applications of statistics. With the drudgery of computation and text
encoding diminished, research in the field of computational stylistics is
accelerating. In this paper, it is shown how projections onto the
Cartesian plane of 25-dimensional vectors related to the frequency of
occurrence of 25 prepositions can distinguish between Oscar Wilde's plays
and essays. Such an application illustrates that it is possible to find
unusual and intriguing examples of how statistics can impinge on
unexpected territory.
Journal: Journal of Applied Statistics
Pages: 781-787
Issue: 7
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922025
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922025
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:7:p:781-787
Template-Type: ReDIF-Article 1.0
Author-Name: Zhenmin Chen
Author-X-Name-First: Zhenmin
Author-X-Name-Last: Chen
Title: A simple exact method for testing hypotheses about the shape parameter of a log-normal distribution
Abstract:
The log-normal distribution is a useful lifetime distribution in many
areas. The survival function of a log-normal distribution cannot be
expressed in close forms. This makes it difficult to develop exact
statistical methods for parameter estimation when censoring occurs. This
article proposes a simple and exact method for conducting statistical
tests about the shape parameter of a log-normal distribution. Necessary
tables are provided based on Monte Carlo simulation. The method can be
used for type II censored data. Comparing with existing exact methods,
this method uses fewer tables and is easier for calculations.
Journal: Journal of Applied Statistics
Pages: 789-805
Issue: 7
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922034
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922034
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:7:p:789-805
Template-Type: ReDIF-Article 1.0
Author-Name: Paul De Bruin
Author-X-Name-First: Paul
Author-X-Name-Last: De Bruin
Author-Name: Philip Hans Franses
Author-X-Name-First: Philip Hans
Author-X-Name-Last: Franses
Title: Forecasting power-transformed time series data
Abstract:
When there is an interest in forecasting the growth rates as well as the
levels of a single macro-economic time series, a practitioner faces the
question of whether a forecasting model should be constructed for growth
rates, for levels, or for both. In this paper, we investigate this issue
for 10 US (un-)employment series, where we evaluate the forecasts from a
non-linear time series model for power-transformed data. Our main finding
is that models for growth rates (levels) do not automatically result in
the most accurate forecasts of growth rates (levels).
Journal: Journal of Applied Statistics
Pages: 807-815
Issue: 7
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922043
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922043
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:7:p:807-815
Template-Type: ReDIF-Article 1.0
Author-Name: Yadolah Dodge
Author-X-Name-First: Yadolah
Author-X-Name-Last: Dodge
Author-Name: Ali Hadi
Author-X-Name-First: Ali
Author-X-Name-Last: Hadi
Title: Simple graphs and bounds for the elements of the hat matrix
Abstract:
In regression analysis, the matrix H = X (XTX)-1XT is known as the 'hat'
or 'projection' matrix, among other names. It has been studied by many
authors from different perspectives. The main area of study has been the
type of measure best adapted to detect leverage points in linear
regression. For computational reasons, these measures were originally
based on the diagonal elements of the hat matrix. In the present paper, we
propose a very simple procedure for identifying leverage groups. The
procedure is based on upper and lower bounds for the diagonal and the
off-diagonal elements of H. These upper and lower bounds can easily be
shown on an index plot of the elements of H.
Journal: Journal of Applied Statistics
Pages: 817-823
Issue: 7
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922052
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922052
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:7:p:817-823
Template-Type: ReDIF-Article 1.0
Author-Name: T. Lehtonen
Author-X-Name-First: T.
Author-X-Name-Last: Lehtonen
Author-Name: J. -O. Malmberg
Author-X-Name-First: J. -O.
Author-X-Name-Last: Malmberg
Title: Do two competing frequencies differ significantly?
Abstract:
When testing the equality of two population frequencies, one well-known
and common situation is that the test is based on two independent samples.
In this paper, we consider the other interesting case, in which the
comparison is actually within a single population and the test is based on
a single sample.
Journal: Journal of Applied Statistics
Pages: 825-830
Issue: 7
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922061
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922061
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:7:p:825-830
Template-Type: ReDIF-Article 1.0
Author-Name: Ralf Ostermark
Author-X-Name-First: Ralf
Author-X-Name-Last: Ostermark
Author-Name: Rune Hoglund
Author-X-Name-First: Rune
Author-X-Name-Last: Hoglund
Title: Simulating competing cointegration tests in a bivariate system
Abstract:
In this paper, we consider the size and power of a set of cointegration
tests in a number of Monte Carlo simulations. The behaviour of the
competing methods is investigated in diff erent situations, including diff
erent levels of variance and correlation in the error processes. The
impact of violations of the common factor restriction (CFR) implied by the
Engle-Granger framework is studied in these situations. The reactions to
changes in the CFR condition depend on the error correlation. When the
correlation is non-positive, the power increases with increasing CFR
violations for the error correction model (ECM) test, while the other
tests react in the opposite direction. We also note the reaction to diff
erences in the error variances in the data-generating process. For
positive correlation and equal variances, the reaction to changes in the
CFR violations diff ers somewhat between the tests. We conclude that the
ECM and the Z-tests show the best performance over diff erent parameter
combinations. In most situations the ECM is best. Therefore, if we had to
recommend a unit root test, it would be the ECM, especially for small
samples. However, we do not think that one should use just one test, but
two or more. Of course, the portfolio of tests we have considered here
only represents a subset of the possible tests.
Journal: Journal of Applied Statistics
Pages: 831-846
Issue: 7
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922070
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922070
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:7:p:831-846
Template-Type: ReDIF-Article 1.0
Author-Name: Joseph Pigeon
Author-X-Name-First: Joseph
Author-X-Name-Last: Pigeon
Author-Name: Joseph Heyse
Author-X-Name-First: Joseph
Author-X-Name-Last: Heyse
Title: A cautionary note about assessing the fit of logistic regression models
Abstract:
Logistic regression is a popular method of relating a binary response to
one or more potential covariables or risk factors. In 1980, Hosmer and
Lemeshow proposed a method for assessing the goodness of fit of logistic
regression models. This test is based on a chi-squared statistic that
compares the observed and expected cell frequencies in the 2 g table, as
found by sorting the observations by predicted probabilities and forming g
groups. We have noted that the test may be sensitive to situations where
there are low expected cell frequencies. Further, several commonly used
statistical packages apply the Hosmer-Lemeshow test, but do so in diff
erent ways, and none of the packages we considered alerted the user to the
potential difficulty with low expected cell frequencies. An alternative
goodness-of-fit test is illustrated which seems to off er an advantage
over the popular Hosmer-Lemeshow test, by reducing the likelihood of small
expected counts and, potentially, sharpening the interpretation. An
example is provided which demonstrates these ideas.
Journal: Journal of Applied Statistics
Pages: 847-853
Issue: 7
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922089
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922089
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:7:p:847-853
Template-Type: ReDIF-Article 1.0
Author-Name: Peiming Wang
Author-X-Name-First: Peiming
Author-X-Name-Last: Wang
Author-Name: Martin Puterman
Author-X-Name-First: Martin
Author-X-Name-Last: Puterman
Title: Markov Poisson regression models for discrete time series. Part 1: Methodology
Abstract:
This paper proposes and investigates a class of Markov Poisson regression
models in which Poisson rate functions of covariates are conditional on
unobserved states which follow a finite-state Markov chain. Features of
the proposed model, estimation, inference, bootstrap confidence intervals,
model selection and other implementation issues are discussed. Monte Carlo
studies suggest that the proposed estimation method is accurate and
reliable for single- and multiple-subject time series data; the choice of
starting probabilities for the Markov process has little eff ect on the
parameter estimates; and penalized likelihood criteria are reliable for
determining the number of states. Part 2 provides applications of the
proposed model.
Journal: Journal of Applied Statistics
Pages: 855-869
Issue: 7
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922098
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922098
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:7:p:855-869
Template-Type: ReDIF-Article 1.0
Author-Name: Peiming Wang
Author-X-Name-First: Peiming
Author-X-Name-Last: Wang
Author-Name: Martin Puterman
Author-X-Name-First: Martin
Author-X-Name-Last: Puterman
Title: Markov Poisson regression models for discrete time series. Part 2: Applications
Abstract:
This paper applies the Markov Poisson regression methodology of Wang and
Puterman to the analysis of seizure frequencies in an epilepsy clinical
trial and counts of poliomyelitis cases. The analysis of the poliomyelitis
data is compared with that of Zeger.
Journal: Journal of Applied Statistics
Pages: 871-882
Issue: 7
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769922106
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769922106
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:7:p:871-882
Template-Type: ReDIF-Article 1.0
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Author-Name: M. Kalyanasundaram
Author-X-Name-First: M.
Author-X-Name-Last: Kalyanasundaram
Title: Determination of conditional double sampling scheme
Abstract:
In this paper, a new sampling scheme called the 'conditional double
sampling scheme' (CDSS) has been proposed. A compact table is presented
for the selection of a CDSS indexed by various combinations of entry
parameters. Advantages of the CDSS over the single sampling scheme have
been discussed. The basis for the construction of the table is given.
Journal: Journal of Applied Statistics
Pages: 893-902
Issue: 8
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769921909
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769921909
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:8:p:893-902
Template-Type: ReDIF-Article 1.0
Author-Name: Chung-Ho Chen
Author-X-Name-First: Chung-Ho
Author-X-Name-Last: Chen
Title: Specification limit under a quality loss function
Abstract:
The purpose of this paper is to present the problem of selecting a lower
specification limit under Taguchi's quality loss function. Considering
that the product quality characteristic obeys an exponential distribution,
we propose a modification of the method of Kapur and Wang for the economic
design of the specification limit.
Journal: Journal of Applied Statistics
Pages: 903-908
Issue: 8
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769921918
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769921918
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:8:p:903-908
Template-Type: ReDIF-Article 1.0
Author-Name: John Cooper
Author-X-Name-First: John
Author-X-Name-Last: Cooper
Title: Artificial neural networks versus multivariate statistics: An application from economics
Abstract:
An artificial neural network is a computer model that mimics the brain's
ability to classify patterns or to make forecasts based on past
experience. This paper explains the underlying theory of the widely used
back-propagation algorithm and applies this procedure to a problem from
the field of international economics, namely the identification of
countries that are likely to seek a rescheduling of their international
debt-service obligations. A comparison of the results with those obtained
from three multivariate statistical procedures applied to the same data
set suggests that neural networks are worthy of consideration by the
applied economist.
Journal: Journal of Applied Statistics
Pages: 909-921
Issue: 8
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769921927
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769921927
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:8:p:909-921
Template-Type: ReDIF-Article 1.0
Author-Name: Graham Horgan
Author-X-Name-First: Graham
Author-X-Name-Last: Horgan
Title: Using wavelets for data smoothing: A simulation study
Abstract:
Wavelet shrinkage has been proposed as a highly adaptable approach to
signal smoothing, which can produce optimum results in some senses. This
paper examines the performance of the method as a function of its
parameters, by simulation for time series showing gradual, rapid and
discontinuous variations, for a range of signal-to-noise ratios. Some
general conclusions are drawn. The effects of the choice of wavelet,
choice of threshold and choice of resolution cut-off are considered. The
use of the residual autocorrelation as a diagnostic tool is suggested.
Journal: Journal of Applied Statistics
Pages: 923-932
Issue: 8
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769921936
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769921936
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:8:p:923-932
Template-Type: ReDIF-Article 1.0
Author-Name: Juneyoung Lee
Author-X-Name-First: Juneyoung
Author-X-Name-Last: Lee
Author-Name: Andre Khuri
Author-X-Name-First: Andre
Author-X-Name-Last: Khuri
Title: Graphical technique for comparing designs for random models
Abstract:
Methods for comparing designs for a random (or mixed) linear model have
focused primarily on criteria based on single-valued functions. In
general, these functions are difficult to use, because of their complex
forms, in addition to their dependence on the model's unknown variance
components. In this paper, a graphical approach is presented for comparing
designs for random models. The one-way model is used for illustration. The
proposed approach is based on using quantiles of an estimator of a
function of the variance components. The dependence of these quantiles on
the true values of the variance components is depicted by plotting the
so-called quantile dispersion graphs (QDGs), which provide a comprehensive
picture of the quality of estimation obtained with a given design. The
QDGs can therefore be used to compare several candidate designs. Two
methods of estimation of variance components are considered, namely
analysis of variance and maximum-likelihood estimation.
Journal: Journal of Applied Statistics
Pages: 933-947
Issue: 8
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769921945
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769921945
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:8:p:933-947
Template-Type: ReDIF-Article 1.0
Author-Name: K. V. Mardia
Author-X-Name-First: K. V.
Author-X-Name-Last: Mardia
Title: Directional statistics and shape analysis
Abstract:
This paper highlights distributional connections between directional
statistics and shape analysis. In particular, we provide a test of
uniformity for highly dispersed shapes, using the standard techniques of
directional statistics. We exploit the isometric transformation from
triangular shapes to a sphere in three dimensions, to provide a rich class
of shape distributions. A link between the Fisher distribution and the
complex Bingham distribution is re-examined. Some extensions to
higher-dimensional shapes are outlined.
Journal: Journal of Applied Statistics
Pages: 949-957
Issue: 8
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769921954
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769921954
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:8:p:949-957
Template-Type: ReDIF-Article 1.0
Author-Name: Antonietta Mira
Author-X-Name-First: Antonietta
Author-X-Name-Last: Mira
Title: Distribution-free test for symmetry based on Bonferroni's measure
Abstract:
We propose a test based on Bonferroni's measure of skewness. The test
detects the asymmetry of a distribution function about an unknown median.
We study the asymptotic distribution of the given test statistic and
provide a consistent estimate of its variance. The asymptotic relative
efficiency of the proposed test is computed along with Monte Carlo
estimates of its power. This allows us to perform a comparison of the test
based on Bonferroni's measure with other tests for symmetry.
Journal: Journal of Applied Statistics
Pages: 959-972
Issue: 8
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769921963
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769921963
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:8:p:959-972
Template-Type: ReDIF-Article 1.0
Author-Name: Ray Okafor
Author-X-Name-First: Ray
Author-X-Name-Last: Okafor
Title: Using an empirical Bayes model to estimate currency exchange rate
Abstract:
An empirical Bayes (EB) model to estimate the exchange rate of a national
currency is described. The national currency under consideration is
typically non-convertible, and is generally associated with a weak economy
of a Third World country. We take the Nigerian currency as an example.
Using theta as a generic notation for the exchange rate parameter, a
sequence of sample mean estimates theta i MN (i = 1, 2, …, m) is
generated over m time periods. An EB model is formulated for the theta MN
, from which the empirical Bayes estimates theta i EB are calculated. The
performances of theta EB and the Central Bank of Nigeria (CBN) estimate
theta CBN are compared. On several performance measures, theta EB is shown
to be superior to theta CBN .
Journal: Journal of Applied Statistics
Pages: 973-983
Issue: 8
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769921972
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769921972
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:8:p:973-983
Template-Type: ReDIF-Article 1.0
Author-Name: Paulo Rodrigues
Author-X-Name-First: Paulo
Author-X-Name-Last: Rodrigues
Author-Name: Denise Osborn
Author-X-Name-First: Denise
Author-X-Name-Last: Osborn
Title: Performance of seasonal unit root tests for monthly data
Abstract:
This paper uses Monte Carlo simulations to analyze the performance of
several seasonal unit root tests for monthly time series. The tests are
those of Dickey, Hasza and Fuller (DHF), Hylleberg, Engle, Granger and Yoo
(HEGY), and Osborn, Chui, Smith and Birchenhall (OCSB). The unit root test
of Dickey and Fuller (DF) is also considered. The results indicate that
users have to be particularly cautious when applying the monthly version
of the HEGY test. In general, the DHF and OCSB tests are preferable in
terms of size and power, but these procedures may impose invalid
restrictions. An empirical illustration is undertaken for UK two-digit
industrial production indicators.
Journal: Journal of Applied Statistics
Pages: 985-1004
Issue: 8
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769921981
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769921981
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:8:p:985-1004
Template-Type: ReDIF-Article 1.0
Author-Name: Rosa Bernardini Papalia
Author-X-Name-First: Rosa Bernardini
Author-X-Name-Last: Papalia
Title: Local generalized method of moments estimation based on kernel weights: An application to panel data
Abstract:
This paper presents and applies a local generalized method of moments
(LGMM) estimator for regression functions. The method is an extension of
previous results obtained by Gozalo and Linton. The LGMM estimation
procedure can be applied to estimate a mean regression function and its
derivatives at an interior point x , without making explicit assumptions
about its functional form. The method has been applied to estimate dynamic
models based on panel data.
Journal: Journal of Applied Statistics
Pages: 1005-1015
Issue: 8
Volume: 26
Year: 1999
X-DOI: 10.1080/02664769921990
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664769921990
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:26:y:1999:i:8:p:1005-1015
Template-Type: ReDIF-Article 1.0
Author-Name: R. D. Baker
Author-X-Name-First: R. D.
Author-X-Name-Last: Baker
Title: Application of a new discrete distribution
Abstract:
In epidemiology, an infection lasting n weeks may be monitored by taking
weekly serum samples. If tests on samples are independent Bernoulli trials
with probability q of correctly testing positive, the apparent duration of
infection ( from the first positive test to the last positive test
inclusive) may be less than n weeks. This distribution of apparent length
also arises when plants in a row of n each have a probability q of
germinating, for example. This distribution is shown to be related to that
of the number of tails obtained when tossing a coin until two heads are
obtained, in a maximum of n tosses. The properties of the 'apparent
length' distribution are described, and some compounded (mixed)
distributions that can be derived from it are also discussed. The
distribution was used to estimate the underlying distribution of the
duration of infection, in a longitudinal study of infections of children.
The methodology was also used to estimate the proportion of infectious
episodes that were not detected. It can be similarly used to correct
episode durations and rates in longitudinal studies in which episodes of
any kind are detected by regular sampling.
Journal: Journal of Applied Statistics
Pages: 5-21
Issue: 1
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021790
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021790
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:1:p:5-21
Template-Type: ReDIF-Article 1.0
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Author-Name: M. Kalyanasundaram
Author-X-Name-First: M.
Author-X-Name-Last: Kalyanasundaram
Title: Generalized tightened two-level continuous sampling plans
Abstract:
In 1955, Lieberman and Solomon introduced multi-level (MLP) continuous
sampling plans. Derman et al . then extended the multi-level plans as
tightened multi-level plans (MLP-T). In this paper, a generalization of
MLP-T with two sampling levels is presented. Using a Markov chain model,
expressions for the performance measures of the general MLP-T plans are
derived. Tables are also presented for the selection of general MLP-T
plans with two sampling levels when the acceptable quality level, limiting
quality level, indiff erence quality level and average outgoing quality
level are specified.
Journal: Journal of Applied Statistics
Pages: 23-38
Issue: 1
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021808
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021808
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:1:p:23-38
Template-Type: ReDIF-Article 1.0
Author-Name: Olcay Arslan
Author-X-Name-First: Olcay
Author-X-Name-Last: Arslan
Author-Name: Nedret Billor
Author-X-Name-First: Nedret
Author-X-Name-Last: Billor
Title: Robust Liu estimator for regression based on an M-estimator
Abstract:
Consider the regression model y = beta 0 1 + Xbeta + epsilon. Recently,
the Liu estimator, which is an alternative biased estimator beta L (d) =
(X'X + I) -1 (X'X + dI)beta OLS , where 0<d<1 is a parameter,
has been proposed to overcome multicollinearity . The advantage of beta L
(d) over the ridge estimator beta R (k) is that beta L (d) is a linear
function of d. Therefore, it is easier to choose d than to choose k in the
ridge estimator. However, beta L (d) is obtained by shrinking the ordinary
least squares (OLS) estimator using the matrix (X'X + I) -1 (X'X + dI) so
that the presence of outliers in the y direction may affect the beta L (d)
estimator. To cope with this combined problem of multicollinearity and
outliers, we propose an alternative class of Liu-type M-estimators
(LM-estimators) obtained by shrinking an M-estimator beta M , instead of
the OLS estimator using the matrix (X'X + I) -1 (X'X + dI).
Journal: Journal of Applied Statistics
Pages: 39-47
Issue: 1
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021817
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021817
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:1:p:39-47
Template-Type: ReDIF-Article 1.0
Author-Name: Jyoti Divecha
Author-X-Name-First: Jyoti
Author-X-Name-Last: Divecha
Title: Search for suitable incomplete block designs for complete diallel cross systems
Abstract:
The suitability of incomplete block designs for each complete diallel
cross system I, II, III and IV, under the general genetic model is
examined, and a set of necessary conditions obtained. In this connection,
modifications in available designs are suggested and illustrated. A table
of suitable designs with higher efficiency for complete diallel cross
systems is presented.
Journal: Journal of Applied Statistics
Pages: 49-62
Issue: 1
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021826
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021826
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:1:p:49-62
Template-Type: ReDIF-Article 1.0
Author-Name: Yangxin Huang Simon
Author-X-Name-First: Yangxin Huang
Author-X-Name-Last: Simon
Author-Name: P. J. Kirby Peter
Author-X-Name-First: P. J. Kirby
Author-X-Name-Last: Peter
Author-Name: Harris John
Author-X-Name-First: Harris
Author-X-Name-Last: John
Author-Name: C. Dearden
Author-X-Name-First: C.
Author-X-Name-Last: Dearden
Title: Interval estimation of the 90% effective dose: A comparison of bootstrap resampling methods with some large-sample approaches
Abstract:
A number of recent studies have looked at the coverage probabilities of
various common parametric methods of interval estimation of the median
effective dose (ED 50 ) for a logistic dose-response curve. There has been
comparatively little work done on more extreme effective doses. In this
paper, the interval estimation of the 90% effective dose (ED 90 ) will be
of principal interest. We provide a comparison of four parametric methods
of interval construction with four methods based on bootstrap resampling.
Journal: Journal of Applied Statistics
Pages: 63-73
Issue: 1
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021835
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021835
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:1:p:63-73
Template-Type: ReDIF-Article 1.0
Author-Name: Takafumi Isogai
Author-X-Name-First: Takafumi
Author-X-Name-Last: Isogai
Title: Analysis of factorial experiments for survival data with long-tailed distributions
Abstract:
Two left-truncated survival data sets are collected in one-way factorial
designs to examine the quality of products. We cannot specify our survival
function completely, and only know that the tail has a power functional
form of its argument. Thus, our problem is a left-truncated one with
incomplete survivor functions. One of our data sets is the case where the
usual analysis of variance (ANOVA) may be adapted. The other is a repeated
measurement case. We note that the likelihood function is expressed as a
product of conditional and marginal likelihood functions. Estimates of
power parameters are always obtained by the conditional likelihood.
Location parameters describing treatment eff ects are included in the
marginal likelihood only, and their estimates are undetermined, because of
missing values resulting from left truncation. However, in the ANOVA case,
we show that a common structure of power parameters and some simple
assumptions about the missing values enable us to construct an approximate
F test for treatment effects through the marginal likelihood. This result
is extended to a regression case. With the data in repeated measurements,
a systematic variation of the power parameters and an apparent deviation
from our presupposed model make an application of the ANOVA mentioned
impossible, and compel us to generalize our model. By using the ratio of
those generalized models, we show that a descriptive model for evaluating
treatment effects can be constructed.
Journal: Journal of Applied Statistics
Pages: 75-101
Issue: 1
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021844
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021844
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:1:p:75-101
Template-Type: ReDIF-Article 1.0
Author-Name: Dejian Lai
Author-X-Name-First: Dejian
Author-X-Name-Last: Lai
Author-Name: Barry Davis
Author-X-Name-First: Barry
Author-X-Name-Last: Davis
Author-Name: Robert Hardy
Author-X-Name-First: Robert
Author-X-Name-Last: Hardy
Title: Fractional Brownian motion and clinical trials
Abstract:
The purpose of this paper is to extend the widely used classical Brownian
motion technique for monitoring clinical trial data to a larger class of
stochastic processes, i.e. fractional Brownian motion, and compare these
results. The beta-blocker heart attack trial is presented as an example to
illustrate both methods.
Journal: Journal of Applied Statistics
Pages: 103-108
Issue: 1
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021853
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021853
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:1:p:103-108
Template-Type: ReDIF-Article 1.0
Author-Name: J. A. Nelder
Author-X-Name-First: J. A.
Author-X-Name-Last: Nelder
Title: Functional marginality and response-surface fitting
Abstract:
Well-formed polynomials contain the marginal terms of all terms; for
example, they contain both x 1 and x 2 if x 1 x 2 is present. Such models
have a goodness of fit that is invariant to linear transformations of the
x variables. Recently, selection procedures have been proposed which may
not give well-formed polynomials. Analysis of two data sets for which
non-well-formed polynomials have been selected shows that conversion to
well-formed polynomials is beneficial in terms of goodness of fit, as well
as giving fits invariant to linear transformation of the x variables. It
is concluded that selection procedures should search among well-formed
polynomials only.
Journal: Journal of Applied Statistics
Pages: 109-112
Issue: 1
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021862
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021862
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:1:p:109-112
Template-Type: ReDIF-Article 1.0
Author-Name: Erhard Reschenhofer
Author-X-Name-First: Erhard
Author-X-Name-Last: Reschenhofer
Title: Modification of autoregressive fractionally integrated moving average models for the estimation of persistence
Abstract:
In this paper, it is proposed to modify autoregressive fractionally
integrated moving average (ARFIMA) processes by introducing an additional
parameter to comply with the criticism of Hauser et al . (1999) that
ARFIMA processes are not appropriate for the estimation of persistence,
because of the degenerate behavior of their spectral densities at
frequency zero. When fitting these modified ARFIMA processes to the US
GNP, it turns out that the estimated spectra are very similar to those
obtained with conventional ARFIMA models, indicating that, in this special
case, the disadvantage of ARFIMA models cited by Hauser et al. (1999) does
not seriously aff ect the estimation of persistence. However, according to
the results of a goodness-of-fit test applied to the estimated spectra,
both the ARFIMA models and the modified ARFIMA models seem to overfit the
data in the neighborhood of frequency zero.
Journal: Journal of Applied Statistics
Pages: 113-118
Issue: 1
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021871
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021871
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:1:p:113-118
Template-Type: ReDIF-Article 1.0
Author-Name: Jon Vilasuso
Author-X-Name-First: Jon
Author-X-Name-Last: Vilasuso
Author-Name: David Katz
Author-X-Name-First: David
Author-X-Name-Last: Katz
Title: Estimates of the likelihood of extreme returns in international stock markets
Abstract:
This study applies extreme-value theory to daily international
stock-market returns to determine (1) whether or not returns follow a
heavy-tailed stable distribution, (2) the likelihood of an extreme return,
such as a 20% drop in a single day, and (3) whether or not the likelihood
of an extreme event has changed since October 1987. Empirical results
reject a heavy-tailed stable distribution for returns. Instead, a
Student-t distribution or an autoregressive conditional heteroscedastic
process is better able to capture the salient features of returns. We find
that the likelihood of a large single-day return diff ers widely across
markets and, for the G-7 countries, the 1987 stock-market drop appears to
be largely an isolated event. A drop of this magnitude, however, is not
rare in the case of Hong Kong. Finally, there is only limited evidence
that the chance of a large single-day decline is more likely since the
October 1987 market drop; however, exceptions include stock markets in
Germany, The Netherlands and the UK.
Journal: Journal of Applied Statistics
Pages: 119-130
Issue: 1
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021880
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021880
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:1:p:119-130
Template-Type: ReDIF-Article 1.0
Author-Name: David Aadland
Author-X-Name-First: David
Author-X-Name-Last: Aadland
Title: Distribution and interpolation using transformed data
Abstract:
This paper addresses the distribution and interpolation of time series
that have been subject to various data transformations. Monte Carlo
experiments are performed, which suggest that failure to account for these
data transformations may lead to serious errors in estimation.
Journal: Journal of Applied Statistics
Pages: 141-156
Issue: 2
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021682
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021682
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:2:p:141-156
Template-Type: ReDIF-Article 1.0
Author-Name: H. Oztas Ayhan
Author-X-Name-First: H. Oztas
Author-X-Name-Last: Ayhan
Title: Estimators of vital events in dual-record systems
Abstract:
Dual-record system estimation has been widely used to obtain vital events
in the past. Because of the weakness of the statistical assumptions of the
model, as well as the biases involved in the estimators, its use became
limited. The proposed estimators for dual-record systems are based on
further division of the cells of the original table. The results have
shown that they improved the underestimation of the total counts when
compared with the classical Chandra Sekar-Deming estimator.
Journal: Journal of Applied Statistics
Pages: 157-169
Issue: 2
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021691
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021691
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:2:p:157-169
Template-Type: ReDIF-Article 1.0
Author-Name: M. Kalyanasundaram
Author-X-Name-First: M.
Author-X-Name-Last: Kalyanasundaram
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Title: Determination of variable-lot-size attribute sampling plan indexed by the acceptable quality level and average outgoing quality level for continuous production
Abstract:
In this paper, procedures and tables for the selection of a
variable-lot-size attribute sampling plan for continuous production are
given, and the advantages of this plan relative to a fixed-lot-size plan
are also discussed.
Journal: Journal of Applied Statistics
Pages: 171-175
Issue: 2
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021709
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021709
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:2:p:171-175
Template-Type: ReDIF-Article 1.0
Author-Name: Alberto Luceno George
Author-X-Name-First: Alberto Luceno
Author-X-Name-Last: George
Author-Name: E. P. Box
Author-X-Name-First: E. P.
Author-X-Name-Last: Box
Title: Influence of the sampling interval, decision limit and autocorrelation on the average run length in Cusum charts
Abstract:
This paper shows how the average run length for a one-sided Cusum chart
varies as a function of the length of the sampling interval between
consecutive observations, the decision limit for the Cusum statistic, and
the amount of autocorrelation between successive observations. It is shown
that the rate of false alarms can be decreased considerably, without
modifying the rate of valid alarms, by decreasing the sampling interval
and appropriately increasing the decision interval. It is also shown that
this can be done even when the shorter sampling interval induces moderate
autocorrelation between successive observations.
Journal: Journal of Applied Statistics
Pages: 177-183
Issue: 2
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021718
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021718
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:2:p:177-183
Template-Type: ReDIF-Article 1.0
Author-Name: Victor Guerrero
Author-X-Name-First: Victor
Author-X-Name-Last: Guerrero
Title: Selecting a linearizing power transformation for time series
Abstract:
A method is proposed for choosing a power transformation that allows a
univariate time series to be adequately represented by a straight line, in
an exploratory analysis of the data. The method is quite simple and
enables the analyst to measure local and global curvature in the data. A
description of the pattern followed by the data is obtained as a
by-product of the method. A specific form of the coefficient of
determination is suggested to discriminate among several combinations of
estimates of the index of the transformation and the slope of the straight
line. Some results related to the degree of diff erencing required to make
the time series stationary are also exploited. The usefulness of the
proposal is illustrated with four empirical applications-two using
demographic data and the other two concerning market studies. These
examples are provided in line with the spirit of an exploratory analysis,
rather than as a complete or confirmatory analysis of the data.
Journal: Journal of Applied Statistics
Pages: 185-195
Issue: 2
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021727
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021727
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:2:p:185-195
Template-Type: ReDIF-Article 1.0
Author-Name: Jiin-Huarng Guo
Author-X-Name-First: Jiin-Huarng
Author-X-Name-Last: Guo
Author-Name: Wei-Ming Luh
Author-X-Name-First: Wei-Ming
Author-X-Name-Last: Luh
Title: Normalized Johnson's transformation one-sample trimmed t for non-normality
Abstract:
The present study suggests the use of the normalized Johnson
transformation trimmed t statistic in the one-sample case when the
assumption of normality is violated. The performance of the proposed
method was evaluated by Monte Carlo simulation, and was compared with the
conventional Student t statistic, the trimmed t statistic and the
normalized Johnson's transformation untrimmed t statistic respectively.
The simulated results indicate that the proposed method can control type I
error very well and that its power is greater than the other competitors
for various conditions of non-normality. The method can be easily computer
programmed and provides an alternative for the conventional t test.
Journal: Journal of Applied Statistics
Pages: 197-203
Issue: 2
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021736
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021736
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:2:p:197-203
Template-Type: ReDIF-Article 1.0
Author-Name: R. Southworth
Author-X-Name-First: R.
Author-X-Name-Last: Southworth
Author-Name: K. V. Mardia
Author-X-Name-First: K. V.
Author-X-Name-Last: Mardia
Author-Name: C. C. Taylor
Author-X-Name-First: C. C.
Author-X-Name-Last: Taylor
Title: Transformation- and label-invariant neural network for the classification of landmark data
Abstract:
One method of expressing coarse information about the shape of an object
is to describe the shape by its landmarks, which can be taken as
meaningful points on the outline of an object. We consider a situation in
which we want to classify shapes into known populations based on their
landmarks, invariant to the location, scale and rotation of the shapes. A
neural network method for transformation-invariant classification of
landmark data is presented. The method is compared with the
(non-transformation-invariant) complex Bingham rule; the two techniques
are tested on two sets of simulated data, and on data that arise from mice
vertebrae. Despite the obvious advantage of the complex Bingham rule
because of information about rotation, the neural network method compares
favourably.
Journal: Journal of Applied Statistics
Pages: 205-215
Issue: 2
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021745
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021745
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:2:p:205-215
Template-Type: ReDIF-Article 1.0
Author-Name: Leann Myers
Author-X-Name-First: Leann
Author-X-Name-Last: Myers
Author-Name: Stephanie Broyles
Author-X-Name-First: Stephanie
Author-X-Name-Last: Broyles
Title: Regression coefficient analysis for correlated binomial outcomes
Abstract:
At present, the generalized estimating equation (GEE) and weighted
least-squares (WLS) regression methods are the most widely used methods
for analyzing correlated binomial data; both are easily implemented using
existing software packages. We propose an alternative technique, i.e.
regression coefficient analysis (RCA), for this type of data. In RCA, a
regression equation is computed for each of n individuals; regression
coefficients are averaged across the n equations to produce a regression
equation, which predicts marginal probabilities and which can be tested to
address hypotheses of different slopes between groups, slopes different
from zero, different intercepts, etc. The method is computationally simple
and can be performed using standard software. Simulations and examples are
used to compare the power and robustness of RCA with those of the standard
GEE and WLS methods. We find that RCA is comparable with the GEE method
under the conditions tested, and suggest that RCA, within specified
limitations, is a viable alternative to the GEE and WLS methods in the
analysis of correlated binomial data.
Journal: Journal of Applied Statistics
Pages: 217-234
Issue: 2
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021754
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021754
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:2:p:217-234
Template-Type: ReDIF-Article 1.0
Author-Name: Tapio Nummi
Author-X-Name-First: Tapio
Author-X-Name-Last: Nummi
Title: Analysis of growth curves under measurement errors
Abstract:
In this paper, we propose a method for the analysis of growth curve
models when also the regressor variable may be measured with errors. Two
classes of structure for errors in regressors are discussed. For complete
and balanced data, estimators for the model parameters are derived under
the maximum-likelihood framework. Numerical examples are provided to
illustrate the proposed technique.
Journal: Journal of Applied Statistics
Pages: 235-243
Issue: 2
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021763
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021763
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:2:p:235-243
Template-Type: ReDIF-Article 1.0
Author-Name: J. Tyrcha
Author-X-Name-First: J.
Author-X-Name-Last: Tyrcha
Author-Name: R. Sundberg
Author-X-Name-First: R.
Author-X-Name-Last: Sundberg
Author-Name: P. Lindskog
Author-X-Name-First: P.
Author-X-Name-Last: Lindskog
Author-Name: B. Sundstrom
Author-X-Name-First: B.
Author-X-Name-Last: Sundstrom
Title: Statistical modelling and saddle-point approximation of tail probabilities for accumulated splice loss in fibre-optic networks
Abstract:
Tail probabilities are calculated by saddle-point approximation in a
probabilistic-statistical model for the accumulated splice loss that
results from a number of fusion splices in the installation of fibre-optic
networks. When these probabilities, representing the risk of exceeding a
specified total loss, can be controlled and kept low, the requirements on
the individual losses can be substantially relaxed from their customary
settings. As a consequence, it should be possible to save considerable
installation time and cost. The probabilistic model, which can be
theoretically motivated, states that the individual loss is basically
exponentially distributed, but with a Gaussian contribution added and
truncated at a set value, and that the loss is additive over splices. An
extensive set of installation data fitted well with this model, except for
occasional high losses. Therefore, the model described was extended to
allow for a frequency of unspecified high losses of this sort. It is also
indicated how the model parameters can be estimated from data.
Journal: Journal of Applied Statistics
Pages: 245-256
Issue: 2
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021772
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021772
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:2:p:245-256
Template-Type: ReDIF-Article 1.0
Author-Name: J. A. Amaral
Author-X-Name-First: J. A.
Author-X-Name-Last: Amaral
Author-Name: M. B. RosARio
Author-X-Name-First: M. B.
Author-X-Name-Last: RosARio
Author-Name: M. T. Paixao
Author-X-Name-First: M. T.
Author-X-Name-Last: Paixao
Title: Data and projections of HIV and AIDS in Portugal
Abstract:
Projections of AIDS incidence are critical for assessing future
healthcare needs. This paper focuses on the method of back-calculation for
obtaining forecasts. The first problem faced was the need to account for
delays and underreporting in reporting of cases and to adjust the
incidence data. The method used to estimate the reporting delay
distribution is based on Poisson regression and involves cross-classifying
each reported case by calendar time of diagnosis and reporting delay. The
adjusted AIDS incidence data are then used to obtain short-term
projections and lower bounds on the size of the AIDS epidemic. The
estimation procedure 'back-calculates' from AIDS incidence data using the
incubation period distribution to obtain estimates of the numbers
previously infected. These numbers are then projected forward. The problem
can be shown to reduce to estimating the size of a multinomial population.
The expectation-maximization (EM) algorithm is used to obtain
maximum-likelihood estimates when the density of infection times is
parametrized as a step function. The methodology is applied to AIDS
incidence data in Portugal for four different transmission categories:
injecting drug users, sexual transmission (homosexual/bisexual and
heterosexual contact) and other, mainly haemophilia and blood transfusion
related, to obtain short-term projections and an estimate of the minimum
size of the epidemic.
Journal: Journal of Applied Statistics
Pages: 269-279
Issue: 3
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021592
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021592
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:3:p:269-279
Template-Type: ReDIF-Article 1.0
Author-Name: Ying Wang Wong
Author-X-Name-First: Ying Wang
Author-X-Name-Last: Wong
Author-Name: Siu Hung Cheung
Author-X-Name-First: Siu Hung
Author-X-Name-Last: Cheung
Title: Simultaneous pairwise multiple comparisons in a two-way analysis of covariance model
Abstract:
Pairwise comparison procedures are important and popular statistical
techniques in many disciplines, such as physiology and agrobiology. In
this paper, we seek to derive the statistical methods which enable one to
perform pairwise comparisons in a two-way analysis of covariance model.
The overall family-wise type I error rate is controlled at a designated
level. The procedures are outlined for simultaneous inferences among
treatment means. Numerical examples are given to illustrate our testing
procedure.
Journal: Journal of Applied Statistics
Pages: 281-291
Issue: 3
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021600
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021600
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:3:p:281-291
Template-Type: ReDIF-Article 1.0
Author-Name: Kang-Mo Jung
Author-X-Name-First: Kang-Mo
Author-X-Name-Last: Jung
Title: Local influence assessment in canonical correlation analysis
Abstract:
The local influence method is adapted to canonical correlation analysis
for the purpose of investigating the influence of observations. We
consider a perturbation based on the empirical distribution function. An
illustrative example is given to show the effectiveness of the local
influence method for the identification of influential observations.
Journal: Journal of Applied Statistics
Pages: 293-301
Issue: 3
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021619
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021619
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:3:p:293-301
Template-Type: ReDIF-Article 1.0
Author-Name: Trine Kvist
Author-X-Name-First: Trine
Author-X-Name-Last: Kvist
Author-Name: Henrik Gislason
Author-X-Name-First: Henrik
Author-X-Name-Last: Gislason
Author-Name: Poul Thyregod
Author-X-Name-First: Poul
Author-X-Name-Last: Thyregod
Title: Using continuation-ratio logits to analyze the variation of the age composition of fish catches
Abstract:
Major sources of information for the estimation of the size of the fish
stocks and the rate of their exploitation are samples from which the age
composition of catches may be determined. However, the age composition in
the catches often varies as a result of several factors. Stratification of
the sampling is desirable, because it leads to better estimates of the age
composition, and the corresponding variances and covariances. The analysis
is impeded by the fact that the response is ordered categorical. This
paper introduces an easily applicable method to analyze such data. The
method combines continuation-ratio logits and the theory for generalized
linear mixed models. Continuation-ratio logits are designed for ordered
multinomial response and have the feature that the associated
log-likelihood splits into separate terms for each category levels. Thus,
generalized linear mixed models can be applied separately to each level of
the logits. The method is illustrated by the analysis of age-composition
data collected from the Danish sandeel fishery in the North Sea in 1993.
The significance of possible sources of variation is evaluated, and
formulae for estimating the proportions of each age group and their
variance-covariance matrix are derived.
Journal: Journal of Applied Statistics
Pages: 303-319
Issue: 3
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021628
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021628
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:3:p:303-319
Template-Type: ReDIF-Article 1.0
Author-Name: Jack Lee
Author-X-Name-First: Jack
Author-X-Name-Last: Lee
Author-Name: Kuo-Ching Liu
Author-X-Name-First: Kuo-Ching
Author-X-Name-Last: Liu
Title: Bayesian analysis of a general growth curve model with predictions using power transformations and AR(1) autoregressive dependence
Abstract:
In this paper, we consider a Bayesian analysis of the unbalanced
(general) growth curve model with AR(1) autoregressive dependence, while
applying the Box-Cox power transformations. We propose exact, simple and
Markov chain Monte Carlo approximate parameter estimation and prediction
of future values. Numerical results are illustrated with real and
simulated data.
Journal: Journal of Applied Statistics
Pages: 321-336
Issue: 3
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021637
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021637
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:3:p:321-336
Template-Type: ReDIF-Article 1.0
Author-Name: Yeong-Tzay Su
Author-X-Name-First: Yeong-Tzay
Author-X-Name-Last: Su
Author-Name: Chyi-Lyi Kathleen Liang
Author-X-Name-First: Chyi-Lyi Kathleen
Author-X-Name-Last: Liang
Title: Using multivariate rank sum tests to evaluate effectiveness of computer applications in teaching business statistics
Abstract:
Arguments about using computer facilities in classroom teaching have
received a lot of attention over time. Using the computer facilities will
be helpful to demonstrate real-world applications, while poor data or
inappropriate case studies might hinder the applications of the computer
programs in classroom teaching. In this paper, we examine the impacts that
using computer programs to teach business statistics have on students in
the Krannert School of Management at Purdue University. The results show
that students are attracted to the interactive computer programs designed
for the business statistics course, and students are more motivated to
attend classes when computer programs are applied in teaching.
Furthermore, computer programs help students to understand confusing
topics, and students feel that teaching them to use computer facilities
really improves their own abilities to apply similar programs in analyzing
real-world problems.
Journal: Journal of Applied Statistics
Pages: 337-345
Issue: 3
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021646
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021646
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:3:p:337-345
Template-Type: ReDIF-Article 1.0
Author-Name: Sifa Mvoi
Author-X-Name-First: Sifa
Author-X-Name-Last: Mvoi
Author-Name: Yan-Xia Lin
Author-X-Name-First: Yan-Xia
Author-X-Name-Last: Lin
Title: Criteria for estimating the variance function used in the asymptotic quasi-likelihood approach
Abstract:
The estimation of the variance function of a linear regression model used
in the asymptotic quasi-likelihood approach is considered. It is shown
that the variance function used in the determination of the asymptotic
quasi-likelihood estimates encompasses the variance functions commonly
found in the literature. Selection criteria of the most appropriate
estimate of the variance function for given data are established. These
criteria are based on a graphical technique and a chi-squared test.
Journal: Journal of Applied Statistics
Pages: 347-362
Issue: 3
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021655
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021655
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:3:p:347-362
Template-Type: ReDIF-Article 1.0
Author-Name: Milad Sawiris
Author-X-Name-First: Milad
Author-X-Name-Last: Sawiris
Title: Optimum grouping and the boundary problem
Abstract:
Given a set of n elements or observations that form a continuous
variable, it is required to divide their distribution into k homogenous
groups where k > 2, and the purpose is to minimize the within-groups
variance. This paper investigates procedures for such a division and shows
how to find the boundaries separating the groups.
Journal: Journal of Applied Statistics
Pages: 363-371
Issue: 3
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021664
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021664
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:3:p:363-371
Template-Type: ReDIF-Article 1.0
Author-Name: A. M. C. Vieira
Author-X-Name-First: A. M. C.
Author-X-Name-Last: Vieira
Author-Name: J. P. Hinde
Author-X-Name-First: J. P.
Author-X-Name-Last: Hinde
Author-Name: C. G. B. Demetrio
Author-X-Name-First: C. G. B.
Author-X-Name-Last: Demetrio
Title: Zero-inflated proportion data models applied to a biological control assay
Abstract:
Biological control of pests is an important branch of entomology,
providing environmentally friendly forms of crop protection. Bioassays are
used to find the optimal conditions for the production of parasites and
strategies for application in the field. In some of these assays,
proportions are measured and, often, these data have an inflated number of
zeros. In this work, six models will be applied to data sets obtained from
biological control assays for Diatraea saccharalis , a common pest in
sugar cane production. A natural choice for modelling proportion data is
the binomial model. The second model will be an overdispersed version of
the binomial model, estimated by a quasi-likelihood method. This model was
initially built to model overdispersion generated by individual
variability in the probability of success. When interest is only in the
positive proportion data, a model can be based on the truncated binomial
distribution and in its overdispersed version. The last two models include
the zero proportions and are based on a finite mixture model with the
binomial distribution or its overdispersed version for the positive data.
Here, we will present the models, discuss their estimation and compare the
results.
Journal: Journal of Applied Statistics
Pages: 373-389
Issue: 3
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760021673
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760021673
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:3:p:373-389
Template-Type: ReDIF-Article 1.0
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Author-Name: K. Govindaraju
Author-X-Name-First: K.
Author-X-Name-Last: Govindaraju
Title: Modified tightened two-level continuous sampling plans
Abstract:
In this paper, a modification is proposed on the tightened two-level
continuous sampling plan. The tightened two-level plan is one of the three
tightened multi-level continuous sampling plans of Derman et al. (1957)
with two sampling levels. A modified tightened two-level continuous
sampling plan is considered, for which the rules concerning partial
inspection depend, in part, on the length of time it takes to decide that
the process quality is good enough that 100% inspection may be suspended
(e.g. the time required to find i consecutive items free of defects).
Using a Markov chain model, expressions for the performance measures of
the modified MLP-T-2 plan are derived. The modified MLP-T-2 plan is shown
to be identical to the MLP-T-2 plan. Tables are also presented for the
selection of the modified MLP-T-2 plan when the AQL or LQL and AOQL are
specified.
Journal: Journal of Applied Statistics
Pages: 397-409
Issue: 4
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050003597
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050003597
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:4:p:397-409
Template-Type: ReDIF-Article 1.0
Author-Name: Bei-Hung Chang
Author-X-Name-First: Bei-Hung
Author-X-Name-Last: Chang
Author-Name: Stuart Lipsitz
Author-X-Name-First: Stuart
Author-X-Name-Last: Lipsitz
Author-Name: Christine Waternaux
Author-X-Name-First: Christine
Author-X-Name-Last: Waternaux
Title: Logistic regression in meta-analysis using aggregate data
Abstract:
We derived two methods to estimate the logistic regression coefficients
in a meta-analysis when only the 'aggregate' data (mean values) from each
study are available. The estimators we proposed are the discriminant
function estimator and the reverse Taylor series approximation. These two
methods of estimation gave similar estimators using an example of
individual data. However, when aggregate data were used, the discriminant
function estimators were quite different from the other two estimators. A
simulation study was then performed to evaluate the performance of these
two estimators as well as the estimator obtained from the model that
simply uses the aggregate data in a logistic regression model. The
simulation study showed that all three estimators are biased. The bias
increases as the variance of the covariate increases. The distribution
type of the covariates also affects the bias. In general, the estimator
from the logistic regression using the aggregate data has less bias and
better coverage probabilities than the other two estimators. We concluded
that analysts should be cautious in using aggregate data to estimate the
parameters of the logistic regression model for the underlying individual
data.
Journal: Journal of Applied Statistics
Pages: 411-424
Issue: 4
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050003605
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050003605
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:4:p:411-424
Template-Type: ReDIF-Article 1.0
Author-Name: Jan-Olof Johansson
Author-X-Name-First: Jan-Olof
Author-X-Name-Last: Johansson
Title: Modelling the surface structure of newsprint
Abstract:
The Gibbs distribution is often used to model micro-textures. This
includes a definition of a neighbourhood system. If a micro-texture
contains a large-scale variation, the neighbourhood system will be large,
which implies many parameters in the corresponding Gibbs distribution. The
estimation of the parameters for such models will be difficult and time
consuming. I suggest, in this paper, a separation of the micro-texture
into a large-scale variation and a small-scale variation and model each
source of variation with a Gibbs distribution. This method is applied on
full-tone print of newsprint to model the variation caused by print
mottle. In this application, the large-scale variation is mainly caused by
fibre flocculation and clustering and the small-scale variation contains
the variation of the fibres and fines on and between the clusters. The
separate description of these two variations makes it possible to relate
different kinds of paper qualities to the appropriate source of variation.
Journal: Journal of Applied Statistics
Pages: 425-438
Issue: 4
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050003614
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050003614
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:4:p:425-438
Template-Type: ReDIF-Article 1.0
Author-Name: H. J. Khamis
Author-X-Name-First: H. J.
Author-X-Name-Last: Khamis
Title: The two-stage i -corrected Kolmogorov-Smirnov test
Abstract:
The delta-corrected Kolmogorov-Smirnov test has been shown to be
uniformly more powerful than the classical Kolmogorov-Smirnov test for
small to moderate sample sizes. However, the delta-corrected test consists
of two tests, leading to a slight inflation of the experimentwise type I
error rate. The critical values of the delta-corrected test are adjusted
to take into account the two-stage nature of the test, ensuring an
experimentwise error rate at the nominal level. A power study confirms
that the resulting so-called two-stage delta-corrected test is uniformly
more powerful than the classical Kolmogorov-Smirnov test, with power
improvements of up to 46 percentage points.
Journal: Journal of Applied Statistics
Pages: 439-450
Issue: 4
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050003623
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050003623
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:4:p:439-450
Template-Type: ReDIF-Article 1.0
Author-Name: Myung Geun Kim
Author-X-Name-First: Myung Geun
Author-X-Name-Last: Kim
Title: Outliers and influential observations in the structural errors-in-variables model
Abstract:
The influence of observations on the parameter estimates for the simple
structural errors-in-variables model with no equation error is
investigated using the local influence method. Residuals themselves are
not sufficient for detecting outliers. The likelihood displacement
approach is useful for outlier detection especially when a masking
phenomenon is present. An illustrative example is provided.
Journal: Journal of Applied Statistics
Pages: 451-460
Issue: 4
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050003632
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050003632
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:4:p:451-460
Template-Type: ReDIF-Article 1.0
Author-Name: C. D. Lai
Author-X-Name-First: C. D.
Author-X-Name-Last: Lai
Author-Name: M. Xie
Author-X-Name-First: M.
Author-X-Name-Last: Xie
Author-Name: K. Govindaraju
Author-X-Name-First: K.
Author-X-Name-Last: Govindaraju
Title: Study of a Markov model for a high-quality dependent process
Abstract:
For high-quality processes, non-conforming items are seldom observed and
the traditional p (or np) charts are not suitable for monitoring the state
of the process. A type of chart based on the count of cumulative
conforming items has recently been introduced and it is especially useful
for automatically collected one-at-a-time data. However, in such a case,
it is common that the process characteristics become dependent as items
produced one after another are inspected. In this paper, we study the
problem of process monitoring when the process is of high quality and
measurement values possess a certain serial dependence. The problem of
assuming independence is examined and a Markov model for this type of
process is studied, upon which suitable control procedures can be
developed.
Journal: Journal of Applied Statistics
Pages: 461-473
Issue: 4
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050003641
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050003641
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:4:p:461-473
Template-Type: ReDIF-Article 1.0
Author-Name: M. M. Shoukri
Author-X-Name-First: M. M.
Author-X-Name-Last: Shoukri
Author-Name: O. Demirkaya
Author-X-Name-First: O.
Author-X-Name-Last: Demirkaya
Title: Sample size requirements to test the equality of raters' precision
Abstract:
Exact and approximate methods are developed to calculate the required
number of subjects n in a repeatability study, where repeatability is
measured by the precision of measurements made by a rater. The exact
method is based on power calculations under the non-null distribution of
the multiple coefficient of determination, which requires intensive
numerical computation. The approximate method is based on predictions from
families of non-linear curves fitted by the method of least squares.
Journal: Journal of Applied Statistics
Pages: 483-494
Issue: 4
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050003669
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050003669
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:4:p:483-494
Template-Type: ReDIF-Article 1.0
Author-Name: Zheng Wang
Author-X-Name-First: Zheng
Author-X-Name-Last: Wang
Title: An algorithm for generalized monotonic smoothing
Abstract:
In this paper, an algorithm for Generalized Monotonic Smoothing (GMS) is
developed as an extension to exponential family models of the monotonic
smoothing techniques proposed by Ramsay (1988, 1998a,b). A two-step
algorithm is used to estimate the coefficients of bases and the linear
term. We show that the algorithm can be embedded into the iterative
re-weighted least square algorithm that is typically used to estimate the
coefficients in Generalized Linear Models. Thus, the GMS estimator can be
computed using existing routines in S-plus and other statistical software.
We apply the GMS model to the Down's syndrome data set and compare the
results with those from Generalized Additive Model estimation. The choice
of smoothing parameter and testing of monotonicity are also discussed.
Journal: Journal of Applied Statistics
Pages: 495-507
Issue: 4
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050003678
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050003678
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:4:p:495-507
Template-Type: ReDIF-Article 1.0
Author-Name: Jon Woodroof
Author-X-Name-First: Jon
Author-X-Name-Last: Woodroof
Title: Bootstrapping: As easy as 1-2-3
Abstract:
The bootstrap is a powerful non-parametric statistical technique for
making probability-based inferences about a population parameter. Through
a Monte-Carlo resampling simulation, bootstrapping empirically generates a
statistic's entire distribution. From this simulated distribution,
inferences can be made about a population parameter. Assumptions about
normality are not required. In general, despite its power, bootstrapping
has been used relatively infrequently in social science research, and this
is particularly true for business research. This under-utilization is
likely due to a combination of a general lack of understanding of the
bootstrap technique and the difficulty with which it has traditionally
been implemented. Researchers in the various fields of business should be
familiar with this powerful statistical technique. The purpose of this
paper is to explain how this technique works using Lotus 1-2-3, a software
package with which business people are very familiar.
Journal: Journal of Applied Statistics
Pages: 509-517
Issue: 4
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050003687
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050003687
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:4:p:509-517
Template-Type: ReDIF-Article 1.0
Author-Name: David Hand
Author-X-Name-First: David
Author-X-Name-Last: Hand
Author-Name: Niall Adams
Author-X-Name-First: Niall
Author-X-Name-Last: Adams
Title: Defining attributes for scorecard construction in credit scoring
Abstract:
In many domains, simple forms of classification rules are needed because
of requirements such as ease of use. A particularly simple form splits
each variable into just a few categories, assigns weights to the
categories, sums the weights for a new object to be classified, and
produces a classification by comparing the score with a threshold. Such
instruments are often called scorecards. We describe a way to find the
best partition of each variable using a simulated annealing strategy. We
present theoretical and empirical comparisons of two such additive models,
one based on weights of evidence and another based on logistic regression.
Journal: Journal of Applied Statistics
Pages: 527-540
Issue: 5
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050076371
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050076371
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:5:p:527-540
Template-Type: ReDIF-Article 1.0
Author-Name: Yoshikazu Ojima
Author-X-Name-First: Yoshikazu
Author-X-Name-Last: Ojima
Title: Generalized staggered nested designs for variance components estimation
Abstract:
Staggered nested experimental designs are the most popular class of
unbalanced nested designs in practical fields. The most important features
of the staggered nested design are that it has a very simple open-ended
structure and each sum of squares in the analysis of variance has almost
the same degrees of freedom. Based on the features, a class of unbalanced
nested designs that is a generalization of the staggered nested design is
proposed in this paper. Formulae for the estimation of variance components
and their sums are provided. Comparing the variances of the estimators to
the staggered nested designs, it is found that some of the generalized
staggered nested designs are more efficient than the traditional staggered
nested design in estimating some of the variance components and their
sums. An example is provided for illustration.
Journal: Journal of Applied Statistics
Pages: 541-553
Issue: 5
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050076380
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050076380
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:5:p:541-553
Template-Type: ReDIF-Article 1.0
Author-Name: Mustafa Yilmaz
Author-X-Name-First: Mustafa
Author-X-Name-Last: Yilmaz
Author-Name: Sangit Chatterjee
Author-X-Name-First: Sangit
Author-X-Name-Last: Chatterjee
Title: Patterns of NBA team performance from 1950 to 1998
Abstract:
This paper examines team performance in the NBA over the last five
decades. It was motivated by two previous observational studies, one of
which studied the winning percentages of professional baseball teams over
time, while the other examined individual player performance in the NBA.
These studies considered professional sports as evolving systems, a view
proposed by evolutionary biologist Stephen Jay Gould, who wrote
extensively on the disappearance of .400 hitters in baseball. Gould argued
that the disappearance is actually a sign of improvement in the quality of
play, reflected in the reduction of variability in hitting performance.
The previous studies reached similar conclusions in terms of winning
percentages of baseball teams, and performance of individual players in
basketball. This paper uses multivariate measures of team performance in
the NBA to see if similar characteristics of evolution can be observed.
The conclusion does not appear to be clearly affirmative, as in previous
studies, and possible reasons for this are discussed.
Journal: Journal of Applied Statistics
Pages: 555-566
Issue: 5
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050076399
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050076399
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:5:p:555-566
Template-Type: ReDIF-Article 1.0
Author-Name: Alan Hutson
Author-X-Name-First: Alan
Author-X-Name-Last: Hutson
Title: A composite quantile function estimator with applications in bootstrapping
Abstract:
In this note we define a composite quantile function estimator in order
to improve the accuracy of the classical bootstrap procedure in small
sample setting. The composite quantile function estimator employs a
parametric model for modelling the tails of the distribution and uses the
simple linear interpolation quantile function estimator to estimate
quantiles lying between 1/(n+1) and n/(n+1). The method is easily
programmed using standard software packages and has general applicability.
It is shown that the composite quantile function estimator improves the
bootstrap percentile interval coverage for a variety of statistics and is
robust to misspecification of the parametric component. Moreover, it is
also shown that the composite quantile function based approach
surprisingly outperforms the parametric bootstrap for a variety of small
sample situations.
Journal: Journal of Applied Statistics
Pages: 567-577
Issue: 5
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050076407
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050076407
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:5:p:567-577
Template-Type: ReDIF-Article 1.0
Author-Name: Hui Li
Author-X-Name-First: Hui
Author-X-Name-Last: Li
Author-Name: Robert Malkin
Author-X-Name-First: Robert
Author-X-Name-Last: Malkin
Title: An approximate Bayesian up-down method for estimating a percentage point on a dose-response curve
Abstract:
While the up-down method for estimating a percentage point on a
dose-response curve has received considerable attention, a general
Bayesian solution to the up-down design and estimation has never been
presented, probably due to its computational complexity both in design and
use. This paper presents a theoretical approach for up-down experimental
designs with unknown location and slope parameters, and a practical
approach for their use. The simplex method is used to find the optimal
starting dose level and step sizes that minimize the expected root mean
square error for a fixed number of observations and a reduced number of
step sizes. The Bayesian estimate is then approximated by a polynomial
formula. The coefficients of the formula are also chosen using simplex
minimization. Two example solutions are given with uniform-uniform and
normal-gamma joint prior distributions, showing that the simplifying
assumptions make the method far easier to use with only a marginal
increase in expected root mean square error. We show how to adapt these
prior distributions to a wide range of frequently encountered
applications.
Journal: Journal of Applied Statistics
Pages: 579-587
Issue: 5
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050076416
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050076416
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:5:p:579-587
Template-Type: ReDIF-Article 1.0
Author-Name: Murari Singh
Author-X-Name-First: Murari
Author-X-Name-Last: Singh
Author-Name: Michael Jones
Author-X-Name-First: Michael
Author-X-Name-Last: Jones
Title: Statistical estimation of time-trends in two-course crop rotations
Abstract:
An assessment of time-trends in yield parameters is essential to the
utilization of data from long-term field trials for the comparison of
different crop rotations and input regimes, and the identification of
sustainable production systems. The barley-vetch rotation established at
Breda in northern Syria has provided the basis for estimation of the
time-trends in yield data from selected treatments in a two-course crop
rotation trial. The model used for the estimation accounts for the effect
of rainfall, a major determinant of each annual yield value, and the
first-order autocorrelation structure in the errors arising from the same
plot over time. An expression for the minimum number of cycles required to
detect a significant time-trend has been obtained. Results from the
barley-vetch rotation under two fertilizer regimes have been discussed.
Journal: Journal of Applied Statistics
Pages: 589-597
Issue: 5
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050076425
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050076425
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:5:p:589-597
Template-Type: ReDIF-Article 1.0
Author-Name: Govind Mudholkar
Author-X-Name-First: Govind
Author-X-Name-Last: Mudholkar
Author-Name: Deo Kumar Srivastava
Author-X-Name-First: Deo Kumar
Author-X-Name-Last: Srivastava
Title: A class of robust stepwise alternatives to Hotelling's T 2 tests
Abstract:
Hotelling's T 2 test is known to be optimal under multivariate normality
and is reasonably validity-robust when the assumption fails. However, some
recently introduced robust test procedures have superior power properties
and reasonable type I error control with non-normal populations. These,
including the tests due to Tiku & Singh (1982), Tiku & Balakrishnan (1988)
and Mudholkar & Srivastava (1999b, c), are asymptotically valid but are
useful with moderate size samples only if the population dimension is
small. A class of B-optimal modifications of the stepwise alternatives to
Hotellings T 2 introduced by Mudholkar & Subbaiah (1980) are simple to
implement and essentially equivalent to the T 2 test even with small
samples. In this paper we construct and study the robust versions of these
modified stepwise tests using trimmed means instead of sample means. We
use the robust one- and two-sample trimmed- t procedures as in Mudholkar
et al. (1991) and propose statistics based on combining them. The results
of an extensive Monte Carlo experiment show that the robust alternatives
provide excellent type I error control and a substantial gain in power.
Journal: Journal of Applied Statistics
Pages: 599-619
Issue: 5
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050076434
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050076434
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:5:p:599-619
Template-Type: ReDIF-Article 1.0
Author-Name: Kelly Zou
Author-X-Name-First: Kelly
Author-X-Name-Last: Zou
Author-Name: W. J. Hall
Author-X-Name-First: W. J.
Author-X-Name-Last: Hall
Title: Two transformation models for estimating an ROC curve derived from continuous data
Abstract:
A receiver operating characteristic (ROC) curve is a plot of two survival
functions, derived separately from the diseased and healthy samples. A
special feature is that the ROC curve is invariant to any monotone
transformation of the measurement scale. We propose and analyse
semiparametric and parametric transformation models for this two-sample
problem. Following an unspecified or specified monotone transformation, we
assume that the healthy and diseased measurements have two normal
distributions with different means and variances. Maximum likelihood
algorithms for estimating ROC curve parameters are developed. The proposed
methods are illustrated on the marker CA125 in the diagnosis of gastric
cancer.
Journal: Journal of Applied Statistics
Pages: 621-631
Issue: 5
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050076443
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050076443
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:5:p:621-631
Template-Type: ReDIF-Article 1.0
Author-Name: Fred Huffer
Author-X-Name-First: Fred
Author-X-Name-Last: Huffer
Author-Name: Cheolyong Park
Author-X-Name-First: Cheolyong
Author-X-Name-Last: Park
Title: A test for multivariate structure
Abstract:
We present a test for detecting 'multivariate structure' in data sets.
This procedure consists of transforming the data to remove the
correlations, then discretizing the data and, finally, studying the cell
counts in the resulting contingency table. A formal test can be performed
using the usual chi-squared test statistic. We give the limiting
distribution of the chi-squared statistic and also present simulation
results to examine the accuracy of this limiting distribution in finite
samples. Several examples show that our procedure can detect a variety of
different types of structure. Our examples include data with clustering,
digitized speech data, and residuals from a fitted time series model. The
chi-squared statistic can also be used as a test for multivariate
normality.
Journal: Journal of Applied Statistics
Pages: 633-650
Issue: 5
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050076452
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050076452
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:5:p:633-650
Template-Type: ReDIF-Article 1.0
Author-Name: Sueli Mingoti
Author-X-Name-First: Sueli
Author-X-Name-Last: Mingoti
Title: A stepwise Bayesian estimator for the total number of distinct species in finite populations: Sampling by elements
Abstract:
A stepwise Bayesian estimator for the total number of distinct species in
the region of investigation is constructed when sampling by elements is
used to collect the sample of species. The species in the region are
supposed to be divided into two groups: the first containing those species
the researcher believes are present in the region and the second group
containing the species in the region which are completely unknown to the
researcher. The abundance values of the second group are supposed to
follow a Dirichlet distribution. Under this model, the obtained stepwise
Bayesian estimator is an extension of that proposed by Lewins & Joanes
(1984). When the negative binomial distribution is chosen as a prior
distribution for the true value T of species in the region, the stepwise
estimator takes a simple form. It is then shown that the estimator
proposed by Hill (1979) is a particular case and that the stepwise
Bayesian estimator can also be similar to the estimator proposed by
Mingoti (1999) for quadrat sampling. Some results of a simulation study
are presented as well as one application using abundance data and another
in the estimation of population size when capture and recapture methods
are used.
Journal: Journal of Applied Statistics
Pages: 651-670
Issue: 5
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050076461
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050076461
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:5:p:651-670
Template-Type: ReDIF-Article 1.0
Author-Name: Jason Abrevaya
Author-X-Name-First: Jason
Author-X-Name-Last: Abrevaya
Title: Testing for a treatment effect in a heterogeneous population: A modified sign-test statistic and a leapfrog statistic
Abstract:
This paper proposes two non-parametric statistics that test for a
treatment effect in a heterogeneous population. In the model considered,
data on two examinations for both a control and a treatment group are
needed to perform the test. The model allows for individual (fixed)
effects that may be correlated with the choice of treatment. In addition,
the model allows for an unspecified, monotonic transformation of the
response variable. The techniques are illustrated by testing whether high
levels of unemploymentbenefit eligibility affect the consumption patterns
of unemployed American workers.
Journal: Journal of Applied Statistics
Pages: 679-687
Issue: 6
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050081852
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050081852
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:6:p:679-687
Template-Type: ReDIF-Article 1.0
Author-Name: J. M. Tapia Garcia
Author-X-Name-First: J. M. Tapia
Author-X-Name-Last: Garcia
Author-Name: A. Martin Andres
Author-X-Name-First: A. Martin
Author-X-Name-Last: Andres
Title: Optimal unconditional critical regions for 2 2 2 multinomial trials
Abstract:
Analysing a 2 2 2 table is one of the most frequent problems in applied
research (particularly in epidemiology). When the table arises from a 2 2
2 multinomial trial (or the case of double dichotomy), the appropriate
test for independence is an unconditional one, like those of Barnard
(1947), which, although they date from a long time ago, have not been
developed (because of computational problems) until the last ten years.
Among the different possible versions, the optimal (Martin Andres & Tapia
Garcia, 1999) is Barnard's original one, but the calculation time (even
today) is excessive. This paper offers critical region tables for that
version, which behave well compared to those of Shuster (1992). The tables
are of particular use for researchers wishing to obtain significant
results for very small sample sizes (N h 50).
Journal: Journal of Applied Statistics
Pages: 689-695
Issue: 6
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050081861
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050081861
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:6:p:689-695
Template-Type: ReDIF-Article 1.0
Author-Name: Chung-Ho Chen
Author-X-Name-First: Chung-Ho
Author-X-Name-Last: Chen
Author-Name: Chao-Yu Chou
Author-X-Name-First: Chao-Yu
Author-X-Name-Last: Chou
Title: Design of a CSP-1 plan based on regret-balanced criterion
Abstract:
This article explores the problem of designing a CSP-1 plan with the
specified average outgoing quality limit (AOQL), the acceptable quality
level (AQL), and the limiting quality level (LQL) value. By adopting the
regret-balanced criterion under the producer's and consumer's interests of
quality, we can design the optimal CSP-1 plan.
Journal: Journal of Applied Statistics
Pages: 697-701
Issue: 6
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050081870
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050081870
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:6:p:697-701
Template-Type: ReDIF-Article 1.0
Author-Name: Wilfried De Corte
Author-X-Name-First: Wilfried
Author-X-Name-Last: De Corte
Title: Using order statistics to assess the sampling variability of personnel selection utility estimates
Abstract:
Virtually all models for the utility of personnel selection are based on
the average criterion score of the predictor selected applicants. This
paper indicates how standard results from the theory on order statistics
can be used to determine the expected value, the standard error and the
sampling distribution of the average criterion score statistic when a
finite number of employees is selected. Exact as well as approximate
results are derived and it is shown how these results can be used to
construct intervals that will contain, with a given probability 1 - f ,
the average criterion score associated with a particular implementation of
the personnel selection. These interval estimates are particularly helpful
to the selection practitioner because they can be used to state the
confidence level with which the selection payoff will be above a specific
value. In addition, for most realistic selection scenarios, it is found
that the corresponding utility interval estimate is quite large. For
situations in which multiple selections are performed over time, the
utility intervals are, however, smaller.
Journal: Journal of Applied Statistics
Pages: 703-713
Issue: 6
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050081889
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050081889
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:6:p:703-713
Template-Type: ReDIF-Article 1.0
Author-Name: D. K. Ghosh
Author-X-Name-First: D. K.
Author-X-Name-Last: Ghosh
Author-Name: P. C. Biswas
Author-X-Name-First: P. C.
Author-X-Name-Last: Biswas
Title: Robust designs for diallel crosses against the missing of one block
Abstract:
Dey & Midha (1996) showed that some of the complete diallel crosses
plans, obtained by using triangular partially balanced designs with two
associate classes, are optimal. In this investigation, it is derived that
these optimal designs for diallel crosses are robust also against the
unavailability of one block.
Journal: Journal of Applied Statistics
Pages: 715-723
Issue: 6
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050081898
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050081898
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:6:p:715-723
Template-Type: ReDIF-Article 1.0
Author-Name: K. Govindaraju
Author-X-Name-First: K.
Author-X-Name-Last: Govindaraju
Author-Name: M. Bebbington
Author-X-Name-First: M.
Author-X-Name-Last: Bebbington
Title: Combined continuous lot by lot acceptance sampling plan
Abstract:
For production processes involving low fraction non-conforming, the
sample sizes of the usual attribute inspection plans are very large. A
continuous sampling plan for such processes would also require either a
large clearance interval or a large sampling fraction. This paper
simplifies the approach of combining the lot by lot and continuous
sampling plans recommended by Pesotchinsky (1987) and provides various
performance measures for the combined plan. A discussion of the choice of
the parameters is also given.
Journal: Journal of Applied Statistics
Pages: 725-730
Issue: 6
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050081906
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050081906
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:6:p:725-730
Template-Type: ReDIF-Article 1.0
Author-Name: Jiin-Huarng Guo
Author-X-Name-First: Jiin-Huarng
Author-X-Name-Last: Guo
Author-Name: Wei-Ming Luh
Author-X-Name-First: Wei-Ming
Author-X-Name-Last: Luh
Title: Testing methods for the one-way fixed effects ANOVA models of log-normal samples
Abstract:
For one-way fixed effects of log-normal data with unequal variance, the
present study proposes a method to deal with heterogeneity. An appropriate
hypothesis testing is demonstrated; and one of the approximate tests, such
as the Alexander-Govern test, Welch test or James second-order test, is
applied to control Type I error rate. Monte Carlo simulation is used to
investigate the performance of the F test for log-scale, the F test for
original scale, the James second-order test, the Welch test, and the
Alexander-Govern test. The simulated results and real data analysis show
that the proposed method is valid and powerful.
Journal: Journal of Applied Statistics
Pages: 731-738
Issue: 6
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050081915
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050081915
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:6:p:731-738
Template-Type: ReDIF-Article 1.0
Author-Name: Dov Ingman
Author-X-Name-First: Dov
Author-X-Name-Last: Ingman
Author-Name: Boris Lipnik
Author-X-Name-First: Boris
Author-X-Name-Last: Lipnik
Title: Loss-based optimal control statistics for control charts
Abstract:
This work proposes a means for interconnecting optimal sample statistics
with parameters of the process output distribution irrespective of the
specific way in which these parameters change during transition to the
out-of-control state (jumps, trends, cycles, etc). The approach, based on
minimization of the loss incurred by the two types of decision errors,
leads to a unique sample statistic and, therefore, to a single control
chart. The optimal sample statistics are obtained as a solution of the
developed optional boundary equation. The paper demonstrates that, for
particular conditions, this equation leads to the same statistics as are
obtained through the Neyman-Pearson fundamental lemma. Application
examples of the approach when the process output distribution is Gamma and
Weibull are given. A special loss function representing out-of-control
state detection as a pattern recognition problem is presented.
Journal: Journal of Applied Statistics
Pages: 739-756
Issue: 6
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050081924
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050081924
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:6:p:739-756
Template-Type: ReDIF-Article 1.0
Author-Name: G. K. Kanji
Author-X-Name-First: G. K.
Author-X-Name-Last: Kanji
Author-Name: Osama Hasan Arif
Author-X-Name-First: Osama Hasan
Author-X-Name-Last: Arif
Title: Median rankit control chart by the quantile approach
Abstract:
It is desirable that the data for a statistical control chart be normally
distributed. However, if the data are not normal, then a transformation
can be used, e.g. Box-Cox transformations, to produce a suitable control
chart. In this paper we will discuss a quantile approach to produce a
control chart and to estimate median rankit for various non-normal
distributions. We will also provide examples of logistic data to indicate
how a quantile approach could be used to construct a control chart for a
non-normal distribution using a median rankit.
Journal: Journal of Applied Statistics
Pages: 757-770
Issue: 6
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050081933
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050081933
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:6:p:757-770
Template-Type: ReDIF-Article 1.0
Author-Name: V. Soundararajan
Author-X-Name-First: V.
Author-X-Name-Last: Soundararajan
Author-Name: M. Palanivel
Author-X-Name-First: M.
Author-X-Name-Last: Palanivel
Title: Quick switching variables single sampling (QSVSS) system indexed by AQL and AOQL
Abstract:
Procedures and tables are given for the selection of a 'Quick Switching
Single Sampling Variable System' for given AQL and AOQL, whenever rejected
lots are 100% inspected and for replacement of non-conforming units.
Journal: Journal of Applied Statistics
Pages: 771-778
Issue: 6
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050081942
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050081942
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:6:p:771-778
Template-Type: ReDIF-Article 1.0
Author-Name: Leopold Simar
Author-X-Name-First: Leopold
Author-X-Name-Last: Simar
Author-Name: Paul Wilson
Author-X-Name-First: Paul
Author-X-Name-Last: Wilson
Title: A general methodology for bootstrapping in non-parametric frontier models
Abstract:
The Data Envelopment Analysis method has been extensively used in the
literature to provide measures of firms' technical efficiency. These
measures allow rankings of firms by their apparent performance. The
underlying frontier model is non-parametric since no particular functional
form is assumed for the frontier model. Since the observations result from
some data-generating process, the statistical properties of the estimated
efficiency measures are essential for their interpretations. In the
general multi-output multi-input framework, the bootstrap seems to offer
the only means of inferring these properties (i.e. to estimate the bias
and variance, and to construct confidence intervals). This paper proposes
a general methodology for bootstrapping in frontier models, extending the
more restrictive method proposed in Simar & Wilson (1998) by allowing for
heterogeneity in the structure of efficiency. A numerical illustration
with real data is provided to illustrate the methodology.
Journal: Journal of Applied Statistics
Pages: 779-802
Issue: 6
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050081951
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050081951
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:6:p:779-802
Template-Type: ReDIF-Article 1.0
Author-Name: A. Martin Andres
Author-X-Name-First: A. Martin
Author-X-Name-Last: Andres
Author-Name: I. Herranz Tejedor
Author-X-Name-First: I. Herranz
Author-X-Name-Last: Tejedor
Title: On the minimum expected quantity for the validity of the chi-squared test in 2 2 2 tables
Abstract:
A 2 2 2 contingency table can often be analysed in an exact fashion by
using Fisher's exact test and in an approximate fashion by using the
chi-squared test with Yates' continuity correction, and it is
traditionally held that the approximation is valid when the minimum
expected quantity E is E S 5. Unfortunately, little research has been
carried out into this belief, other than that it is necessary to establish
a bound E>E*, that the condition E S 5 may not be the most
appropriate (Martin Andres et al., 1992) and that E* is not a constant,
but usually increasing with the growth of the sample size (Martin Andres &
Herranz Tejedor, 1997). In this paper, the authors conduct a theoretical
experimental study from which they ascertain that E* value (which is very
variable and frequently quite a lot greater than 5) is strongly related to
the magnitude of the skewness of the underlying hypergeometric
distribution, and that bounding the skewness is equivalent to bounding E
(which is the best control procedure). The study enables estimating the
expression for the above-mentioned E* (which in turn depends on the number
of tails in the test, the alpha error used, the total sample size, and the
minimum marginal imbalance) to be estimated. Also the authors show that E*
increases generally with the sample size and with the marginal imbalance,
although it does reach a maximum. Some general and very conservative
validity conditions are E S 35.53 (one-tailed test) and E S 7.45
(two-tailed test) for alpha nominal errors in 1% h f h 10%. The
traditional condition E S 5 is only valid when the samples are small and
one of the marginals is very balanced; alternatively, the condition E S
5.5 is valid for small samples or a very balanced marginal. Finally, it is
proved that the chi-squared test is always valid in tables where both
marginals are balanced, and that the maximum skewness permitted is related
to the maximum value of the bound E*, to its value for tables with at
least one balanced marginal and to the minimum value that those marginals
must have (in non-balanced tables) for the chi-squared test to be valid.
Journal: Journal of Applied Statistics
Pages: 807-820
Issue: 7
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050120506
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050120506
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:7:p:807-820
Template-Type: ReDIF-Article 1.0
Author-Name: Jesus Gonzalo
Author-X-Name-First: Jesus
Author-X-Name-Last: Gonzalo
Author-Name: Tae-Hwy Lee
Author-X-Name-First: Tae-Hwy
Author-X-Name-Last: Lee
Title: On the robustness of cointegration tests when series are fractionally intergrated
Abstract:
This paper shows that when series are fractionally integrated, but unit
root tests wrongly indicate that they are I(1), Johansen likelihood ratio
(LR) tests tend to find too much spurious cointegration, while the
Engle-Granger test presents a more robust performance. This result holds
asymptotically as well as infinite samples. The different performance of
these two methods is due to the fact that they are based on different
principles. The Johansen procedure is based on maximizing correlations
(canonical correlation) while Engle-Granger minimizes variances (in the
spirit of principal components).
Journal: Journal of Applied Statistics
Pages: 821-827
Issue: 7
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050120515
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050120515
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:7:p:821-827
Template-Type: ReDIF-Article 1.0
Author-Name: K. Govindaraju
Author-X-Name-First: K.
Author-X-Name-Last: Govindaraju
Author-Name: C. Kandasamy
Author-X-Name-First: C.
Author-X-Name-Last: Kandasamy
Title: Design of generalized CSP-C continuous sampling plan
Abstract:
In this paper, the concept acceptance number has been incorporated to the
single level continuous sampling plan CSP-1. The advantage of the proposed
plan, designated as the CSP-C plan, is to achieve a reduction in the
average fraction inspected at good quality levels. Nomographs for the
design of the proposed plan are presented. The expressions of the
performance measures for this new plan such as OC, AOQ and AFI are also
provided.
Journal: Journal of Applied Statistics
Pages: 829-841
Issue: 7
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050120524
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050120524
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:7:p:829-841
Template-Type: ReDIF-Article 1.0
Author-Name: Wei-Ming Luh
Author-X-Name-First: Wei-Ming
Author-X-Name-Last: Luh
Author-Name: Jiin-Huarng Guo
Author-X-Name-First: Jiin-Huarng
Author-X-Name-Last: Guo
Title: Approximate transformation trimmed mean methods to the test of simple linear regression slope equality
Abstract:
To deal with the problem of non-normality and heteroscedasticity, the
current study proposes applying approximate transformation trimmed mean
methods to the test of simple linear regression slope equality. The
distribution-free slope estimates are first trimmed on both sides and then
the test statistic t is transformed by Johnson's method for each group to
correct non-normality. Lastly, an approximate test such as the James
second-order test, the Welch test, or the DeShon-Alexander test, which are
robust for heterogeneous variances, is applied to test the equality of
regression slopes. Bootstrap methods and Monte Carlo simulation results
show that the proposed methods provide protection against both unusual y
values, as well as unusual x values. The new methods are valid
alternatives for testing the simple linear regression slopes when
heteroscedastic variances and nonnormality are present.
Journal: Journal of Applied Statistics
Pages: 843-857
Issue: 7
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050120533
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050120533
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:7:p:843-857
Template-Type: ReDIF-Article 1.0
Author-Name: Arthur Pewsey
Author-X-Name-First: Arthur
Author-X-Name-Last: Pewsey
Title: Problems of inference for Azzalini's skewnormal distribution
Abstract:
This paper considers various unresolved inference problems for the
skewnormal distribution. We give reasons as to why the direct
parameterization should not be used as a general basis for estimation, and
consider method of moments and maximum likelihood estimation for the
distribution's centred parameterization. Large sample theory results are
given for the method of moments estimators, and numerical approaches for
obtaining maximum likelihood estimates are discussed. Simulation is used
to assess the performance of the two types of estimation. We also present
procedures for testing for departures from the limiting folded normal
distribution. Data on the percentage body fat of elite athletes are used
to illustrate some of the issues raised.
Journal: Journal of Applied Statistics
Pages: 859-870
Issue: 7
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050120542
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050120542
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:7:p:859-870
Template-Type: ReDIF-Article 1.0
Author-Name: Man-Lai Tang
Author-X-Name-First: Man-Lai
Author-X-Name-Last: Tang
Title: On tests of linearity for dose response data: Asymptotic, exact conditional and exact unconditional tests
Abstract:
The approximate chi-square statistic, X 2 Q , which is calculated as the
difference between the usual chi-square statistic for heterogeneity and
the Cochran-Armitage trend test statistic, has been widely applied to test
the linearity assumption for dose-response data. This statistic can be
shown to be asymptotically distributed as chi-square with K - 2 degrees of
freedom. However, this asymptotic property could be quite questionable if
the sample size is small, or if there is a high degree of sparseness or
imbalance in the data. In this article, we consider how exact tests based
on this X 2 Q statistic can be performed. Both the exact conditional and
unconditional versions will be studied. Interesting findings include: (i)
the exact conditional test is extremely sensitive to a small change in
dosages, which may eventually produce a degenerate exact conditional
distribution; and (ii) the exact unconditional test avoids the problem of
degenerate distribution and is shown to be less sensitive to the change in
dosages. A real example involving an animal carcinogenesis experiment as
well as a fictitious data set will be used for illustration purposes.
Journal: Journal of Applied Statistics
Pages: 871-880
Issue: 7
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050120551
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050120551
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:7:p:871-880
Template-Type: ReDIF-Article 1.0
Author-Name: Stephen Salter
Author-X-Name-First: Stephen
Author-X-Name-Last: Salter
Author-Name: Neville Topham
Author-X-Name-First: Neville
Author-X-Name-Last: Topham
Title: Side betting and playing the National Lottery: An exercise in policy design
Abstract:
This paper demonstrates a methodology for estimating the frequencies by
which numbers are selected by National Lottery players by utilizing a
twofold approach of a Multi-Response Non-Linear Regression model in
conjunction with a suggested approximation function for number selections,
which leads to an explanation of number choice in terms of the spatial
effects of form design. It shows that in a marketplace, side betting is
complementary with the main online draw product, and market forces produce
close substitutes if side betting on the National Lottery is prohibited.
Journal: Journal of Applied Statistics
Pages: 881-899
Issue: 7
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050120560
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050120560
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:7:p:881-899
Template-Type: ReDIF-Article 1.0
Author-Name: R. Vijayaraghavan
Author-X-Name-First: R.
Author-X-Name-Last: Vijayaraghavan
Title: Design and evaluation of skip-lot sampling plans of type SkSP-3
Abstract:
This paper presents a system of skip-lot sampling inspection plans
designated as SkSP-3 based on the principle of a continuous sampling plan
of type CSP-2. Expressions for performance measures such as Operating
Characteristic function and ASN function are derived by the Markov chain
approach. Selection of SkSP-3 with a single sampling plan having
acceptance number zero as the reference plan is discussed.
Journal: Journal of Applied Statistics
Pages: 901-908
Issue: 7
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050120579
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050120579
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:7:p:901-908
Template-Type: ReDIF-Article 1.0
Author-Name: George Wesolowsky
Author-X-Name-First: George
Author-X-Name-Last: Wesolowsky
Title: Detecting excessive similarity in answers on multiple choice exams
Abstract:
This paper provides a simple and robust method for detecting cheating.
Unlike some methods, non-cheating behaviour and not cheating behaviour is
modelled because this requires the fewest assumptions. The main concern is
the prevention of false accusations. The model is suitable for screening
large classes and the results are simple to interpret. Simulation and the
Bonferroni inequality are used to prevent false accusation due to 'data
dredging'. The model has received considerable application in practice and
has been verified through the adjacent seating method.
Journal: Journal of Applied Statistics
Pages: 909-921
Issue: 7
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050120588
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050120588
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:7:p:909-921
Template-Type: ReDIF-Article 1.0
Author-Name: Beatrice Giglio
Author-X-Name-First: Beatrice
Author-X-Name-Last: Giglio
Author-Name: Eva Riccomagno
Author-X-Name-First: Eva
Author-X-Name-Last: Riccomagno
Author-Name: Henry Wynn
Author-X-Name-First: Henry
Author-X-Name-Last: Wynn
Title: Gro¨bner basis strategies in regression
Abstract:
The Grobner basis method in experimental design (Pistone & Wynn, 1996) is
developed in a practical setting. The computational algebraic techniques
(Grobner bases in particular) are coupled with statistical strategies and
the links to more standard approaches made. A new method of analysing a
non-orthogonal experiment based on the Grobner basis method is introduced.
Examples are given utilizing the approaches.
Journal: Journal of Applied Statistics
Pages: 923-938
Issue: 7
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050120597
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050120597
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:7:p:923-938
Template-Type: ReDIF-Article 1.0
Author-Name: Chao-Yu Chou
Author-X-Name-First: Chao-Yu
Author-X-Name-Last: Chou
Author-Name: Chung-Ho Chen
Author-X-Name-First: Chung-Ho
Author-X-Name-Last: Chen
Author-Name: Hui-Rong Liu
Author-X-Name-First: Hui-Rong
Author-X-Name-Last: Liu
Title: Economic-statistical design of X ¥ charts for non-normal data by considering quality loss
Abstract:
When the X ¥ control chart is used to monitor a process, three
parameters should be determined: the sample size, the sampling interval
between successive samples, and the control limits of the chart. Duncan
presented a cost model to determine the three parameters for an X ¥
chart. Alexander et al. combined Duncan's cost model with the Taguchi loss
function to present a loss model for determining the three parameters. In
this paper, the Burr distribution is employed to conduct the
economic-statistical design of X ¥ charts for non-normal data.
Alexander's loss model is used as the objective function, and the
cumulative function of the Burr distribution is applied to derive the
statistical constraints of the design. An example is presented to
illustrate the solution procedure. From the results of the sensitivity
analyses, we find that small values of the skewness coefficient have no
significant effect on the optimal design; however, a larger value of
skewness coefficient leads to a slightly larger sample size and sampling
interval, as well as wider control limits. Meanwhile, an increase on the
kurtosis coefficient results in an increase on the sample size and wider
control limits.
Journal: Journal of Applied Statistics
Pages: 939-951
Issue: 8
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050173274
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050173274
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:8:p:939-951
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Fader
Author-X-Name-First: Peter
Author-X-Name-Last: Fader
Author-Name: Bruce Hardie
Author-X-Name-First: Bruce
Author-X-Name-Last: Hardie
Title: A note on modelling underreported Poisson counts
Abstract:
In this paper we present a parsimonious model for the analysis of
underreported Poisson count data. In contrast to previously developed
methods, we are able to derive analytic expressions for the key marginal
posterior distributions that are of interest. The usefulness of this model
is explored via a re-examination of previously analysed data covering the
purchasing of port wine (Ramos, 1999).
Journal: Journal of Applied Statistics
Pages: 953-964
Issue: 8
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050173283
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050173283
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:8:p:953-964
Template-Type: ReDIF-Article 1.0
Author-Name: Wei-Ming Luh
Author-X-Name-First: Wei-Ming
Author-X-Name-Last: Luh
Author-Name: Jiin-Huarng Guo
Author-X-Name-First: Jiin-Huarng
Author-X-Name-Last: Guo
Title: Johnson's transformation two-sample trimmed t and its bootstrap method for heterogeneity and non-normality
Abstract:
The present study investigates the performance of Johnson's
transformation trimmed t statistic, Welch's t test, Yuen's trimmed t ,
Johnson's transformation untrimmed t test, and the corresponding bootstrap
methods for the two-sample case with small/unequal sample sizes when the
distribution is non-normal and variances are heterogeneous. The Monte
Carlo simulation is conducted in two-sided as well as one-sided tests.
When the variance is proportional to the sample size, Yuen's trimmed t is
as good as Johnson's transformation trimmed t . However, when the variance
is disproportional to the sample size, the bootstrap Yuen's trimmed t and
the bootstrap Johnson's transformation trimmed t are recommended in
one-sided tests. For two-sided tests, Johnson's transformation trimmed t
is not only valid but also powerful in comparison to the bootstrap
methods.
Journal: Journal of Applied Statistics
Pages: 965-973
Issue: 8
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050173292
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050173292
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:8:p:965-973
Template-Type: ReDIF-Article 1.0
Author-Name: Steen Magnussen
Author-X-Name-First: Steen
Author-X-Name-Last: Magnussen
Title: Unequal probability sampling in fixed area plots of stem volume with and without prior inclusion probabilities
Abstract:
The impact of guessing auxiliary population attributes, as opposed to
relying on actual values from a prior survey, was quantified for three
unequal probability sampling methods of tree stem volume (biomass).
Reasonable prior guesses (no-list sampling) yielded, in five populations
and 35 combinations of population size and sample size, results at par
with sampling with known auxiliary predictors (list sampling). Realized
sample sizes were slightly inflated in no-list sampling with probability
proportional to predictions ( PPP ). Mean absolute differences from true
totals and root mean square errors in no-list-sampling schemes were only
slightly above those achieved with list sampling. Stratified sampling
generally outperformed PPP and systematic sampling, yet the latter is
recommended due to consistency between observed and expected mean square
errors and overall robustness against a systematic bias in no-list
settings.
Journal: Journal of Applied Statistics
Pages: 975-990
Issue: 8
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050173300
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050173300
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:8:p:975-990
Template-Type: ReDIF-Article 1.0
Author-Name: Carlos Mate
Author-X-Name-First: Carlos
Author-X-Name-Last: Mate
Author-Name: Rafael Calderon
Author-X-Name-First: Rafael
Author-X-Name-Last: Calderon
Title: Exploring the characteristics of rotating electric machines with factor analysis
Abstract:
Applications of multivariate statistics in engineering are hard to find,
apart from those in quality control. However, we think that further
insight into some technological cases may be gained by using adequate
multivariate analysis tools. In this paper, we propose a review of the key
parameters of rotating electric machines with factor analysis. This
statistical technique allows not only the reduction of the dimension of
the case we are analysing, but also reveals subtle relationships between
the variables under study. We show an application of this methodology by
studying the interrelations between the key variables in an electric
machine, in this case the squirrel-cage induction motor. Through a
step-by-step presentation of the case study, we deal with some of the
topics an applied researcher may face, such as the rotation of the
original factors, the extraction of higher-order factors and the
development of the exploratory model. As a result, we present a worthwhile
framework to both confirm our previous knowledge and capture unexplored
facts. Moreover, it may provide a new approach to describing and
understanding the design, performance and operating characteristics of
these machines.
Journal: Journal of Applied Statistics
Pages: 991-1006
Issue: 8
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050173319
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050173319
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:8:p:991-1006
Template-Type: ReDIF-Article 1.0
Author-Name: J. A. Nelder
Author-X-Name-First: J. A.
Author-X-Name-Last: Nelder
Title: Quasi-likelihood and pseudo-likelihood are not the same thing
Abstract:
Models described as using quasi-likelihood (QL) are often using a
different approach based on the normal likelihood, which I call
pseudo-likelihood. The two approaches are described and contrasted, and an
example is used to illustrate the advantages of the QL approach proper.
Journal: Journal of Applied Statistics
Pages: 1007-1011
Issue: 8
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050173328
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050173328
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:8:p:1007-1011
Template-Type: ReDIF-Article 1.0
Author-Name: M. K. Sharma
Author-X-Name-First: M. K.
Author-X-Name-Last: Sharma
Title: Application of PBIB designs in CDC Method IV
Abstract:
In this paper we propose the use of some partially balanced incomplete
block designs for blocking in complete diallel cross Method IV (Griffing,
1956) to deal with the situation when it is not desirable for all crosses
to be accommodated in the block of a traditional randomized block design.
A method is also proposed to analyse the MatingEnvironment designs for
estimating the general combining ability effect of lines.
Journal: Journal of Applied Statistics
Pages: 1013-1019
Issue: 8
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050173337
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050173337
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:8:p:1013-1019
Template-Type: ReDIF-Article 1.0
Author-Name: Ghazi Shukur
Author-X-Name-First: Ghazi
Author-X-Name-Last: Shukur
Author-Name: Panagiotis Mantalos
Author-X-Name-First: Panagiotis
Author-X-Name-Last: Mantalos
Title: A simple investigation of the Granger-causality test in integrated-cointegrated VAR systems
Abstract:
The size and power of various generalization tests for the
Granger-causality in integrated-cointegrated VAR systems are considered.
By using Monte Carlo methods, properties of eight versions of the test are
studied in two different forms, the standard form and the modified form by
Dolado & Lutkepohl (1996) in a study confined to properties of the Wald
test only. In their study as well as in ours, both the standard and the
modified Wald tests are shown to perform badly especially in small
samples. We find, however, that the corrected LR tests exhibit correct
size even in small samples. The power of the test is higher when the true
VAR(2) model is estimated, and the modified test loses information by
estimating the extra coefficients. The same is true when considering the
power results in the VAR(3) model, and the power of the tests is somewhat
lower than those in the VAR(2).
Journal: Journal of Applied Statistics
Pages: 1021-1031
Issue: 8
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050173346
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050173346
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:8:p:1021-1031
Template-Type: ReDIF-Article 1.0
Author-Name: Siu Keung Tse
Author-X-Name-First: Siu Keung
Author-X-Name-Last: Tse
Author-Name: Chunyan Yang
Author-X-Name-First: Chunyan
Author-X-Name-Last: Yang
Author-Name: Hak-Keung Yuen
Author-X-Name-First: Hak-Keung
Author-X-Name-Last: Yuen
Title: Statistical analysis of Weibull distributed lifetime data under Type II progressive censoring with binomial removals
Abstract:
This paper considers the analysis of Weibull distributed lifetime data
observed under Type II progressive censoring with random removals, where
the number of units removed at each failure time follows a binomial
distribution. Maximum likelihood estimators of the parameters and their
asymptotic variances are derived. The expected time required to complete
the life test under this censoring scheme is investigated.
Journal: Journal of Applied Statistics
Pages: 1033-1043
Issue: 8
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050173355
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050173355
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:8:p:1033-1043
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Wright
Author-X-Name-First: Peter
Author-X-Name-Last: Wright
Title: Choosing a lower specification limit for an exponential process with 'the larger the better' tolerance: A simple, exact solution
Abstract:
Chen (1999) proposed an economic design, using Taguchi's quality loss
function, for choosing a producer's lower specification limit eta for a
product with a quality characteristic that has an exponential distribution
with mean θ and 'the larger the better' tolerance. Chen (1999)
developed an approximate solution that is applicable when 0.5 r m /θ
r 0.7 and that requires numerical minimization. We derive a simple, exact
solution that is applicable for all values of m /θ and does not
require numerical minimization.
Journal: Journal of Applied Statistics
Pages: 1045-1049
Issue: 8
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050173364
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050173364
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:8:p:1045-1049
Template-Type: ReDIF-Article 1.0
Author-Name: Zhenlin Yang
Author-X-Name-First: Zhenlin
Author-X-Name-Last: Yang
Author-Name: Min Xie
Author-X-Name-First: Min
Author-X-Name-Last: Xie
Title: Process monitoring of exponentially distributed characteristics through an optimal normalizing transformation
Abstract:
Many process characteristics follow an exponential distribution, and
control charts based on such a distribution have attracted a lot of
attention. However, traditional control limits may be not appropriate
because of the lack of symmetry. In this paper, process monitoring through
a normalizing power transformation is studied. The traditional individual
measurement control charts can be used based on the transformed data. The
properties of this control chart are investigated. A comparison with the
chart when using probability limits is also carried out for cases of known
and estimated parameters. Without losing much accuracy, even compared with
the exact probability limits, the power transformation approach can easily
be used to produce charts that can be interpreted when the normality
assumption is valid.
Journal: Journal of Applied Statistics
Pages: 1051-1063
Issue: 8
Volume: 27
Year: 2000
X-DOI: 10.1080/02664760050173373
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760050173373
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:27:y:2000:i:8:p:1051-1063
Template-Type: ReDIF-Article 1.0
Author-Name: S. A. Al-Awadhi
Author-X-Name-First: S. A.
Author-X-Name-Last: Al-Awadhi
Author-Name: P. H. Garthwaite
Author-X-Name-First: P. H.
Author-X-Name-Last: Garthwaite
Title: Prior distribution assessment for a multivariate normal distribution: An experimental study
Abstract:
A variety of methods of eliciting a prior distribution for a multivariate
normal (MVN) distribution have recently been proposed. This paper reports
an experiment in which 16 meteorologists used the methods to quantify
their opinions about climatology variables. Our results compare prior
models and show, in particular, that it can be better to assume the mean
and variance of an MVN distribution are independent a priori, rather than
to model opinion by the conjugate prior distribution. Using a proper
scoring rule, different forms of assessment task are examined and
alternative ways of estimating parameters are compared. To quantify
opinion about means, it proved preferable to ask directly about the means
rather than individual observations while, to quantify opinion about the
variance matrix, it was best to ask about deviations from the mean.
Further results include recommendations for the way parameters of the
prior distribution are estimated.
Journal: Journal of Applied Statistics
Pages: 5-23
Issue: 1
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120011563
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120011563
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:1:p:5-23
Template-Type: ReDIF-Article 1.0
Author-Name: P. K. Tsay
Author-X-Name-First: P. K.
Author-X-Name-Last: Tsay
Author-Name: A. Chao
Author-X-Name-First: A.
Author-X-Name-Last: Chao
Title: Population size estimation for capture-recapture models with applications to epidemiological data
Abstract:
The capture-recapture method is applied to estimate the population size
of a target population based on ascertainment data in epidemiological
applications. We generalize the three-list case of Chao & Tsay (1998) to
situations where more than three lists are available. An estimation
procedure is presented using the concept of sample coverage, which can be
interpreted as a measure of overlap information among multiple list
records. When there is enough overlap, an estimator of the total
population size is proposed. The bootstrap method is used to construct a
variance estimator and confidence interval. If the overlap rate is
relatively low, then the population size cannot be precisely estimated and
thus only a lower (upper) bound is proposed for positively (negatively)
dependent lists. The proposed method is applied to two data sets, one with
a high and one with a low overlap rate.
Journal: Journal of Applied Statistics
Pages: 25-36
Issue: 1
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120011572
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120011572
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:1:p:25-36
Template-Type: ReDIF-Article 1.0
Author-Name: Camil Fuchs
Author-X-Name-First: Camil
Author-X-Name-Last: Fuchs
Author-Name: Morton Brown
Author-X-Name-First: Morton
Author-X-Name-Last: Brown
Title: Summary measurements and screening in clinical trials with replicate observations
Abstract:
Repeating measurements of efficacy variables in clinical trials may be
desirable when the measurement may be affected by ambient conditions. When
such measurements are repeated at baseline and at the end of therapy,
statistical questions relate to: (1) the best summary measurement to use
for a subject when there is a possibility that some observations are
contaminated and have increased variances; and (2) the effect of screening
procedures which exclude outliers based on within- and between-subject
contamination tests. We study these issues in two stages, each using a
different set of models. The first stage deals only with the choice of the
summary measure. The simulation results show that in some cases of
contamination, the power achieved by the tests based on the median exceeds
that achieved by the tests based on the mean of the replicates. However,
even when we use the median, there are cases when contamination leads to a
considerable loss in power. The combined issue of the best summary
measurement and the effect of screening is studied in the second stage.
The tests use either the observed data or the data after screening for
outliers. The simulation results demonstrate that the power depends on the
screening procedure as well as on the test statistic used in the study. We
found that for the extent and magnitude of contamination considered,
within-subject screening has a minimal effect on the power of the tests
when there are at least three replicates; as a result, we found no
advantage in the use of screening procedures for within-subject
contamination. On the other hand, the use of a between-subject screening
for outliers increases the power of the test procedures. However, even
with the use of screening procedures, heterogeneity of variances can
greatly reduce the power of the study.
Journal: Journal of Applied Statistics
Pages: 37-51
Issue: 1
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120011581
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120011581
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:1:p:37-51
Template-Type: ReDIF-Article 1.0
Author-Name: D. Gregori
Author-X-Name-First: D.
Author-X-Name-Last: Gregori
Author-Name: C. Rocco
Author-X-Name-First: C.
Author-X-Name-Last: Rocco
Author-Name: S. Miocic
Author-X-Name-First: S.
Author-X-Name-Last: Miocic
Author-Name: L. Mestroni
Author-X-Name-First: L.
Author-X-Name-Last: Mestroni
Title: Estimating the frequency of familial dilated cardiomyopathy in the presence of misclassification errors
Abstract:
Dilated cardiomyopathy is a disease of unknown cause characterized by
dilation and impaired function of one or both ventricles. Most cases are
believed to be sporadic, although familial forms have been detected. The
familial form has been estimated to have a relative frequency of about
25%. Since, except for familial history, familial form has no other
characteristics that could help in classifying the two diseases, the
estimate of the frequency of the familial form should take into account a
possible misclassification error. In our study, 100 cases were randomly
selected in a prospective series of 350 patients. Out of them, 28 index
cases were included in the analysis: 12 were known to be familial, and 88
were believed to be sporadic. After extensive clinical examination of the
relatives, 3 patients supposed to have a sporadic form were found to have
a familial form. 13 cases had a confirmed sporadic disease. Models in the
Log-Linear Product class (LLP) have been used to separate classification
errors from underlying patterns of disease incidence. The most
conservative crude estimate of the misclassification error is 16.1% (CI
0.22- 23.27%), which leads to a crude estimate of the frequency of the
familiar form of about 60%. An estimate of the disease frequency, adjusted
for taking into consideration the sampling plan, is 40.93% (CI
32.29-44.17%). The results are consistent with the hypothesis that genetic
factors are still underestimated, although they represent a major cause of
the disease.
Journal: Journal of Applied Statistics
Pages: 53-62
Issue: 1
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120011590
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120011590
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:1:p:53-62
Template-Type: ReDIF-Article 1.0
Author-Name: Krishan Lal
Author-X-Name-First: Krishan
Author-X-Name-Last: Lal
Author-Name: V. K. Gupta
Author-X-Name-First: V. K.
Author-X-Name-Last: Gupta
Author-Name: Lalmohan Bhar
Author-X-Name-First: Lalmohan
Author-X-Name-Last: Bhar
Title: Robustness of designed experiments against missing data
Abstract:
This paper investigates the robustness of designed experiments for
estimating linear functions of a subset of parameters in a general linear
model against the loss of any t( U 1) observations. Necessary and
sufficient conditions for robustness of a design under a homoscedastic
model are derived. It is shown that a design robust under a homoscedastic
model is also robust under a general heteroscedastic model with correlated
observations. As a particular case, necessary and sufficient conditions
are obtained for the robustness of block designs against the loss of data.
Simple sufficient conditions are also provided for the binary block
designs to be robust against the loss of data. Some classes of designs,
robust up to three missing observations, are identified. A-efficiency of
the residual design is evaluated for certain block designs for several
patterns of two missing observations. The efficiency of the residual
design has also been worked out when all the observations in any two
blocks, not necessarily disjoint, are lost. The lower bound to
A-efficiency has also been obtained for the loss of t observations.
Finally, a general expression is obtained for the efficiency of the
residual design when all the observations of m ( U 1) disjoint blocks are
lost.
Journal: Journal of Applied Statistics
Pages: 63-79
Issue: 1
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120011608
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120011608
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:1:p:63-79
Template-Type: ReDIF-Article 1.0
Author-Name: P. J. Lindsey
Author-X-Name-First: P. J.
Author-X-Name-Last: Lindsey
Title: Adapting sample size calculations to repeated measurements in clinical trials
Abstract:
Many of the repeated-measures sample size calculation methods presented
in the literature are not suitable when: ” the different treatments
are assumed to be equal on average at baseline time due to randomization,
” and the experimenters are interested in a pre-specified
difference to be detected after a specific time period. The method
presented here has been developed for those cases where a multivariate
normal distribution can reasonably be assumed. It is likelihood-based and
has been designed to be flexible enough to handle repeated-measures
models, including a non-linear change in time, and an arbitrary
correlation structure.
Journal: Journal of Applied Statistics
Pages: 81-89
Issue: 1
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120011617
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120011617
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:1:p:81-89
Template-Type: ReDIF-Article 1.0
Author-Name: Dayanand Naik
Author-X-Name-First: Dayanand
Author-X-Name-Last: Naik
Author-Name: Shantha Rao
Author-X-Name-First: Shantha
Author-X-Name-Last: Rao
Title: Analysis of multivariate repeated measures data with a Kronecker product structured covariance matrix
Abstract:
In this article we consider a set of t repeated measurements on p
variables (or characteristics) on each of the n individuals. Thus, data on
each individual is a p 2 t matrix. The n individuals themselves may be
divided and randomly assigned to g groups. Analysis of these data using a
MANOVA model, assuming that the data on an individual has a covariance
matrix which is a Kronecker product of two positive definite matrices, is
considered. The well-known Satterthwaite type approximation to the
distribution of a quadratic form in normal variables is extended to the
distribution of a multivariate quadratic form in multivariate normal
variables. The multivariate tests using this approximation are developed
for testing the usual hypotheses. Results are illustrated on a data set. A
method for analysing unbalanced data is also discussed.
Journal: Journal of Applied Statistics
Pages: 91-105
Issue: 1
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120011626
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120011626
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:1:p:91-105
Template-Type: ReDIF-Article 1.0
Author-Name: Raffaella Piccarreta
Author-X-Name-First: Raffaella
Author-X-Name-Last: Piccarreta
Title: A new measure of nominal-ordinal association
Abstract:
A new measure for evaluating the strength of the association between a
nominal variable and an ordered categorical response variable is
introduced. The introduction of a new measure is justified by analysing
the characteristics of a measure of the nominal-ordinal association
proposed by Agresti (1981), especially with respect to the problem of the
'choice' of a predictive variable. The sample-based version of the index
is studied, and its asymptotic standard error and asymptotic distribution
are derived. Simulations are considered to evaluate the adequacy of the
asymptotic approximation determined, following Goodman & Kruskal (1963).
Journal: Journal of Applied Statistics
Pages: 107-120
Issue: 1
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120011635
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120011635
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:1:p:107-120
Template-Type: ReDIF-Article 1.0
Author-Name: R. R. L. Kantam
Author-X-Name-First: R. R. L.
Author-X-Name-Last: Kantam
Author-Name: K. Rosaiah
Author-X-Name-First: K.
Author-X-Name-Last: Rosaiah
Author-Name: G. Srinivasa Rao
Author-X-Name-First: G. Srinivasa
Author-X-Name-Last: Rao
Title: Acceptance sampling based on life tests: Log-logistic model
Abstract:
The problem of acceptance sampling when the life test is truncated at a
preassigned time is considered. For various acceptance numbers, confidence
levels and values of the ratio of the fixed experimental time to the
specified average life, the minimum sample size necessary to ensure the
specified average life, are obtained under the assumption that the
lifetime variate of the test items follows a distribution belonging to
Burr's family XII of distributions - called the log-logistic model. The
operating characteristic values of the sampling plans and producer's risk
are presented. The results are illustrated by an example.
Journal: Journal of Applied Statistics
Pages: 121-128
Issue: 1
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120011644
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120011644
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:1:p:121-128
Template-Type: ReDIF-Article 1.0
Author-Name: Ashis Sengupta
Author-X-Name-First: Ashis
Author-X-Name-Last: Sengupta
Author-Name: Chandranath Pal
Author-X-Name-First: Chandranath
Author-X-Name-Last: Pal
Title: On optimal tests for isotropy against the symmetric wrapped stable-circular uniform mixture family
Abstract:
The family of Symmetric Wrapped Stable (SWS) distributions can be widely
used for modelling circular data. Mixtures of Circular Uniform (CU) with
the former also have applications as a larger family of circular
distributions to incorporate possible outliers. Restricting ourselves to
such a mixture, we derive the locally most powerful invariant (LMPI) test
for the hypothesis of isotropy or randomness of directions-expressed in
terms of the null value of the mixing proportion, p, in the model. Global
monotonicity of the power function of the test is established. The test is
also consistent. Power values of the test for some selected parameter
combinations, obtained through simulation reveal quite encouraging
performances even for moderate sample sizes. The P 3 approach (SenGupta,
1991; Pal & SenGupta, 2000) for unknown p and rho and the non-regular case
of unknown a, the index parameter, are also discussed. A real-life example
is presented to illustrate the inadequacy of the circular normal
distribution as a circular model. This example is also used to demonstrate
the applications of the LMPI test, optimal P 3 test and a Daviesmotivated
test (Davies, 1977, 1987). Finally, a goodness-of-fit test performed on
the data establishes the plausibility of the above SWS-CU mixture model
for real-life problems encountered in practical situations.
Journal: Journal of Applied Statistics
Pages: 129-143
Issue: 1
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120011653
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120011653
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:1:p:129-143
Template-Type: ReDIF-Article 1.0
Author-Name: Mehmetcik Bayazit
Author-X-Name-First: Mehmetcik
Author-X-Name-Last: Bayazit
Author-Name: Hafzullah Aksoy
Author-X-Name-First: Hafzullah
Author-X-Name-Last: Aksoy
Title: Using wavelets for data generation
Abstract:
Wavelets are proposed as a non-parametric data generation tool. The idea
behind the suggested method is decomposition of data into its details and
later reconstruction by summation of the details randomly to generate new
data. A Haar wavelet is used because of its simplicity. The method is
applied to annual and monthly streamflow series taken from Turkey and USA.
It is found to give good results for non-skewed data, as well as in the
presence of auto-correlation.
Journal: Journal of Applied Statistics
Pages: 157-166
Issue: 2
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760020016073
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760020016073
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:2:p:157-166
Template-Type: ReDIF-Article 1.0
Author-Name: Teresa Aparicio
Author-X-Name-First: Teresa
Author-X-Name-Last: Aparicio
Author-Name: Inmaculada Villanua
Author-X-Name-First: Inmaculada
Author-X-Name-Last: Villanua
Title: The asymptotically efficient version of the information matrix test in binary choice models. A study of size and power
Abstract:
As Newey (1985) and Orme (1988) argue in the context of discrete binary
choice models, the test of the information matrix (IM) is sensitive to
heteroscedasticity and the incorrect distribution of the error term, with
both these problems leading to inconsistency of the estimators obtained.
This paper uses simulation experiments to analyse the size and power of
the asymptotically efficient version of this test, with the aim of
obtaining evidence on its capacity to detect such specification errors,
considering different alternatives.
Journal: Journal of Applied Statistics
Pages: 167-182
Issue: 2
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760020016082
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760020016082
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:2:p:167-182
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Congdon
Author-X-Name-First: Peter
Author-X-Name-Last: Congdon
Title: Predicting adverse infant health outcomes using routine screening variables: Modelling the impact of interdependent risk factors
Abstract:
This paper sets out a methodology for risk assessment of pregnancies in
terms of adverse outcomes such as low birth-weight and neonatal mortality
in a situation of multiple but possibly interdependent major dimensions of
risk. In the present analysis, the outcome is very low birth-weight and
the observed risk indicators are assumed to be linked to three main
dimensions: socio-demographic, bio-medical status, and fertility history.
Summary scores for each mother under each risk dimension are derived from
observed indicators and used as the basis for a multidimensional
classification to high or low risk. A fully Bayesian method of
implementation is applied to estimation and prediction. A case study is
presented of very low birth-weight singleton livebirths over 1991-93 in a
health region covering North West London and parts of the adjacent South
East of England, with validating predictions to maternities in 1994.
Journal: Journal of Applied Statistics
Pages: 183-197
Issue: 2
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760020016091
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760020016091
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:2:p:183-197
Template-Type: ReDIF-Article 1.0
Author-Name: Abhijit Gupta
Author-X-Name-First: Abhijit
Author-X-Name-Last: Gupta
Title: Optimization of product performance of a paint formulation using a mixture experiment
Abstract:
A paint manufacturing company was facing the problem of Vehicle
Separation and Settling in one of its prime products. These two
abnormalities are, in general, opposing in nature. The manufacturer tried
several modifications in the existing recipe for the product but failed to
control them. Experimentation was carried out using mixture design, a
special type of designed experiment, and quadratic response surface models
were fitted for both the responses. Finally, optimum formulation was
obtained by simultaneously optimizing the two response surface models.
During the determination of optimal formulation, different methods were
compared. The optimum formulation is currently being used for regular
manufacturing.
Journal: Journal of Applied Statistics
Pages: 199-213
Issue: 2
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760020016109
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760020016109
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:2:p:199-213
Template-Type: ReDIF-Article 1.0
Author-Name: Dog-super-˜an Argac
Author-X-Name-First: Dog-super-˜an
Author-X-Name-Last: Argac
Author-Name: Kepher Makambi
Author-X-Name-First: Kepher
Author-X-Name-Last: Makambi
Author-Name: Joachim Hartung
Author-X-Name-First: Joachim
Author-X-Name-Last: Hartung
Title: A note on testing the nullity of the between group variance in the one-way random effects model under variance heterogeneity
Abstract:
In an unbalanced and heteroscedastic one-way random effects model, we
compare, by way of simulation, several test statistics for testing the
null hypothesis that the variance of the random effects, also named the
between group variance, is zero. These tests are the classical F-test, the
test proposed by Jeyaratnam & Othman, the Welch test, and a modified
version of Welch's test.
Journal: Journal of Applied Statistics
Pages: 215-222
Issue: 2
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760020016118
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760020016118
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:2:p:215-222
Template-Type: ReDIF-Article 1.0
Author-Name: Jack Lee
Author-X-Name-First: Jack
Author-X-Name-Last: Lee
Author-Name: W. H. Lien
Author-X-Name-First: W. H.
Author-X-Name-Last: Lien
Title: Bayesian analysis of a growth curve model with power transformation, random effects and AR(1) dependence
Abstract:
In this paper we devote ourselves to a general growth curve model with
power transformation, random effects and AR(1) dependence via a Bayesian
approach. Two priors are proposed and both parameter estimation and
prediction of future values are considered. Some numerical results with a
set of real data are also given.
Journal: Journal of Applied Statistics
Pages: 223-238
Issue: 2
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760020016127
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760020016127
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:2:p:223-238
Template-Type: ReDIF-Article 1.0
Author-Name: Youngjo Lee
Author-X-Name-First: Youngjo
Author-X-Name-Last: Lee
Title: Can we recover information from concordant pairs in binary matched pairs?
Abstract:
When possible values of a response variable are limited, distributional
assumptions about random effects may not be checkable. This may cause a
distribution-robust estimator, such as the conditional maximum likelihood
estimator to be recommended; however, it does not utilize all the
information in the data. We show how, with binary matched pairs, the
hierarchical likelihood can be used to recover information from concordant
pairs, giving an improvement over the conditional maximum likelihood
estimator without losing distribution-robustness.
Journal: Journal of Applied Statistics
Pages: 239-246
Issue: 2
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760020016136
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760020016136
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:2:p:239-246
Template-Type: ReDIF-Article 1.0
Author-Name: Donald Martin
Author-X-Name-First: Donald
Author-X-Name-Last: Martin
Title: Influence functions applied to the estimation of mean rain rate
Abstract:
In this paper we illustrate the usefulness of influence functions for
studying properties of various statistical estimators of mean rain rate
using space-borne radar data. In Martin (1999), estimators using
censoring, minimum chi-square, and least squares are compared in terms of
asymptotic variance. Here, we use influence functions to consider
robustness properties of the same estimators. We also obtain formulas for
the asymptotic variance of the estimators using influence functions, and
thus show that they may also be used for studying relative efficiency. The
least squares estimator, although less efficient, is shown to be more
robust in the sense that it has the smallest gross-error sensitivity. In
some cases, influence functions associated with the estimators reveal
counterintuitive behaviour. For example, observations that are less than
the mean rain rate may increase the estimated mean. The additional
information gleaned from influence functions may be used to understand
better and improve the estimation procedures themselves.
Journal: Journal of Applied Statistics
Pages: 247-258
Issue: 2
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760020016145
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760020016145
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:2:p:247-258
Template-Type: ReDIF-Article 1.0
Author-Name: Man-Suk Oh
Author-X-Name-First: Man-Suk
Author-X-Name-Last: Oh
Author-Name: Yong Bin Lim
Author-X-Name-First: Yong Bin
Author-X-Name-Last: Lim
Title: Bayesian analysis of time series Poisson data
Abstract:
This paper provides a practical simulation-based Bayesian analysis of
parameter-driven models for time series Poisson data with the AR(1) latent
process. The posterior distribution is simulated by a Gibbs sampling
algorithm. Full conditional posterior distributions of unknown variables
in the model are given in convenient forms for the Gibbs sampling
algorithm. The case with missing observations is also discussed. The
methods are applied to real polio data from 1970 to 1983.
Journal: Journal of Applied Statistics
Pages: 259-271
Issue: 2
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760020016154
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760020016154
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:2:p:259-271
Template-Type: ReDIF-Article 1.0
Author-Name: Andrew Roddam
Author-X-Name-First: Andrew
Author-X-Name-Last: Roddam
Title: An approximate maximum likelihood procedure for parameter estimation in multivariate discrete data regression models
Abstract:
This paper considers an alternative to iterative procedures used to
calculate maximum likelihood estimates of regression coefficients in a
general class of discrete data regression models. These models can include
both marginal and conditional models and also local regression models. The
classical estimation procedure is generally via a Fisher-scoring algorithm
and can be computationally intensive for high-dimensional problems. The
alternative method proposed here is non-iterative and is likely to be more
efficient in high-dimensional problems. The method is demonstrated on two
different classes of regression models.
Journal: Journal of Applied Statistics
Pages: 273-279
Issue: 2
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760020016163
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760020016163
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:2:p:273-279
Template-Type: ReDIF-Article 1.0
Author-Name: M. Arvidsson
Author-X-Name-First: M.
Author-X-Name-Last: Arvidsson
Author-Name: P. Kammerlind
Author-X-Name-First: P.
Author-X-Name-Last: Kammerlind
Author-Name: A. Hynen
Author-X-Name-First: A.
Author-X-Name-Last: Hynen
Author-Name: B. Bergman
Author-X-Name-First: B.
Author-X-Name-Last: Bergman
Title: Identification of factors influencing dispersion in split-plot experiments
Abstract:
As split-plot designs are commonly used in robust design it is important
to identify factors in these designs that influence the dispersion of the
response variable. In this article, the Bergman-Hynen method, developed
for identification of dispersion effects in unreplicated experiments, is
modified to be used in the context of split-plot experiments. The
modification of the Bergman-Hynen method enables identification of factors
that influence specific variance components in unreplicated two-level
fractional factorial splitplot experiments. An industrial example is used
to illustrate the proposed method.
Journal: Journal of Applied Statistics
Pages: 269-283
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034027
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034027
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:269-283
Template-Type: ReDIF-Article 1.0
Author-Name: George Box
Author-X-Name-First: George
Author-X-Name-Last: Box
Title: Statistics for discovery
Abstract:
The question is discussed of why investigators in engineering and the
physical sciences rarely use statistical methods. It is argued that
statistics has in the past been overly influenced by the needs of
mathematics rather than those of scientific learning and discovery.
Remedies are suggested.
Journal: Journal of Applied Statistics
Pages: 285-299
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034036
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034036
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:285-299
Template-Type: ReDIF-Article 1.0
Author-Name: Roland Caulcutt
Author-X-Name-First: Roland
Author-X-Name-Last: Caulcutt
Title: Why is Six Sigma so successful?
Abstract:
There can be little doubt that Motorola, General Electric, Black and
Decker, Allied Signal (now Honeywell), ABB and Bombardier, have achieved
impressive business performance in recent years. Their annual reports
document this success. Furthermore, in several cases, the Annual Report
clearly attributes this success to having followed a Six Sigma strategy.
Not surprisingly, many other companies wish to learn what Six Sigma can do
for them, and their first question is 'What exactly is Six Sigma?'.
Unfortunately it is rather difficult, if not impossible, to define Six
Sigma in one or two sentences. This paper identifies the essential
elements of Six Sigma. Some are obvious, such as the extensive use of
statistical techniques by employees known as Blackbelts. However, other
more subtle, but very important, features of Six Sigma are concealed
within the business culture of these successful companies. It is clear to
those who have participated in this success, that any company embarking on
Six Sigma will not succeed if it focuses on statistics whilst failing to
develop a supporting culture.
Journal: Journal of Applied Statistics
Pages: 301-306
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034045
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034045
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:301-306
Template-Type: ReDIF-Article 1.0
Author-Name: P. R. G. Chambers
Author-X-Name-First: P. R. G.
Author-X-Name-Last: Chambers
Author-Name: J. L. Piggott
Author-X-Name-First: J. L.
Author-X-Name-Last: Piggott
Author-Name: S. Y. Coleman
Author-X-Name-First: S. Y.
Author-X-Name-Last: Coleman
Title: SPC—a team effort for process improvement across four Area Control Centres
Abstract:
This paper describes an innovative application of statistical process
control to the online remote control of the UK's gas transportation
networks. The gas industry went through a number of changes in ownership,
regulation, access to networks, organization and management culture in the
1990s. The application of SPC was motivated by these changes along with
the desire to apply the best industrial statistics theory to practical
problems. The work was initiated by a studentship, with the technology
gradually being transferred to the industry. The combined efforts of
control engineers and statisticians helped develop a novel SPC system.
Having set up the control limits, a system was devised to automatically
update and publish the control charts on a daily basis. The charts and an
associated discussion forum are available to both managers and control
engineers throughout the country at their desktop PCs. The paper describes
methods of involving people to design first-class systems to achieve
continual process improvement. It describes how the traditional benefits
of SPC can be realized in a 'distal team working', and 'soft systems',
context of four Area Control Centres, controlling a system delivering two
thirds of the UK's energy needs.
Journal: Journal of Applied Statistics
Pages: 307-324
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034054
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034054
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:307-324
Template-Type: ReDIF-Article 1.0
Author-Name: S. Y. Coleman
Author-X-Name-First: S. Y.
Author-X-Name-Last: Coleman
Author-Name: G. Arunakumar
Author-X-Name-First: G.
Author-X-Name-Last: Arunakumar
Author-Name: F. Foldvary
Author-X-Name-First: F.
Author-X-Name-Last: Foldvary
Author-Name: R. Feltham
Author-X-Name-First: R.
Author-X-Name-Last: Feltham
Title: SPC as a tool for creating a successful business measurement framework
Abstract:
Many companies are trying to get to the bottom of what their main
objectives are and what their business should be doing. The new Six Sigma
approach concentrates on clarifying business strategy and making sure that
everything relates to company objectives. It is vital to clarify each part
of the business in such a way that everyone can understand the causes of
variation that can lead to improvements in processes and performance. This
paper describes a situation where the full implementation of SPC
methodology has made possible a visual and widely appreciated summary of
the performance of one important aspect of the business. The major part of
the work was identifying the core objectives and deciding how to
encapsulate each of them in one or more suitable measurements. The next
step was to review the practicalities of obtaining the measurements and
their reliability and representativeness. Finally, the measurements were
presented in chart form and the more traditional steps of SPC analysis
were commenced. Data from fast changing business environments are prone to
many different problems, such as the short previous span of typical data,
strange distributions and other uncertainties. Issues surrounding these
and the eventual extraction of a meaningful set of information will be
discussed in the paper. The measurement framework has proved very useful
and, from an initial circulation of a handful of people, it now forms an
important part of an information process that provides responsible
managers with valuable control information. The measurement framework is
kept fresh and vital by constant review and modifications. Improved
electronic data collection and dissemination of the report has proved very
important.
Journal: Journal of Applied Statistics
Pages: 325-334
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034063
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034063
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:325-334
Template-Type: ReDIF-Article 1.0
Author-Name: David Bruce
Author-X-Name-First: David
Author-X-Name-Last: Bruce
Author-Name: Shirley Coleman
Author-X-Name-First: Shirley
Author-X-Name-Last: Coleman
Title: Improving communication via quantitative management
Abstract:
Villa Soft Drinks Ltd, established in 1884, manufactures and bottles
spring waters and carbonates for both the growing adult soft drinks market
and the more traditional soft drinks market. The company employs just over
100 people split between the manufacturing site in Sunderland and the head
office and distribution centre in Washington. One of the fundamental
problems affecting the day-to-day running of Villa, and most companies, is
communication. There is a lack of awareness of the impact that changes in
one department have on other departments (e.g. if production efficiency is
increased by 10%, what impact will this have on warehousing?). Villa had
recently identified key performance indicators (KPIs) to monitor all
aspects of manufacturing performance on a regular basis. This enabled the
current production situation to be evaluated and helped familiarize staff
with charts and measurements. The use of Pareto analysis and problem
solving techniques helped to boost efficiency and utilization. Key
performance indicators were then developed in most other departments and
are monitored and displayed regularly. The KPIs can be used further to
improve transparency across the company by incorporating them in an
interactive, interpretative tool to aid communication and understanding at
all levels of the company. Individual departmental flow diagrams will be
linked together to represent how the company operates. The diagrams will
include both material flow and information flow. These data will then be
organized in a software package and the end result will be a fully
integrated simulation of the company in which any variable can be altered
to demonstrate the effect this has on other departments and therefore the
company as a whole. This will be an extremely valuable tool for the
company as it will have many different applications, such as calculating
manning requirements, identifying potential cycle time reductions and
optimizing warehouse space.
Journal: Journal of Applied Statistics
Pages: 335-341
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034072
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034072
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:335-341
Template-Type: ReDIF-Article 1.0
Author-Name: S. Y. Coleman
Author-X-Name-First: S. Y.
Author-X-Name-Last: Coleman
Author-Name: A. Gordon
Author-X-Name-First: A.
Author-X-Name-Last: Gordon
Author-Name: P. R. Chambers
Author-X-Name-First: P. R.
Author-X-Name-Last: Chambers
Title: SPC—making it work for the gas transportation business
Abstract:
Transco is the main provider of gas transportation to domestic and
commercial customers in mainland Britain. Gas arrives in Britain at a
steady rate but is consumed with a distinct diurnal pattern. The safe and
timely movement of gas from arrival at the beach in various places in
Britain to delivery at burners is the main driver for System Operations.
The movement of gas is meticulously controlled and monitored resulting in
a mass of information on pressure, flow and temperature. Gas is stored
temporarily in various storage vessels and is moved around the pipes and
in and out of storage as demand dictates. Demand is mostly dictated by the
weather and is therefore subject to much variation. Transco and its
predecessors have been transporting gas for over 50 years and are very
successful as judged by their excellent safety record and the continual
delivery of gas. Nevertheless, the company wished to improve itself and
make further use of the many measurements collected. SPC is ideal for
improving communication and understanding through increased visibility of
data. All companies have special issues to face when they implement SPC,
and this paper describes the way these were dealt with in System
Operations and the lessons learnt along the way. The first part describes
how performance measures were chosen for investigation. It includes a
novel use of correlation between output and day-to-day conditions, which
was successfully turned into a measure to check the uncheckable. The
second part is about the issues involved with early application of SPC
when features of the system are still unexplained. SPC has helped enhance
understanding of the complex transportation process, encouraged team work,
improved performance and provided an objective means of decision making.
Journal: Journal of Applied Statistics
Pages: 343-351
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034081
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034081
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:343-351
Template-Type: ReDIF-Article 1.0
Author-Name: M. A. A. Cox
Author-X-Name-First: M. A. A.
Author-X-Name-Last: Cox
Title: Towards the implementation of a universal control chart and estimation of its average run length using a spreadsheet: An artificial neural network is employed to model the parameters in a special case
Abstract:
A control chart procedure has previously been proposed (Champ et al.,
1991) for which the Shewhart X ¥ -chart, the cumulative sum chart,
and the exponentially weighted moving average chart are special cases. The
rapid and easy production of these charts, plus many others, is proposed
using spreadsheets. In addition, for all these novel charts, the average
run lengths are generated as a guide to their likely behaviour. The
cumulative sum chart is widely employed in quality control and is
considered in greater detail. Charts are designed to exhibit acceptable
average run lengths both when the process is in and out of control. A
functional technique for parameter selection for such a chart is
introduced that results in target average run lengths. It employs the
method of artificial neural networks to derive appropriate coefficients.
This approach may be extended to any of the charts previously introduced.
Journal: Journal of Applied Statistics
Pages: 353-364
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034090
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034090
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:353-364
Template-Type: ReDIF-Article 1.0
Author-Name: Trevor Cox
Author-X-Name-First: Trevor
Author-X-Name-Last: Cox
Title: Multidimensional scaling used in multivariate statistical process control
Abstract:
This paper considers the use of multidimensional scaling techniques in
multivariate statistical process control. Principal components analysis,
multiple principal components analysis, partial least squares and PARAFAC
models have already been established as useful methods for such, but it
should be possible to widen the portfolio of techniques to include others
that come under the multidimensional scaling class. Some of these are
briefly described-namely classical scaling, non-metric scaling, biplots,
Procrustes analysis-and are then used on some gas transportation data
provided by Transco.
Journal: Journal of Applied Statistics
Pages: 365-378
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034108
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034108
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:365-378
Template-Type: ReDIF-Article 1.0
Author-Name: E. J. Godolphin
Author-X-Name-First: E. J.
Author-X-Name-Last: Godolphin
Title: Observable trend-projecting state-space models
Abstract:
Much attention has focused in recent years on the use of state-space
models for describing and forecasting industrial time series. However,
several state-space models that are proposed for such data series are not
observable and do not have a unique representation, particularly in
situations where the data history suggests marked seasonal trends. This
raises major practical difficulties since it becomes necessary to impose
one or more constraints and this implies a complicated error structure on
the model. The purpose of this paper is to demonstrate that state-space
models are useful for describing time series data for forecasting purposes
and that there are trend-projecting state-space components that can be
combined to provide observable state-space representations for specified
data series. This result is particularly useful for seasonal or
pseudo-seasonal time series. A well-known data series is examined in some
detail and several observable state-space models are suggested and
compared favourably with the constrained observable model.
Journal: Journal of Applied Statistics
Pages: 379-389
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034117
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034117
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:379-389
Template-Type: ReDIF-Article 1.0
Author-Name: T. N. Goh
Author-X-Name-First: T. N.
Author-X-Name-Last: Goh
Title: A pragmatic approach to experimental design in industry
Abstract:
The importance of statistically designed experiments in industry has been
well recognized. However, the use of 'design of experiments' is still not
pervasive, owing in part to the inefficient learning process experienced
by many non-statisticians. In this paper, the nature of design of
experiments, in contrast to the usual statistical process control
techniques, is discussed. It is then pointed out that for design of
experiments to be appreciated and applied, appropriate approaches should
be taken in training, learning and application. Perspectives based on the
concepts of objective setting and design under constraints can be used to
facilitate the experimenters' formulation of plans for collection,
analysis and interpretation of empirical information. A review is made of
the expanding role of design of experiments in the past several decades,
with comparisons made of the various formats and contexts of experimental
design applications, such as Taguchi methods and Six Sigma. The trend of
development shows that, from the realm of scientific research to business
improvement, the competitive advantage offered by design of experiments is
being increasingly felt.
Journal: Journal of Applied Statistics
Pages: 391-398
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034126
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034126
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:391-398
Template-Type: ReDIF-Article 1.0
Author-Name: G. Robin Henderson
Author-X-Name-First: G. Robin
Author-X-Name-Last: Henderson
Title: EWMA and industrial applications to feedback adjustment and control
Abstract:
In his book 'Out of the Crisis' the late Dr Edwards Deming asserted that
'if anyone adjusts a stable process to try to compensate for a result that
is undesirable, or for a result that is extra good, the output will be
worse than if he had left the process alone'. His famous funnel
experiments supported this assertion. The development of the control chart
by Dr Walter Shewhart stemmed from an approach made to him by the
management of a Western Electric Company plant because of their awareness
that adjustments made to processes often made matters worse. However, many
industrial processes are such that the mean values of product quality
characteristics shift and drift over time so that, instead of sequences of
independent observations to which Deming's assertion applies, process
owners are faced with autocorrelated data. The truth of Dr Deming's
assertion is demonstrated, both theoretically and via computer simulation.
The use of the Exponentially Weighted Moving Average (EWMA) for process
monitoring is demonstrated and, for situations where process data exhibit
autocorrelation, its use for feedback adjustment is discussed and
demonstrated. Finally, successful applications of process improvements
using EWMA-based control algorithms is discussed.
Journal: Journal of Applied Statistics
Pages: 399-407
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034135
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034135
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:399-407
Template-Type: ReDIF-Article 1.0
Author-Name: M. Weighell
Author-X-Name-First: M.
Author-X-Name-Last: Weighell
Author-Name: E. B. Martin
Author-X-Name-First: E. B.
Author-X-Name-Last: Martin
Author-Name: A. J. Morris
Author-X-Name-First: A. J.
Author-X-Name-Last: Morris
Title: The statistical monitoring of a complex manufacturing process
Abstract:
This paper describes the development of a multivariate statistical
process performance monitoring scheme for a high-speed polyester film
production facility. The objective for applying multivariate statistical
process control (MSPC) was to improve product consistency, detect process
changes and disturbances and increase operator awareness of the impact of
both routine maintenance and unusual events. The background to MSPC is
briefly described and the various stages in the development of an at-line
MSPC representation for the production line are described. A number of
case studies are used to illustrate the power of the methodology,
highlighting its potential to assist in process maintenance, the detection
of changes in process operation and the potential for the identification
of badly tuned controller loops.
Journal: Journal of Applied Statistics
Pages: 409-425
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034144
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034144
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:409-425
Template-Type: ReDIF-Article 1.0
Author-Name: Douglas Montgomery
Author-X-Name-First: Douglas
Author-X-Name-Last: Montgomery
Title: Opportunities and challenges for industrial statisticians
Abstract:
The last 20 years have seen significant advances in the use of
statistical methodology in industry, with applications in new product
design and development, optimization and control of manufacturing
processes, and in the service industries. The field of industrial
statistics has emerged as an important branch of statistical science that
focuses on this application environment. Yet as applications of statistics
in industry have expanded, creating many new opportunities for the modern
industrial statistician, many new challenges have arisen. Some of these
challenges are technical, while others have managerial and organizational
aspects. There are also important concerns pertaining to training and
education. This presentation focuses on some of these issues, and
identifies some potential solutions.
Journal: Journal of Applied Statistics
Pages: 427-439
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034153
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034153
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:427-439
Template-Type: ReDIF-Article 1.0
Author-Name: P. Pongcharoen
Author-X-Name-First: P.
Author-X-Name-Last: Pongcharoen
Author-Name: D. J. Stewardson
Author-X-Name-First: D. J.
Author-X-Name-Last: Stewardson
Author-Name: C. Hicks
Author-X-Name-First: C.
Author-X-Name-Last: Hicks
Author-Name: P. M. Braiden
Author-X-Name-First: P. M.
Author-X-Name-Last: Braiden
Title: Applying designed experiments to optimize the performance of genetic algorithms used for scheduling complex products in the capital goods industry
Abstract:
Conventional optimization approaches, such as Linear Programming, Dynamic
Programming and Branch-and-Bound methods are well established for solving
relatively simple scheduling problems. Algorithms such as Simulated
Annealing, Taboo Search and Genetic Algorithms (GA) have recently been
applied to large combinatorial problems. Owing to the complex nature of
these problems it is often impossible to search the whole problem space
and an optimal solution cannot, therefore, be guaranteed. A BiCriteria
Genetic Algorithm (BCGA) has been developed for the scheduling of complex
products with multiple resource constraints and deep product structure.
This GA identifies and corrects infeasible schedules and takes account of
the early supply of components and assemblies, late delivery of final
products and capacity utilization. The research has used manufacturing
data obtained from a capital goods company. Genetic Algorithms include a
number of parameters, including the probabilities of crossover and
mutation, the population size and the number of generations. The BCGA
scheduling tool provides 16 alternative crossover operations and eight
different mutation mechanisms. The overall objective of this study was to
develop an efficient design-of-experiments approach to identify genetic
algorithm operators and parameters that produce solutions with minimum
total cost. The case studies were based upon a complex, computationally
intensive scheduling problem that was insoluble using conventional
approaches. This paper describes an efficient sequential experimental
strategy that enabled this work to be performed within a reasonable time.
The first stage was a screening experiment, which had a fractional
factorial embedded within a half Latin-square design. The second stage was
a half-fraction design with a reduced number of GA operators. The results
are compared with previous studies. It is demonstrated that, in this case,
improved GA performance was achieved using the experimental strategy
proposed. The appropriate genetic operators and parameters may be case
specific, leading to the view that experimental design may be the best way
to proceed when finding the 'best' combination of GA operators and
parameters.
Journal: Journal of Applied Statistics
Pages: 441-455
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034162
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034162
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:441-455
Template-Type: ReDIF-Article 1.0
Author-Name: Elsayed Elamir
Author-X-Name-First: Elsayed
Author-X-Name-Last: Elamir
Author-Name: Allan Seheult
Author-X-Name-First: Allan
Author-X-Name-Last: Seheult
Title: Control charts based on linear combinations of order statistics
Abstract:
The last 20 years have seen an increasing emphasis on statistical process
control as a practical approach to reducing variability in industrial
applications. Control charts are used to detect problems such as outliers
or excess variability in subgroup means that may have a special cause. We
describe an approach to the computation of control limits for
exponentially weighted moving average control charts where the usual
statistics in classical charts are replaced by linear combinations of
order statistics; in particular, the trimmed mean and Gini's mean
difference instead of the mean and range, respectively. Control limits are
derived, and simulated average run length experiments show the trimmed
control charts to be less influenced by extreme observations than their
classical counterparts, and lead to tighter control limits. An example is
given that illustrates the benefits of the proposed charts. parameters;
see, for example, Hunter (1986) and Montgomery (1996). On the other hand,
EWMA charts have been shown to be more efficient than Shewharttype charts
in detecting small shifts in the process mean; see, for example, Ng & Case
(1989), Crowder (1989), Lucas & Saccucci (1990), Amin & Searcy (1991) and
Wetherill & Brown (1991). In fact, the EWMA control chart has become
popular for monitoring a process mean; see Hunter (1986) for a good
discussion. More recently, EWMA charts have been developed for monitoring
process variability;
Journal: Journal of Applied Statistics
Pages: 457-468
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034171
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034171
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:457-468
Template-Type: ReDIF-Article 1.0
Author-Name: Dave Stewardson
Author-X-Name-First: Dave
Author-X-Name-Last: Stewardson
Author-Name: Shirley Coleman
Author-X-Name-First: Shirley
Author-X-Name-Last: Coleman
Title: Using the Summed Rank Cusum for monitoring environmental data from industrial processes
Abstract:
Environmental issues have become a hot topic recently, especially those
surrounding industrial outputs. Effluents, emissions, outflows,
by-products, waste materials, product de-commissioning, land reclamation
and energy consumption are all the subject of monitoring, either under new
legislation or through economic necessity. Many types of environmental
data are often difficult to understand or measure because of their unusual
distribution of values however. Standard methods of monitoring these data
types often fail or are unwieldy. The scarcity of events, small volume
measurements and the unusual time scales sometimes involved add to the
complexity of the task. One recently developed monitoring technique is the
Summed Rank Cusum (SRC) that applies non-parametric methods to a standard
chart. The SRC can be used diagnostically and this paper describes the
application of this new tool to three data sets, each derived from a
different problem area. These are measuring industrial effluent, assessing
the levels of potentially harmful proteins produced by an industrial
process and industrial land reclamation in the face of harmful waste
materials. The use of the SRC to spot change points in time
retrospectively is described. The paper also shows the use of SRC in the
significant-difference testing mode, which is applied via the use of
spreadsheets. Links to other similar methods described in the literature
are given and formulae describing the statistical nature of the
transformation are shown. These practical demonstrations illustrate that
the graphical interpretation of the method appears to help considerably in
practice when trying to find time-series change points. The charts are an
effective graphical retrospective monitoring technique when dealing with
non-normal data. The method is easy to apply and may help considerably in
dealing with environmental data in the industrial setting when standard
methods are not appropriate. Further work is continuing on the more
theoretical aspects of the method.
Journal: Journal of Applied Statistics
Pages: 469-484
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034180
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034180
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:469-484
Template-Type: ReDIF-Article 1.0
Author-Name: Dave Stewardson
Author-X-Name-First: Dave
Author-X-Name-Last: Stewardson
Author-Name: David Porter
Author-X-Name-First: David
Author-X-Name-Last: Porter
Author-Name: Tony Kelly
Author-X-Name-First: Tony
Author-X-Name-Last: Kelly
Title: The dangers posed by saddle points, and other problems, when using central composite designs
Abstract:
This paper discusses two problems, which can occur when using central
composite designs (CCDs), that are not generally covered in the literature
but can lead to wrong decisions-and therefore incorrect models-if they are
ignored. Most industrialbased experimental designs are sequential. This
usually involves running as few initial tests as possible, while getting
enough information as is needed to provide a reasonable approximation to
reality (the screening stage). The CCD design strategy generally requires
the running of a full or fractional factorial design (the cube or
hypercube) with one or more additional centre points. The cube is
augmented, if deemed necessary, by additional experiments known as
star-points. The major problems highlighted here concern the decision to
run the star points or not. If the difference between the average response
at the centre of the design and the average of the cube results is
significant, there is probably a need for one or more quadratic terms in
the predictive model. If not, then a simpler model that includes only main
effects and interactions is usually considered sufficient. This test for
'curvature' in a main effect will often fail if the design space contains
or surrounds a saddle-point. Such a point may disguise the need for a
quadratic term. This paper describes the occurrence of a real saddle-point
from an industrial project and how this was overcome. The second problem
occurs because the cube and star point portions of a CCD are sometimes run
as orthogonal blocks. Indeed, theory would suggest that this is the
correct procedure. However in the industrial context, where minimizing the
total number of tests is at a premium, this can lead to designs with star
points a long way from the cube. In such a situation, were the curvature
test to be found non-significant, we could end with a model that predicted
well within the cube portion of the design space but that would be
unreliable in the balance of the total area of investigation. The paper
discusses just such a design, one that disguised the real need for a
quadratic term.
Journal: Journal of Applied Statistics
Pages: 485-495
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034199
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034199
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:485-495
Template-Type: ReDIF-Article 1.0
Author-Name: Carl Scarrott
Author-X-Name-First: Carl
Author-X-Name-Last: Scarrott
Author-Name: Granville Tunnicliffe Wilson
Author-X-Name-First: Granville Tunnicliffe
Author-X-Name-Last: Wilson
Title: Building a statistical model to predict reactor temperatures
Abstract:
This paper describes the various stages in building a statistical model
to predict temperatures in the core of a reactor, and compares the
benefits of this model with those of a physical model. We give a brief
background to this study and the applications of the model to rapid online
monitoring and safe operation of the reactor. We describe the methods, of
correlation and two dimensional spectral analysis, which we use to
identify the effects that are incorporated in a spatial regression model
for the measured temperatures. These effects are related to the age of the
reactor fuel and the spatial geometry of the reactor. A remaining
component of the temperature variation is a slowly varying temperature
surface modelled by smooth functions with constrained coefficients. We
assess the accuracy of the model for interpolating temperatures throughout
the reactor, when measurements are available only at a reduced set of
spatial locations, as is the case in most reactors. Further possible
improvements to the model are discussed.
Journal: Journal of Applied Statistics
Pages: 497-511
Issue: 3-4
Volume: 28
Year: 2001
Keywords: Spatial Prediction Two-DIMENSIONAL Spectra Linear Mixed Model,
X-DOI: 10.1080/02664760120034207
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034207
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:497-511
Template-Type: ReDIF-Article 1.0
Author-Name: Seung-Ho Kang
Author-X-Name-First: Seung-Ho
Author-X-Name-Last: Kang
Author-Name: Chul Ahn
Author-X-Name-First: Chul
Author-X-Name-Last: Ahn
Title: Regression coefficient analysis for correlated binomial outcomes
Abstract:
Journal: Journal of Applied Statistics
Pages: 513-514
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120034216
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120034216
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:513-514
Template-Type: ReDIF-Article 1.0
Author-Name: Leann Myers
Author-X-Name-First: Leann
Author-X-Name-Last: Myers
Author-Name: Stephanie Broyles
Author-X-Name-First: Stephanie
Author-X-Name-Last: Broyles
Title: Response
Abstract:
Journal: Journal of Applied Statistics
Pages: 515-515
Issue: 3-4
Volume: 28
Year: 2001
X-DOI: 10.1080/026647601300073221
File-URL: http://www.tandfonline.com/doi/abs/10.1080/026647601300073221
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:3-4:p:515-515
Template-Type: ReDIF-Article 1.0
Author-Name: I. Bairamov
Author-X-Name-First: I.
Author-X-Name-Last: Bairamov
Author-Name: S. Kotz
Author-X-Name-First: S.
Author-X-Name-Last: Kotz
Author-Name: M. Bekci
Author-X-Name-First: M.
Author-X-Name-Last: Bekci
Title: New generalized Farlie-Gumbel-Morgenstern distributions and concomitants of order statistics
Abstract:
We consider a generalization of the bivariate Farlie-Gumbel-Morgenstern
(FGM) distribution by introducing additional parameters. For the
generalized FGM distribution, the admissible range of the association
parameter allowing positive quadrant dependence property is shown.
Distributional properties of concomitants for this generalized FGM
distribution are studied. Recurrence relations between moments of
concomitants are presented.
Journal: Journal of Applied Statistics
Pages: 521-536
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120047861
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120047861
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:521-536
Template-Type: ReDIF-Article 1.0
Author-Name: Ling-Yau Chan
Author-X-Name-First: Ling-Yau
Author-X-Name-Last: Chan
Author-Name: Ying-Nan Guan
Author-X-Name-First: Ying-Nan
Author-X-Name-Last: Guan
Title: A- and D-optimal designs for a log contrast model for experiments with mixtures
Abstract:
A- and D-optimal designs are investigated for a log contrast model
suggested by Aitchison & Bacon-Shone for experiments with mixtures. It is
proved that when the number of mixture components q is an even integer, A-
and D-optimal designs are identical; and when q is an odd integer, A- and
D-optimal designs are different, but they share some common support points
and are very close to each other in efficiency. Optimal designs with a
minimum number of support points are also constructed for 3, 4, 5 and 6
mixture components.
Journal: Journal of Applied Statistics
Pages: 537-546
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120047870
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120047870
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:537-546
Template-Type: ReDIF-Article 1.0
Author-Name: Francesco Pauli
Author-X-Name-First: Francesco
Author-X-Name-Last: Pauli
Author-Name: Stuart Coles
Author-X-Name-First: Stuart
Author-X-Name-Last: Coles
Title: Penalized likelihood inference in extreme value analyses
Abstract:
Models for extreme values are usually based on detailed asymptotic
argument, for which strong ergodic assumptions such as stationarity, or
prescribed perturbations from stationarity, are required. In most
applications of extreme value modelling such assumptions are not
satisfied, but the type of departure from stationarity is either unknown
or complex, making asymptotic calculations unfeasible. This has led to
various approaches in which standard extreme value models are used as
building blocks for conditional or local behaviour of processes, with more
general statistical techniques being used at the modelling stage to handle
the non-stationarity. This paper presents another approach in this
direction based on penalized likelihood. There are some advantages to this
particular approach: the method has a simple interpretation; computations
for estimation are relatively straightforward using standard algorithms;
and a simple reinterpretation of the model enables broader inferences,
such as confidence intervals, to be obtained using MCMC methodology.
Methodological details together with applications to both athletics and
environmental data are given.
Journal: Journal of Applied Statistics
Pages: 547-560
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120047889
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120047889
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:547-560
Template-Type: ReDIF-Article 1.0
Author-Name: Steven Garren
Author-X-Name-First: Steven
Author-X-Name-Last: Garren
Author-Name: Richard Smith
Author-X-Name-First: Richard
Author-X-Name-Last: Smith
Author-Name: Walter Piegorsch
Author-X-Name-First: Walter
Author-X-Name-Last: Piegorsch
Title: Bootstrap goodness-of-fit test for the beta-binomial model
Abstract:
A common question in the analysis of binary data is how to deal with
overdispersion. One widely advocated sampling distribution for
overdispersed binary data is the beta-binomial model. For example, this
distribution is often used to model litter effects in toxicological
experiments. Testing the null hypothesis of a beta-binomial distribution
against all other distributions is difficult, however, when the litter
sizes vary greatly. Herein, we propose a test statistic based on combining
Pearson statistics from individual litter sizes, and estimate the p-value
using bootstrap techniques. A Monte Carlo study confirms the accuracy and
power of the test against a beta-binomial distribution contaminated with a
few outliers. The method is applied to data from environmental toxicity
studies.
Journal: Journal of Applied Statistics
Pages: 561-571
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120047898
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120047898
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:561-571
Template-Type: ReDIF-Article 1.0
Author-Name: Subrata Ghatak
Author-X-Name-First: Subrata
Author-X-Name-Last: Ghatak
Author-Name: Jalal Siddiki
Author-X-Name-First: Jalal
Author-X-Name-Last: Siddiki
Title: The use of the ARDL approach in estimating virtual exchange rates in India
Abstract:
This paper applies the autoregressive distributed lag approach to
cointegration analysis in estimating the 'virtual exchange rate' (VER) in
India. The VER would have prevailed if the unconstrained import demand
were equal to the constraint imposed due to foreign exchange rationing and
the VER is used to approximate the 'price' of rationed foreign exchange
reserves. We highlight the shortcomings of the existing literature in
approximating equilibrium exchange rates in a less developed country such
as India and propose the VER approach for equilibrium rates, which uses
information from an estimated structural model. In this relationship,
black market real exchange rate (E U ) is a dependent variable and real
official exchange rates (E O ), the ratio of the foreign (r*) to the
domestic (r) interest rate (I), and official forex reserves (Q) are
explanatory variables. In our estimation, the VERs are higher than E O by
about 10% in the short-run and 16% in the long-run.
Journal: Journal of Applied Statistics
Pages: 573-583
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120047906
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120047906
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:573-583
Template-Type: ReDIF-Article 1.0
Author-Name: W. J. Krzanowski
Author-X-Name-First: W. J.
Author-X-Name-Last: Krzanowski
Title: Data-based interval estimation of classification error rates
Abstract:
Leave-one-out and 632 bootstrap are popular data-based methods of
estimating the true error rate of a classification rule, but practical
applications almost exclusively quote only point estimates. Interval
estimation would provide better assessment of the future performance of
the rule, but little has been published on this topic. We first review
general-purpose jackknife and bootstrap methodology that can be used in
conjunction with leave-one-out estimates to provide prediction intervals
for true error rates of classification rules. Monte Carlo simulation is
then used to investigate coverage rates of the resulting intervals for
normal data, but the results are disappointing; standard intervals show
considerable overinclusion, intervals based on Edgeworth approximations or
random weighting do not perform well, and while a bootstrap approach
provides intervals with coverage rates closer to the nominal ones there is
still marked underinclusion. We then turn to intervals constructed from
632 bootstrap estimates, and show that much better results are obtained.
Although there is now some overinclusion, particularly for large training
samples, the actual coverage rates are sufficiently close to the nominal
rates for the method to be recommended. An application to real data
illustrates the considerable variability that can arise in practical
estimation of error rates.
Journal: Journal of Applied Statistics
Pages: 585-595
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120047915
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120047915
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:585-595
Template-Type: ReDIF-Article 1.0
Author-Name: Yong Lim
Author-X-Name-First: Yong
Author-X-Name-Last: Lim
Author-Name: B. S. So
Author-X-Name-First: B. S.
Author-X-Name-Last: So
Title: A note on the optimal number of centre runs in a second phase design of response surface methods
Abstract:
In searching for optimum conditions, the response surface methods
comprise two phases. In the first phase, the method of the steepest ascent
with a 2 k-p design is used in searching for a region of improved
response. The curvature of the response surface is checked in the second
phase. For testing the evidence of curvature, a reasonable design is a 2
k-p fractional factorial design augmented by centre runs. Using
c-optimality criterion, the optimal number of centre runs is investigated.
Incorporating c-efficiencies for the curvature test with D-efficiencies
and G-efficiencies of CCDs for the quadratic response surfaces and then,
adopting the Mini-Max principle, i.e. maximizing the worst efficiency, we
propose robust centre runs with respect to the three optimality criteria
to be chosen.
Journal: Journal of Applied Statistics
Pages: 597-602
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120047924
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120047924
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:597-602
Template-Type: ReDIF-Article 1.0
Author-Name: Alvaro Montenegro
Author-X-Name-First: Alvaro
Author-X-Name-Last: Montenegro
Title: On sample size and precision in ordinary least squares
Abstract:
An expression relating estimation precision in the classical linear model
to the number of parameters k and the sample size n is illustrated. A rule
of thumb for the sample size is suggested.
Journal: Journal of Applied Statistics
Pages: 603-605
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120047933
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120047933
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:603-605
Template-Type: ReDIF-Article 1.0
Author-Name: G. Nuel
Author-X-Name-First: G.
Author-X-Name-Last: Nuel
Author-Name: S. Robin
Author-X-Name-First: S.
Author-X-Name-Last: Robin
Author-Name: C. P. Baril
Author-X-Name-First: C. P.
Author-X-Name-Last: Baril
Title: Predicting distances using a linear model: The case of varietal distinctness
Abstract:
Differences between plant varieties are based on phenotypic observations,
which are both space and time consuming. Moreover, the phenotypic data
result from the combined effects of genotype and environment. On the
contrary, molecular data are easier to obtain and give a direct access to
the genotype. In order to save experimental trials and to concentrate
efforts on the relevant comparisons between varieties, the relationship
between phenotypic and genetic distances is studied. It appears that the
classical genetic distances based on molecular data are not appropriate
for predicting phenotypic distances. In the linear model framework, we
define a new pseudo genetic distance, which is a prediction of the
phenotypic one. The distribution of this distance given the pseudo genetic
distance is established. Statistical properties of the predicted distance
are derived when the parameters of the model are either given or
estimated. We finally apply these results to distinguishing between 144
maize lines. This case study is very satisfactory because the use of
anonymous molecular markers (RFLP) leads to saving 29% of the trials with
an acceptable error risk. These results need to be confirmed on other
varieties and species and would certainly be improved by using genes
coding for phenotypic traits.
Journal: Journal of Applied Statistics
Pages: 607-621
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120047942
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120047942
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:607-621
Template-Type: ReDIF-Article 1.0
Author-Name: Peiming Wang
Author-X-Name-First: Peiming
Author-X-Name-Last: Wang
Title: Markov zero-inflated Poisson regression models for a time series of counts with excess zeros
Abstract:
This paper discusses a class of Markov zero-inflated Poisson regression
models for a time series of counts with the presence of excess zero
relative to a Poisson distribution, in which the frequency distribution
changes according to an underlying two-state Markov chain. Features of the
proposed model, estimation method based on the EM and quasi-Newton
algorithms, and other implementation issues are discussed. A Monte Carlo
study shows that the estimation method is accurate and reliable as long as
the sample size is reasonably large, and the choice of starting
probabilities for the Markov process has little impact on the parameter
estimates. The methodology is illustrated using daily numbers of phone
calls reporting faults for a mainframe computer system.
Journal: Journal of Applied Statistics
Pages: 623-632
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120047951
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120047951
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:623-632
Template-Type: ReDIF-Article 1.0
Author-Name: Seung-Ho Kang
Author-X-Name-First: Seung-Ho
Author-X-Name-Last: Kang
Author-Name: Chul Ahn
Author-X-Name-First: Chul
Author-X-Name-Last: Ahn
Title: Regression coefficient analysis for correlated binomial outcomes
Abstract:
Journal: Journal of Applied Statistics
Pages: 633-634
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120047960
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120047960
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:633-634
Template-Type: ReDIF-Article 1.0
Author-Name: Leann Myers
Author-X-Name-First: Leann
Author-X-Name-Last: Myers
Author-Name: Stephanie Broyles
Author-X-Name-First: Stephanie
Author-X-Name-Last: Broyles
Title: Authors' reply
Abstract:
Journal: Journal of Applied Statistics
Pages: 635-635
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/026647601750235952
File-URL: http://www.tandfonline.com/doi/abs/10.1080/026647601750235952
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:635-635
Template-Type: ReDIF-Article 1.0
Author-Name: Rainer Winkelmann
Author-X-Name-First: Rainer
Author-X-Name-Last: Winkelmann
Title: 'Under-reporting of purchases of port wine': A correction
Abstract:
Journal: Journal of Applied Statistics
Pages: 637-637
Issue: 5
Volume: 28
Year: 2001
X-DOI: 10.1080/026647601750235961
File-URL: http://www.tandfonline.com/doi/abs/10.1080/026647601750235961
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:5:p:637-637
Template-Type: ReDIF-Article 1.0
Author-Name: S. C. Bagui
Author-X-Name-First: S. C.
Author-X-Name-Last: Bagui
Author-Name: D. K. Ghosh
Author-X-Name-First: D. K.
Author-X-Name-Last: Ghosh
Title: Efficiency balanced designs through reinforcement
Abstract:
In this investigation, general efficiency balanced (GEB) and efficiency
balanced (EB) designs with (v + t) treatments, using (i) balanced
incomplete block (BIB), (ii) symmetrical BIB, (iii) f -resolvable BIB,
(iv) group divisible (GD) and (v) resolvable GD designs have been
constructed with smaller number of replications and block sizes.
Journal: Journal of Applied Statistics
Pages: 649-658
Issue: 6
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120059192
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120059192
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:6:p:649-658
Template-Type: ReDIF-Article 1.0
Author-Name: Vicente Cancho
Author-X-Name-First: Vicente
Author-X-Name-Last: Cancho
Author-Name: Heleno Bolfarine
Author-X-Name-First: Heleno
Author-X-Name-Last: Bolfarine
Title: Modeling the presence of immunes by using the exponentiated-Weibull model
Abstract:
In this paper the exponentiated-Weibull model is modified to model the
possibility that long-term survivors are present in the data. The
modification leads to an exponentiated-Weibull mixture model which
encompasses as special cases the exponential and Weibull mixture models
typically used to model such data. Inference for the model parameters is
considered via maximum likelihood and also via Bayesian inference by using
Markov chain Monte Carlo simulation. Model comparison is considered by
using likelihood ratio statistics and also the pseudo Bayes factor, which
can be computed by using the generated samples. An example of a data set
is considered for which the exponentiated-Weibull mixture model presents a
better fit than the Weibull mixture model. Results of simulation studies
are also reported, which show that the likelihood ratio statistics seems
to be somewhat deficient for small and moderate sample sizes.
Journal: Journal of Applied Statistics
Pages: 659-671
Issue: 6
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120059200
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120059200
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:6:p:659-671
Template-Type: ReDIF-Article 1.0
Author-Name: Mark Glickman
Author-X-Name-First: Mark
Author-X-Name-Last: Glickman
Title: Dynamic paired comparison models with stochastic variances
Abstract:
In paired comparison experiments, the worth or merit of a unit is
measured through comparisons against other units. When paired comparison
outcomes are collected over time and the merits of the units may be
changing, it is often convenient to assume the data follow a non-linear
state-space model. Typical paired comparison state-space models that
assume a fixed (unknown) autoregressive variance do not account for the
possibility of sudden changes in the merits. This is a particular concern,
for example, in modeling cognitive ability in human development; cognitive
ability not only changes over time, but also can change abruptly. We
explore a particular extension of conventional state-space models for
paired comparison data that allows the state variance to vary
stochastically. Models of this type have recently been developed and
applied to modeling financial data, but can be seen to have applicability
in modeling paired comparison data. A filtering algorithm is also derived
that can be used in place of likelihood-based computations when the number
of objects being compared is large. Applications to National Football
League game outcomes and chess game outcomes are presented.
Journal: Journal of Applied Statistics
Pages: 673-689
Issue: 6
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120059219
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120059219
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:6:p:673-689
Template-Type: ReDIF-Article 1.0
Author-Name: Andy Lee
Author-X-Name-First: Andy
Author-X-Name-Last: Lee
Author-Name: John Yick
Author-X-Name-First: John
Author-X-Name-Last: Yick
Author-Name: Yer Van Hui
Author-X-Name-First: Yer
Author-X-Name-Last: Van Hui
Title: Sensitivity of the portmanteau statistic in time series modeling
Abstract:
The portmanteau statistic is commonly used for testing goodness-of-fit of
time series models. However, this lack of fit test may depend on one or
several atypical observations in the series. We investigate the
sensitivity of the portmanteau statistic in the presence of additive
outliers. Diagnostics are developed to assess both local and global
influence. Three practical examples demonstrate the usefulness of the
proposed diagnostics.
Journal: Journal of Applied Statistics
Pages: 691-702
Issue: 6
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120059228
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120059228
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:6:p:691-702
Template-Type: ReDIF-Article 1.0
Author-Name: Francisco Louzada-Neto
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada-Neto
Author-Name: Juan Carlos Pardo-Fernandez
Author-X-Name-First: Juan Carlos
Author-X-Name-Last: Pardo-Fernandez
Title: The effect of reparametrization on accelerated lifetime tests
Abstract:
Efficient reliability industrial experiments consist of submitting items
to accelerated life tests. It is of interest to obtain measures of the
realiability of the devices under the usual working conditions,
represented here by the mean lifetime. A practical problem refers to the
accuracy of interval estimation of the parameter of interest when the
sample size is small or moderate. In this paper, we describe the effect of
several reparametrizations on the accuracy of the interval estimation. We
propose a reparametrization that leads to accuracy while allowing
orthogonality between the parameters. The idea is to consider a
logarithmic reparametrization on orthogonal parameters in order to have
independent maximum likelihood estimates with good asymptotic normal
approximation. The study is illustrated by a data set on an accelerated
life test at pressurized containers of Kevlan/Epoxy 49.
Journal: Journal of Applied Statistics
Pages: 703-711
Issue: 6
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120059237
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120059237
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:6:p:703-711
Template-Type: ReDIF-Article 1.0
Author-Name: Terence Mills
Author-X-Name-First: Terence
Author-X-Name-Last: Mills
Title: Business cycle asymmetry and duration dependence: An international perspective
Abstract:
The business cycle behaviour of macroeconomic variables has long been of
interest to economists, and attention has recently focused on two aspects
of this behaviour - the 'stylized facts' of cyclical asymmetry and
duration dependence. Cyclical asymmetry is where the economy behaves
differently over the expansion and recession phases of the business cycle.
Duration dependence, on the other hand, concerns the question of whether,
for example, the probability of a cyclical expansion is dependent on how
long the expansion has been running, or whether business cycle lengths
tend to cluster around a particular duration. Using an international data
set containing annual output per capita for 22 countries, we focus
attention on non-parametric techniques for extracting cyclical components
and for modelling and testing asymmetry and duration dependence. Once
outliers, primarily associated with wars, are omitted, there is little
international evidence of asymmetry. There is considerably more evidence
of duration dependence, which is detected in the majority of countries
using a variety of non-parametric tests. There is thus widespread evidence
against the constant hazard hypothesis that cyclical patterns occur simply
by chance. Business cycle durations do appear to cluster around certain
values, with the average duration being about 3.6 years.
Journal: Journal of Applied Statistics
Pages: 713-724
Issue: 6
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120059246
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120059246
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:6:p:713-724
Template-Type: ReDIF-Article 1.0
Author-Name: Reik Oberrath
Author-X-Name-First: Reik
Author-X-Name-Last: Oberrath
Author-Name: Katrin Bohning-Gaese
Author-X-Name-First: Katrin
Author-X-Name-Last: Bohning-Gaese
Title: The Signed Mantel test to cope with autocorrelation in comparative analyses
Abstract:
In biology, medicine and anthropology, scientists try to reveal general
patterns when comparing different sampling units such as biological taxa,
diseases or cultures. A problem of such comparative data is that standard
statistical procedures are often inappropriate due to possible
autocorrelation within the data. Widespread causes of autocorrelation are
a shared geography or phylogeny of the sampling units. To cope with
possible autocorrelations within comparative data, we suggest a new kind
of the Mantel test. The Signed Mantel test evaluates the relationship
between two or more distance matrices and allows trait variables
facultatively to be represented as signed distances (calculated as signed
differences or quotients). Considering the sign of distances takes into
account the direction of an effect found in the data. Since different
metrics exist to calculate the distance between two sampling units from
the raw data and because the test results often depend on the kind of
metric used, we suggest validating analysis by comparing the structures of
the raw and the distance data. We offer a computer program that is able to
construct both signed and absolute distance matrices, to perform both
customary and Signed Mantel tests, and to explore raw and distance data
visually.
Journal: Journal of Applied Statistics
Pages: 725-736
Issue: 6
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120059255
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120059255
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:6:p:725-736
Template-Type: ReDIF-Article 1.0
Author-Name: Wai-Yin Poon
Author-X-Name-First: Wai-Yin
Author-X-Name-Last: Poon
Author-Name: Man-Lai Tang
Author-X-Name-First: Man-Lai
Author-X-Name-Last: Tang
Title: Influence measure in maximum likelihood estimate for models of lifetime data
Abstract:
We use the local influence approach to develop influence measures for
identifying observations that strike a disproportionate effect on the
maximum likelihood estimate of parameters in models for lifetime data. The
proposed method for developing influence measures can be applied to a wide
variety of models and we use the exponential model to illustrate the
details. In particular, we show that the proposed measure is equivalent to
the martingale residual under the exponential model.
Journal: Journal of Applied Statistics
Pages: 737-742
Issue: 6
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120059264
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120059264
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:6:p:737-742
Template-Type: ReDIF-Article 1.0
Author-Name: Ralph Mansson
Author-X-Name-First: Ralph
Author-X-Name-Last: Mansson
Author-Name: Philip Prescott
Author-X-Name-First: Philip
Author-X-Name-Last: Prescott
Title: Missing values in replicated Latin squares
Abstract:
Designs based on any number of replicated Latin squares are examined for
their robustness against the loss of up to three observations randomly
scattered throughout the design. The information matrix for the treatment
effects is used to evaluate the average variances of the treatment
differences for each design in terms of the number of missing values and
the size of the design. The resulting average variances are used to assess
the overall robustness of the designs. In general, there are 16 different
situations for the case of three missing values when there are at least
three Latin square replicates in the design. Algebraic expressions may be
determined for all possible configurations, but here the best and worst
cases are given in detail. Numerical illustrations are provided for the
average variances, relative efficiencies, minimum and maximum variances
and the frequency counts, showing the effects of the missing values for a
range of design sizes and levels of replication.
Journal: Journal of Applied Statistics
Pages: 743-757
Issue: 6
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120059273
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120059273
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:6:p:743-757
Template-Type: ReDIF-Article 1.0
Author-Name: Thomas Wenzel
Author-X-Name-First: Thomas
Author-X-Name-Last: Wenzel
Title: Hits-and-misses for the evaluation and combination of forecasts
Abstract:
Error measures for the evaluation of forecasts are usually based on the
size of the forecast errors. Common measures are, e.g. the mean squared
error (MSE), the mean absolute deviation (MAD) or the mean absolute
percentage error (MAPE). Alternative measures for the comparison of
forecasts are turning points or hits-and-misses, where an indicator loss
function is used to decide if a forecast is of high quality or not. Here,
we discuss the latter to obtain reliable combined forecasts. We apply
several combination techniques to a set of German macroeconomic data.
Furthermore, we perform a small simulation study for the combination of
two biased forecasts.
Journal: Journal of Applied Statistics
Pages: 759-773
Issue: 6
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120059282
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120059282
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:6:p:759-773
Template-Type: ReDIF-Article 1.0
Author-Name: Arnold Zellner
Author-X-Name-First: Arnold
Author-X-Name-Last: Zellner
Title: Remarks on a 'critique' of the Bayesian Method of Moments
Abstract:
Journal: Journal of Applied Statistics
Pages: 775-778
Issue: 6
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120059291
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120059291
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:6:p:775-778
Template-Type: ReDIF-Article 1.0
Author-Name: Chung-Ho Chen
Author-X-Name-First: Chung-Ho
Author-X-Name-Last: Chen
Author-Name: Te-Shiang Cheng
Author-X-Name-First: Te-Shiang
Author-X-Name-Last: Cheng
Author-Name: Chao-Yu Chou
Author-X-Name-First: Chao-Yu
Author-X-Name-Last: Chou
Title: Minimum average fraction inspected for TCSP-1 plan
Abstract:
This paper presents the calculation of the average outgoing quality limit
(AOQL) for the tightened single-level continuous sampling plan (TCSP-1
plan) based on a numerical method. A solution procedure is developed to
find the parameters (i, f, k) that will meet the AOQL requirement, while
also minimizing the average fraction inspected (AFI) for the TCSP-1 plan
when the process average p-super-¯ (> AOQL) is known.
Journal: Journal of Applied Statistics
Pages: 793-799
Issue: 7
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120074906
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120074906
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:7:p:793-799
Template-Type: ReDIF-Article 1.0
Author-Name: Jeongwen Chiang
Author-X-Name-First: Jeongwen
Author-X-Name-Last: Chiang
Author-Name: Ching-Fan Chung
Author-X-Name-First: Ching-Fan
Author-X-Name-Last: Chung
Author-Name: Emily Cremers
Author-X-Name-First: Emily
Author-X-Name-Last: Cremers
Title: Promotions and the pattern of grocery shopping time
Abstract:
The histograms of interpurchase times for frequently purchased packaged
goods have consistently shown pronounced seven-day cycles. Evidence
supports that the weekly spike phenomenon is the result of consumers'
regular shopping trip schedules. We explore the implications of this
peculiar regularity on the issue of consumer purchase timing acceleration.
Data for five product categories are examined. Promotions are found to
have little effect in accelerating purchase timing. In contrast,
conventional interpurchase time models are shown to overstate the effect
of promotions.
Journal: Journal of Applied Statistics
Pages: 801-819
Issue: 7
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120074997
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120074997
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:7:p:801-819
Template-Type: ReDIF-Article 1.0
Author-Name: D. K. Ghosh
Author-X-Name-First: D. K.
Author-X-Name-Last: Ghosh
Author-Name: S. B. Shrivastava
Author-X-Name-First: S. B.
Author-X-Name-Last: Shrivastava
Title: A class of BIB designs with repeated blocks
Abstract:
Balanced incomplete block design (BIBD) with repeated blocks is studied
in detail. Methods of construction of BIB designs with repeated blocks are
developed so as to distinguish the usual BIBD and BIBD with repeated
blocks. One additional parameter, say d, is considered here, where d
denotes the number of distinct blocks present in the BIB design with
repeated blocks. Further, a class of BIB design with parameters: v = 7, b
= 28, r = 12, k = 3, u = 4, has been constructed where, out of 15, 14 BIB
designs have repeated blocks. These 15 BIB designs, which have the same
parameters, are compared on the basis of number of distinct blocks (d) and
the multiplicities of variance of elementary contrasts of the block
effect.
Journal: Journal of Applied Statistics
Pages: 821-833
Issue: 7
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120074915
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120074915
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:7:p:821-833
Template-Type: ReDIF-Article 1.0
Author-Name: Tsung-Wu Ho
Author-X-Name-First: Tsung-Wu
Author-X-Name-Last: Ho
Title: Finite-sample properties of the bootstrap estimator in a Markov-switching model
Abstract:
The size distortion problem is clearly indicative of the small-sample
approximation in the Markov-switching regression model. This paper shows
that the bootstrap procedure can relieve the effects that this problem
has. Our Monte Carlo simulation results reveal that the bootstrap maximum
likelihood asymptotic approximations to the distribution can often be good
when the sample size is small.
Journal: Journal of Applied Statistics
Pages: 835-842
Issue: 7
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120074924
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120074924
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:7:p:835-842
Template-Type: ReDIF-Article 1.0
Author-Name: F. Huettmann
Author-X-Name-First: F.
Author-X-Name-Last: Huettmann
Author-Name: A. W. Diamond
Author-X-Name-First: A. W.
Author-X-Name-Last: Diamond
Title: Using PCA scores to classify species communities: An example for pelagic seabird distribution
Abstract:
Using Principal Component Analysis (PCA) in order to classify animal
communities from transect counts is a widely used method. One problem with
this approach is determining an appropriate cut-off point on the Principal
Component (PC) axis to separate communities. We have developed a method
using the distribution of PC scores of individual species along transects
from the PIROP (Programme Integrede Recherches sur les Oiseaux Pelagiques)
database for seabirds at sea in the Northwest Atlantic in winter 1965-
1992. This method can be applied generally to wildlife species, and also
facilitates the evaluation, justification and stratification of PCs and
community classifications in a transparent way. A typical application of
this method is shown for three Principal Components; spatial implications
of the cut-off decision for PCs are also discussed, e.g. for habitat
studies.
Journal: Journal of Applied Statistics
Pages: 843-853
Issue: 7
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120074933
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120074933
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:7:p:843-853
Template-Type: ReDIF-Article 1.0
Author-Name: Miguel Garcia-Perez
Author-X-Name-First: Miguel
Author-X-Name-Last: Garcia-Perez
Author-Name: Vicente Nunez-Anton
Author-X-Name-First: Vicente
Author-X-Name-Last: Nunez-Anton
Title: Small-sample comparisons for powerdivergence goodness-of-fit statistics for symmetric and skewed simple null hypotheses
Abstract:
Power-divergence goodness-of-fit statistics have asymptotically a
chi-squared distribution. Asymptotic results may not apply in small-sample
situations, and the exact significance of a goodness-of-fit statistic may
potentially be over- or under-stated by the asymptotic distribution.
Several correction terms have been proposed to improve the accuracy of the
asymptotic distribution, but their performance has only been studied for
the equiprobable case. We extend that research to skewed hypotheses.
Results are presented for one-way multinomials involving k = 2 to 6 cells
with sample sizes N = 20, 40, 60, 80 and 100 and nominal test sizes f =
0.1, 0.05, 0.01 and 0.001. Six power-divergence goodness-of-fit statistics
were investigated, and five correction terms were included in the study.
Our results show that skewness itself does not affect the accuracy of the
asymptotic approximation, which depends only on the magnitude of the
smallest expected frequency (whether this comes from a small sample with
the equiprobable hypothesis or a large sample with a skewed hypothesis).
Throughout the conditions of the study, the accuracy of the asymptotic
distribution seems to be optimal for Pearson's X2 statistic (the
power-divergence statistic of index u = 1) when k > 3 and the
smallest expected frequency is as low as between 0.1 and 1.5 (depending on
the particular k, N and nominal test size), but a computationally
inexpensive improvement can be obtained in these cases by using a
moment-corrected h2 distribution. If the smallest expected frequency is
even smaller, a normal correction yields accurate tests through the
log-likelihood-ratio statistic G2 (the power-divergence statistic of index
u = 0).
Journal: Journal of Applied Statistics
Pages: 855-874
Issue: 7
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120074942
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120074942
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:7:p:855-874
Template-Type: ReDIF-Article 1.0
Author-Name: Antonio Costa
Author-X-Name-First: Antonio
Author-X-Name-Last: Costa
Author-Name: M. A. Rahim
Author-X-Name-First: M. A.
Author-X-Name-Last: Rahim
Title: Economic design of X charts with variable parameters: The Markov chain approach
Abstract:
This paper presents an economic design of X control charts with variable
sample sizes, variable sampling intervals, and variable control limits.
The sample size n, the sampling interval h, and the control limit
coefficient k vary between minimum and maximum values, tightening or
relaxing the control. The control is relaxed when an X value falls close
to the target and is tightened when an X value falls far from the target.
A cost model is constructed that involves the cost of false alarms, the
cost of finding and eliminating the assignable cause, the cost associated
with production in an out-of-control state, and the cost of sampling and
testing. The assumption of an exponential distribution to describe the
length of time the process remains in control allows the application of
the Markov chain approach for developing the cost function. A
comprehensive study is performed to examine the economic advantages of
varying the X chart parameters.
Journal: Journal of Applied Statistics
Pages: 875-885
Issue: 7
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120074951
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120074951
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:7:p:875-885
Template-Type: ReDIF-Article 1.0
Author-Name: John Roberts
Author-X-Name-First: John
Author-X-Name-Last: Roberts
Author-Name: Devon Brewer
Author-X-Name-First: Devon
Author-X-Name-Last: Brewer
Title: Measures and tests of heaping in discrete quantitative distributions
Abstract:
Heaping is often found in discrete quantitative data based on subject
responses to open-ended interview questions or observer assessments.
Heaping occurs when subjects or observers prefer some set of numbers as
responses (e.g. multiples of 5) simply because of the features of this
set. Although heaping represents a common type of measurement error,
apparently no prior general measure of heaping exists. We present simple
measures and tests of heaping in discrete quantitative data, illustrate
them with data from an epidemiologic study, and evaluate the bias of these
statistics. These techniques permit formal measurement of heaping and
facilitate comparisons of the degree of heaping in data from different
samples, substantive domains, and data collection methods.
Journal: Journal of Applied Statistics
Pages: 887-896
Issue: 7
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120074960
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120074960
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:7:p:887-896
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Rothery
Author-X-Name-First: Peter
Author-X-Name-Last: Rothery
Author-Name: David Roy
Author-X-Name-First: David
Author-X-Name-Last: Roy
Title: Application of generalized additive models to butterfly transect count data
Abstract:
We investigate the use of generalized additive models for describing
patterns in butterfly transect counts during the flight period. Models
were applied to sets of simulated data and to transect counts from the
British Butterfly Monitoring Scheme (BMS) recorded at a large number of
sites in the UK. The models successfully described patterns in counts in a
range of species with different life cycles and the approach can be used
to estimate an index of butterfly abundance allowing for missing counts.
The method could be extended to include other factors such as temperature,
sunshine, windspeed and time of day, and to examine potential biases
arising from variation in these factors.
Journal: Journal of Applied Statistics
Pages: 897-909
Issue: 7
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120074979
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120074979
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:7:p:897-909
Template-Type: ReDIF-Article 1.0
Author-Name: Stephen Walker
Author-X-Name-First: Stephen
Author-X-Name-Last: Walker
Author-Name: Christopher Page
Author-X-Name-First: Christopher
Author-X-Name-Last: Page
Title: Generalized ridge regression and a generalization of the CP statistic
Abstract:
We consider a generalization of ridge regression and demonstrate
advantages over ridge regression. We provide an empirical Bayes method for
determining the ridge constants, using the Bayesian interpretation of
ridge estimators, and show that this coincides with a method based on a
generalization of the CP statistic and the non-negative garrote. These
provide an automatic variable selection procedure for the canonical
variables.
Journal: Journal of Applied Statistics
Pages: 911-922
Issue: 7
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120074988
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120074988
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:7:p:911-922
Template-Type: ReDIF-Article 1.0
Author-Name: Vic Barnett
Author-X-Name-First: Vic
Author-X-Name-Last: Barnett
Author-Name: Maria Cecilia Mendes Barreto
Author-X-Name-First: Maria Cecilia Mendes
Author-X-Name-Last: Barreto
Title: Estimators for a Poisson parameter using ranked set sampling
Abstract:
Using ranked set sampling, a viable BLUE estimator is obtained for
estimating the mean of a Poisson distribution. Its properties, such as
efficiency relative to the ranked set sample mean and to the maximum
likelihood estimator, have been calculated for different sample sizes and
values of the Poisson parameter. The estimator (termed the normal modified
r.s.s. estimator is more efficient than both the ranked set sample mean
and the MLE. It is recommended as a reasonable estimator of the Poisson
mean ( u ) to be used in a ranked set sampling environment.
Journal: Journal of Applied Statistics
Pages: 929-941
Issue: 8
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120076616
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120076616
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:8:p:929-941
Template-Type: ReDIF-Article 1.0
Author-Name: Soren Bisgaard
Author-X-Name-First: Soren
Author-X-Name-Last: Bisgaard
Author-Name: Murat Kulahci
Author-X-Name-First: Murat
Author-X-Name-Last: Kulahci
Title: Switching-one-column follow-up experiments for Plackett-Burman designs
Abstract:
Industrial experiments are frequently performed sequentially using
two-level fractional factorial designs. In this context, a common strategy
for the design of follow-up experiments is to switch the signs in one
column. It is well known that this strategy, when applied to two-level
fractional factorial resolution III designs, will clear the main effect,
for which the switch was performed, from any confounding with any other
two-factor interactions and will also clear all the two-factor
interactions between that factor and the other main effects from any
confounding with other two-factor interactions. In this article, we extend
this result and show that this strategy applies to any orthogonal
two-level resolution III design and therefore specifically to any
two-level Plackett- Burman design .
Journal: Journal of Applied Statistics
Pages: 943-949
Issue: 8
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120076625
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120076625
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:8:p:943-949
Template-Type: ReDIF-Article 1.0
Author-Name: Patrick Bourke
Author-X-Name-First: Patrick
Author-X-Name-Last: Bourke
Title: The geometric CUSUM chart with sampling inspection for monitoring fraction defective
Abstract:
The detection of an upward shift in the fraction defective of a
repetitive process is considered using the geometric CUSUM. This CUSUM
makes use of the information provided by the run-lengths of non-defective
items between successive defective items, and was initially developed for
the case of 100% inspection. This paper considers the geometric CUSUM
under sampling inspection, and emphasizes that the pattern of sampling
inspection can be quite haphazard without causing any difficulty for the
operation of the CUSUM. Two separate mechanisms for the occurrence of a
shift are considered. Methods for evaluating zero-state and steady-state
ARL are presented for both 100% inspection and sampling inspection.
Parameter choice is also considered, and recommendations made. Comparisons
with some np -charts are provided.
Journal: Journal of Applied Statistics
Pages: 951-972
Issue: 8
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120076643
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120076643
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:8:p:951-972
Template-Type: ReDIF-Article 1.0
Author-Name: George Box
Author-X-Name-First: George
Author-X-Name-Last: Box
Author-Name: Ian Hau
Author-X-Name-First: Ian
Author-X-Name-Last: Hau
Title: Experimental designs when there are one or more factor constraints
Abstract:
In response surface methodology, designs of orders one or two are often
needed such that some or all the factor levels satisfy one or more linear
constraints. A method is discussed for obtaining such designs by
projection of a standard design onto the constraint hyperplane. It is
shown that a projected design obtained from a rotatable design is also
rotatable, and for a rotatable design that is also orthogonal (in
particular any orthogonal first-order design) a least squares analysis
carried out on the generating design supplies a least squares solution for
the constrained design subject to the constraints. Some useful properties
of the generating design, such as orthogonal blocking and fractionation
are retained in the projected design. Some second-order mixture designs
generated by two-level factorials are discussed.
Journal: Journal of Applied Statistics
Pages: 973-989
Issue: 8
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120076652
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120076652
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:8:p:973-989
Template-Type: ReDIF-Article 1.0
Author-Name: V. R. Prayag
Author-X-Name-First: V. R.
Author-X-Name-Last: Prayag
Author-Name: S. A. Chiplonkar
Author-X-Name-First: S. A.
Author-X-Name-Last: Chiplonkar
Title: A multiple test for comparing two treatments with control: Interval hypotheses approach
Abstract:
In biological experiments, multiple comparison test procedures may lead
to a statistically significant difference in means. However, sometimes the
difference is not worthy of attention considering the inherent variation
in the characteristic. This may be due to the fact that the magnitude of
the change in the characteristic under study after receiving the treatment
is small, less than the natural biological variation. It then becomes the
job of the statistician to design a test that will remove this paradox,
such that the statistical significance will coincide with the biological
one. The present paper develops a multiple comparison test for comparing
two treatments with control by incorporating within-person variation in
forming interval hypotheses. Assuming common variance (unknown) for the
three groups (control and two treatments) and the width of the interval as
intra-individual variation (known), the distribution of the test statistic
is obtained as bivariate non-central t . A level f test procedure is
designed. A table of critical values for carrying out the test is
constructed for f = 0.05. The exact powers are computed for various values
of small sample sizes and parameters. The test is powerful for all values
of the parameters. The test was used to detect differences in zinc
absorption for two cereal diets compared with a control diet. After
application of our test, we arrived at the conclusion of homogeneity of
diets with the control diet. Dunnett's procedure, when applied to the same
data, concluded otherwise. The new test can also be applied to other data
situations in biology, medicine and agriculture.
Journal: Journal of Applied Statistics
Pages: 991-1001
Issue: 8
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120076661
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120076661
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:8:p:991-1001
Template-Type: ReDIF-Article 1.0
Author-Name: Hassen Muttlak
Author-X-Name-First: Hassen
Author-X-Name-Last: Muttlak
Title: Regression estimators in extreme and median ranked set samples
Abstract:
The ranked set sampling (RSS) method as suggested by McIntyre (1952) may
be modified to come up with new sampling methods that can be made more
efficient than the usual RSS method. Two such modifications, namely
extreme and median ranked set sampling methods, are considered in this
study. These two methods are generally easier to use in the field and less
prone to problems resulting from errors in ranking. Two regression-type
estimators based on extreme ranked set sampling (ERSS) and median ranked
set sampling (MRSS) for estimating the population mean of the variable of
interest are considered in this study and compared with the
regression-type estimators based on RSS suggested by Yu & Lam (1997). It
turned out that when the variable of interest and the concomitant variable
jointly followed a bivariate normal distribution, the regression-type
estimator of the population mean based on ERSS dominates all other
estimators considered.
Journal: Journal of Applied Statistics
Pages: 1003-1017
Issue: 8
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120076670
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120076670
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:8:p:1003-1017
Template-Type: ReDIF-Article 1.0
Author-Name: Key-Il Shin
Author-X-Name-First: Key-Il
Author-X-Name-Last: Shin
Author-Name: Hee-Jeong Kang
Author-X-Name-First: Hee-Jeong
Author-X-Name-Last: Kang
Title: A study on the effect of power transformation in the ARMA(p,q) model
Abstract:
In time series analysis, the Box-Cox power transformation is generally
used for variance stabilization. In this paper we show that the order and
the first step ahead forecast of the transformed model are approximately
invariant to those of the original model under certain assumptions on the
mean and variance. A small Monte Carlo simulation is performed to support
the results.
Journal: Journal of Applied Statistics
Pages: 1019-1028
Issue: 8
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120076689
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120076689
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:8:p:1019-1028
Template-Type: ReDIF-Article 1.0
Author-Name: X. M. Tu
Author-X-Name-First: X. M.
Author-X-Name-Last: Tu
Author-Name: J. Kowalski
Author-X-Name-First: J.
Author-X-Name-Last: Kowalski
Author-Name: A. Begley
Author-X-Name-First: A.
Author-X-Name-Last: Begley
Author-Name: P. Houck
Author-X-Name-First: P.
Author-X-Name-Last: Houck
Author-Name: S. Mazumdar
Author-X-Name-First: S.
Author-X-Name-Last: Mazumdar
Author-Name: J. Miewald
Author-X-Name-First: J.
Author-X-Name-Last: Miewald
Author-Name: D. J. Buysse
Author-X-Name-First: D. J.
Author-X-Name-Last: Buysse
Author-Name: D. J. Kupfer
Author-X-Name-First: D. J.
Author-X-Name-Last: Kupfer
Title: Data recycling: A response to the changing technology from the statistical perspective with application to psychiatric sleep research
Abstract:
Rapid technological advances have resulted in continual changes in data
acquisition and reporting processes. While such advances have benefited
research in these areas, the changing technologies have, at the same time,
created difficulty for statistical analysis by generating outdated data
which are incompatible with data based on newer technology. Relationships
between these incompatible variables are complicated; not only they are
stochastic, but also often depend on other variables, rendering even a
simple statistical analysis, such as estimation of a population mean,
difficult in the presence of mixed data formats. Thus, technological
advancement has brought forth, from the statistical perspective, a
methodological problem of the analysis of newer data with outdated data.
In this paper, we discuss general principles for addressing the
statistical issues related to the analysis of incompatible data. The
approach taken to the task at hand has three desirable properties, it is
readily understood, since it builds upon a linear regression setting, it
is flexible to allow for data incompatibility in either the response or
covariate, and it is not computationally intensive. In addition,
inferences may be made for a latent variable of interest. Our
considerations to this problem are motivated by the analysis of delta wave
counts, as a surrogate for sleep disorder, in the sleep laboratory of the
Department of Psychiatry, University of Pittsburgh Medical Center, where
two major changes had occurred in the acquisition of this data, resulting
in three mixed formats. By developing appropriate methods for addressing
this issue, we provide statistical advancement that is compatible with
technological advancement.
Journal: Journal of Applied Statistics
Pages: 1029-1049
Issue: 8
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120076698
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120076698
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:8:p:1029-1049
Template-Type: ReDIF-Article 1.0
Author-Name: Samia Adham
Author-X-Name-First: Samia
Author-X-Name-Last: Adham
Author-Name: Stephen Walker
Author-X-Name-First: Stephen
Author-X-Name-Last: Walker
Title: A multivariate Gompertz-type distribution
Abstract:
The Gompertz distribution has many applications, particularly in medical
and actuarial studies. However, there has been little recent work on the
Gompertz in comparison with its early investigation. The problem of
finding and analysing a bivariate (or multivariate) Gompertz distribution
is of interest and the focus of this paper. A search of the literature
suggests there is currently no multivariate or even useful bivariate
Gompertz distribution.
Journal: Journal of Applied Statistics
Pages: 1051-1065
Issue: 8
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120076706
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120076706
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:8:p:1051-1065
Template-Type: ReDIF-Article 1.0
Author-Name: Man-Yu Wong
Author-X-Name-First: Man-Yu
Author-X-Name-Last: Wong
Author-Name: Shuanglin Zhang
Author-X-Name-First: Shuanglin
Author-X-Name-Last: Zhang
Title: Degrees of freedom and the likelihood ratio test for the generalized Behrens-Fisher problem
Abstract:
Several methods for testing the difference between two group means of k
independent populations are compared. Simulation shows that the likelihood
ratio test with the Bartlett correction factor and the t test with
appropriate degrees of freedom perform better, particularly when the
sample size is small. However, the latter is very good for all
configurations.
Journal: Journal of Applied Statistics
Pages: 1067-1074
Issue: 8
Volume: 28
Year: 2001
X-DOI: 10.1080/02664760120076715
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120076715
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:28:y:2001:i:8:p:1067-1074
Template-Type: ReDIF-Article 1.0
Author-Name: Emmanuelle Cam
Author-X-Name-First: Emmanuelle
Author-X-Name-Last: Cam
Author-Name: Bernard Cadiou
Author-X-Name-First: Bernard
Author-X-Name-Last: Cadiou
Author-Name: James Hines
Author-X-Name-First: James
Author-X-Name-Last: Hines
Author-Name: Jean Yves Monnat
Author-X-Name-First: Jean Yves
Author-X-Name-Last: Monnat
Title: Influence of behavioural tactics on recruitment and reproductive trajectory in the kittiwake
Abstract:
Many studies have provided evidence that, in birds, inexperienced
breeders have a lower probability of breeding successfully. This is often
explained by lack of skills and knowledge, and sometimes late laying dates
in the first breeding attempt. There is growing evidence that in many
species with deferred reproduction, some prebreeders attend breeding
places, acquire territories and form pairs. Several behavioural tactics
assumed to be associated with territory acquisition have been described in
different species. These tactics may influence the probability of
recruiting in the breeding segment of the population, age of first
breeding, and reproductive success in the first breeding attempt. Here we
addressed the influence of behaviour ('squatting') during the prebreeding
period on demographic parameters (survival and recruitment probability) in
a long-lived colonial seabird species: the kittiwake. We also investigated
the influence of behaviour on reproductive trajectory. Squatters have a
higher survival and recruitment probability, and a higher probability of
breeding successfully in the first breeding attempt in all age-classes
where this category is represented. The influence of behaviour is mainly
expressed in the first reproduction. However, there is a relationship
between breeding success in the first occasion and subsequent occasions.
The influence of breeding success in the first breeding attempt on the
rest of the trajectory may indirectly reflect the influence of behaviour
on breeding success in the first occasion. The shape of the reproductive
trajectory is influenced by behaviour and age of first breeding. There is
substantial individual variation from the mean reproductive trajectory,
which is accounted for by heterogeneity in performance among individuals
in the first attempt, but there is no evidence of individual heterogeneity
in the rate of change over time in performance in subsequent breeding
occasions
Journal: Journal of Applied Statistics
Pages: 163-185
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108502
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108502
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:163-185
Template-Type: ReDIF-Article 1.0
Author-Name: William Link
Author-X-Name-First: William
Author-X-Name-Last: Link
Author-Name: Evan Cooch
Author-X-Name-First: Evan
Author-X-Name-Last: Cooch
Author-Name: Emmanuelle Cam
Author-X-Name-First: Emmanuelle
Author-X-Name-Last: Cam
Title: Model-based estimation of individual fitness
Abstract:
Fitness is the currency of natural selection, a measure of the
propagation rate of genotypes into future generations. Its various
definitions have the common feature that they are functions of survival
and fertility rates. At the individual level, the operative level for
natural selection, these rates must be understood as latent features,
genetically determined propensities existing at birth. This conception of
rates requires that individual fitness be defined and estimated by
consideration of the individual in a modelled relation to a group of
similar individuals; the only alternative is to consider a sample of size
one, unless a clone of identical individuals is available. We present
hierarchical models describing individual heterogeneity in survival and
fertility rates and allowing for associations between these rates at the
individual level. We apply these models to an analysis of life histories
of Kittiwakes ( Rissa tridactyla ) observed at several colonies on the
Brittany coast of France. We compare Bayesian estimation of the population
distribution of individual fitness with estimation based on treating
individual life histories in isolation, as samples of size one (e.g.
McGraw & Caswell, 1996).
Journal: Journal of Applied Statistics
Pages: 207-224
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108700a
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108700a
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:207-224
Template-Type: ReDIF-Article 1.0
Author-Name: Kenneth Burnham
Author-X-Name-First: Kenneth
Author-X-Name-Last: Burnham
Author-Name: Gary White
Author-X-Name-First: Gary
Author-X-Name-Last: White
Title: Evaluation of some random effects methodology applicable to bird ringing data
Abstract:
Existing models for ring recovery and recapture data analysis treat
temporal variations in annual survival probability (S) as fixed effects.
Often there is no explainable structure to the temporal variation in S 1 ,
… , S k ; random effects can then be a useful model: Si = E(S) + k
i . Here, the temporal variation in survival probability is treated as
random with average value E( k 2 ) = † 2 . This random effects
model can now be fit in program MARK. Resultant inferences include point
and interval estimation for process variation, † 2 , estimation of
E(S) and var(E(S)) where the latter includes a component for † 2 as
well as the traditional component for v ar(S&7CS). Furthermore, the random
effects model leads to shrinkage estimates, S i , as improved (in mean
square error) estimators of Si compared to the MLE, S i , from the
unrestricted time-effects model. Appropriate confidence intervals based on
the S i are also provided. In addition, AIC has been generalized to random
effects models. This paper presents results of a Monte Carlo evaluation of
inference performance under the simple random effects model. Examined by
simulation, under the simple one group Cormack-Jolly-Seber (CJS) model,
are issues such as bias of † 2 , confidence interval coverage on
† 2 , coverage and mean square error comparisons for inference
about Si based on shrinkage versus maximum likelihood estimators, and
performance of AIC model selection over three models: S i = S (no
effects), Si = E(S) + k i (random effects), and S 1 , … , S k
(fixed effects). For the cases simulated, the random effects methods
performed well and were uniformly better than fixed effects MLE for the S
i .
Journal: Journal of Applied Statistics
Pages: 245-264
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108755
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108755
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:245-264
Template-Type: ReDIF-Article 1.0
Author-Name: Ian Nisbet
Author-X-Name-First: Ian
Author-X-Name-Last: Nisbet
Author-Name: Emmanuelle Cam
Author-X-Name-First: Emmanuelle
Author-X-Name-Last: Cam
Title: Test for age-specificity in survival of the common tern
Abstract:
Much effort in life-history theory has been addressed to the dependence
of life-history traits on age, especially the phenomenon of senescence and
its evolution. Although senescent declines in survival are well documented
in humans and in domestic and laboratory animals, evidence for their
occurrence and importance in wild animal species remains limited and
equivocal. Several recent papers have suggested that methodological issues
may contribute to this problem, and have encouraged investigators to
improve sampling designs and to analyse their data using recently
developed approaches to modelling of capture-mark-recapture data. Here we
report on a three-year, two-site, mark-recapture study of known-aged
common terns (Sterna hirundo) in the north-eastern USA. The study was
nested within a long-term ecological study in which large numbers of
chicks had been banded in each year for > 25 years. We used a range
of models to test the hypothesis of an influence of age on survival
probability. We also tested for a possible influence of sex on survival.
The cross-sectional design of the study (one year's parameter estimates)
avoided the possible confounding of effects of age and time. The study was
conducted at a time when one of the study sites was being colonized and
numbers were increasing rapidly. We detected two-way movements between the
sites and estimated movement probabilities in the year for which they
could be modelled. We also obtained limited data on emigration from our
study area to more distant sites. We found no evidence that survival
depended on either sex or age, except that survival was lower among the
youngest birds (ages 2-3 years). Despite the large number of birds
included in the study (1599 known-aged birds, 2367 total), confidence
limits on estimates of survival probability were wide, especially for the
oldest age-classes, so that a slight decline in survival late in life
could not have been detected. In addition, the cross-sectional design of
this study meant that a decline in survival probability within individuals
(actuarial senescence) could have been masked by heterogeneity in survival
probability among individuals (mortality selection). This emphasizes the
need for the development of modelling tools permitting separation of these
two phenomena, valid under field conditions in which the recapture
probabilities are less than one.
Journal: Journal of Applied Statistics
Pages: 65-83
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108467
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108467
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:65-83
Template-Type: ReDIF-Article 1.0
Author-Name: M. Dolores Ugarte
Author-X-Name-First: M. Dolores
Author-X-Name-Last: Ugarte
Title: Book Review
Abstract:
Journal: Journal of Applied Statistics
Pages: 669-669
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108881
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108881
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:669-669
Template-Type: ReDIF-Article 1.0
Author-Name: Victoria Dreitz
Author-X-Name-First: Victoria
Author-X-Name-Last: Dreitz
Author-Name: James Nichols
Author-X-Name-First: James
Author-X-Name-Last: Nichols
Author-Name: James Hines
Author-X-Name-First: James
Author-X-Name-Last: Hines
Author-Name: Robert Bennetts
Author-X-Name-First: Robert
Author-X-Name-Last: Bennetts
Author-Name: Wiley Kitchens
Author-X-Name-First: Wiley
Author-X-Name-Last: Kitchens
Author-Name: Donald Deangelis
Author-X-Name-First: Donald
Author-X-Name-Last: Deangelis
Title: The use of resighting data to estimate the rate of population growth of the snail kite in Florida
Abstract:
The rate of population growth ( u ) is an important demographic parameter
used to assess the viability of a population and to develop management and
conservation agendas. We examined the use of resighting data to estimate u
for the snail kite population in Florida from 1997-2000. The analyses
consisted of (1) a robust design approach that derives an estimate of u
from estimates of population size and (2) the Pradel (1996) temporal
symmetry (TSM) approach that directly estimates u using an open-population
capture-recapture model. Besides resighting data, both approaches required
information on the number of unmarked individuals that were sighted during
the sampling periods. The point estimates of u differed between the robust
design and TSM approaches, but the 95% confidence intervals overlapped
substantially. We believe the differences may be the result of sparse data
and do not indicate the inappropriateness of either modelling technique.
We focused on the results of the robust design because this approach
provided estimates for all study years. Variation among these estimates
was smaller than levels of variation among ad hoc estimates based on
previously reported index statistics. We recommend that u of snail kites
be estimated using capture-resighting methods rather than ad hoc counts.
Journal: Journal of Applied Statistics
Pages: 609-623
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108854
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108854
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:609-623
Template-Type: ReDIF-Article 1.0
Author-Name: David Otis
Author-X-Name-First: David
Author-X-Name-Last: Otis
Author-Name: Gary White
Author-X-Name-First: Gary
Author-X-Name-Last: White
Title: Re-analysis of a banding study to test the effects of an experimental increase in bag limits of mourning doves
Abstract:
In 1966-1971, eastern US states with hunting seasons on mourning doves (
Zenaida macroura ) participated in a study designed to estimate the
effects of bag limit increases on population survival rates. More than 400
000 adult and juvenile birds were banded and released during this period,
and subsequent harvest and return of bands, together with total harvest
estimates from mail and telephone surveys of hunters, provided the
database for analysis. The original analysis used an ANOVA framework, and
resulted in inferences of no effect of bag limit increase on population
parameters (Hayne 1975). We used a logistic regression analysis to infer
that the bag limit increase did not cause a biologically significant
increase in harvest rate and thus the experiment could not provide any
insight into the relationship between harvest and annual survival rates.
Harvest rate estimates of breeding populations from geographical
subregions were used as covariates in a Program MARK analysis and revealed
an association between annual survival and harvest rates, although this
relationship is potentially confounded by a latitudinal gradient in
survival rates of dove populations. We discuss methodological problems
encountered in the analysis of these data, and provide recommendations for
future studies of the relationship between harvest and annual survival
rates of mourning dove populations.
Journal: Journal of Applied Statistics
Pages: 479-495
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108539
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108539
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:479-495
Template-Type: ReDIF-Article 1.0
Author-Name: Jean Clobert
Author-X-Name-First: Jean
Author-X-Name-Last: Clobert
Title: Capture-recapture and evolutionary ecology: Further comments
Abstract:
Journal: Journal of Applied Statistics
Pages: 53-56
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108773
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108773
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:53-56
Template-Type: ReDIF-Article 1.0
Author-Name: George Seber
Author-X-Name-First: George
Author-X-Name-Last: Seber
Author-Name: Carl Schwarz
Author-X-Name-First: Carl
Author-X-Name-Last: Schwarz
Title: Capture-recapture: Before and after EURING 2000
Abstract:
Capture-recapture studies and analyses have become an important tool for
the study of bird populations. One reason for the rapid advancement in
this area has been the EURING conferences where population biologists and
statisticians meet to review recent progress, identify areas that require
further work, and work collaborately to solve real world problems. In this
paper, we forecast the needs for future research in this area and review
the recent conference to try and identify what questions are yet unsolved.
This EURING conference was dedicated to Dr George Seber who was the author
of a number of key papers and whose name is synonymous with 'The
estimation of animal abundance and related parameter' (Seber, 1982). He
has retired from working in this field.
Journal: Journal of Applied Statistics
Pages: 5-18
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108700
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108700
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:5-18
Template-Type: ReDIF-Article 1.0
Author-Name: Anne Loison
Author-X-Name-First: Anne
Author-X-Name-Last: Loison
Author-Name: Bernt-Erik Sæther
Author-X-Name-First: Bernt-Erik
Author-X-Name-Last: Sæther
Author-Name: Kurt Jerstad
Author-X-Name-First: Kurt
Author-X-Name-Last: Jerstad
Author-Name: Ole Wiggo Røstad
Author-X-Name-First: Ole Wiggo
Author-X-Name-Last: Røstad
Title: Disentangling the sources of variation in the survival of the European dipper
Abstract:
The population growth rate of the European dipper has been shown to
decrease with winter temperature and population size. We examine here the
demographic mechanism for this effect by analysing how these factors
affect the survival rate. Using more than 20 years of
capture-mark-recapture data (1974-1997) based on more than 4000 marked
individuals, we perform analyses using open capture-mark-recapture models.
This allowed us to estimate the annual apparent survival rates
(probability of surviving and staying on the study site from one year to
the next one) and the recapture probabilities. We partitioned the variance
of the apparent survival rates into sampling variance and process variance
using random effects models, and investigated which variables best
accounted for temporal process variation. Adult males and females had
similar apparent survival rates, with an average of 0.52 and a coefficient
of variation of 40%. Chick apparent survival was lower, averaging 0.06
with a coefficient of variation of 42%. Eighty percent of the variance in
apparent survival rates was explained by winter temperature and population
size for adults and 48% by winter temperature for chicks. The process
variance outweighed the sampling variance both for chick and adult
survival rates, which explained that shrunk estimates obtained under
random effects models were close to MLE estimates. A large proportion of
the annual variation in the apparent survival rate of chicks appears to be
explained by inter-year differences in dispersal rates.
Journal: Journal of Applied Statistics
Pages: 289-304
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108665
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108665
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:289-304
Template-Type: ReDIF-Article 1.0
Author-Name: J. D. Lebreton
Author-X-Name-First: J. D.
Author-X-Name-Last: Lebreton
Author-Name: R. Pradel Cefe
Author-X-Name-First: R. Pradel
Author-X-Name-Last: Cefe
Title: Multistate recapture models: Modelling incomplete individual histories
Abstract:
Multistate capture-recapture models are a natural generalization of the
usual one-site recapture models. Similarly, individuals are sampled on
discrete occasions, at which they may be captured or not. However,
contrary to the one-site case, the individuals can move within a finite
set of states between occasions. The growing interest in spatial aspects
of population dynamics presently contributes to making multistate models a
very promising tool for population biology. We review first the interest
and the potential of multistate models, in particular when they are used
with individual states as well as geographical sites. Multistate models
indeed constitute canonical capture-recapture models for individual
categorical covariates changing over time, and can be linked to
longitudinal studies with missing data and models such as hidden Markov
chains. Multistate models also provide a promising tool for handling
heterogeneity of capture, provided states related to capturability can be
defined and used. Such an approach could be relevant for population size
estimation in closed populations. Multistate models also constitute a
natural framework for mixtures of information in individual history data.
Presently, most models can be fit using program MARK. As an example, we
present a canonical model for multisite accession to reproduction, which
fully generalizes a classical one-site model. In the generalization
proposed, one can estimate simultaneously age-dependent rates of accession
to reproduction, natal and breeding dispersal. Finally, we discuss further
generalizations - such as a multistate generalization of growth rate
models and models for data where the state in which an individual is
detected is known with uncertainty - and prospects for software
development.
Journal: Journal of Applied Statistics
Pages: 353-369
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108638
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108638
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:353-369
Template-Type: ReDIF-Article 1.0
Author-Name: James Hines
Author-X-Name-First: James
Author-X-Name-Last: Hines
Author-Name: James Nichols
Author-X-Name-First: James
Author-X-Name-Last: Nichols
Title: Investigations of potential bias in the estimation of u using Pradel's (1996) model for capture-recapture data
Abstract:
Pradel's (1996) temporal symmetry model permitting direct estimation and
modelling of population growth rate, u i , provides a potentially useful
tool for the study of population dynamics using marked animals. Because of
its recent publication date, the approach has not seen much use, and there
have been virtually no investigations directed at robustness of the
resulting estimators. Here we consider several potential sources of bias,
all motivated by specific uses of this estimation approach. We consider
sampling situations in which the study area expands with time and present
an analytic expression for the bias in u i We next consider trap response
in capture probabilities and heterogeneous capture probabilities and
compute large-sample and simulation-based approximations of resulting bias
in u i . These approximations indicate that trap response is an especially
important assumption violation that can produce substantial bias. Finally,
we consider losses on capture and emphasize the importance of selecting
the estimator for u i that is appropriate to the question being addressed.
For studies based on only sighting and resighting data, Pradel's (1996) u
i ' is the appropriate estimator.
Journal: Journal of Applied Statistics
Pages: 573-587
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108872
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108872
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:573-587
Template-Type: ReDIF-Article 1.0
Author-Name: James Nichols
Author-X-Name-First: James
Author-X-Name-Last: Nichols
Title: Discussion comments on: 'Occam's shadow: Levels of analysis in evolutionary ecology-- where to next?' by Cooch, Cam and Link
Abstract:
Journal: Journal of Applied Statistics
Pages: 49-52
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108449
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108449
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:49-52
Template-Type: ReDIF-Article 1.0
Author-Name: Michael Conroy
Author-X-Name-First: Michael
Author-X-Name-Last: Conroy
Author-Name: Juan Carlos Senar
Author-X-Name-First: Juan Carlos
Author-X-Name-Last: Senar
Author-Name: Jordi Domenech
Author-X-Name-First: Jordi
Author-X-Name-Last: Domenech
Title: Analysis of individual- and time-specific covariate effects on survival of Serinus serinus in north-eastern Spain
Abstract:
We developed models for the analysis of recapture data for 2678 serins (
Serinus serinus ) ringed in north-eastern Spain since 1985. We
investigated several time- and individual-specific factors as potential
predictors of overall mortality and dispersal patterns, and of gender and
age differences in these patterns. Time-specific covariates included
minimum daily temperature, days below freezing, and abundance of a strong
competitor, siskins ( Carduelis spinus ) during winter, and maximum
temperature and rainfall during summer. Individual covariates included
body mass (i.e. body condition), and wing length (i.e. flying ability),
and interactions between body mass and environmental factors. We found
little support of a predictive relationship between environmental factors
and survival, but good evidence of relationships between body mass and
survival, especially for juveniles. Juvenile survival appears to vary in a
curvilinear manner with increasing mass, suggesting that there may exist
an optimal mass beyond which increases are detrimental. The mass-survival
relationship does seem to be influenced by at least one environmental
factor, namely the abundance of wintering siskins. When siskins are
abundant, increases in body mass appear to relate strongly to increasing
survival. When siskin numbers are average or low the relationship is
largely reversed, suggesting that the presence of strong competition
mitigates the otherwise largely negative aspects of greater body mass.
Wing length in juveniles also appears to be related positively to
survival, perhaps largely due to the influence of a few unusually large
juveniles with adult-like survival. Further work is needed to test these
relationships, ideally under experimentation.
Journal: Journal of Applied Statistics
Pages: 125-142
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108674
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108674
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:125-142
Template-Type: ReDIF-Article 1.0
Author-Name: Richard Barker
Author-X-Name-First: Richard
Author-X-Name-Last: Barker
Author-Name: David Fletcher
Author-X-Name-First: David
Author-X-Name-Last: Fletcher
Author-Name: Paul Scofield
Author-X-Name-First: Paul
Author-X-Name-Last: Scofield
Title: Measuring density dependence in survival from mark-recapture data
Abstract:
We discuss the analysis of mark-recapture data when the aim is to
quantify density dependence between survival rate and abundance. We
describe an analysis for a random effects model that includes a linear
relationship between abundance and survival using an errors-in-variables
regression estimator with analytical adjustment for approximate bias. The
analysis is illustrated using data from short-tailed shearwaters banded
for 48 consecutive years at Fisher Island, Tasmania, and Hutton's
shearwater banded at Kaikoura, New Zealand for nine consecutive years. The
Fisher Island data provided no evidence of a density dependence
relationship between abundance and survival, and confidence interval
widths rule out anything but small density dependent effects. The Hutton's
shearwater data were equivocal with the analysis unable to rule out
anything but a very strong density dependent relationship between survival
and abundance.
Journal: Journal of Applied Statistics
Pages: 305-313
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108782
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108782
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:305-313
Template-Type: ReDIF-Article 1.0
Author-Name: Jeffrey Spendelow
Author-X-Name-First: Jeffrey
Author-X-Name-Last: Spendelow
Author-Name: James Nichols
Author-X-Name-First: James
Author-X-Name-Last: Nichols
Author-Name: James Hines
Author-X-Name-First: James
Author-X-Name-Last: Hines
Author-Name: Jean-Dominique Lebreton
Author-X-Name-First: Jean-Dominique
Author-X-Name-Last: Lebreton
Author-Name: Roger Pradel
Author-X-Name-First: Roger
Author-X-Name-Last: Pradel
Title: Modelling postfledging survival and age-specific breeding probabilities in species with delayed maturity: A case study of Roseate Terns at Falkner Island, Connecticut
Abstract:
We modelled postfledging survival and age-specific breeding probabilities
in endangered Roseate Terns ( Sterna dougallii ) at Falkner Island,
Connecticut, USA using capture-recapture data from 1988-1998 of birds
ringed as chicks and as adults. While no individuals bred as 2-year-olds
during this period, about three-quarters of the young that survived and
returned as 3-year-olds nested, and virtually all surviving birds had
begun breeding by the time they reached 5 years of age. We found no
evidence of temporal variation age of first breeding of birds from
different cohorts. There was significant temporal variation in the annual
survival of adults and the survival over the typical 3-year maturation
period of prebreeding birds, with extremely low values for both groups
from the 1991 breeding season. The estimated overwinter survival rate
(0.62) for adults from 1991-1992 was about three-quarters the usual rate
of about 0.83, but the low survival of fledglings from 1991 resulted in
less than 25% of the otherwise expected number of young from that cohort
returning as breeding birds; this suggests that fledglings suffered a
greater proportional decrease in survival than did adults. The survival
estimates of young from 1989 and 1990 show that these cohorts were not
negatively influenced by the events that decimated the young from 1991,
and the young from 1992 and 1993 had above-average survival estimates. The
apparent decrease since 1996 in development of fidelity of new recruits to
this site is suspected to be due mainly to nocturnal disturbance and
predation of chicks causing low productivity.
Journal: Journal of Applied Statistics
Pages: 385-405
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108764
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108764
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:385-405
Template-Type: ReDIF-Article 1.0
Author-Name: J. Andrew Royle
Author-X-Name-First: J. Andrew
Author-X-Name-Last: Royle
Author-Name: William Link
Author-X-Name-First: William
Author-X-Name-Last: Link
Title: Random effects and shrinkage estimation in capture-recapture models
Abstract:
We discuss the analysis of random effects in capture-recapture models,
and outline Bayesian and frequentists approaches to their analysis. Under
a normal model, random effects estimators derived from Bayesian or
frequentist considerations have a common form as shrinkage estimators. We
discuss some of the difficulties of analysing random effects using
traditional methods, and argue that a Bayesian formulation provides a
rigorous framework for dealing with these difficulties. In
capture-recapture models, random effects may provide a parsimonious
compromise between constant and completely time-dependent models for the
parameters (e.g. survival probability). We consider application of random
effects to band-recovery models, although the principles apply to more
general situations, such as Cormack-Jolly-Seber models. We illustrate
these ideas using a commonly analysed band recovery data set.
Journal: Journal of Applied Statistics
Pages: 329-351
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108746
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108746
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:329-351
Template-Type: ReDIF-Article 1.0
Author-Name: E. A. Catchpole
Author-X-Name-First: E. A.
Author-X-Name-Last: Catchpole
Author-Name: B. J. T. Morgan
Author-X-Name-First: B. J. T.
Author-X-Name-Last: Morgan
Author-Name: A. Viallefont
Author-X-Name-First: A.
Author-X-Name-Last: Viallefont
Title: Solving problems in parameter redundancy using computer algebra
Abstract:
A model, involving a particular set of parameters, is said to be
parameter redundant when the likelihood can be expressed in terms of a
smaller set of parameters. In many important cases, the parameter
redundancy of a model can be checked by evaluating the symbolic rank of a
derivative matrix. We describe the main results, and show how to construct
this matrix using the symbolic algebra package Maple. We apply the theory
to examples from the mark-recapture field. General code is given which can
be applied to other models.
Journal: Journal of Applied Statistics
Pages: 625-636
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108601
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108601
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:625-636
Template-Type: ReDIF-Article 1.0
Author-Name: Evan Cooch
Author-X-Name-First: Evan
Author-X-Name-Last: Cooch
Author-Name: Emmanuelle Cam
Author-X-Name-First: Emmanuelle
Author-X-Name-Last: Cam
Author-Name: William Link
Author-X-Name-First: William
Author-X-Name-Last: Link
Title: Occam's shadow: Levels of analysis in evolutionary ecology--where to next?
Abstract:
Evolutionary ecology is the study of evolutionary processes, and the
ecological conditions that influence them. A fundamental paradigm
underlying the study of evolution is natural selection. Although there are
a variety of operational definitions for natural selection in the
literature, perhaps the most general one is that which characterizes
selection as the process whereby heritable variation in fitness associated
with variation in one or more phenotypic traits leads to intergenerational
change in the frequency distribution of those traits. The past 20 years
have witnessed a marked increase in the precision and reliability of our
ability to estimate one or more components of fitness and characterize
natural selection in wild populations, owing particularly to significant
advances in methods for analysis of data from marked individuals. In this
paper, we focus on several issues that we believe are important
considerations for the application and development of these methods in the
context of addressing questions in evolutionary ecology. First, our
traditional approach to estimation often rests upon analysis of aggregates
of individuals, which in the wild may reflect increasingly non-random
(selected) samples with respect to the trait(s) of interest. In some
cases, analysis at the aggregate level, rather than the individual level,
may obscure important patterns. While there are a growing number of
analytical tools available to estimate parameters at the individual level,
and which can cope (to varying degrees) with progressive selection of the
sample, the advent of new methods does not reduce the need to consider
carefully the appropriate level of analysis in the first place. Estimation
should be motivated a priori by strong theoretical analysis. Doing so
provides clear guidance, in terms of both (i) assisting in the
identification of realistic and meaningful models to include in the
candidate model set, and (ii) providing the appropriate context under
which the results are interpreted. Second, while it is true that selection
(as defined) operates at the level of the individual, the selection
gradient is often (if not generally) conditional on the abundance of the
population. As such, it may be important to consider estimating transition
rates conditional on both the parameter values of the other individuals in
the population (or at least their distribution), and population abundance.
This will undoubtedly pose a considerable challenge, for both single- and
multi-strata applications. It will also require renewed consideration of
the estimation of abundance, especially for open populations. Thirdly,
selection typically operates on dynamic, individually varying traits. Such
estimation may require characterizing fitness in terms of individual
plasticity in one or more state variables, constituting analysis of the
norms of reaction of individuals to variable environments. This can be
quite complex, especially for traits that are under facultative control.
Recent work has indicated that the pattern of selection on such traits is
conditional on the relative rates of movement among and frequency of
spatially heterogeneous habitats, suggesting analyses of evolution of life
histories in open populations can be misleading in some cases.
Journal: Journal of Applied Statistics
Pages: 19-48
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108421
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108421
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:19-48
Template-Type: ReDIF-Article 1.0
Author-Name: Brett Sandercock
Author-X-Name-First: Brett
Author-X-Name-Last: Sandercock
Author-Name: Steven Beissinger
Author-X-Name-First: Steven
Author-X-Name-Last: Beissinger
Title: Estimating rates of population change for a neotropical parrot with ratio, mark-recapture and matrix methods
Abstract:
Robust methods for estimating rates of population change ( u ) are
necessary for applied and theoretical goals in conservation and
evolutionary biology. Traditionally, u has been calculated from either
ratios of population counts (observed u or u obs ), or population models
based on projection matrices (asymptotic u or u asy ,). New mark-recapture
methods permit calculation of u from mark-resighting information alone
(realized u or u rea ), but empirical comparisons with other methods are
rare. In this paper, rates of population change were calculated for a
population of green-rumped parrotlets ( Forpus passerinus ) that have been
studied for more than a decade in central Venezuela. First, a ratio method
based on counts of detected birds was used to calculate u obs. Next, a
temporal symmetry method based on mark-recapture data (i.e. the u
-parameterization introduced by Pradel, 1996) was used to calculate u rea
. Finally, a stage-structured matrix model based on state-specific
estimates of fecundity, immigration, local survival, and transition rates
was used to calculate u asy . Analyses were conducted separately for
females and males. Overall values of u ⁁from the three methods were
consistent and all indicated that the finite rate of population change was
not significantly different from 1. Annual values of u from the three
methods were also in general agreement for a majority of years. However, u
rea from the temporal symmetry method had the greatest precision, and
apparently better accuracy than u asy . Unrealistic annual values of u asy
could have been due to poor estimates of the transitional probability of
becoming a breeder ( ‚ ) or to a mismatch between the actual and
the asymptotic stable stage distribution. In this study, the trade-off
between biological realism and accuracy was better met by the temporal
symmetry than the matrix method. Our results suggest that the temporal
symmetry models can be applied with confidence to populations where less
information may be available.
Journal: Journal of Applied Statistics
Pages: 589-607
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108818
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108818
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:589-607
Template-Type: ReDIF-Article 1.0
Author-Name: Douglas Johnson
Author-X-Name-First: Douglas
Author-X-Name-Last: Johnson
Title: Discussion comments on 'Evaluation of some random effects methodology applicable to bird ringing data' by Burnham & White
Abstract:
Journal: Journal of Applied Statistics
Pages: 265-266
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108728
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108728
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:265-266
Template-Type: ReDIF-Article 1.0
Author-Name: Mijeom Joe
Author-X-Name-First: Mijeom
Author-X-Name-Last: Joe
Author-Name: Kenneth H. Pollock Biomathematics
Author-X-Name-First: Kenneth H. Pollock
Author-X-Name-Last: Biomathematics
Title: Separation of survival and movement rates in multi-state tag-return and capture-recapture models
Abstract:
There has been growing interest in the estimation of transition
probabilities among stages (Hestbeck et al. , 1991; Brownie et al. , 1993;
Schwarz et al. , 1993) in tag-return and capture-recapture models. This
has been driven by the increasing interest in meta-population models in
ecology and the need for parameter estimates to use in these models. These
transition probabilities are composed of survival and movement rates,
which can only be estimated separately when an additional assumption is
made (Brownie et al. , 1993). Brownie et al. (1993) assumed that movement
occurs at the end of the interval between time i and i + 1. We generalize
this work to allow different movement patterns in the interval for
multiple tag-recovery and capture-recapture experiments. The time of
movement is a random variable with a known distribution. The model
formulations can be viewed as matrix extensions to the model formulations
of single open population capturerecapture and tag-recovery experiments
(Jolly, 1965; Seber, 1965; Brownie et al. , 1985). We also present the
results of a small simulation study for the tag-return model when movement
time follows a beta distribution, and later another simulation study for
the capture-recapture model when movement time follows a uniform
distribution. The simulation studies use a modified program SURVIV (White,
1983). The Relative Standard Errors (RSEs) of estimates according to high
and low movement rates are presented. We show there are strong
correlations between movement and survival estimates in the case that the
movement rate is high. We also show that estimators of movement rates to
different areas and estimators of survival rates in different areas have
substantial correlations.
Journal: Journal of Applied Statistics
Pages: 373-384
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108836
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108836
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:373-384
Template-Type: ReDIF-Article 1.0
Author-Name: Marlina Nasution
Author-X-Name-First: Marlina
Author-X-Name-Last: Nasution
Author-Name: Cavell Brownie
Author-X-Name-First: Cavell
Author-X-Name-Last: Brownie
Author-Name: Kenneth Pollock
Author-X-Name-First: Kenneth
Author-X-Name-Last: Pollock
Title: Optimal allocation of sample sizes between regular banding and radio-tagging for estimating annual survival and emigration rates
Abstract:
Many authors have shown that a combined analysis of data from two or more
types of recapture survey brings advantages, such as the ability to
provide more information about parameters of interest. For example, a
combined analysis of annual resighting and monthly radio-telemetry data
allows separate estimates of true survival and emigration rates, whereas
only apparent survival can be estimated from the resighting data alone.
For studies involving more than one type of survey, biologists should
consider how to allocate the total budget to the surveys related to the
different types of marks so that they will gain optimal information from
the surveys. For example, since radio tags and subsequent monitoring are
very costly, while leg bands are cheap, the biologists should try to
balance costs with information obtained in deciding how many animals
should receive radios. Given a total budget and specific costs, it is
possible to determine the allocation of sample sizes to different types of
marks in order to minimize the variance of parameters of interest, such as
annual survival and emigration rates. In this paper, we propose a cost
function for a study where all birds receive leg bands and a subset
receives radio tags and all new releases occur at the start of the study.
Using this cost function, we obtain the allocation of sample sizes to the
two survey types that minimizes the standard error of survival rate
estimates or, alternatively, the standard error of emigration rates. Given
the proposed costs, we show that for high resighting probability, e.g.
0.6, tagging roughly 10-40% of birds with radios will give survival
estimates with standard errors within the minimum range. Lower resighting
rates will require a higher percentage of radioed birds. In addition, the
proposed costs require tagging the maximum possible percentage of radioed
birds to minimize the standard error of emigration estimates.
Journal: Journal of Applied Statistics
Pages: 443-457
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108863
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108863
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:443-457
Template-Type: ReDIF-Article 1.0
Author-Name: Thierry Boulinier
Author-X-Name-First: Thierry
Author-X-Name-Last: Boulinier
Author-Name: Nigel Yoccoz
Author-X-Name-First: Nigel
Author-X-Name-Last: Yoccoz
Author-Name: Karen McCoy
Author-X-Name-First: Karen
Author-X-Name-Last: McCoy
Author-Name: Kjell Einar Erikstad
Author-X-Name-First: Kjell Einar
Author-X-Name-Last: Erikstad
Author-Name: Torkild Tveraa
Author-X-Name-First: Torkild
Author-X-Name-Last: Tveraa
Title: Testing the effect of conspecific reproductive success on dispersal and recruitment decisions in a colonial bird: Design issues
Abstract:
Factors affecting dispersal and recruitment in animal populations will
play a prominent role in the dynamics of populations. This is particularly
the case for subdivided populations where the dispersal of individuals
among patches may lead to local extinction and 'rescue effects'. A
long-term observational study carried out in Brittany, France, and
involving colour-ringed Black-legged Kittiwakes (Rissa tridactyla)
suggested that the reproductive success of conspecifics (or some social
correlate) could be one important factor likely to affect dispersal and
recruitment. By dispersing from patches where the local reproductive
success was low and recruiting to patches where the local reproductive
success was high, individual birds could track spatio-temporal variations
in the quality of breeding patches (the quality of breeding patches can be
affected by different factors, such as food availability, the presence of
predators or ectoparasites, which can vary in space and time at different
scales). Such an observational study may nevertheless have confounded the
role of conspecific reproductive success with the effect of a correlated
factor (e.g. the local activities of a predator). In other words,
individuals may have been influenced directly by the factor responsible
for the low local reproductive success or indirectly by the low success of
their neighbours. Thus, an experimental approach was needed to address
this question. Estimates of demographic parameters (other than
reproductive success) and studies of the response of marked individuals to
changes in their environment usually face problems associated with
variability in the probability of detecting individuals and with
nonindependence among events occurring on a local scale. Further, very few
studies on dispersal have attempted to address the causal nature of
relationships by experimentally manipulating factors. Here we present an
experiment designed to test for an effect of local reproductive success of
conspecifics on behavioural decisions of individuals regarding dispersal
and recruitment. The experiment was carried out on Kittiwakes within a
large seabird colony in northern Norway. It involved (i) the colour
banding of several hundreds of birds; (ii) the manipulation
(increase/decrease) of the local reproductive success of breeding groups
on cliffpatches; and (iii) the detailed survey of attendance and
activities of birds on these patches. It also involved the manipulation of
the nest content of marked individuals breeding within these patches
(individuals failing at the egg stage were expected to respond in terms of
dispersal to the success of their neighbours). This allowed us to test
whether a lower local reproductive success would lower (1) the attendance
of breeders at the end of the breeding season; (2) the presence of
prospecting birds; and (3) the proportion of failed breeders that came
back to breed on the same patch the year after. In this paper, we discuss
how we dealt with (I) the use of return rates to infer differences in
dispersal rates; (II) the trade-off between sample sizes and local
treatment levels; and (III) potential differences in detection
probabilities among locations. We also present some results to illustrate
the design and implementation of the experiment.
Journal: Journal of Applied Statistics
Pages: 509-520
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108566
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108566
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:509-520
Template-Type: ReDIF-Article 1.0
Author-Name: James Nichols
Author-X-Name-First: James
Author-X-Name-Last: Nichols
Author-Name: James Hines
Author-X-Name-First: James
Author-X-Name-Last: Hines
Title: Approaches for the direct estimation of u , and demographic contributions to u , using capture-recapture data
Abstract:
We first consider the estimation of the finite rate of population
increase or population growth rate, u i , using capture-recapture data
from open populations. We review estimation and modelling of u i under
three main approaches to modelling openpopulation data: the classic
approach of Jolly (1965) and Seber (1965), the superpopulation approach of
Crosbie & Manly (1985) and Schwarz & Arnason (1996), and the temporal
symmetry approach of Pradel (1996). Next, we consider the contributions of
different demographic components to u i using a probabilistic approach
based on the composition of the population at time i + 1 (Nichols et al.,
2000b). The parameters of interest are identical to the seniority
parameters, n i , of Pradel (1996). We review estimation of n i under the
classic, superpopulation, and temporal symmetry approaches. We then
compare these direct estimation approaches for u i and n i with analogues
computed using projection matrix asymptotics. We also discuss various
extensions of the estimation approaches to multistate applications and to
joint likelihoods involving multiple data types.
Journal: Journal of Applied Statistics
Pages: 539-568
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108809
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108809
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:539-568
Template-Type: ReDIF-Article 1.0
Author-Name: Nigel Yoccoz
Author-X-Name-First: Nigel
Author-X-Name-Last: Yoccoz
Author-Name: Kjell Erikstad
Author-X-Name-First: Kjell
Author-X-Name-Last: Erikstad
Author-Name: Jan Bustnes
Author-X-Name-First: Jan
Author-X-Name-Last: Bustnes
Author-Name: Sveinn Hanssen
Author-X-Name-First: Sveinn
Author-X-Name-Last: Hanssen
Author-Name: Torkild Tveraa
Author-X-Name-First: Torkild
Author-X-Name-Last: Tveraa
Title: Costs of reproduction in common eiders ( Somateria mollissima ): An assessment of relationships between reproductive effort and future survival and reproduction based on observational and experimental studies
Abstract:
The two traditional approaches to the study of costs of reproduction,
correlational and experimental, have been used in parallel in a breeding
colony of common eiders ( Somateria mollissima ) and were compared in this
paper. The analysis of the observational data was based on a two-strata
capture-recapture model, the strata being defined on the basis of the
clutch size laid by individual females in a given year. The best model
according to AIC C indicated substantial variation in survival, recapture
and transition rates, but overall a pattern emerged: females laying large
clutches have a somewhat higher survival and much higher capture rate than
females laying small clutches, and transition from large to small clutch
size occurs much more frequently than the reverse transition. The analysis
of the experimental data (adding/removing one egg) showed that no clear
effect was found on either survival or transition rates. We conclude by
suggesting (1) that condition should be included in multi-strata models in
addition to reproductive effort; (2) that a specific study design for
estimating the proportion of non-breeding females should be implemented,
and (3) that non-breeding (a non-observable state in this study) may be
influenced by previous reproduction events.
Journal: Journal of Applied Statistics
Pages: 57-64
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108458
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108458
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:57-64
Template-Type: ReDIF-Article 1.0
Author-Name: S. E. Piper
Author-X-Name-First: S. E.
Author-X-Name-Last: Piper
Title: Survival of adult, territorial Longtailed Wagtails Motacilla clara : The effects of environmental factors and individual covariates
Abstract:
The Longtailed Wagtail is a non-migratory African passerine that is
confined exclusively to small, fast-flowing rivers in a largely arboreal
environment. The breeding adults hold permanent, life-long, linear
territories in their riverine habitat and this makes it easy to locate
colour-marked birds. They are confiding by nature and permit close
approach, often to less than 10 m, and this allows their unique
permutations of colourrings to be read. Using data from the 21 year
period, 1 August 1978 to 31 July 1999, of a dozen territories it has been
shown that the breeding territories have not changed at all, even though
there has been a continual, but slow turnover of territory holders. A
total of 109 territorial adult birds were monitored for a total of 1121
bird-quarters and survival was estimated for each of four quarters in a
year. The average survival rate is estimated at 68.8% yr -1 (95%
confidence limits: 63.3% to 69.3%) and this is high for such a small bird
(approximately 20 g) and there have been some remarkably long-lived
individuals, e.g. 10 to 12 years. In this paper, a generalized linear
model is built of the survival of territorial adults. It is shown that
bigger birds have a higher survival rate and that there are seasonal
differences in survival that are ascribable to the cost of breeding and
possibly cost of moult. There is an underlying long-term quadratic trend
in survival that is related to increasing environmental degradation and
decreasing chemical pollution.
Journal: Journal of Applied Statistics
Pages: 107-124
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108485
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108485
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:107-124
Template-Type: ReDIF-Article 1.0
Author-Name: Evan Cooch
Author-X-Name-First: Evan
Author-X-Name-Last: Cooch
Title: Fledging size and survival in snow geese: Timing is everything (or is it?)
Abstract:
In many birds, body size at fledging is assumed to predict accurately the
probability of subsequent survival, and size at fledging is often used as
a proxy variable in analyses attempting to assess the pattern of natural
selection on body size. However, in some species, size at fledging can
vary significantly as a function of variation in the environmental
component of growth. Such developmental plasticity has been demonstrated
in several species of Arctic-breeding geese. In many cases, slower growth
and reduced size at fledging has been suggested as the most parsimonious
explanation for reduced post-fledging survival in goslings reared under
poor environmental conditions. However, simply quantifying a relationship
between mean size at fledging and mean survival rate (Francis et al .,
1992) may obscure the pattern of selection on the interaction of the
genetic and environmental components of growth. The hypothesis that
selection operates on the environmental component of body size at
fledging, rather than the genetic component of size per se, was tested
using data from the long-term study of Lesser Snow Geese ( Anser c.
caerulescens ) breeding at La Perouse Bay, Manitoba, Canada. Using data
from female goslings measured at fledging, post-fledging survival rates
were estimated using combined live encounter and dead recovery data
(Burnham, 1993). To control for the covariation between growth and
environmental factors, survival rates were constrained to be functions of
individual covariation of size at fledging, and various measures of the
timing of hatch; in all Arctic-breeding geese studied to date, late
hatching goslings grow significantly more slowly than do early hatching
goslings. The slower growth of late-hatching goslings has been
demonstrated to reflect systematic changes in the environmental component
of growth, and thus controlling for hatch date controls for a significant
proportion of variation in the environmental component of growth. The
relationship between size at fledging, hatch date and survival was found
to be significantly non-linear; among early hatching goslings, there was
little indication of significant differences in survival rate among large
and small goslings. However, with increasingly later hatch dates, there
was progressively greater mortality selection against smaller, slower
growing goslings in most years. This would appear to suggest that body
size matters, but not absolutely; small size leads to reduced survival for
late-hatching goslings only at La Perouse Bay. Since at least some of the
variation in size among goslings for a given hatch date reflects genetic
differences, this suggests selection may favour larger size at fledging,
albeit only among late-hatching goslings.
Journal: Journal of Applied Statistics
Pages: 143-162
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108494
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108494
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:143-162
Template-Type: ReDIF-Article 1.0
Author-Name: S. M. Slattery
Author-X-Name-First: S. M.
Author-X-Name-Last: Slattery
Author-Name: R. T. Alisauskas
Author-X-Name-First: R. T.
Author-X-Name-Last: Alisauskas
Title: Use of the Barker model in an experiment examining covariate effects on first-year survival in Ross's Geese ( Chen rossii ): A case study
Abstract:
The Barker model provides researchers with an opportunity to use three
types of data for mark-recapture analyses - recaptures, recoveries, and
resightings. This model structure maximizes use of encounter data and
increases the precision of parameter estimates, provided the researcher
has large amounts of resighting data. However, to our knowledge, this
model has not been used for any published ringing studies. Our objective
here is to report our use of the Barker model in covariate-dependent
analyses that we conducted in Program MARK. In particular, we wanted to
describe our experimental study design and discuss our analytical approach
plus some logistical constraints we encountered while conducting a study
of the effects of growth and parasites on survival of juvenile Ross's
Geese. Birds were marked just before fledging, alternately injected with
antiparasite drugs or a control, and then were re-encountered during
migration and breeding in following years. Although the Barker model
estimates seven parameters, our objectives focused on annual survival
only, thus we considered all other parameters as nuisance terms.
Therefore, we simplified our model structures by maintaining biological
complexity on survival, while retaining a very basic structure on nuisance
parameters. These analyses were conducted in a two-step approach where we
used the most parsimonious model from nuisance parameter analyses as our
starting model for analyses of covariate effects. This analytical approach
also allowed us to minimize the long CPU times associated with the use of
covariates in earlier versions of Program MARK. Resightings made up about
80% of our encounter history data, and simulations demonstrated that
precision and bias of parameter estimates were minimally affected by this
distribution. Overall, the main source of bias was that smaller goslings
were too small to retain neckbands, yet were the birds that we predicted
would have the lowest survival probability and highest probability for
parasite effects. Consequently, we considered our results conservative.
The largest constraint of our study design was the inability to partition
survival into biologically meaningful periods to provide insight into the
timing and mechanisms of mortality.
Journal: Journal of Applied Statistics
Pages: 497-508
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108548
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108548
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:497-508
Template-Type: ReDIF-Article 1.0
Author-Name: Gary White
Author-X-Name-First: Gary
Author-X-Name-Last: White
Title: Discussion comments on: The use of auxiliary variables in capture-recapture modelling. An overview
Abstract:
Journal: Journal of Applied Statistics
Pages: 103-106
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108476
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108476
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:103-106
Template-Type: ReDIF-Article 1.0
Author-Name: C. J. Schwarz
Author-X-Name-First: C. J.
Author-X-Name-Last: Schwarz
Title: Discussion comments on 'Prior distributions for stratified capture-recapture models'
Abstract:
Journal: Journal of Applied Statistics
Pages: 239-240
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108647
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108647
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:239-240
Template-Type: ReDIF-Article 1.0
Author-Name: Juan Carlos Senar
Author-X-Name-First: Juan Carlos
Author-X-Name-Last: Senar
Author-Name: Michael Conroy
Author-X-Name-First: Michael
Author-X-Name-Last: Conroy
Author-Name: Antoni Borras
Author-X-Name-First: Antoni
Author-X-Name-Last: Borras
Title: Asymmetric exchange between populations differing in habitat quality: A metapopulation study on the citril finch
Abstract:
The citril finch ( Serinus citrinella ) is a Cardueline finch restricted
to the high mountains of western Europe. Since 1991 we have
captured-recaptured about 6000 birds in two contrasting subpopulations
located on the same mountain but separated by 5 km in distance. Citril
finches, at the north-facing locality (La Vansa), rely more on Pine trees
( Pinus uncinata ) as their main food source, than birds at the
south-facing locality (La Bofia), which rely more on herb seeds, which are
of lower energetic content. Birds at La Vansa had higher body mass and fat
score than those at La Bofia, suggesting that La Vansa was a site of
higher-quality than La Bofia. By the use of a metapopulation approach and
multistate models, we found that citril finches at the high-quality
locality (La Vansa) showed higher survival rates than those at the
low-quality one (La Bofia) (Vansa adults: φ = 0.42 - 0.04,
juveniles: φ = 0.34 - 0.05; Bofia adults: φ = 0.35 - 0.04,
juveniles: φ = 0.28 - 0.05). Dispersal was also asymmetric and
higher for juvenile birds, with movement rates for juvenile citril finches
from the low-quality to the higher-quality locality (Bofia to Vansa:
‚ = 0.38 - 0.10) higher than the reverse (Vansa to Bofia: ‚
= 0.09 - 0.03). We also investigated time-specific factors (e.g.
meteorological data and fructification rate of Pinus ) as potential
predictors of overall mortality and dispersal patterns. The results do not
allow strong conclusions regarding the impact of these factors on survival
and movement rates. Patterns of movement found in the Citril Finch between
localities document a new model for the dispersal of species from low to
high quality habitats, which we label of 'sources and pools'. This
contrasts with currently accepted models of 'sources and sinks', in which
movement is from high to low quality habitats, and 'Ideal Free
Distributions', in which there is a balanced dispersal between habitats of
different quality.
Journal: Journal of Applied Statistics
Pages: 425-441
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108791
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108791
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:425-441
Template-Type: ReDIF-Article 1.0
Author-Name: Richard Barker
Author-X-Name-First: Richard
Author-X-Name-Last: Barker
Author-Name: Evan Cooch
Author-X-Name-First: Evan
Author-X-Name-Last: Cooch
Author-Name: Carl Schwarz
Author-X-Name-First: Carl
Author-X-Name-Last: Schwarz
Title: Discussion comments on: 'Approaches for the direct estimation of u and demographic contributions to u using capture-recapture data'
Abstract:
Journal: Journal of Applied Statistics
Pages: 569-572
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108610
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108610
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:569-572
Template-Type: ReDIF-Article 1.0
Author-Name: J. A. Dupuis
Author-X-Name-First: J. A.
Author-X-Name-Last: Dupuis
Title: Prior distributions for stratified capture-recapture models
Abstract:
We consider the Arnason-Schwarz model, usually used to estimate survival
and movement probabilities from capture-recapture data. A missing data
structure of this model is constructed which allows a clear separation of
information relative to capture and relative to movement. Extensions of
the Arnason-Schwarz model are considered. For example, we consider a model
that takes into account both the individual migration history and the
individual reproduction history. Biological assumptions of these
extensions are summarized via a directed graph. Owing to missing data, the
posterior distribution of parameters is numerically intractable. To
overcome those computational difficulties we advocate a Gibbs sampling
algorithm that takes advantage of the missing data structure inherent in
capture-recapture models. Prior information on survival, capture and
movement probabilities typically consists of a prior mean and of a prior
95% credible confidence interval. Dirichlet distributions are used to
incorporate some prior information on capture, survival probabilities, and
movement probabilities. Finally, the influence of the prior on the
Bayesian estimates of movement probabilities is examined.
Journal: Journal of Applied Statistics
Pages: 225-237
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108692
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108692
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:225-237
Template-Type: ReDIF-Article 1.0
Author-Name: Shirley Pledger
Author-X-Name-First: Shirley
Author-X-Name-Last: Pledger
Author-Name: Carl Schwarz
Author-X-Name-First: Carl
Author-X-Name-Last: Schwarz
Title: Modelling heterogeneity of survival in band-recovery data using mixtures
Abstract:
Finite mixture methods are applied to bird band-recovery studies to allow
for heterogeneity of survival. Birds are assumed to belong to one of
finitely many groups, each of which has its own survival rate (or set of
survival rates varying by time and/or age). The group to which a specific
animal belongs is not known, so its survival probability is a random
variable from a finite mixture. Heterogeneity is thus modelled as a latent
effect. This gives a wide selection of likelihood-based models, which may
be compared using likelihood ratio tests. These models are discussed with
reference to real and simulated data, and compared with previous models.
Journal: Journal of Applied Statistics
Pages: 315-327
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108737
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108737
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:315-327
Template-Type: ReDIF-Article 1.0
Author-Name: A. Neil Arnason
Author-X-Name-First: A. Neil
Author-X-Name-Last: Arnason
Author-Name: Carl Schwarz
Author-X-Name-First: Carl
Author-X-Name-Last: Schwarz
Title: POPAN-6: Exploring convergence and estimate properties with SIMULATE
Abstract:
We describe some developments in the P OPAN system for the analysis of
mark-recapture data from Jolly-Seber (JS) type experiments. The latest
version, P OPAN-6, adopts the Design Matrix approach for specifying
constraints and then uses it in the constrained maximization of the
likelihood. We describe how this is done and the difference it makes to
convergence and parameter identifiability over the constraint
contrast-equation methods used in P OPAN-5. Then we show how the SIMULATE
capabilities of P OPAN can be used to explore the properties of estimates,
including their identifiability, precision, and robustness to model
misspecification or capture heterogeneity.
Journal: Journal of Applied Statistics
Pages: 649-668
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108593
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108593
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:649-668
Template-Type: ReDIF-Article 1.0
Author-Name: Kenneth Pollock
Author-X-Name-First: Kenneth
Author-X-Name-Last: Pollock
Title: The use of auxiliary variables in capture-recapture modelling: An overview
Abstract:
I review the use of auxiliary variables in capture-recapture models for
estimation of demographic parameters (e.g. capture probability, population
size, survival probability, and recruitment, emigration and immigration
numbers). I focus on what has been done in current research and what still
needs to be done. Typically in the literature, covariate modelling has
made capture and survival probabilities functions of covariates, but there
are good reasons also to make other parameters functions of covariates as
well. The types of covariates considered include environmental covariates
that may vary by occasion but are constant over animals, and individual
animal covariates that are usually assumed constant over time. I also
discuss the difficulties of using time-dependent individual animal
covariates and some possible solutions. Covariates are usually assumed to
be measured without error, and that may not be realistic. For closed
populations, one approach to modelling heterogeneity in capture
probabilities uses observable individual covariates and is thus related to
the primary purpose of this paper. The now standard Huggins-Alho approach
conditions on the captured animals and then uses a generalized
Horvitz-Thompson estimator to estimate population size. This approach has
the advantage of simplicity in that one does not have to specify a
distribution for the covariates, and the disadvantage is that it does not
use the full likelihood to estimate population size. Alternately one could
specify a distribution for the covariates and implement a full likelihood
approach to inference to estimate the capture function, the covariate
probability distribution, and the population size. The general Jolly-Seber
open model enables one to estimate capture probability, population sizes,
survival rates, and birth numbers. Much of the focus on modelling
covariates in program MARK has been for survival and capture probability
in the Cormack-Jolly-Seber model and its generalizations (including
tag-return models). These models condition on the number of animals marked
and released. A related, but distinct, topic is radio telemetry survival
modelling that typically uses a modified Kaplan-Meier method and Cox
proportional hazards model for auxiliary variables. Recently there has
been an emphasis on integration of recruitment in the likelihood, and
research on how to implement covariate modelling for recruitment and
perhaps population size is needed. The combined open and closed 'robust'
design model can also benefit from covariate modelling and some important
options have already been implemented into MARK. Many models are usually
fitted to one data set. This has necessitated development of model
selection criteria based on the AIC (Akaike Information Criteria) and the
alternative of averaging over reasonable models. The special problems of
estimating over-dispersion when covariates are included in the model and
then adjusting for over-dispersion in model selection could benefit from
further research.
Journal: Journal of Applied Statistics
Pages: 85-102
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108430
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108430
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:85-102
Template-Type: ReDIF-Article 1.0
Author-Name: Andrea. Dhondt
Author-X-Name-First: Andrea.
Author-X-Name-Last: Dhondt
Title: Discussion comments 'Multistate recapture models: Modelling incomplete individual histories'--why are we doing all this?
Abstract:
Journal: Journal of Applied Statistics
Pages: 371-372
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108629
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108629
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:371-372
Template-Type: ReDIF-Article 1.0
Author-Name: Alan Franklin
Author-X-Name-First: Alan
Author-X-Name-Last: Franklin
Author-Name: David Anderson
Author-X-Name-First: David
Author-X-Name-Last: Anderson
Author-Name: Kenneth Burnham
Author-X-Name-First: Kenneth
Author-X-Name-Last: Burnham
Title: Estimation of long-term trends and variation in avian survival probabilities using random effects models
Abstract:
We obtained banding and recovery data from the Bird Banding Laboratory
(operated by the Biological Resources Division of the US Geological
Survey) for adults from 129 avian species that had been continuously
banded for > 24 years. Data were partitioned by gender, banding
period (winter versus summer), and by states/provinces. Data sets were
initially screened for adequacy based on specific criteria (e.g. minimum
sample sizes). Fifty-nine data sets (11 waterfowl species, the Mourning
Dove and Common Grackle) met our criteria of adequacy for further
analysis. We estimated annual survival probabilities using the Brownie et
al. recovery model {St, ft} in program MARK. Trends in annual survival and
temporal process variation were estimated using random effects models
based on shrinkage estimators. Waterfowl species had relatively little
variation in annual survival probabilities (mean CV = 8.7% and 10% for
males and females, respectively). The limited data for other species
suggested similar low temporal variation for males, but higher temporal
variation for females (CV = 40%). Evidence for long-term trends varied by
species, banding period and sex, with no obvious spatial patterns for
either positive or negative trends in survival probabilities. An exception
was Mourning Doves banded in Illinois/Missouri and Arizona/New Mexico
where both males (slope = -0.0122, se = 0.0019 and females (slope =
-0.0109 to -0.0128, se = 0.0018 -0.0032) exhibited declining trends in
survival probabilities. We believe our approach has application for
large-scale monitoring. However, meaningful banding and recovery data for
species other than waterfowl is very limited in North America.
Journal: Journal of Applied Statistics
Pages: 267-287
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108719
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108719
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:267-287
Template-Type: ReDIF-Article 1.0
Author-Name: Blandine Doligez
Author-X-Name-First: Blandine
Author-X-Name-Last: Doligez
Author-Name: Jean Clobert
Author-X-Name-First: Jean
Author-X-Name-Last: Clobert
Author-Name: Richard Pettifor
Author-X-Name-First: Richard
Author-X-Name-Last: Pettifor
Author-Name: Marcus Rowcliffe
Author-X-Name-First: Marcus
Author-X-Name-Last: Rowcliffe
Author-Name: Lars Gustafsson
Author-X-Name-First: Lars
Author-X-Name-Last: Gustafsson
Author-Name: Christopher Perrins
Author-X-Name-First: Christopher
Author-X-Name-Last: Perrins
Author-Name: Robin McCleery
Author-X-Name-First: Robin
Author-X-Name-Last: McCleery
Title: Costs of reproduction: Assessing responses to brood size manipulation on life-history and behavioural traits using multi-state capture-recapture models
Abstract:
Costs of reproduction are fundamental trade-offs shaping the evolution of
life histories. There has been much interest, discussion and controversy
about the nature and type of reproductive costs. The manipulation of
reproductive effort (e.g. brood size manipulation) may alter not only
life-history traits such as future adult survival rate and future
reproductive effort, but also behavioural decisions affecting
recapture/resighting and dispersal probabilities. We argue that many
previous studies of the costs of reproduction may have erroneously
concluded the existence or non-existence of such costs because of their
use of local return rates to assess survival. In this paper, we take
advantage of the modern multistate capture-recapture methods to highlight
how the accurate assessment of the costs of reproduction requires
incorporating not only recapture probability, but also behavioural 'state'
variables, for example dispersal status and current reproductive
investment. The inclusion of state-dependent decisions can radically alter
the conclusions drawn regarding the costs of reproduction on future
survival or reproductive investment. We illustrate this point by
re-analysing data collected to address the question of the costs of
reproduction in the collared flycatcher and the great tit. We discuss in
some detail the methodological issues and implications of the analytical
techniques.
Journal: Journal of Applied Statistics
Pages: 407-423
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108845
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108845
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:407-423
Template-Type: ReDIF-Article 1.0
Author-Name: R. T. Alisauskas
Author-X-Name-First: R. T.
Author-X-Name-Last: Alisauskas
Author-Name: M. S. Lindberg
Author-X-Name-First: M. S.
Author-X-Name-Last: Lindberg
Title: Effects of neckbands on survival and fidelity of white-fronted and Canada geese captured as non-breeding adults
Abstract:
We conducted an experiment to examine the effect of neckbands,
controlling for differences in sex, species and year of study (1991-1997),
on probabilities of capture, survival, reporting, and fidelity in
non-breeding small Canada ( Branta canadensis hutchinsi ) and
white-fronted ( Anser albifrons frontalis ) geese. In Canada's central
arctic, we systematically double-marked about half of the individuals from
each species with neckbands and legbands, and we marked the other half
only with legbands. We considered 48 a priori models that included
combinations of sex, species, year, and neckband effects on the four
population parameters produced by Burnham's (1993) model, using AIC for
model selection. The four best approximating models each included a
negative effect of neckbands on survival, and effect size varied among
years. True survival probability of neckbanded birds annually ranged from
0.006 to 0.23 and 0.039 to 0.22 (Canada and white-fronted geese,
respectively) lower than for conspecifics without neckbands. Changes in
estimates of survival probability in neckbanded birds appeared to
attenuate more recently, particularly in Canada Geese, a result that we
suspect was related to lower retention rates of neckbands. We urge extreme
caution in use of neckbands for estimation of certain population
parameters, and discourage their use for estimation of unbiased survival
probability in these two species.
Journal: Journal of Applied Statistics
Pages: 521-537
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108575
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108575
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:521-537
Template-Type: ReDIF-Article 1.0
Author-Name: Michael Conroy
Author-X-Name-First: Michael
Author-X-Name-Last: Conroy
Title: Real and quasi-experiments in capture-recapture studies: Suggestions for advancing the state of the art
Abstract:
Journal: Journal of Applied Statistics
Pages: 475-477
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108520
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108520
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:475-477
Template-Type: ReDIF-Article 1.0
Author-Name: Carl James Schwarz
Author-X-Name-First: Carl James
Author-X-Name-Last: Schwarz
Title: Real and quasi-experiments in capture-recapture studies
Abstract:
The three key elements of experimental design are randomization,
replication, and variance identification and control. Capture-recapture
experiments usually pay sufficient attention to the first two elements,
but often do not pay sufficient attention to sources of variation. These
include blocking factors and different sizes of experimental units. By
casting capture-recapture studies in an experimental design framework, the
various roles of these sources of variation become clear and the sources
that are pooled when these experiments are analysed using existing
software is also clear. This formulation also shows that care must be
taken with pseudo-replication and different sized experimental units.
Journal: Journal of Applied Statistics
Pages: 459-473
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108511
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108511
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:459-473
Template-Type: ReDIF-Article 1.0
Author-Name: J. A. Dupuis
Author-X-Name-First: J. A.
Author-X-Name-Last: Dupuis
Title: Response to Carl Schwarz
Abstract:
Journal: Journal of Applied Statistics
Pages: 241-244
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108656
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108656
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:241-244
Template-Type: ReDIF-Article 1.0
Author-Name: S. P. Brooks
Author-X-Name-First: S. P.
Author-X-Name-Last: Brooks
Author-Name: E. A. Catchpole
Author-X-Name-First: E. A.
Author-X-Name-Last: Catchpole
Author-Name: B. J. T. Morgan
Author-X-Name-First: B. J. T.
Author-X-Name-Last: Morgan
Author-Name: M. P. Harris
Author-X-Name-First: M. P.
Author-X-Name-Last: Harris
Title: Bayesian methods for analysing ringing data
Abstract:
A major recent development in statistics has been the use of fast
computational methods of Markov chain Monte Carlo. These procedures allow
Bayesian methods to be used in quite complex modelling situations. In this
paper, we shall use a range of real data examples involving lapwings,
shags, teal, dippers, and herring gulls, to illustrate the power and range
of Bayesian techniques. The topics include: prior sensitivity; the use of
reversible-jump MCMC for constructing model probabilities and comparing
models, with particular reference to models with random effects;
model-averaging; and the construction of Bayesian measures of
goodness-of-fit. Throughout, there will be discussion of the practical
aspects of the work - for instance explaining when and when not to use the
BUGS package.
Journal: Journal of Applied Statistics
Pages: 187-206
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108683
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108683
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:187-206
Template-Type: ReDIF-Article 1.0
Author-Name: Charles Francis
Author-X-Name-First: Charles
Author-X-Name-Last: Francis
Author-Name: Pertti Saurola
Author-X-Name-First: Pertti
Author-X-Name-Last: Saurola
Title: Estimating age-specific survival rates of tawny owls--recaptures versus recoveries
Abstract:
We compared estimates of annual survival rates of tawny owls ( Strix
aluco ) ringed in southern Finland from several different sampling
methods: recoveries of birds ringed as young; recaptures of birds ringed
as young; recoveries of birds ringed as adults as well as young; combined
recoveries and recaptures of birds ringed as young, and combined
recoveries and recaptures of birds ringed as adults and young. From 1979
to 1998, 18 040 young owls were ringed, of which 983 were recaptured as
breeders in subsequent years during this period, and 1764 were recovered
dead at various locations. In addition, 1751 owls were ringed as adults,
of which 612 were later recaptured and 199 were recovered dead. First-year
survival rates estimated using only recoveries of birds ringed as young
averaged 48%, while apparent survival rates estimated using only
recaptures from birds ringed as young averaged 10-13%. Use of combined
recapture-recovery models, or supplementary information from recoveries of
birds ringed as adults, produced survival estimates of 30-37%. Survival
estimates from young-recoveries-only models were biased high, because of
violation of the assumption of constant recovery rates with age: birds
dying in their first-year were one-third less likely to be found and
reported than older birds. In contrast, recaptures-only models confounded
emigration with mortality. Despite these differences in mean values,
annual fluctuations in estimated first-year survival rates were similar
with all models. Estimates of adult survival rates were similar with all
models, while those for second-year birds were similar for all models
except recaptures-only. These results highlight the potential biases
associated with analysing either recaptures or recoveries alone of birds
ringed as young, and the benefits of using combined data.
Journal: Journal of Applied Statistics
Pages: 637-647
Issue: 1-4
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120108584
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120108584
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:1-4:p:637-647
Template-Type: ReDIF-Article 1.0
Author-Name: David Harvey
Author-X-Name-First: David
Author-X-Name-Last: Harvey
Author-Name: Terence Mills
Author-X-Name-First: Terence
Author-X-Name-Last: Mills
Title: Unit roots and double smooth transitions
Abstract:
Techniques for testing the null hypothesis of difference stationarity
against stationarity around some deterministic function have received much
attention. In particular, unit root tests where the alternative is
stationarity around a smooth transition in a linear trend have recently
been proposed to permit the possibility of non-instantaneous structural
change. In this paper we develop tests extending such an approach in order
to admit more than one structural change. The analysis is motivated by
time series that appear to undergo two smooth transitions in the linear
trend, and the application of the new tests to two such series (average
global temperature and US consumer prices) highlights the benefits of this
double transition extension.
Journal: Journal of Applied Statistics
Pages: 675-683
Issue: 5
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120098739
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120098739
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:5:p:675-683
Template-Type: ReDIF-Article 1.0
Author-Name: M. P. Diaz
Author-X-Name-First: M. P.
Author-X-Name-Last: Diaz
Author-Name: A. H. Barchuk
Author-X-Name-First: A. H.
Author-X-Name-Last: Barchuk
Author-Name: S. Luque
Author-X-Name-First: S.
Author-X-Name-Last: Luque
Author-Name: C. Oviedo
Author-X-Name-First: C.
Author-X-Name-Last: Oviedo
Title: Generalized linear models to study spatial distribution of tree species in Argentinean arid Chaco
Abstract:
This work adapts some generalized linear models in order to study the
spatial pattern of an important tree species. The classical multivariate
Ising model, which incorporates the dependence on neighbour individuals in
a regular lattice, was adapted by setting a Poisson regression with an
extra variation parameter to fit over-dispersion. Because the spatial
pattern is only evident to a special reference scale, plots were sampled
at two different scales. Two individual presence-absence matrices were
analysed for each case through over-dispersion Poisson regression and
log-linear models, including binary indicators for a neighbour in the four
directions in the linear predictor. The results showed that the species,
in the adult stage, has a spatial distribution in patches having no more
than two adult individuals.
Journal: Journal of Applied Statistics
Pages: 685-694
Issue: 5
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120098748
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120098748
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:5:p:685-694
Template-Type: ReDIF-Article 1.0
Author-Name: M. Dharmalingam
Author-X-Name-First: M.
Author-X-Name-Last: Dharmalingam
Title: Construction of Partial Triallel Crosses based on Trojan Square Design
Abstract:
A systematic method of developing or raising the offsprings of parents or
lines that are subjected to analysis to draw valid inferences about
parents is called a Mating Design (MD). A Mating Design represents only a
part of a genetic experiment. Diallel and the four North Carolina (NC)
designs, Triallel and Double Crosses are notable examples of mating
designs. In this paper, an attempt has been made to provide a systematic
method of construction of Partial Triallel Crosses (PTC) using Trojan
Square Design (TSD), which requires only the fraction of the number of
crosses to be made compared with Triallel Crosses.
Journal: Journal of Applied Statistics
Pages: 695-702
Issue: 5
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120098757
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120098757
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:5:p:695-702
Template-Type: ReDIF-Article 1.0
Author-Name: Chul Ahn
Author-X-Name-First: Chul
Author-X-Name-Last: Ahn
Author-Name: Sin-Ho Jung
Author-X-Name-First: Sin-Ho
Author-X-Name-Last: Jung
Author-Name: Seung-Ho Kang
Author-X-Name-First: Seung-Ho
Author-X-Name-Last: Kang
Title: Modified regression coefficient analysis for repeated binary measurements
Abstract:
Myers & Broyles (2000a, 2000b) illustrate that regression coefficient
analysis (RCA) is a viable alternative to a generalized estimating
equation (GEE) in the analysis of correlated binomial data. Since the
regression coefficients (b i ' s ) may have different precisions, we
modify RCA by weighting b i ' s by the inverses of their variances for
statistical optimality. We perform the simulation study to evaluate the
performance of RCA, modified RCA and GEE in terms of empirical type I
errors and empirical powers of the regression coefficients in repeated
binary measurement designs with and without dropouts. Two thousand data
sets are generated using autoregressive (AR(1)) and compound symmetry (CS)
correlation structures. We compare the type I errors and powers of RCA,
modified RCA and GEE for the analysis of repeated binary measurement data
as affected by different dropout mechanisms such as random dropouts and
treatment dependent dropouts.
Journal: Journal of Applied Statistics
Pages: 703-710
Issue: 5
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120098766
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120098766
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:5:p:703-710
Template-Type: ReDIF-Article 1.0
Author-Name: Jorge Belaire-Franch
Author-X-Name-First: Jorge
Author-X-Name-Last: Belaire-Franch
Author-Name: Dulce Contreras-Bayarri
Author-X-Name-First: Dulce
Author-X-Name-Last: Contreras-Bayarri
Title: Improving cross-correlation tests through re-sampling techniques
Abstract:
In this paper, we show that type I and type II errors of the
cross-correlation test between two autocorrelated time series can be
reduced, in some cases, by means of tabulation of the empirical
distribution of the sample cross-correlation coefficient, using
alternative re-sampling techniques.
Journal: Journal of Applied Statistics
Pages: 711-720
Issue: 5
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120098775
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120098775
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:5:p:711-720
Template-Type: ReDIF-Article 1.0
Author-Name: Moon Yul Huh
Author-X-Name-First: Moon Yul
Author-X-Name-Last: Huh
Author-Name: Kiyeol Kim
Author-X-Name-First: Kiyeol
Author-X-Name-Last: Kim
Title: Visualization of multidimensional data using modifications of the Grand Tour
Abstract:
Current implementations of Asimov's Grand Tour (for example in XLISP-STAT
by Tierney, 1990, or in XGobi by Buja et al., 1996) do not remember the
path of projections and show only the current state during the touring
process. We propose a modification of the Grand Tour, named Tracking Grand
Tour (TGT), that shows the trace of the touring process as small 'comet
trails' of the projected points. The usefulness of the TGT is demonstrated
with a simulated and a real data set.
Journal: Journal of Applied Statistics
Pages: 721-728
Issue: 5
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120098784
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120098784
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:5:p:721-728
Template-Type: ReDIF-Article 1.0
Author-Name: Zbigniew Kominek
Author-X-Name-First: Zbigniew
Author-X-Name-Last: Kominek
Title: Minimum chi-squared estimation of stable distributions parameters: An application to the Warsaw Stock Exchange
Abstract:
This paper derives an application of the minimum chi-squared (MCS)
methodology to estimate the parameters of the unimodal symmetric stable
distribution. The proposed method is especially suitable for large, both
regular and non-standard, data sets. Monte Carlo simulations are performed
to compare the efficiency of the MCS estimation with the efficiency of the
McCulloch quantile algorithm. In the case of grouped observations,
evidence in favour of the MCS method is reported. For the ungrouped data
the MCS estimation generally performs better than McCulloch's quantile
method for samples larger than 400 observations and for high alphas. The
relative advantage of the MCS over the McCulloch estimators increases for
larger samples. The empirical example analyses the highly irregular
distributions of returns on the selected securities from the Warsaw Stock
Exchange. The quantile and maximum likelihood estimates of characteristic
exponents are generally smaller than the MCS ones. This reflects the bias
in the traditional methods, which is due to a lack of adjustment for
censored and clustered observations, and shows the flexibility of the
proposed MCS approach.
Journal: Journal of Applied Statistics
Pages: 729-744
Issue: 5
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120098793
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120098793
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:5:p:729-744
Template-Type: ReDIF-Article 1.0
Author-Name: Kelly Zou
Author-X-Name-First: Kelly
Author-X-Name-Last: Zou
Author-Name: W. J. Hall
Author-X-Name-First: W. J.
Author-X-Name-Last: Hall
Title: On estimating a transformation correlation coefficient
Abstract:
We consider a semiparametric and a parametric transformation-to-normality
model for bivariate data. After an unstructured or structured monotone
transformation of the measurement scales, the measurements are assumed to
have a bivariate normal distribution with correlation coefficient „
, here termed the 'transformation correlation coefficient'. Under the
semiparametric model with unstructured transformation, the principle of
invariance leads to basing inference on the marginal ranks. The resulting
rank-based likelihood function of „ is maximized via a Monte Carlo
procedure. Under the parametric model, we consider Box-Cox type
transformations and maximize the likelihood of „ along with the
nuisance parameters. Efficiencies of competing methods are reported, both
theoretically and by simulations. The methods are illustrated on a
real-data example.
Journal: Journal of Applied Statistics
Pages: 745-760
Issue: 5
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120098801
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120098801
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:5:p:745-760
Template-Type: ReDIF-Article 1.0
Author-Name: James Kepner
Author-X-Name-First: James
Author-X-Name-Last: Kepner
Author-Name: Dennis Wackerly
Author-X-Name-First: Dennis
Author-X-Name-Last: Wackerly
Title: Observations on the effect of the prior distribution on the predictive distribution in Bayesian inferences
Abstract:
Typically, in the brief discussion of Bayesian inferential methods
presented at the beginning of calculus-based undergraduate or graduate
mathematical statistics courses, little attention is paid to the process
of choosing the parameter value(s) for the prior distribution. Even less
attention is paid to the impact of these choices on the predictive
distribution of the data. Reasons for this include that the posterior can
be found by ignoring the predictive distribution thereby streamlining the
derivation of the posterior and/or that computer software can be used to
find the posterior distribution. In this paper, the binomial,
negative-binomial and Poisson distributions along with their conjugate
beta and gamma priors are utilized to obtain the resulting predictive
distributions. It is then demonstrated that specific choices of the
parameters of the priors can lead to predictive distributions with
properties that might be surprising to a non-expert user of Bayesian
methods.
Journal: Journal of Applied Statistics
Pages: 761-769
Issue: 5
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120098810
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120098810
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:5:p:761-769
Template-Type: ReDIF-Article 1.0
Author-Name: Man-Suk Oh
Author-X-Name-First: Man-Suk
Author-X-Name-Last: Oh
Author-Name: Dong Wan Shin
Author-X-Name-First: Dong Wan
Author-X-Name-Last: Shin
Title: Bayesian model selection and parameter estimation for possibly asymmetric and non-stationary time series using a reversible jump Markov chain Monte Carlo approach
Abstract:
A Markov chain Monte Carlo (MCMC) approach, called a reversible jump
MCMC, is employed in model selection and parameter estimation for possibly
non-stationary and non-linear time series data. The non-linear structure
is modelled by the asymmetric momentum threshold autoregressive process
(MTAR) of Enders & Granger (1998) or by the asymmetric self-exciting
threshold autoregressive process (SETAR) of Tong (1990). The
non-stationary and non-linear feature is represented by the MTAR (or
SETAR) model in which one ( „ 1 ) of the AR coefficients is greater
than one, and the other ( „ 2 ) is smaller than one. The other
non-stationary and linear, stationary and nonlinear, and stationary and
linear features, represented respectively by ( „ 1 = „ 2 = 1
), ( „ 1 p „ 2 < 1 ) and ( „ 1 = „ 2
< 1 ), are also considered as possible models. The reversible jump
MCMC provides estimates of posterior probabilities for these four
different models as well as estimates of the AR coefficients „ 1
and „ 2 . The proposed method is illustrated by analysing six
series of US interest rates in terms of model selection, parameter
estimation, and forecasting.
Journal: Journal of Applied Statistics
Pages: 771-789
Issue: 5
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760120098829
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760120098829
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:5:p:771-789
Template-Type: ReDIF-Article 1.0
Author-Name: Kelly Zou
Author-X-Name-First: Kelly
Author-X-Name-Last: Zou
Author-Name: W. J. Hall
Author-X-Name-First: W. J.
Author-X-Name-Last: Hall
Title: Semiparametric and parametric transformation models for comparing diagnostic markers with paired design
Abstract:
We develop semiparametric and parametric transformation models for
estimation and comparison of ROC curves derived from measurements from two
diagnostic tests on the same subjects. We assume the existence of
transformed measurement scales, one for each test, on which the paired
measurements have bivariate normal distributions. The resulting pair of
ROC curves are estimated by maximum likelihood algorithms, using joint
rank data in the semiparametric model with unspecified transformations and
using Box-Cox transformations in the parametric transformation case.
Several hypothesis tests for comparing the two ROC curves, or
characteristics of them, are developed. Two clinical examples are
presented and simulation results are provided.
Journal: Journal of Applied Statistics
Pages: 803-816
Issue: 6
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760220136140
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760220136140
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:6:p:803-816
Template-Type: ReDIF-Article 1.0
Author-Name: Abdulnasser Hatemi-J
Author-X-Name-First: Abdulnasser
Author-X-Name-Last: Hatemi-J
Author-Name: Ghazi Shukur
Author-X-Name-First: Ghazi
Author-X-Name-Last: Shukur
Title: Multivariate-based causality tests of twin deficits in the US
Abstract:
This paper provides an alternative methodology for testing the causality
direction between twin deficits in the US. Rao's multivariate F-test
combined with the bootstrap simulation technique has appealing properties,
especially when the data-generating process is characterized by unit
roots. In addition the results show that the effect of structural breaks
are of paramount importance when the causality tests are conducted. In
much contemporary applied econometrics there is all-too-little attention
given to the possibility of changes in the economic process.
Journal: Journal of Applied Statistics
Pages: 817-824
Issue: 6
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760220136159
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760220136159
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:6:p:817-824
Template-Type: ReDIF-Article 1.0
Author-Name: C. Agostinelli
Author-X-Name-First: C.
Author-X-Name-Last: Agostinelli
Title: Robust stepwise regression
Abstract:
The selection of an appropriate subset of explanatory variables to use in
a linear regression model is an important aspect of a statistical
analysis. Classical stepwise regression is often used with this aim but it
could be invalidated by a few outlying observations. In this paper, we
introduce a robust F-test and a robust stepwise regression procedure based
on weighted likelihood in order to achieve robustness against the presence
of outliers. The introduced methodology is asymptotically equivalent to
the classical one when no contamination is present. Some examples and
simulation are presented.
Journal: Journal of Applied Statistics
Pages: 825-840
Issue: 6
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760220136168
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760220136168
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:6:p:825-840
Template-Type: ReDIF-Article 1.0
Author-Name: M. D. Jimenez Gamero
Author-X-Name-First: M. D. Jimenez
Author-X-Name-Last: Gamero
Author-Name: J. M. Munoz Pichardo
Author-X-Name-First: J. M. Munoz
Author-X-Name-Last: Pichardo
Author-Name: J. Munoz Garcia
Author-X-Name-First: J. Munoz
Author-X-Name-Last: Garcia
Author-Name: A. Pascual Acosta
Author-X-Name-First: A. Pascual
Author-X-Name-Last: Acosta
Title: Rao distance as a measure of influence in the multivariate linear model
Abstract:
Several methods have been suggested to detect influential observations in
the linear regression model and a number of them have been extended for
the multivariate regression model. In this article we consider the
multivariate general linear model, Y = XB + k , which contains the linear
regression model and the multivariate regression model as particular
cases. Assuming that the random disturbances are normally distributed, the
BLUE of v B is also normally distributed. Since the distribution of the
BLUE of v B and the distribution of the BLUE of v B in the model with the
omission of a set of observations differ, to study the influence that a
set of observations has on the BLUE of v B , we propose to measure the
distance between both distributions. To do this we use Rao distance.
Journal: Journal of Applied Statistics
Pages: 841-854
Issue: 6
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760220136177
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760220136177
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:6:p:841-854
Template-Type: ReDIF-Article 1.0
Author-Name: E. Andres Houseman
Author-X-Name-First: E. Andres
Author-X-Name-Last: Houseman
Author-Name: Louise Ryan
Author-X-Name-First: Louise
Author-X-Name-Last: Ryan
Author-Name: Jonathan Levy
Author-X-Name-First: Jonathan
Author-X-Name-Last: Levy
Author-Name: John Spengler
Author-X-Name-First: John
Author-X-Name-Last: Spengler
Title: Autocorrelation in real-time continuous monitoring of microenvironments
Abstract:
Interpretation of continuous measurements in microenvironmental studies
and exposure assessments can be complicated by autocorrelation, the
implications of which are often not fully addressed. We discuss some
statistical issues that arose in the analysis of microenvironmental
particulate matter concentration data collected in 1998 by the Harvard
School of Public Health. We present a simulation study that suggests that
Generalized Estimating Equations, a technique often used to adjust for
autocorrelation, may produce inflated Type I errors when applied to
microenvironmental studies of small or moderate sample size, and that
Linear Mixed Effects models may be more appropriate in small-sample
settings. Environmental scientists often appeal to longer averaging times
to reduce autocorrelation. We explore the functional relationship between
averaging time, autocorrelation, and standard errors of both mean and
variance, showing that longer averaging times impair statistical
inferences about main effects. We conclude that, given widely available
techniques that adjust for autocorrelation, longer averaging times may be
inappropriate in microenvironmental studies.
Journal: Journal of Applied Statistics
Pages: 855-872
Issue: 6
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760220136186A
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760220136186A
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:6:p:855-872
Template-Type: ReDIF-Article 1.0
Author-Name: Jixian Wang
Author-X-Name-First: Jixian
Author-X-Name-Last: Wang
Author-Name: Peter Donnan
Author-X-Name-First: Peter
Author-X-Name-Last: Donnan
Title: Adjusting for missing record linkage in outcome studies
Abstract:
Record linkage databases have been increasingly available and used in
pharmacoepidemiology, pharmacoeconomic and outcome studies, where the
relationship between drug exposure or intervention and outcome is the main
concern. Sometimes the linkage between outcome data and exposure data may
be missing so that only a proportion of patients in the outcome database
can be linked to other databases. This paper proposes maximum likelihood
(ML) and GEE procedures to obtain consistent estimates of parameters in
the model relating the outcome and risk factors. Asymptotic variances of
the estimates were derived for the situation where the missing rate is
estimated from the same dataset. We show that using the estimated missing
rate, rather than the known missing rate, may result in more accurate
estimates of the parameters. The confidence interval of the predicted
occurrence rate, when the missing rate was estimated, was derived.
Simulations for different scenarios were performed in order to explore the
small-sample behaviour of the ML procedure using the estimated missing
rate. The results confirmed the greater efficiency of using the estimated
missing rate instead of the true one for large sample sizes. However, this
may not be true for small samples. The ML procedure was applied to an
analysis of coronary artery bypass operations in patients with acute
coronary syndrome.
Journal: Journal of Applied Statistics
Pages: 873-884
Issue: 6
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760220136186
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760220136186
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:6:p:873-884
Template-Type: ReDIF-Article 1.0
Author-Name: Sung Park
Author-X-Name-First: Sung
Author-X-Name-Last: Park
Author-Name: Kiho Kim
Author-X-Name-First: Kiho
Author-X-Name-Last: Kim
Title: Construction of central composite designs for balanced orthogonal blocks
Abstract:
Box & Hunter (1957) recommended a set of orthogonally blocked central
composite designs (CCD) when the region of interest is spherical. In order
to achieve rotatability along with orthogonal blocking, the block size for
those designs becomes unequal and it may not be attractive or practical to
use such unequally blocked designs in many practical situations. In this
paper, a construction method of orthogonally blocked CCD under the
assumption of equal block size is proposed and an index of block
orthogonality is introduced.
Journal: Journal of Applied Statistics
Pages: 885-893
Issue: 6
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760220136195
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760220136195
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:6:p:885-893
Template-Type: ReDIF-Article 1.0
Author-Name: K. Hafeez
Author-X-Name-First: K.
Author-X-Name-Last: Hafeez
Author-Name: H. Rowlands
Author-X-Name-First: H.
Author-X-Name-Last: Rowlands
Author-Name: G. Kanji
Author-X-Name-First: G.
Author-X-Name-Last: Kanji
Author-Name: S. Iqbal
Author-X-Name-First: S.
Author-X-Name-Last: Iqbal
Title: Design optimization using ANOVA
Abstract:
This paper describes the design optimization of a robot sensor used for
locating 3-D objects employing the Taguchi method in a computer simulation
scenario. The location information from the sensor is to be utilized to
control the movements of an industrial robot in a 'pick-and-place' or
assembly operation. The Taguchi method, which is based on the
Analysis-of-Variance (ANOVA) approach, is utilized to improve the
performance of the sensor over a wider operating range. A review of the
Taguchi method is presented along with step-by-step implementation details
to identify and optimize the design parameters of the sensor. The method
allows us to gauge the impact of various interactions present in the
sensor system exclusively and permits us to single out those factors that
have a dominant influence on the overall performance of the sensor. The
investigation suggests that the Taguchi method is a more structured and
efficient approach for achieving a robust design compared with the
classical full factorial design approach.
Journal: Journal of Applied Statistics
Pages: 895-906
Issue: 6
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760220136203
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760220136203
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:6:p:895-906
Template-Type: ReDIF-Article 1.0
Author-Name: Herbert Buning
Author-X-Name-First: Herbert
Author-X-Name-Last: Buning
Title: Robustness and power of modified Lepage, Kolmogorov-Smirnov and Cramer-von Mises two-sample tests
Abstract:
For the two-sample problem with location and/or scale alternatives, as
well as different shapes, several statistical tests are presented, such as
of Kolmogorov-Smirnov and Cramer-von Mises type for the general
alternative, and such as of Lepage type for location and scale
alternatives. We compare these tests with the t-test and other location
tests, such as the Welch test, and also the Levene test for scale. It
turns out that there is, of course, no clear winner among the tests but,
for symmetric distributions with the same shape, tests of Lepage type are
the best ones whereas, for different shapes, Cramer-von Mises type tests
are preferred. For extremely right-skewed distributions, a modification of
the Kolmogorov-Smirnov test should be applied.
Journal: Journal of Applied Statistics
Pages: 907-924
Issue: 6
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760220136212
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760220136212
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:6:p:907-924
Template-Type: ReDIF-Article 1.0
Author-Name: Beverley Causey
Author-X-Name-First: Beverley
Author-X-Name-Last: Causey
Title: Parametric estimation of the number of classes in a population
Abstract:
This paper deals with the well-studied problem of how best to estimate
the number of mutually exclusive and exhaustive classes in a population,
based on a sample from it. Haas & Stokes review and provide non-parametric
approaches, but there are associated difficulties especially for small
sampling fractions and/or widely varying population class sizes. Sichel
provided 'GIGP' methodology, for this problem and for other purposes; this
paper utilizes the three-parameter GIGP distribution for this problem, and
also for the estimation of the number of classes of size 1, as an
alternative to the non-parametric approaches. Methodological and
computational issues are considered, and examples indicate the potential
for GIGP.
Journal: Journal of Applied Statistics
Pages: 925-934
Issue: 6
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760220136221
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760220136221
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:6:p:925-934
Template-Type: ReDIF-Article 1.0
Author-Name: Vic Patrangenaru
Author-X-Name-First: Vic
Author-X-Name-Last: Patrangenaru
Author-Name: Kanti Mardia
Author-X-Name-First: Kanti
Author-X-Name-Last: Mardia
Title: A bootstrap approach to Pluto's origin
Abstract:
The solar nebula theory hypothesizes that planets are formed from an
accretion disk of material that, over time, condenses into dust, small
planetesimals, and that the planets should have, on average, coplanar,
nearly circular orbits. If the orbit of Pluto has a different origin to
the other planets in the solar system, then there will be tremendous
repercussions on modelling the spacecrafts for a mission to Pluto. We test
here the nebula theory for Pluto, using both parametric and non-parametric
methods. We first develop asymptotic distributions of extrinsic means on a
manifold, and then derive bootstrap and large sample distributions of the
sample mean direction. Our parametric and non-parametric analyses provide
very strong evidence that the solar nebula theory does not hold for Pluto.
Journal: Journal of Applied Statistics
Pages: 935-943
Issue: 6
Volume: 29
Year: 2002
X-DOI: 10.1080/02664760220136230
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760220136230
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:6:p:935-943
Template-Type: ReDIF-Article 1.0
Author-Name: M. E. Ghitany
Author-X-Name-First: M. E.
Author-X-Name-Last: Ghitany
Author-Name: S. Al-Awadhi
Author-X-Name-First: S.
Author-X-Name-Last: Al-Awadhi
Title: Maximum likelihood estimation of Burr XII distribution parameters under random censoring
Abstract:
In this paper, we consider the maximum likelihood estimation of the
parameters of Burr XII distribution using randomly right censored data. We
provide necessary and sufficient conditions for the existence and
uniqueness of the maximum likelihood estimates. Under such conditions, it
is shown that the maximum likelihood estimates are strongly consistent for
the true values of the parameters and are asymptotically bivariate normal.
An application to leukemia free-survival times for allogeneic and
autologous transplant patients is given.
Journal: Journal of Applied Statistics
Pages: 955-965
Issue: 7
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000006667
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000006667
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:7:p:955-965
Template-Type: ReDIF-Article 1.0
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Author-Name: M. Kalyanasundaram
Author-X-Name-First: M.
Author-X-Name-Last: Kalyanasundaram
Title: Construction of a generalized robust Taguchi capability index
Abstract:
In this paper, a generalized Taguchi capability index is proposed and the
construction method is also indicated.
Journal: Journal of Applied Statistics
Pages: 967-971
Issue: 7
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000006676
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000006676
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:7:p:967-971
Template-Type: ReDIF-Article 1.0
Author-Name: E. Andersson
Author-X-Name-First: E.
Author-X-Name-Last: Andersson
Title: Monitoring cyclical processes. A non-parametric approach
Abstract:
Forecasting the turning points in business cycles is important to
economic and political decisions. Time series of business indicators often
exhibit cycles that cannot easily be modelled with a parametric function.
This article presents a method for monitoring time-series with cycles in
order to detect the turning points. A non-parametric estimation procedure
that uses only monotonicity restrictions is used. The methodology of
statistical surveillance is used for developing a system for early
warnings of cycle turning points in monthly data. In monitoring, the
inference situation is one of repeated decisions. Measurements of the
performance of a method of surveillance are, for example, average run
length and expected delay to a correct alarm. The properties of the
proposed monitoring system are evaluated by means of a simulation study.
The false alarms are controlled by a fixed median run length to the first
false alarm. Results are given on the median delay time to a correct alarm
for two situations: a peak after two and three years respectively .
Journal: Journal of Applied Statistics
Pages: 973-990
Issue: 7
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000006685
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000006685
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:7:p:973-990
Template-Type: ReDIF-Article 1.0
Author-Name: Steven Caudill
Author-X-Name-First: Steven
Author-X-Name-Last: Caudill
Author-Name: Norman Godwin
Author-X-Name-First: Norman
Author-X-Name-Last: Godwin
Title: Heterogeneous skewness in binary choice models: Predicting outcomes in the men's NCAA basketball tournament
Abstract:
Several authors have recently explored the estimation of binary choice
models based on asymmetric error structures. One such family of skewed
models is based on the exponential generalized beta type 2 (EGB2). One
model in this family is the skewed logit. Recently, McDonald (1996, 2000)
extended the work on the EGB2 family of skewed models to permit
heterogeneity in the scale parameter. The aim of this paper is to extend
the skewed logit model to allow for heterogeneity in the skewness
parameter. By this we mean that, in the model developed, here the skewness
parameter is permitted to vary from observation to observation by making
it a function of exogenous variables. To demonstrate the usefulness of our
model, we examine the issue of the predictive ability of sports seedings.
We find that we are able to obtain better probability predictions using
the skewed logit model with heterogeneous skewness than can be obtained
with logit, probit, or skewed logit.
Journal: Journal of Applied Statistics
Pages: 991-1001
Issue: 7
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000006694
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000006694
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:7:p:991-1001
Template-Type: ReDIF-Article 1.0
Author-Name: Chung-Ho Chen
Author-X-Name-First: Chung-Ho
Author-X-Name-Last: Chen
Author-Name: Chao-Yu Chou
Author-X-Name-First: Chao-Yu
Author-X-Name-Last: Chou
Title: Economic design of continuous sampling plan under linear inspection cost
Abstract:
The article explores the problem of an economically based type I
continuous sampling plan (CSP-1 plan) under linear inspection cost. By
assuming that the per unit inspection cost is linearly proportional to the
average number of inspections per inspection cycle, and by solving the
modified Cassady et al.'s model, we not only have the required level of
product quality but also obtain the minimum total expected cost per unit
produced.
Journal: Journal of Applied Statistics
Pages: 1003-1009
Issue: 7
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000006702
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000006702
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:7:p:1003-1009
Template-Type: ReDIF-Article 1.0
Author-Name: Claudio Verzilli
Author-X-Name-First: Claudio
Author-X-Name-Last: Verzilli
Author-Name: James Carpenter
Author-X-Name-First: James
Author-X-Name-Last: Carpenter
Title: A Monte Carlo EM algorithm for random-coefficient-based dropout models
Abstract:
Longitudinal studies of neurological disorders suffer almost inevitably
from non-compliance, which is likely to be non-ignorable. It is important
in these cases to model the response variable and the dropout mechanism
jointly. In this article we propose a Monte Carlo version of the EM
algorithm that can be used to fit random-coefficient-based dropout models.
A linear mixed model is assumed for the response variable and a
discrete-time proportional hazards model for the dropout mechanism; these
share a common set of random coefficients. The ideas are illustrated using
data from a five-year trial assessing the efficacy of two drugs in the
treatment of patients in the early stages of Parkinson's disease.
Journal: Journal of Applied Statistics
Pages: 1011-1021
Issue: 7
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000006711
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000006711
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:7:p:1011-1021
Template-Type: ReDIF-Article 1.0
Author-Name: James Mays
Author-X-Name-First: James
Author-X-Name-Last: Mays
Author-Name: Jeffrey Birch
Author-X-Name-First: Jeffrey
Author-X-Name-Last: Birch
Title: Smoothing for small samples with model misspecification: Nonparametric and semiparametric concerns
Abstract:
Our goal is to find a regression technique that can be used in a
small-sample situation with possible model misspecification. The
development of a new bandwidth selector allows nonparametric regression
(in conjunction with least squares) to be used in this small-sample
problem, where nonparametric procedures have previously proven to be
inadequate. Considered here are two new semiparametric (model-robust)
regression techniques that combine parametric and nonparametric techniques
when there is partial information present about the underlying model. A
general overview is given of how typical concerns for bandwidth selection
in nonparametric regression extend to the model-robust procedures. A new
penalized PRESS criterion (with a graphical selection strategy for
applications) is developed that overcomes these concerns and is able to
maintain the beneficial mean squared error properties of the new
model-robust methods. It is shown that this new selector outperforms
standard and recently improved bandwidth selectors. Comparisons of the
selectors are made via numerous generated data examples and a small
simulation study.
Journal: Journal of Applied Statistics
Pages: 1023-1045
Issue: 7
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000006720
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000006720
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:7:p:1023-1045
Template-Type: ReDIF-Article 1.0
Author-Name: Daniela Climov
Author-X-Name-First: Daniela
Author-X-Name-Last: Climov
Author-Name: Michel Delecroix
Author-X-Name-First: Michel
Author-X-Name-Last: Delecroix
Author-Name: Leopold Simar
Author-X-Name-First: Leopold
Author-X-Name-Last: Simar
Title: Semiparametric estimation in single index Poisson regression: A practical approach
Abstract:
In a single index Poisson regression model with unknown link function,
the index parameter can be root- n consistently estimated by the method of
pseudo maximum likelihood. In this paper, we study, by simulation
arguments, the practical validity of the asymptotic behaviour of the
pseudo maximum likelihood index estimator and of some associated
cross-validation bandwidths. A robust practical rule for implementing the
pseudo maximum likelihood estimation method is suggested, which uses the
bootstrap for estimating the variance of the index estimator and a variant
of bagging for numerically stabilizing its variance. Our method gives
reasonable results even for moderate sized samples; thus, it can be used
for doing statistical inference in practical situations. The procedure is
illustrated through a real data example.
Journal: Journal of Applied Statistics
Pages: 1047-1070
Issue: 7
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000006739
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000006739
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:7:p:1047-1070
Template-Type: ReDIF-Article 1.0
Author-Name: Yangxin Huang
Author-X-Name-First: Yangxin
Author-X-Name-Last: Huang
Title: Robustness of interval estimation of the 90% effective dose: Bootstrap resampling and some large-sample parametric methods
Abstract:
Interval estimation of the x th effective dose (ED x ), where x is a
prespecified percentage, has been the focus of interest of a number of
recent studies, the majority of which have considered the case in which a
logistic dose-response curve is correctly assumed. In this paper, we focus
our attention upon the 90% effective dose (ED 90 ) and consider the
situation in which the assumption of a logistic dose-response curve is
incorrect. Specifically, we consider three classes of true model: the
probit, the cubic logistic and the asymmetric Aranda-Ordaz models. We
investigate the robustness of four large sample parametric methods of
interval construction and four methods based upon bootstrap resampling.
Journal: Journal of Applied Statistics
Pages: 1071-1081
Issue: 7
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000006748
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000006748
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:7:p:1071-1081
Template-Type: ReDIF-Article 1.0
Author-Name: T. Antoniadou
Author-X-Name-First: T.
Author-X-Name-Last: Antoniadou
Author-Name: D. Wallach
Author-X-Name-First: D.
Author-X-Name-Last: Wallach
Title: Evaluating optimal fertilizer rates using plant measurements
Abstract:
Correctly adjusting the amount of nitrogen fertilizer to crop needs is
important for both economic and environmental reasons. A recent
development in nitrogen fertilization is the use of plant measurements to
indicate plant nitrogen status. We present a theoretical treatment of this
practice. We assume that yield response to nitrogen dose can be described
using a random parameter model. The lack of precise knowledge of the
parameter values leads to calculated nitrogen doses that are not optimal.
The plant measurement allows one to calculate a conditional distribution
of the parameter values, which leads to improved calculated nitrogen
doses. We apply the treatment to a data set for wheat in northern France.
It is shown that the use of a plant measurement, compared with no
measurement, has only a minor effect on net profit, but achieves this with
less nitrogen and, in particular, reduces the probability of large
excesses of nitrogen beyond crop needs.
Journal: Journal of Applied Statistics
Pages: 1083-1099
Issue: 7
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000006757
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000006757
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:7:p:1083-1099
Template-Type: ReDIF-Article 1.0
Author-Name: Andre Khuri
Author-X-Name-First: Andre
Author-X-Name-Last: Khuri
Title: Graphical evaluation of the adequacy of the method of unweighted means
Abstract:
A graphical technique is introduced to assess the adequacy of the method
of unweighted means in providing approximate F -tests for an unbalanced
random model. These tests are similar to those obtained under a balanced
ANOVA. The proposed technique is simple and can easily be used to
determine the effects of imbalance and values of the variance components
on the adequacy of the approximation. The one-way and two-way random
models are used to illustrate the proposed methodology. Extensions to
higher-order models are also mentioned.
Journal: Journal of Applied Statistics
Pages: 1107-1119
Issue: 8
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000011193
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000011193
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:8:p:1107-1119
Template-Type: ReDIF-Article 1.0
Author-Name: Patrick Bourke
Author-X-Name-First: Patrick
Author-X-Name-Last: Bourke
Title: A continuous sampling plan using CUSUMs
Abstract:
A Continuous Sampling Plan, CSP-CUSUM, is proposed based on the use of
Cumulative Sums (CUSUMs) for deciding when to switch between the phases of
sampling inspection and 100% inspection. The Geometric CUSUM, also termed
the Run-length CUSUM, is chosen for this purpose, and two separate CUSUMs
are to be operated, one for each inspection phase. The conventional
measures of performance for CSPs such as average outgoing quality, average
fraction inspected, and average proportion passed under sampling
inspection are evaluated for CSP-CUSUM, and comparisons with some standard
CSPs are presented. An additional performance-measure, Average Cycle
Length, is proposed. A table is provided to aid the choice of parameters
for the operation of CSP-CUSUM. It is recommended that a Geometric CUSUM
control chart be maintained in parallel with CSP-CUSUM to detect
significant upward shifts in the incoming fraction defective.
Journal: Journal of Applied Statistics
Pages: 1121-1133
Issue: 8
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000011201
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000011201
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:8:p:1121-1133
Template-Type: ReDIF-Article 1.0
Author-Name: N. David Yanez
Author-X-Name-First: N. David
Author-X-Name-Last: Yanez
Author-Name: Richard Kronmal
Author-X-Name-First: Richard
Author-X-Name-Last: Kronmal
Author-Name: Jennifer Nelson
Author-X-Name-First: Jennifer
Author-X-Name-Last: Nelson
Author-Name: Todd Alonzo
Author-X-Name-First: Todd
Author-X-Name-Last: Alonzo
Title: Analysing change in clinical trials using quasi-likelihoods
Abstract:
In clinical trials, investigations focus upon whether a treatment affects
a measured outcome. Data often collected include pre- and post-treatment
measurements on each patient and an analysis of the change in the outcome
is typically performed to determine treatment efficacy. Absolute change
and relative change are frequently selected as the outcome. In selecting
from these two measures, the analyst makes implicit assumptions regarding
the mean and variance-mean relationship of the data. Some have provided ad
hoc guidelines for selecting between the two measures. We present a more
rigorous means of investigating change using quasi-likelihoods. We show
that both absolute change and relative change are special cases of the
specified quasi-likelihood model. A cystic fibrosis example is provided.
Journal: Journal of Applied Statistics
Pages: 1135-1145
Issue: 8
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000011210
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000011210
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:8:p:1135-1145
Template-Type: ReDIF-Article 1.0
Author-Name: S. P. Singh
Author-X-Name-First: S. P.
Author-X-Name-Last: Singh
Author-Name: A. K. Srivastava
Author-X-Name-First: A. K.
Author-X-Name-Last: Srivastava
Author-Name: B. V. S. Sisodia
Author-X-Name-First: B. V. S.
Author-X-Name-Last: Sisodia
Title: Evaluation of a synthetic method of estimation for small areas
Abstract:
A synthetic estimator is one of the simplest estimators for a small area,
and it has several variants. In this paper, a ratio-synthetic estimator is
proposed and compared with the existing synthetic estimator (Ghangurde &
Singh, 1977) and it is observed that the gain due to stratification in the
case of a synthetic estimator reduces proportional to the domain coverage.
Journal: Journal of Applied Statistics
Pages: 1147-1151
Issue: 8
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000011229
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000011229
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:8:p:1147-1151
Template-Type: ReDIF-Article 1.0
Author-Name: Loki Natarajan
Author-X-Name-First: Loki
Author-X-Name-Last: Natarajan
Author-Name: John O'Quigley
Author-X-Name-First: John
Author-X-Name-Last: O'Quigley
Title: Predictive capability of stratified proportional hazards models
Abstract:
Following on from the work of O'Quigley & Flandre (1994) and, more
recently, O'Quigley & Xu (2000), we develop a measure, R2, of the
predictive ability of a stratified proportional hazards regression model.
The extension of this earlier work to the stratified case is relatively
straightforward, both conceptually and in its practical implementation.
The extension is nonetheless important in that the stratified model is
making weaker assumptions than the full multivariate model. Formulae are
given that can be readily incorporated into standard software routines,
since the component parts of the calculations are routinely provided by
most packages. We give examples on the predictability of survival in
breast cancer data, modelled via proportional hazards and stratified
proportional hazards models, the latter being necessary in view of the
effects of a non-proportional hazards nature.
Journal: Journal of Applied Statistics
Pages: 1153-1163
Issue: 8
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000011238
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000011238
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:8:p:1153-1163
Template-Type: ReDIF-Article 1.0
Author-Name: Samuel Kotz
Author-X-Name-First: Samuel
Author-X-Name-Last: Kotz
Author-Name: J. Renevan Dorp
Author-X-Name-First: J. Renevan
Author-X-Name-Last: Dorp
Title: A versatile bivariate distribution on a bounded domain: Another look at the product moment correlation
Abstract:
The Farlie-Gumbel-Morgenstern (FGM) family has been investigated in
detail for various continuous marginals such as Cauchy, normal,
exponential, gamma, Weibull, lognormal and others. It has been a popular
model for the bivariate distribution with mild dependence. However,
bivariate FGMs with continuous marginals on a bounded support discussed in
the literature are only those with uniform or power marginals. In this
paper we study the bivariate FGM family with marginals given by the
recently proposed two-sided power (TSP) distribution. Since this family of
bounded continuous distributions is very flexible, the properties of the
FGM family with TSP marginals could serve as an indication of the
structure of the FGM distribution with arbitrary marginals defined on a
compact set. A remarkable stability of the correlation between the
marginals has been observed.
Journal: Journal of Applied Statistics
Pages: 1165-1179
Issue: 8
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000011247
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000011247
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:8:p:1165-1179
Template-Type: ReDIF-Article 1.0
Author-Name: Alan Hutson
Author-X-Name-First: Alan
Author-X-Name-Last: Hutson
Author-Name: Lauren Fishbein
Author-X-Name-First: Lauren
Author-X-Name-Last: Fishbein
Author-Name: Patricia O'Brien
Author-X-Name-First: Patricia
Author-X-Name-Last: O'Brien
Author-Name: Peter Stacpoole
Author-X-Name-First: Peter
Author-X-Name-Last: Stacpoole
Title: Accounting for plasma levels below detection limits in a one-compartment zero-order absorption pharmacokinetics model
Abstract:
We provide a simple method for fitting a one-compartment, zero-order
absorption pharmacokinetics model in the presence of observations below
the detection limit. This method may be extended to more complex
pharmacokinetics models. We demonstrate, using a small simulation study,
that the method provides accurate parameter estimates over a range of
detection limits and we compare it to an ad hoc midpoint method. An
applied example is provided from a pharmacokinetic investigation of a
nicotine nasal spray.
Journal: Journal of Applied Statistics
Pages: 1181-1190
Issue: 8
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000011256
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000011256
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:8:p:1181-1190
Template-Type: ReDIF-Article 1.0
Author-Name: Manuel Galea
Author-X-Name-First: Manuel
Author-X-Name-Last: Galea
Author-Name: Heleno Bolfarine
Author-X-Name-First: Heleno
Author-X-Name-Last: Bolfarine
Author-Name: Filidor Vilcalabra
Author-X-Name-First: Filidor
Author-X-Name-Last: Vilcalabra
Title: Influence diagnostics for the structural errors-in-variables model under the Student-t distribution
Abstract:
The influence of observations on the parameter estimates for the simple
structural errors-in-variables model with no equation error, under the
Student-t distribution, is investigated using the local influence
approach. The main conclusion is that the Student-t model with small
degrees of freedom is able to incorporate possible outliers and
influential observations in the data. The likelihood displacement approach
is useful for outlier detection, especially when a masking phenomenon is
present and the degrees of freedom parameter is large. The diagnostics are
illustrated with two examples.
Journal: Journal of Applied Statistics
Pages: 1191-1204
Issue: 8
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000011265
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000011265
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:8:p:1191-1204
Template-Type: ReDIF-Article 1.0
Author-Name: Herwig Friedl
Author-X-Name-First: Herwig
Author-X-Name-Last: Friedl
Author-Name: Erwin Stampfer
Author-X-Name-First: Erwin
Author-X-Name-Last: Stampfer
Title: Estimating general variable acceptance sampling plans by bootstrap methods
Abstract:
We consider variable acceptance sampling plans that control the lot or
process fraction defective, where a specification limit defines acceptable
quality. The problem is to find a sampling plan that fulfils some
conditions, usually on the operation characteristic. Its calculation
heavily depends on distributional properties that, in practice, might be
doubtful. If prior data are already available, we propose to estimate the
sampling plan by means of bootstrap methods. The bias and standard error
of the estimated plan can be assessed easily by Monte Carlo approximation
to the respective bootstrap moments. This resampling approach does not
require strong assumptions and, furthermore, is a flexible method that can
be extended to any statistic that might be informative for the fraction
defective in a lot.
Journal: Journal of Applied Statistics
Pages: 1205-1217
Issue: 8
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000011274
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000011274
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:8:p:1205-1217
Template-Type: ReDIF-Article 1.0
Author-Name: Sangit Chatterjee
Author-X-Name-First: Sangit
Author-X-Name-Last: Chatterjee
Author-Name: Frederick Wiseman
Author-X-Name-First: Frederick
Author-X-Name-Last: Wiseman
Author-Name: Robert Perez
Author-X-Name-First: Robert
Author-X-Name-Last: Perez
Title: Studying improved performance in golf
Abstract:
The topic of improved performances by athletes in both team and
individual sports has shown that each sport has its own unique set of
characteristics and these have to be analysed accordingly. This paper
presents an extensive analysis of the nature and extent of improvement in
golf by analysing the performances of the top players in the Masters
tournament throughout the entire history of the event. The results
indicate that golfers are obtaining lower scores over time and that the
variation of the scores has declined. Further, the distributions of scores
are symmetric and display a monotonic reduction of peakedness (kurtosis).
These findings are indicative of rapid and improved performance and
increased competition.
Journal: Journal of Applied Statistics
Pages: 1219-1227
Issue: 8
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000011283
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000011283
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:8:p:1219-1227
Template-Type: ReDIF-Article 1.0
Author-Name: Avner Bar-Hen
Author-X-Name-First: Avner
Author-X-Name-Last: Bar-Hen
Title: Influence of missing data on compact designs for spacing experiments
Abstract:
Density optimization of a plantation is a classical task with important
practical consequences. In this article, we present an adaptation of
criss-cross design and an alternative analysis. If a tree is missing, the
spacing of neighbouring trees is altered and considerable information is
lost. We derive the estimate of the missing value that minimizes the
residual sum of squares and obtain the analytical solution of the EM
algorithm. The relationships between the two techniques are clarified. The
method is applied to data from a plantation of Eucalyptus in the Congo.
Journal: Journal of Applied Statistics
Pages: 1229-1240
Issue: 8
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000011292
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000011292
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:8:p:1229-1240
Template-Type: ReDIF-Article 1.0
Author-Name: George Box
Author-X-Name-First: George
Author-X-Name-Last: Box
Author-Name: Alberto Luceno
Author-X-Name-First: Alberto
Author-X-Name-Last: Luceno
Title: Feedforward as a supplement to feedback adjustment in allowing for feedstock changes
Abstract:
Many industrial processes must be adjusted from time to time to maintain
their mean continuously close to the target value. Compensations for
deviations of the process mean from the target may be accomplished by
feedback and/or by feedforward adjustment. Feedback adjustments are made
in reaction to errors at the output; feedforward adjustments are made to
compensate anticipated changes. This article considers the complementary
use of feedback and feedforward adjustments to compensate for anticipated
step changes in the process mean as may be necessary in a manufacturing
process each time a new batch of feedstock material is introduced. We
consider and compare five alternative control schemes: (1) feedforward
adjustment alone, (2) feedback adjustment alone, (3) feedback- feedforward
adjustment, (4) feedback and indirect feedforward to increase the
sensitivity of the feedback scheme, and (5) feedback with both direct and
indirect feedforward.
Journal: Journal of Applied Statistics
Pages: 1241-1254
Issue: 8
Volume: 29
Year: 2002
X-DOI: 10.1080/0266476022000011300
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000011300
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:29:y:2002:i:8:p:1241-1254
Template-Type: ReDIF-Article 1.0
Author-Name: Reiko Aoki
Author-X-Name-First: Reiko
Author-X-Name-Last: Aoki
Author-Name: Jorge Achcar
Author-X-Name-First: Jorge
Author-X-Name-Last: Achcar
Author-Name: Heleno Bolfarine
Author-X-Name-First: Heleno
Author-X-Name-Last: Bolfarine
Author-Name: Julio Singer
Author-X-Name-First: Julio
Author-X-Name-Last: Singer
Title: Bayesian analysis of null intercept errors-in-variables regression for pretest/post-test data
Abstract:
This article discusses a Bayesian analysis of repeated measures
pretest/post-test data under null intercepts errors-in-variables
regression models. For illustration we consider an example in the field of
dentistry involving the comparison of two types of toothbrushes with
respect to the efficacy in removing dental plaque. The proposed Bayesian
approach accommodates the correlated measurements and incorporates the
restriction that the slopes must lie in the [0,1] interval, a feature not
considered in the analysis conducted by Singer & Andrade (1997). The
observed values of the (repeated) response and explanatory variables are
supposed to follow a Multivariate Student- t distribution. A Gibbs sampler
is used to perform the computations.
Journal: Journal of Applied Statistics
Pages: 3-12
Issue: 1
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000018466
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000018466
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:1:p:3-12
Template-Type: ReDIF-Article 1.0
Author-Name: Maria Carapeto
Author-X-Name-First: Maria
Author-X-Name-Last: Carapeto
Author-Name: William Holt
Author-X-Name-First: William
Author-X-Name-Last: Holt
Title: Testing for heteroscedasticity in regression models
Abstract:
A new test for heteroscedasticity in regression models is presented based
on the Goldfeld-Quandt methodology. Its appeal derives from the fact that
no further regressions are required, enabling widespread use across all
types of regression models. The distribution of the test is computed using
the Imhof method and its power is assessed by performing a Monte Carlo
simulation. We compare our results with those of Griffiths & Surekha
(1986) and show that our test is more powerful than the wide range of
tests they examined. We introduce an estimation procedure using a neural
network to correct the heteroscedastic disturbances.
Journal: Journal of Applied Statistics
Pages: 13-20
Issue: 1
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000018475
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000018475
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:1:p:13-20
Template-Type: ReDIF-Article 1.0
Author-Name: D. K. Panda
Author-X-Name-First: D. K.
Author-X-Name-Last: Panda
Author-Name: Rajender Parsad
Author-X-Name-First: Rajender
Author-X-Name-Last: Parsad
Author-Name: V. K. Sharma
Author-X-Name-First: V. K.
Author-X-Name-Last: Sharma
Title: Robustness of complete diallel crossing plans against exchange of one cross
Abstract:
The robustness aspects of block designs for complete diallel crossing
plans against the exchange of one cross using connectedness and efficiency
criteria have been investigated. The exchanged cross may have either no
line in common or one line in common with the original cross. It has been
found that randomized complete block (RCB) designs for complete diallel
crosses and binary balanced block designs for complete diallel crosses are
robust against the exchange of one cross in one observation. The RCB
designs for diallel crosses have been shown to be robust against the
exchange of one cross with another cross in all the blocks. The non-binary
balanced block designs obtainable from Family 5 of Das et al. (1998) have
also been found to be robust against the exchange of one cross.
Journal: Journal of Applied Statistics
Pages: 21-35
Issue: 1
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000018484
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000018484
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:1:p:21-35
Template-Type: ReDIF-Article 1.0
Author-Name: JosE Eduardo Corrente
Author-X-Name-First: JosE Eduardo
Author-X-Name-Last: Corrente
Author-Name: Liciana Chalita
Author-X-Name-First: Liciana
Author-X-Name-Last: Chalita
Author-Name: Jeanete Alves Moreira
Author-X-Name-First: Jeanete Alves
Author-X-Name-Last: Moreira
Title: Choosing between Cox proportional hazards and logistic models for interval- censored data via bootstrap
Abstract:
This work develops a new methodology in order to discriminate models for
interval- censored data based on bootstrap residual simulation by
observing the deviance difference from one model in relation to another,
according to Hinde (1992). Generally, this sort of data can generate a
large number of tied observations and, in this case, survival time can be
regarded as discrete. Therefore, the Cox proportional hazards model for
grouped data (Prentice & Gloeckler, 1978) and the logistic model (Lawless,
1982) can be fitted by means of generalized linear models. Whitehead
(1989) considered censoring to be an indicative variable with a binomial
distribution and fitted the Cox proportional hazards model using
complementary log-log as a link function. In addition, a logistic model
can be fitted using logit as a link function. The proposed methodology
arises as an alternative to the score tests developed by Colosimo et al.
(2000), where such models can be obtained for discrete binary data as
particular cases from the Aranda-Ordaz distribution asymmetric family.
These tests are thus developed with a basis on link functions to generate
such a fit. The example that motivates this study was the dataset from an
experiment carried out on a flax cultivar planted on four substrata
susceptible to the pathogen Fusarium oxysoprum . The response variable,
which is the time until blighting, was observed in intervals during 52
days. The results were compared with the model fit and the AIC values.
Journal: Journal of Applied Statistics
Pages: 37-47
Issue: 1
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000018493
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000018493
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:1:p:37-47
Template-Type: ReDIF-Article 1.0
Author-Name: Penelope Vounatsou
Author-X-Name-First: Penelope
Author-X-Name-Last: Vounatsou
Author-Name: Tom Smith
Author-X-Name-First: Tom
Author-X-Name-Last: Smith
Author-Name: Alan Gelfand
Author-X-Name-First: Alan
Author-X-Name-Last: Gelfand
Title: Spatial modelling of gene frequencies in the presence of undetectable alleles
Abstract:
Bayesian hierarchical models are developed to estimate the frequencies of
the alleles at the HLA-C locus in the presence of non-identifiable alleles
and possible spatial correlations in a large but sparse, spatially defined
database from Papua New Guinea. Bayesian model selection methods are
applied to investigate the effects of altitude and language on the genetic
diversity of HLA-C alleles. The general model includes fixed altitudinal
effects, random language effects and random spatially structured location
effects. Conditional autoregressive priors are used to incorporate the
geographical structure of the map, and Markov chain Monte Carlo simulation
methods are applied for estimation and inference. The results show that
HLA-C allele frequencies are explained more by linguistic than altitudinal
differences, indicating that genetic diversity at this locus in Papua New
Guinea probably tracks population movements and is less influenced by
natural selection than is variation at HLA-A and HLA-B.
Journal: Journal of Applied Statistics
Pages: 49-62
Issue: 1
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000018501
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000018501
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:1:p:49-62
Template-Type: ReDIF-Article 1.0
Author-Name: Dimitris Karlis
Author-X-Name-First: Dimitris
Author-X-Name-Last: Karlis
Title: An EM algorithm for multivariate Poisson distribution and related models
Abstract:
Multivariate extensions of the Poisson distribution are plausible models
for multivariate discrete data. The lack of estimation and inferential
procedures reduces the applicability of such models. In this paper, an EM
algorithm for Maximum Likelihood estimation of the parameters of the
Multivariate Poisson distribution is described. The algorithm is based on
the multivariate reduction technique that generates the Multivariate
Poisson distribution. Illustrative examples are also provided. Extension
to other models, generated via multivariate reduction, is discussed.
Journal: Journal of Applied Statistics
Pages: 63-77
Issue: 1
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000018510
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000018510
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:1:p:63-77
Template-Type: ReDIF-Article 1.0
Author-Name: James Reed
Author-X-Name-First: James
Author-X-Name-Last: Reed
Title: Adjusting for bias in randomized cluster trials
Abstract:
The randomized cluster design is typical in studies where the unit of
randomization is a cluster of individuals rather than the individual.
Evaluating various intervention strategies across medical care providers
at either an institutional level or at a physician group practice level
fits the randomized cluster model. Clearly, the analytical approach to
such studies must take the unit of randomization and accompanying
intraclass correlation into consideration. We review alternative methods
to the typical Pearson's chi-square analysis and illustrate these
alternatives. We have written and tested a Fortran program that produces
the statistics outlined in this paper. The program, in an executable
format is available from the author on request.
Journal: Journal of Applied Statistics
Pages: 79-85
Issue: 1
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000018529
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000018529
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:1:p:79-85
Template-Type: ReDIF-Article 1.0
Author-Name: Håkon Tjelmeland
Author-X-Name-First: Håkon
Author-X-Name-Last: Tjelmeland
Author-Name: Kjetill Vassmo Lund
Author-X-Name-First: Kjetill Vassmo
Author-X-Name-Last: Lund
Title: Bayesian modelling of spatial compositional data
Abstract:
Compositional data are vectors of proportions, specifying fractions of a
whole. Aitchison (1986) defines logistic normal distributions for
compositional data by applying a logistic transformation and assuming the
transformed data to be multi- normal distributed. In this paper we
generalize this idea to spatially varying logistic data and thereby define
logistic Gaussian fields. We consider the model in a Bayesian framework
and discuss appropriate prior distributions. We consider both complete
observations and observations of subcompositions or individual
proportions, and discuss the resulting posterior distributions. In
general, the posterior cannot be analytically handled, but the Gaussian
base of the model allows us to define efficient Markov chain Monte Carlo
algorithms. We use the model to analyse a data set of sediments in an
Arctic lake. These data have previously been considered, but then without
taking the spatial aspect into account.
Journal: Journal of Applied Statistics
Pages: 87-100
Issue: 1
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000018547
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000018547
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:1:p:87-100
Template-Type: ReDIF-Article 1.0
Author-Name: Christopher Illert
Author-X-Name-First: Christopher
Author-X-Name-Last: Illert
Title: Lexigenesis in ancestral south-east Australian Aboriginal language
Abstract:
The 1/x frequency distribution is known to researchers ranging from
economists and biologists to electronic engineers. It is known to
linguists as Zipf's Law (Zipf, 1949) and has recently been shown not to be
a consequence of the Central Limit Theorem (Troll & Graben, 1998)--leaving
an "unsolved problem' in information theory (Jones, 1999). This 1/x
distribution, associated with scale-invariant physical systems (Machlup &
Hoshiko, 1980), is a special case of the general power law xλ
arising from the Lagrangian L(x,F-super-˙(x)) =
½x1-λF-super-˙2 and, as λ need not be an integer,
some related research understandably involves fractals (Allison et al. ,
2001). The present paper generalizes this Lagrangian to include a van der
Waals effect. It is argued that ancestral Aboriginal language consisted of
root-morphemes that were built up into, and often condensed within,
subsequent words or lexemes. Using discrete-optimization techniques
pioneered elsewhere (Illert, 1987; Reverberi, 1985), and the new
morpho-statistics, this paper models lexeme-condensation in ancestral
south-east Australian Aboriginal language.
Journal: Journal of Applied Statistics
Pages: 113-143
Issue: 2
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000023703
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000023703
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:2:p:113-143
Template-Type: ReDIF-Article 1.0
Author-Name: Kaushik Ghosh
Author-X-Name-First: Kaushik
Author-X-Name-Last: Ghosh
Author-Name: Rao Jammalamadaka
Author-X-Name-First: Rao
Author-X-Name-Last: Jammalamadaka
Author-Name: Ram Tiwari
Author-X-Name-First: Ram
Author-X-Name-Last: Tiwari
Title: Semiparametric Bayesian Techniques for Problems in Circular Data
Abstract:
In this paper, we consider the problems of prediction and tests of
hypotheses for directional data in a semiparametric Bayesian set-up.
Observations are assumed to be independently drawn from the von Mises
distribution and uncertainty in the location parameter is modelled by a
Dirichlet process. For the prediction problem, we present a method to
obtain the predictive density of a future observation, and, for the
testing problem, we present a method of computing the Bayes factor by
obtaining the posterior probabilities of the hypotheses under
consideration. The semiparametric model is seen to be flexible and robust
against prior misspecifications. While analytical expressions are
intractable, the methods are easily implemented using the Gibbs sampler.
We illustrate the methods with data from two real-life examples.
Journal: Journal of Applied Statistics
Pages: 145-161
Issue: 2
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000023712
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000023712
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:2:p:145-161
Template-Type: ReDIF-Article 1.0
Author-Name: Shuo-Jye Wu
Author-X-Name-First: Shuo-Jye
Author-X-Name-Last: Wu
Author-Name: Chun-Tao Chang
Author-X-Name-First: Chun-Tao
Author-X-Name-Last: Chang
Title: Inference in the Pareto distribution based on progressive Type II censoring with random removals
Abstract:
This study considers the estimation problem for the Pareto distribution
based on progressive Type II censoring with random removals. The number of
units removed at each failure time has a discrete uniform distribution. We
are going to use the maximum likelihood method to obtain the estimator of
parameter. The expectation and variance of the maximum likelihood
estimator will be derived. The expected time required to complete such an
experiment will be computed. Some numerical results of expected test times
are carried out for this type of progressive censoring and other sampling
schemes.
Journal: Journal of Applied Statistics
Pages: 163-172
Issue: 2
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000023721
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000023721
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:2:p:163-172
Template-Type: ReDIF-Article 1.0
Author-Name: Alan Wan
Author-X-Name-First: Alan
Author-X-Name-Last: Wan
Author-Name: Anoop Chaturvedi
Author-X-Name-First: Anoop
Author-X-Name-Last: Chaturvedi
Author-Name: Guohuazou Zou
Author-X-Name-First: Guohuazou
Author-X-Name-Last: Zou
Title: Unbiased estimation of the MSE matrices of improved estimators in linear regression
Abstract:
Stein-rule and other improved estimators have scarcely been used in
empirical work. One major reason is that it is not easy to obtain
precision measures for these estimators. In this paper, we derive unbiased
estimators for both the mean squared error (MSE) and the scaled MSE
matrices of a class of Stein-type estimators. Our derivation provides the
basis for measuring the estimators' precision and constructing confidence
bands. Comparisons are made between these MSE estimators and the least
squares covariance estimator. For illustration, the methodology is applied
to data on energy consumption.
Journal: Journal of Applied Statistics
Pages: 173-189
Issue: 2
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000023730
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000023730
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:2:p:173-189
Template-Type: ReDIF-Article 1.0
Author-Name: Man-Suk Oh
Author-X-Name-First: Man-Suk
Author-X-Name-Last: Oh
Author-Name: Jung Whan Choi
Author-X-Name-First: Jung Whan
Author-X-Name-Last: Choi
Author-Name: Dai-Gyoung Kim
Author-X-Name-First: Dai-Gyoung
Author-X-Name-Last: Kim
Title: Bayesian inference and model selection in latent class logit models with parameter constraints: An application to market segmentation
Abstract:
Latent class models have recently drawn considerable attention among many
researchers and practitioners as a class of useful tools for capturing
heterogeneity across different segments in a target market or population.
In this paper, we consider a latent class logit model with parameter
constraints and deal with two important issues in the latent class
models--parameter estimation and selection of an appropriate number of
classes--within a Bayesian framework. A simple Gibbs sampling algorithm is
proposed for sample generation from the posterior distribution of unknown
parameters. Using the Gibbs output, we propose a method for determining an
appropriate number of the latent classes. A real-world marketing example
as an application for market segmentation is provided to illustrate the
proposed method.
Journal: Journal of Applied Statistics
Pages: 191-204
Issue: 2
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000023749
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000023749
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:2:p:191-204
Template-Type: ReDIF-Article 1.0
Author-Name: Yuzhi Cai
Author-X-Name-First: Yuzhi
Author-X-Name-Last: Cai
Author-Name: Neville Davies
Author-X-Name-First: Neville
Author-X-Name-Last: Davies
Title: A simple diagnostic method of outlier detection for stationary Gaussian time series
Abstract:
In this paper we present a "model free' method of outlier detection
for Gaussian time series by using the autocorrelation structure of the
time series. We also present a graphic diagnostic method in order to
distinguish an additive outlier (AO) from an innovation outlier (IO). The
test statistic for detecting the outlier has a P ² distribution with
one degree of freedom. We show that this method works well when the time
series contain either one type of the outliers or both additive and
innovation type outliers, and this method has the advantage that no time
series model needs to be estimated from the data. Simulation evidence
shows that different types of outliers can be graphically distinguished by
using the techniques proposed.
Journal: Journal of Applied Statistics
Pages: 205-223
Issue: 2
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000023758
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000023758
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:2:p:205-223
Template-Type: ReDIF-Article 1.0
Author-Name: Kepher Makambi
Author-X-Name-First: Kepher
Author-X-Name-Last: Makambi
Title: Weighted inverse chi-square method for correlated significance tests
Abstract:
Fisher's inverse chi-square method for combining independent significance
tests is extended to cover cases of dependence among the individual tests.
A weighted version of the method and its approximate null distribution are
presented. To illustrate the use of the proposed method, two tests for the
overall treatment efficacy are combined, with the resulting test procedure
exhibiting good control of the type I error probability. Two examples from
clinical trials are given to illustrate the applicability of the
procedures to real-life situations.
Journal: Journal of Applied Statistics
Pages: 225-234
Issue: 2
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000023767
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000023767
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:2:p:225-234
Template-Type: ReDIF-Article 1.0
Author-Name: Didier Renard
Author-X-Name-First: Didier
Author-X-Name-Last: Renard
Author-Name: Helena Geys
Author-X-Name-First: Helena
Author-X-Name-Last: Geys
Author-Name: Geert Molenberghs
Author-X-Name-First: Geert
Author-X-Name-Last: Molenberghs
Author-Name: Tomasz Burzykowski
Author-X-Name-First: Tomasz
Author-X-Name-Last: Burzykowski
Author-Name: Marc Buyse
Author-X-Name-First: Marc
Author-X-Name-Last: Buyse
Author-Name: Tony Vangeneugden
Author-X-Name-First: Tony
Author-X-Name-Last: Vangeneugden
Author-Name: Luc Bijnens
Author-X-Name-First: Luc
Author-X-Name-Last: Bijnens
Title: Validation of a longitudinally measured surrogate marker for a time-to-event endpoint
Abstract:
The objective of this paper is to extend the surrogate endpoint
validation methodology proposed by Buyse et al. (2000) to the case of a
longitudinally measured surrogate marker when the endpoint of interest is
time to some key clinical event. A joint model for longitudinal and event
time data is required. To this end, the model formulation of Henderson et
al. (2000) is adopted. The methodology is applied to a set of two
randomized clinical trials in advanced prostate cancer to evaluate the
usefulness of prostate-specific antigen (PSA) level as a surrogate for
survival.
Journal: Journal of Applied Statistics
Pages: 235-247
Issue: 2
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000023776
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000023776
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:2:p:235-247
Template-Type: ReDIF-Article 1.0
Author-Name: Gang Zheng
Author-X-Name-First: Gang
Author-X-Name-Last: Zheng
Author-Name: Mohammad Al-Saleh
Author-X-Name-First: Mohammad
Author-X-Name-Last: Al-Saleh
Title: Improving the best linear unbiased estimator for the scale parameter of symmetric distributions by using the absolute value of ranked set samples
Abstract:
Ranked set sampling is a cost efficient sampling technique when actually
measuring sampling units is difficult but ranking them is relatively easy.
For a family of symmetric location-scale distributions with known location
parameter, we consider a best linear unbiased estimator for the scale
parameter. Instead of using original ranked set samples, we propose to use
the absolute deviations of the ranked set samples from the location
parameter. We demonstrate that this new estimator has smaller variance
than the best linear unbiased estimator using original ranked set samples.
Optimal allocation in the absolute value of ranked set samples is also
discussed for the estimation of the scale parameter when the location
parameter is known. Finally, we perform some sensitivity analyses for this
new estimator when the location parameter is unknown but estimated using
ranked set samples and when the ranking of sampling units is imperfect.
Journal: Journal of Applied Statistics
Pages: 253-265
Issue: 3
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000030039
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000030039
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:3:p:253-265
Template-Type: ReDIF-Article 1.0
Author-Name: Steven Cook
Author-X-Name-First: Steven
Author-X-Name-Last: Cook
Title: The stylized approach to unit root testing: Neglected contributions and the cost of simplicity
Abstract:
Following Dickey & Fuller (1979) (DF), a stylized approach to the testing
of the unit root hypothesis has emerged. Based upon the combined use of
the DF test in its augmented t -ratio form and MacKinnon (1991) critical
values, the approach has received widespread adoption due to the ease with
which it can be applied. In this paper a number of departures from this
"stylized approach', which do not significantly reduce its ease of
application, are examined. The results obtained from an empirical
application to UK industrial production and Monte Carlo experimentation
have clear methodological implications, showing that routine application
of the stylized approach can lead to misleading inferences concerning the
integrated nature of economic time series.
Journal: Journal of Applied Statistics
Pages: 267-272
Issue: 3
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000030048
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000030048
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:3:p:267-272
Template-Type: ReDIF-Article 1.0
Author-Name: Tirthankar Dasgupta
Author-X-Name-First: Tirthankar
Author-X-Name-Last: Dasgupta
Title: An economic inspection interval for control of defective items in a hot rolling mill
Abstract:
The article addresses a real-life problem on determining the optimum
sampling interval for control of defective items in a hot rolling mill.
Having observed that the pattern of appearance of mill defects indicates a
geometric process failure mechanism, an economic model is developed in
line with the method suggested by Taguchi and critically examined by
Nayebpour & Woodall. An expression for the expected loss per product as a
function of the sampling interval is derived and the optimum interval is
obtained by minimizing this loss function. The practical issues involved
in this exercise, such as estimation of various cost components, are also
discussed and the effect of erroneous estimation of cost components is
studied through a sensitivity analysis.
Journal: Journal of Applied Statistics
Pages: 273-282
Issue: 3
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000030057
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000030057
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:3:p:273-282
Template-Type: ReDIF-Article 1.0
Author-Name: Jacobo De UNA-ALvarez
Author-X-Name-First: Jacobo
Author-X-Name-Last: De UNA-ALvarez
Author-Name: M. Soledad Otero-GirALdez
Author-X-Name-First: M. Soledad
Author-X-Name-Last: Otero-GirALdez
Author-Name: Gema ALvarez-Llorente
Author-X-Name-First: Gema
Author-X-Name-Last: ALvarez-Llorente
Title: Estimation under length-bias and right-censoring: An application to unemployment duration analysis for married women
Abstract:
In this work we analyse unemployment duration for married women in Spain,
using the Labour Force Survey (LFS) of the Spanish Institute for
Statistics, 1987-1997. Consistent non-parametric estimation of the
unemployment survival function is provided. Since the available data are
length- biased, a suitable correction of the Kaplan-Meier product-limit
estimator is motivated and used for the referred analysis. The accuracy of
parametric models is checked by means of goodness-of-fit plots--a
graphical tool that requires preliminary estimation of the survival.
Structural features of the associated hazard (as monotonicity and
unimodality) are explored.
Journal: Journal of Applied Statistics
Pages: 283-291
Issue: 3
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000030066
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000030066
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:3:p:283-291
Template-Type: ReDIF-Article 1.0
Author-Name: Paul Gustafson
Author-X-Name-First: Paul
Author-X-Name-Last: Gustafson
Author-Name: Lawrence Walker
Author-X-Name-First: Lawrence
Author-X-Name-Last: Walker
Title: An extension of the Dirichlet prior for the analysis of longitudinal multinomial data
Abstract:
Studies producing longitudinal multinomial data arise in several subject
areas. This article suggests a Bayesian approach to the analysis of such
data. Rather than infusing a latent model structure, we develop a prior
distribution for the multinomial parameters which reflects the
longitudinal nature of the observations. This distribution is constructed
by modifying the prior that posits independent Dirichlet distributions for
the multinomial parameters across time. Posterior analysis, which is
implemented using Monte Carlo methods, can then be used to assess the
temporal behaviour of the multinomial parameters underlying the observed
data. We test this methodology on simulated data, opinion polling data,
and data from a study concerning the development of moral reasoning.
Journal: Journal of Applied Statistics
Pages: 293-310
Issue: 3
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000030075
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000030075
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:3:p:293-310
Template-Type: ReDIF-Article 1.0
Author-Name: Saralees Nadarajah
Author-X-Name-First: Saralees
Author-X-Name-Last: Nadarajah
Author-Name: Samuel Kotz
Author-X-Name-First: Samuel
Author-X-Name-Last: Kotz
Title: Moments of some J-shaped distributions
Abstract:
This paper concerns a family of univariate distributions suggested by
Topp & Leone in 1955. Topp & Leone provided no motivation for this new
family and by way of properties they derived only the first four
integer-order moments, i.e. E(Xn) for n=1, r 2, r 3, r 4 . In this paper
we provide a motivation for the family of distributions and derive
explicit algebraic expressions for: (1) hazard rate function; (2) E(Xn)
when n ± 1 is any integer; (3) E(Xn) for n=1, r 2, r … r , r
10 , and (4) E[{X-E(X)} n] , n=2, r 3, r 4 . We also give an expression
for the characteristic function and discuss issues on estimation and
simulation. The main calculations of this paper use properties of the
Gauss hypergeometric function.
Journal: Journal of Applied Statistics
Pages: 311-317
Issue: 3
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000030084
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000030084
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:3:p:311-317
Template-Type: ReDIF-Article 1.0
Author-Name: Camino GonzALez
Author-X-Name-First: Camino
Author-X-Name-Last: GonzALez
Author-Name: Gabriel Palomo
Author-X-Name-First: Gabriel
Author-X-Name-Last: Palomo
Title: Bayesian acceptance sampling plans following economic criteria: An application to paper pulp manufacturing
Abstract:
The main purposes of this paper are to derive Bayesian acceptance
sampling plans regarding the number of defects per unit of product, and to
illustrate how to apply the methodology to the paper pulp industry. The
sampling plans are obtained following an economic criterion: minimize the
expected total cost of quality. It has been assumed that the number of
defects per unit of product follows a Poisson distribution with process
average 5 , whose prior information is described either for a gamma or for
a non- informative distribution. The expected total cost of quality is
composed of three independent components: inspection, acceptance and
rejection. Both quadratic and step-loss functions have been used to
quantify the cost incurred for the acceptance of a lot containing units
with defects. Combining the prior information on 5 with the loss
functions, four different sampling plans are obtained. When the
quadratic-loss function is used, an analytical relation between the
optimum settings of the sample size and the acceptance number is derived.
The robustness analysis indicates that the sampling plans obtained are
robust with respect to the prior distribution of the process average as
well as to the misspecification of its mean and variance.
Journal: Journal of Applied Statistics
Pages: 319-333
Issue: 3
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000030093
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000030093
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:3:p:319-333
Template-Type: ReDIF-Article 1.0
Author-Name: Sueli Mingoti
Author-X-Name-First: Sueli
Author-X-Name-Last: Mingoti
Author-Name: Otaviano Neves
Author-X-Name-First: Otaviano
Author-X-Name-Last: Neves
Title: A note on the Zhang omnibus test for normality based on the Q statistic
Abstract:
A discussion about the estimators proposed by Zhang (1999) for the true
standard deviation C of a normal distribution is presented. Those
estimators, called by Zhang q 1 and q 2 , are functions of the expected
values of the order statistics from a standard normal distribution and
they were the basis of the Q statistic used in the derivation of a new
test for normality proposed by Zhang. Although the type I error and the
power of the test was discussed by Zhang, no study was performed to test
the reliability of q 1 and q 2 as estimators of C . In this paper, it is
shown that q 1 is a very poor estimator for C especially when C is large.
On the other hand, the estimator q 2 has a performance very similar to the
well-known sample standard deviation S. When some correlation is
introduced among the sample units it can be seen that the estimator q 1 is
much more affected than the estimators q 2 and S.
Journal: Journal of Applied Statistics
Pages: 335-341
Issue: 3
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476022000030101
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476022000030101
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:3:p:335-341
Template-Type: ReDIF-Article 1.0
Author-Name: Yoshinori Kawasaki
Author-X-Name-First: Yoshinori
Author-X-Name-Last: Kawasaki
Author-Name: Philip Hans Franses
Author-X-Name-First: Philip Hans
Author-X-Name-Last: Franses
Title: Detecting seasonal unit roots in a structural time series model
Abstract:
In this paper, we propose to detect seasonal unit roots within the
context of a structural time series model. Such a model is often found to
be useful in practice. Using Monte Carlo simulations, we show that our
method works well. We illustrate our approach for several quarterly
macroeconomic time series variables.
Journal: Journal of Applied Statistics
Pages: 373-387
Issue: 4
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000035412
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000035412
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:4:p:373-387
Template-Type: ReDIF-Article 1.0
Author-Name: Thaddeus Tarpey
Author-X-Name-First: Thaddeus
Author-X-Name-Last: Tarpey
Title: Estimating the average slope
Abstract:
The slope is usually the parameter of primary importance in a simple
linear regression. If the straight line model gives a poor fit to the
data, one can consider the average slope of the non-linear response. In
this paper, we show that if the response is quadratic, then the average
slope can be obtained by simply using the slope from a straight line fit.
In fact, if the slope of the best fitting line to a smooth non-linear
function equals the average slope of the function over an arbitrary
interval, then the function must be quadratic. This paper illustrates the
case where intentionally fitting a wrong model (in this case, a straight
line) gives the correct result (the average slope). The example which
motivated this study is used to illustrate the results.
Journal: Journal of Applied Statistics
Pages: 389-395
Issue: 4
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000035412a
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000035412a
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:4:p:389-395
Template-Type: ReDIF-Article 1.0
Author-Name: Liming Xiang
Author-X-Name-First: Liming
Author-X-Name-Last: Xiang
Author-Name: Andy Lee
Author-X-Name-First: Andy
Author-X-Name-Last: Lee
Author-Name: Siu-Keung Tse
Author-X-Name-First: Siu-Keung
Author-X-Name-Last: Tse
Title: Assessing local cluster influence in generalized linear mixed models
Abstract:
This paper investigates local influence measures for assessing cluster
influence in generalized linear mixed models. Several cluster-specific
perturbation schemes are considered. The proposed local influence
diagnostics are applied to analyse maternity length of inpatient stay data
where individual observations are nested within hospitals and the relative
performance of hospitals is of interest.
Journal: Journal of Applied Statistics
Pages: 349-359
Issue: 4
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000035395
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000035395
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:4:p:349-359
Template-Type: ReDIF-Article 1.0
Author-Name: Patrick Barrie
Author-X-Name-First: Patrick
Author-X-Name-Last: Barrie
Title: A new sports ratings system: The tiddlywinks world ratings
Abstract:
A system for calculating relative playing strengths of tiddlywinks
players is described. The method can also be used for other sports. It is
specifically designed to handle cases where the number of games played in
a season varies greatly between players, and thus the confidence that one
can have in an assigned rating also varies greatly between players. In
addition, the method is designed to handle situations in which some games
in the tournament are played as individuals ("singles'), while others
are played with a partner ("pairs'). These factors make application
of some statistical treatments, such as the Elo rating system used in
chess, difficult to apply. The new method characterizes each player's
ability by a numerical rating together with an associated uncertainty in
that player's rating. After each tournament, a "tournament rating' is
calculated for each player based on how many points the player achieved
and the relative strength of partner(s) and opponent(s). Statistical
analysis is then used to estimate the likely error in the calculated
tournament rating. Both the tournament rating and its estimated error are
used in the calculation of new ratings. The method has been applied to
calculate tiddlywinks world ratings based on over 13 r 000 national
tournament games in Britain and the USA going back to 1985.
Journal: Journal of Applied Statistics
Pages: 361-372
Issue: 4
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000035403
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000035403
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:4:p:361-372
Template-Type: ReDIF-Article 1.0
Author-Name: Hajaj Al-Oraini
Author-X-Name-First: Hajaj
Author-X-Name-Last: Al-Oraini
Author-Name: M. A. Rahim
Author-X-Name-First: M. A.
Author-X-Name-Last: Rahim
Title: Economic statistical design of x ¥ control charts for systems with gamma ( 5 ,2) in-control times
Abstract:
In this paper, gamma ( 5 ,2) distribution is considered as a failure
model for the economic statistical design of x ¥ control charts. The
study shows that the statistical performance of control charts can be
improved significantly, with only a slight increase in the cost, by adding
constraints to the optimization problem. The use of an economic
statistical design instead of an economic design results in control charts
that may be less expensive to implement, that have lower false alarm
rates, and that have a higher probability of detecting process shifts.
Numerical examples are presented to support this proposition. The results
of economic statistical design are compared with those of a pure economic
design. The effects of adding constraints for statistical performance
measures, such as Type I error rate and the power of the chart, are
extensively investigated.
Journal: Journal of Applied Statistics
Pages: 397-409
Issue: 4
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000035430
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000035430
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:4:p:397-409
Template-Type: ReDIF-Article 1.0
Author-Name: Getachew Asfaw Dagne
Author-X-Name-First: Getachew Asfaw
Author-X-Name-Last: Dagne
Title: The use of power transformations in small area estimation
Abstract:
Sample surveys are usually designed and analysed to produce estimates for
larger areas. Nevertheless, sample sizes are often not large enough to
give adequate precision for small area estimates of interest. To overcome
such difficulties, borrowing strength from related small areas via
modelling becomes essential. In line with this, we propose components of
variance models with power transformations for small area estimation. This
paper reports the results of a study aimed at incorporating the power
transformation in small area estimation for improving the quality of small
area predictions. The proposed methods are demonstrated on satellite data
in conjunction with survey data to estimate mean acreage under a specified
crop for counties in Iowa.
Journal: Journal of Applied Statistics
Pages: 411-423
Issue: 4
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000035449
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000035449
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:4:p:411-423
Template-Type: ReDIF-Article 1.0
Author-Name: JosE Eduardo Corrente
Author-X-Name-First: JosE Eduardo
Author-X-Name-Last: Corrente
Author-Name: Maria Del Pilar DIAz
Author-X-Name-First: Maria Del Pilar
Author-X-Name-Last: DIAz
Title: Ordinal models and generalized estimating equations to evaluate disease severity
Abstract:
Many assays have been carried out in Capsicum spp. in order to evaluate
its resistance to Phytophthora capsici , which causes blight and
considerable yield loss. An assay aiming at the selection of resistant
pepper and bell pepper genotypes to P. capsici was jointly performed in
the laboratory of the Phytopathological Clinic of Entomology,
Phytopathology and Agricultural Zoology and in the experimental area of
the Plant Production Department, both located at ESALQ, University of Sao
Paulo, Brazil. The data set for this assay comes from ordinal categorized
random variables, whose analysis does not generally take into account the
ordinal nature of the responses, but instead, builds indexes, among other
measures, in order to evaluate the resistance of the studied genotypes.
This work presents ordinal generalized linear fits in order to evaluate
blight severity as well as to identify and select new resources to the
pathogen. It also analyses the estimating equations proposed by Liang &
Zeger (1986a, b) in order to obtain an infection pattern for the disease.
From the fit of the cumulative logit models, valuable genotypes are
identified for genetic breeding programs.
Journal: Journal of Applied Statistics
Pages: 425-439
Issue: 4
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000035458
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000035458
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:4:p:425-439
Template-Type: ReDIF-Article 1.0
Author-Name: Shakir Hussain
Author-X-Name-First: Shakir
Author-X-Name-Last: Hussain
Author-Name: Ghazi Shukur
Author-X-Name-First: Ghazi
Author-X-Name-Last: Shukur
Title: Testing for autocorrelation in non-stationary dynamic systems of equations
Abstract:
Using Monte Carlo methods, the properties of systemwise generalizations
of the Breusch-Godfrey test for autocorrelated errors are studied in
integrated cointegrated systems of equations. Our analysis, regarding the
size of the test, reveals that the corrected LR tests have been shown to
perform satisfactorily even in cases when the exogenous variables follow a
unit root process, whilst the commonly used TR2 test behaves badly even in
single equations. All tests perform badly, however, when the number of
equations increases and the exogenous variables are highly autocorrelated.
Journal: Journal of Applied Statistics
Pages: 441-454
Issue: 4
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000035467
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000035467
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:4:p:441-454
Template-Type: ReDIF-Article 1.0
Author-Name: Steen Magnussen
Author-X-Name-First: Steen
Author-X-Name-Last: Magnussen
Title: Stepwise estimators for three-phase sampling of categorical variables
Abstract:
Three-phase sampling can be a very effective design for the estimation of
regional and national forest cover type frequencies. Simultaneous
estimation of frequencies and sampling variances require estimation of a
large number of parameters; often so many that consistency and robustness
of results becomes an issue. A new stepwise estimation model, in which
bias in phase one and two is corrected sequentially instead of
simultaneously, requires fewer parameters. Simulated three-phase sampling
tested the new model with 144 settings of sample sizes, the number of
classes and classification accuracy. Relative mean absolute deviations and
root mean square errors were, in most cases, about 8% lower with the
stepwise method than with a simultaneous approach. Differences were a
function of design parameters. Average expected relative root mean square
errors, derived from the assumption of a Dirichlet distribution of
cover-type frequencies, tracked the empirical root mean square errors
obtained from repeated sampling with - 10%. Resampling results indicate
that the relative bias of the most frequent cover types was slightly
inflated by the stepwise method. For the least common cover type, the
simultaneous method produced the largest relative bias.
Journal: Journal of Applied Statistics
Pages: 461-475
Issue: 5
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053628
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053628
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:5:p:461-475
Template-Type: ReDIF-Article 1.0
Author-Name: Youngjo Lee
Author-X-Name-First: Youngjo
Author-X-Name-Last: Lee
Author-Name: John Nelder
Author-X-Name-First: John
Author-X-Name-Last: Nelder
Title: False parsimony and its detection with GLMs
Abstract:
A search for a good parsimonious model is often required in data
analysis. However, unfortunately we may end up with a falsely parsimonious
model. Misspecification of the variance structure causes a loss of
efficiency in regression estimation and this can lead to large
standard-error estimates, producing possibly false parsimony. With
generalized linear models (GLMs) we can keep the link function fixed while
changing the variance function, thus allowing us to recognize false
parsimony caused by such increased standard errors. With data
transformation, any change of transformation automatically changes the
scale for additivity, making false parsimony hard to recognize.
Journal: Journal of Applied Statistics
Pages: 477-483
Issue: 5
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053637
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053637
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:5:p:477-483
Template-Type: ReDIF-Article 1.0
Author-Name: Sifa Mvoi
Author-X-Name-First: Sifa
Author-X-Name-Last: Mvoi
Author-Name: Yan-Xia Lin
Author-X-Name-First: Yan-Xia
Author-X-Name-Last: Lin
Title: Analysis of experiments using the asymptotic quasi-likelihood approach
Abstract:
Comparison of treatment effects in an experiment is usually done through
analysis of variance under the assumption that the errors are normally and
independently distributed with zero mean and constant variance. The
traditional approach in dealing with non-constant variance is to apply a
variance stabilizing transformation and then run the analysis on the
transformed data. In this approach, the conclusions of analysis of
variance apply only to the transformed population. In this paper, the
asymptotic quasi-likelihood method is introduced to the analysis of
experimental designs. The weak assumptions of the asymptotic
quasi-likelihood method make it possible to draw conclusions on
heterogeneous populations without transforming them. This paper
demonstrates how to apply the asymptotic quasi-likelihood technique to
three commonly used models. This gives a possible way to analyse data
given a complex experimental design.
Journal: Journal of Applied Statistics
Pages: 485-505
Issue: 5
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053646
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053646
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:5:p:485-505
Template-Type: ReDIF-Article 1.0
Author-Name: Arthur Yeh
Author-X-Name-First: Arthur
Author-X-Name-Last: Yeh
Author-Name: Dennis Lin
Author-X-Name-First: Dennis
Author-X-Name-Last: Lin
Author-Name: Honghong Zhou
Author-X-Name-First: Honghong
Author-X-Name-Last: Zhou
Author-Name: Chandramouliswaran Venkataramani
Author-X-Name-First: Chandramouliswaran
Author-X-Name-Last: Venkataramani
Title: A multivariate exponentially weighted moving average control chart for monitoring process variability
Abstract:
This paper introduces a new multivariate exponentially weighted moving
average (EWMA) control chart. The proposed control chart, called an EWMA
V-chart, is designed to detect small changes in the variability of
correlated multivariate quality characteristics. Through examples and
simulations, it is demonstrated that the EWMA V-chart is superior to the
&7CS&7C-chart in detecting small changes in process variability.
Furthermore, a counterpart of the EWMA V-chart for monitoring process
mean, called the EWMA M-chart is proposed. In detecting small changes in
process variability, the combination of EWMA M-chart and EWMA V-chart is a
better alternative to the combination of MEWMA control chart (Lowry et al.
, 1992) and &7CS&7C-chart. Furthermore, the EWMA M- chart and V-chart can
be plotted in one single figure. As for monitoring both process mean and
process variability, the combined MEWMA and EWMA V-charts provide the best
control procedure.
Journal: Journal of Applied Statistics
Pages: 507-536
Issue: 5
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053655
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053655
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:5:p:507-536
Template-Type: ReDIF-Article 1.0
Author-Name: H. Wong
Author-X-Name-First: H.
Author-X-Name-Last: Wong
Author-Name: Wai-Cheung Ip
Author-X-Name-First: Wai-Cheung
Author-X-Name-Last: Ip
Author-Name: Zhongjie Xie
Author-X-Name-First: Zhongjie
Author-X-Name-Last: Xie
Author-Name: Xueli Lui
Author-X-Name-First: Xueli
Author-X-Name-Last: Lui
Title: Modelling and forecasting by wavelets, and the application to exchange rates
Abstract:
This paper investigates the modelling and forecasting method for
non-stationary time series. Using wavelets, the authors propose a
modelling procedure that decomposes the series as the sum of three
separate components, namely trend, harmonic and irregular components. The
estimates suggested in this paper are all consistent. This method has been
used for the modelling of US dollar against DM exchange rate data, and ten
steps ahead (2 weeks) forecasting are compared with several other methods.
Under the Average Percentage of forecasting Error (APE) criterion, the
wavelet approach is the best one. The results suggest that forecasting
based on wavelets is a viable alternative to existing methods.
Journal: Journal of Applied Statistics
Pages: 537-553
Issue: 5
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053664
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053664
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:5:p:537-553
Template-Type: ReDIF-Article 1.0
Author-Name: Christopher Weigand
Author-X-Name-First: Christopher
Author-X-Name-Last: Weigand
Title: Economically optimal inspection policy with geometric adaptation
Abstract:
A process is considered whose quality deteriorates according to a
constant failure intensity 5 . As, in practice, it can be difficult to
estimate the true value of 5 , the purpose of this paper is to present a
strategy that can be applied without knowing 5 . In order to maximize
profit per item, perfect inspections and renewals are performed. The
length of the inspection interval is described by a geometric sequence and
changes in time, depending on perceived assignable causes. Optimal
adaptive control plans provide nearly the same profit per item as in the
case when 5 is known.
Journal: Journal of Applied Statistics
Pages: 555-569
Issue: 5
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053673
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053673
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:5:p:555-569
Template-Type: ReDIF-Article 1.0
Author-Name: Abdullah Almasri
Author-X-Name-First: Abdullah
Author-X-Name-Last: Almasri
Author-Name: Ghazi Shukur
Author-X-Name-First: Ghazi
Author-X-Name-Last: Shukur
Title: An illustration of the causality relation between government spending and revenue using wavelet analysis on Finnish data
Abstract:
Quarterly data for the period 1960:1 to 1997:2, conventional tests, a
bootstrap simulation approach and a multivariate Rao's F-test have been
used to investigate if the causality between government spending and
revenue in Finland was changed at the beginning of 1990 due to future
plans to create the European Monetary Union (EMU). The results indicate
that during the period before 1990, the government revenue Granger-caused
spending, while the opposite happened after 1990, which agrees better with
Barro's tax smoothing hypothesis. However, when using monthly data instead
of quarterly data for almost the same sample period, totally different
results have been noted. The general conclusion is that the relationship
between spending and revenue in Finland is still not completely
understood. The ambiguity of these results may well be due to the fact
that there are several time scales involved in the relationship, and that
the conventional analyses may be inadequate to separate out the time scale
structured relationships between these variables. Therefore, to
investigate empirically the relation between these variables we attempt to
use the wavelets analysis that enables us to separate out different time
scales of variation in the data. We find that time scale decomposition is
important for analysing these economic variables.
Journal: Journal of Applied Statistics
Pages: 571-584
Issue: 5
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053682
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053682
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:5:p:571-584
Template-Type: ReDIF-Article 1.0
Author-Name: Stephen Leary
Author-X-Name-First: Stephen
Author-X-Name-Last: Leary
Author-Name: Atul Bhaskar
Author-X-Name-First: Atul
Author-X-Name-Last: Bhaskar
Author-Name: Andy Keane
Author-X-Name-First: Andy
Author-X-Name-Last: Keane
Title: Optimal orthogonal-array-based latin hypercubes
Abstract:
The use of optimal orthogonal array latin hypercube designs is proposed.
Orthogonal arrays were proposed for constructing latin hypercube designs
by Tang (1993). Such designs generally have better space filling
properties than random latin hypercube designs. Even so, these designs do
not necessarily fill the space particularly well. As a result, we consider
orthogonal-array-based latin hypercube designs that try to achieve
optimality in some sense. Optimization is performed by adapting strategies
found in Morris & Mitchell (1995) and Ye et al. (2000). The strategies
here search only orthogonal-array-based latin hypercube designs and, as a
result, optimal designs are found in a more efficient fashion. The designs
found are in general agreement with existing optimal designs reported
elsewhere.
Journal: Journal of Applied Statistics
Pages: 585-598
Issue: 5
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053691
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053691
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:5:p:585-598
Template-Type: ReDIF-Article 1.0
Author-Name: R. G. Aykroyd
Author-X-Name-First: R. G.
Author-X-Name-Last: Aykroyd
Author-Name: K. V. Mardia
Author-X-Name-First: K. V.
Author-X-Name-Last: Mardia
Title: A wavelet approach to shape analysis for spinal curves
Abstract:
We present a new method to describe shape change and shape differences in
curves, by constructing a deformation function in terms of a wavelet
decomposition. Wavelets form an orthonormal basis which allows
representations at multiple resolutions. The deformation function is
estimated, in a fully Bayesian framework, using a Markov chain Monte Carlo
algorithm. This Bayesian formulation incorporates prior information about
the wavelets and the deformation function. The flexibility of the MCMC
approach allows estimation of complex but clinically important summary
statistics, such as curvature in our case, as well as estimates of
deformation functions with variance estimates, and allows thorough
investigation of the posterior distribution. This work is motivated by
multi-disciplinary research involving a large-scale longitudinal study of
idiopathic scoliosis in UK children. This paper provides novel statistical
tools to study this spinal deformity, from which 5% of UK children suffer.
Using the data we consider statistical inference for shape differences
between normals, scoliotics and developers of scoliosis, in particular for
spinal curvature, and look at longitudinal deformations to describe shape
changes with time.
Journal: Journal of Applied Statistics
Pages: 605-623
Issue: 6
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053718
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053718
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:6:p:605-623
Template-Type: ReDIF-Article 1.0
Author-Name: Ren-Fen Lee
Author-X-Name-First: Ren-Fen
Author-X-Name-Last: Lee
Author-Name: Deng-Yuan Huang
Author-X-Name-First: Deng-Yuan
Author-X-Name-Last: Huang
Title: On some data oriented robust estimation procedures for means
Abstract:
Data oriented to estimate means is very important for large data sets.
Since outliers usually occur, the trimmed mean is a robust estimator of
locations. After building a reasonable linear model to explain the
relationship between the suitably transformed symmetric data and the
approximately standardized normal statistics, we find the trimmed
proportion based on the smallest variance of trimmed means. The related
statistical inference is also discussed. An empirical study based on an
annual survey about inbound visitors in the Taiwan area is used to achieve
our goal in deciding the trimmed proportion. In this study, we propose a
complete procedure to attain the goal.
Journal: Journal of Applied Statistics
Pages: 625-634
Issue: 6
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053727
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053727
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:6:p:625-634
Template-Type: ReDIF-Article 1.0
Author-Name: K. D. Patterson
Author-X-Name-First: K. D.
Author-X-Name-Last: Patterson
Author-Name: S. M. Heravi
Author-X-Name-First: S. M.
Author-X-Name-Last: Heravi
Title: The impact of fat-tailed distributions on some leading unit roots tests
Abstract:
There is substantial evidence that many time series associated with
financial and insurance claim data are fat-tailed, with a (much) higher
probability of " outliers' compared with the normal distribution.
However, standard tests, or variants of them, for the presence of unit
roots assume a normal distribution for the innovations driving the series.
Application of the former to the latter therefore involves an
inconsistency. We assess the impact of this inconsistency and provide
information on its impact on inference when innovations are drawn from the
Cauchy and sequence of t(v) distributions. A simple prediction that fat
tails will uniformly lead to over-sizing of standard tests (because the
fatness in the tail translates to the test distribution) turns out to be
incorrect: we find that some tests are over-sized but some are
under-sized. We also consider size retention and the power of the
Dickey-Fuller pivotal and normalized bias test statistics and weighted
symmetric versions of these tests. To make the unit root testing procedure
feasible, we develop an entropy-based test for some fat-tailed
distributions and apply it to share prices from the FTSE100.
Journal: Journal of Applied Statistics
Pages: 635-667
Issue: 6
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053736
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053736
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:6:p:635-667
Template-Type: ReDIF-Article 1.0
Author-Name: S. Robin
Author-X-Name-First: S.
Author-X-Name-Last: Robin
Author-Name: M. Lecomte
Author-X-Name-First: M.
Author-X-Name-Last: Lecomte
Author-Name: H. Hofte
Author-X-Name-First: H.
Author-X-Name-Last: Hofte
Author-Name: G. Mouille
Author-X-Name-First: G.
Author-X-Name-Last: Mouille
Title: A procedure for the clustering of cell wall mutants in the model plant Arabidopsis based on Fourier-transform infrared (FT-IR) spectroscopy
Abstract:
FT-IR microspectroscopy can be used to study the global composition and
architecture of plant cell walls and it allows cell wall mutants to be
identified. Our aim is to define a distance between cell wall mutants in
the model species Arabidopsis based on FT-IR spectra. Since the number of
data points that constitute a spectrum exceeds the number of samples
analysed, it is essential to reduce first the dimension of the dataset. We
present a comparison of several compression methods, including linear
discriminant analysis using a non-canonical covariance matrix. The
calculated distances were used to define clusters of mutants that appeared
to be biologically meaningful.
Journal: Journal of Applied Statistics
Pages: 669-681
Issue: 6
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053745
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053745
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:6:p:669-681
Template-Type: ReDIF-Article 1.0
Author-Name: Daniel Wang
Author-X-Name-First: Daniel
Author-X-Name-Last: Wang
Author-Name: Michael Conerly
Author-X-Name-First: Michael
Author-X-Name-Last: Conerly
Title: Evaluation of three lack of fit tests in linear regression models
Abstract:
A key diagnostic in the analysis of linear regression models is whether
the fitted model is appropriate for the observed data. The classical lack
of fit test is used for testing the adequacy of a linear regression model
when replicates are available. While many efforts have been made in
finding alternative lack of fit tests for models without replicates, this
paper focuses on studying the efficacy of three tests: the classical lack
of fit test, Utts' (1982) test, Burn & Ryan's (1983) test. The powers of
these tests are computed for a variety of situations. Comments and
conclusions on the overall performance of these tests are made, including
recommendations for future studies.
Journal: Journal of Applied Statistics
Pages: 683-696
Issue: 6
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053763
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053763
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:6:p:683-696
Template-Type: ReDIF-Article 1.0
Author-Name: D. K. Ghosh
Author-X-Name-First: D. K.
Author-X-Name-Last: Ghosh
Author-Name: P. C. Biswas
Author-X-Name-First: P. C.
Author-X-Name-Last: Biswas
Title: Complete diallel crosses plans through balanced incomplete block designs
Abstract:
The present investigation involves the methods of construction of
complete diallel cross plans using balanced incomplete block (BIB)
designs. Furthermore, the analysis of complete diallel crosses plans are
carried out to estimate the general combining ability of the ith line
(i=1, r 2, r …, r v) where the intra- block analysis of the
adjusted sum of squares for GCA and the unadjusted block sum of squares
are also obtained, thereafter the relationship between the estimates of
BIB design and the estimates of the GCA effect of CDC plan has been
established. Moreover, it has also been shown that the complete diallel
crosses design obtained through two BIB designs satisfying v1=b1= 4 5
1+3=v2=b2, r r1=2 5 1+1=r2=k1=k2 and 5 1= 5 2 are universally optimum.
These results are further supported by a suitable example of each.
However, the need of this study is to show that the analysis of the CDC
plan is reducible to the analysis of generating the BIB design.
Journal: Journal of Applied Statistics
Pages: 697-708
Issue: 6
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053772
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053772
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:6:p:697-708
Template-Type: ReDIF-Article 1.0
Author-Name: Siu-Keung Tse
Author-X-Name-First: Siu-Keung
Author-X-Name-Last: Tse
Author-Name: Chunyan Yang
Author-X-Name-First: Chunyan
Author-X-Name-Last: Yang
Title: Reliability sampling plans for the Weibull distribution under Type II progressive censoring with binomial removals
Abstract:
This paper presents reliability sampling plans for the Weibull
distribution under Type II progressive censoring with random removals
(PCR), where the number of units removed at each failure time follows a
binomial distribution. To construct the sampling plans, the sample size n
and the acceptance constant k are determined based on asymptotic
distribution theory. The resulting sampling plans are tabulated for
selected specifications under the proposed censoring scheme. Furthermore,
a Monte Carlo simulation is conducted to validate the true probability of
acceptance for the designed sampling plans.
Journal: Journal of Applied Statistics
Pages: 709-718
Issue: 6
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000053781
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000053781
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:6:p:709-718
Template-Type: ReDIF-Article 1.0
Author-Name: Dimitrios Vougas
Author-X-Name-First: Dimitrios
Author-X-Name-Last: Vougas
Title: Reconsidering LM unit root testing
Abstract:
Non-rejection of a unit root hypothesis by usual Dickey & Fuller (1979)
(DF, hereafter) or Phillips & Perron (1988) (hereafter PP) tests should
not be taken as strong evidence in favour of unit root presence. There are
less popular, but more powerful, unit root tests that should be employed
instead of DF-PP tests. A prime example of an alternative test is the LM
unit root test developed by Schmidt & Phillips (1992) (hereafter SP) and
Schmidt & Lee (1991) (hereafter SL). LM unit root tests are easy to
calculate and invariant (similar); they employ optimal detrending and are
more powerful than usual DF-PP tests. Asymptotic theory and finite sample
critical values (with inaccuracies that we correct in this paper) are
available for SP-SL tests. However, the usefulness of LM tests is not
fully understood, due to ambiguity over test type recommendation, as well
as potentially inefficient derivation of the test that might confuse
applied researchers. In this paper, we reconsider LM unit root testing in
a model with linear trend. We derive asymptotic distribution theory (in a
new fashion), as well as accurate appropriate critical values. We
undertake Monte Carlo investigation of finite sample properties of SP-SL
LM tests, along with applications to the Nelson & Plosser (1982) time
series and real quarterly UK GDP.
Journal: Journal of Applied Statistics
Pages: 727-741
Issue: 7
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076010
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076010
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:7:p:727-741
Template-Type: ReDIF-Article 1.0
Author-Name: W. J. Krzanowski
Author-X-Name-First: W. J.
Author-X-Name-Last: Krzanowski
Title: Non-parametric estimation of distance between groups
Abstract:
A numerical procedure is outlined for obtaining the distance between
samples from two populations. First, the probability densities in the two
populations are estimated by kernel methods, and then the distance is
derived by numerical integration of a suitable function of these
densities. Various such functions have been proposed in the past; they are
all implemented and compared with each other and with Mahalanobis D 2 on
several real and simulated data sets. The results show the method to be
viable, and to perform well against the Mahalanobis D 2 standard.
Journal: Journal of Applied Statistics
Pages: 743-750
Issue: 7
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076029
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076029
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:7:p:743-750
Template-Type: ReDIF-Article 1.0
Author-Name: J. Kowalski
Author-X-Name-First: J.
Author-X-Name-Last: Kowalski
Author-Name: X. M. Tu
Author-X-Name-First: X. M.
Author-X-Name-Last: Tu
Title: Trend analysis with response incompatible formats and measurement error
Abstract:
The increasing popularity of longitudinal studies, along with the rapid
advances in science and technology, has created a potential
incompatibility between data formats, which leads to an inference problem
when applying conventional statistical methods. This inference problem is
further compounded by measurement error, since incompatible data format
often arise in the context of measuring latent constructs. Without a
systematic study of the impact of scale differences, ad-hoc approaches
generally lead to inconsistent estimates and thus, invalid statistical
inferences. In this paper, we examine the asymptotic properties and
identify conditions that guarantee consistent estimation within the
context of a trend analysis with response incompatible formats and
measurement error. For model estimation, we introduce two competing
methods that use a generalized estimating equation approach to provide
inferences for the parameters of interest, and highlight the relative
strengths of each method. The approach is illustrated by data obtained
from a multi-centre AIDS cohort study (MACS), where a trend analysis of an
immunologic marker of HIV infection is of interest.
Journal: Journal of Applied Statistics
Pages: 751-770
Issue: 7
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076038
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076038
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:7:p:751-770
Template-Type: ReDIF-Article 1.0
Author-Name: Jian-Xin Pan
Author-X-Name-First: Jian-Xin
Author-X-Name-Last: Pan
Author-Name: Peng Bai
Author-X-Name-First: Peng
Author-X-Name-Last: Bai
Title: Local influence analysis in the growth curve model with Rao's simple covariance structure
Abstract:
In this paper we discuss the likelihood-based local influence in a growth
curve model with Rao's simple covariance structure. Under an abstract
perturbation, the Hessian matrix is provided in which the eigenvector
corresponding to the maximum absolute eigenvalue is used to assess the
influence of observations. Specifically, we employ covariance-weighted
perturbation to demonstrate the use of the proposed approach. A practical
example is analysed using the proposed local influence approach.
Journal: Journal of Applied Statistics
Pages: 771-781
Issue: 7
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076047
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076047
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:7:p:771-781
Template-Type: ReDIF-Article 1.0
Author-Name: Chien-Tai Lin
Author-X-Name-First: Chien-Tai
Author-X-Name-Last: Lin
Author-Name: N. Balakrishnan
Author-X-Name-First: N.
Author-X-Name-Last: Balakrishnan
Title: Exact prediction intervals for exponential distributions based on doubly Type-II censored samples
Abstract:
In this paper, we make use of an algorithm of Huffer & Lin (2001) in
order to develop exact prediction intervals for failure times from
one-parameter and two- parameter exponential distributions based on doubly
Type-II censored samples. We show that this method yields the same results
as those of Lawless (1971, 1977) and Like w (1974) in the case when the
available sample is Type-II right censored. We present a computational
algorithm for the determination of the exact percentage points of the
pivotal quantities used in the construction of these prediction intervals.
We also present some tables of these percentage points for the prediction
of the ' th order statistic in a sample of size n for both one- and
two-parameter exponential distributions, assuming that the available
sample is doubly Type-II censored. Finally, we present two examples to
illustrate the methods of inference developed here.
Journal: Journal of Applied Statistics
Pages: 783-801
Issue: 7
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076056
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076056
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:7:p:783-801
Template-Type: ReDIF-Article 1.0
Author-Name: Nicolaus Tideman
Author-X-Name-First: Nicolaus
Author-X-Name-Last: Tideman
Author-Name: Reza Kheirandish
Author-X-Name-First: Reza
Author-X-Name-Last: Kheirandish
Title: Structurally consistent probabilities of selecting answers
Abstract:
This paper offers a procedure for specifying probabilities for students
to select answers on a multiple-choice test that, unlike previous
procedures, satisfies all three of the following structural consistency
conditions: (1) for any student, the sum over questions of the
probabilities that the student will use the correct answers is the
student's score on the test; (2) for any student, the sum over possible
answers of the probabilities of using the answers is 1.0; and (3) for any
answer to any question, the sum over students of the probabilities of
using that answer is the number of students who used that answer. When
applied to an exam, these fully consistent probabilities had the same
power to identify cheaters as the probabilities proposed by Wesolowsky,
and noticeably better power than the probabilities suggested by Frary et
al.
Journal: Journal of Applied Statistics
Pages: 803-811
Issue: 7
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076065
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076065
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:7:p:803-811
Template-Type: ReDIF-Article 1.0
Author-Name: Duolao Wang
Author-X-Name-First: Duolao
Author-X-Name-Last: Wang
Author-Name: Panuwat Lertsithichai
Author-X-Name-First: Panuwat
Author-X-Name-Last: Lertsithichai
Author-Name: Kiran Nanchahal
Author-X-Name-First: Kiran
Author-X-Name-Last: Nanchahal
Author-Name: Mohammed Yousufuddin
Author-X-Name-First: Mohammed
Author-X-Name-Last: Yousufuddin
Title: Risk factors of coronary heart disease: A Bayesian model averaging approach
Abstract:
To analyse the risk factors of coronary heart disease (CHD), we apply the
Bayesian model averaging approach that formalizes the model selection
process and deals with model uncertainty in a discrete-time survival model
to the data from the Framingham Heart Study. We also use the Alternating
Conditional Expectation algorithm to transform the risk factors, such that
their relationships with CHD are best described, overcoming the problem of
coding such variables subjectively. For the Framingham Study, the Bayesian
model averaging approach, which makes inferences about the effects of
covariates on CHD based on an average of the posterior distributions of
the set of identified models, outperforms the stepwise method in
predictive performance. We also show that age, cholesterol, and smoking
are nonlinearly associated with the occurrence of CHD and that P-values
from models selected from stepwise methods tend to overestimate the
evidence for the predictive value of a risk factor and ignore model
uncertainty.
Journal: Journal of Applied Statistics
Pages: 813-826
Issue: 7
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076074
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076074
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:7:p:813-826
Template-Type: ReDIF-Article 1.0
Author-Name: A. H. M. Rahmatullah Imon
Author-X-Name-First: A. H. M. Rahmatullah
Author-X-Name-Last: Imon
Title: Residuals from deletion in added variable plots
Abstract:
An added variable plot is a commonly used plot in regression diagnostics.
The rationale for this plot is to provide information about the addition
of a further explanatory variable to the model. In addition, an added
variable plot is most often used for detecting high leverage points and
influential data. So far as we know, this type of plot involves the least
squares residuals which, we suspect, could produce a confusing picture
when a group of unusual cases are present in the data. In this situation,
added variable plots may not only fail to detect the unusual cases but
also may fail to focus on the need for adding a further regressor to the
model. We suggest that residuals from deletion should be more convincing
and reliable in this type of plot. The usefulness of an added variable
plot based on residuals from deletion is investigated through a few
examples and a Monte Carlo simulation experiment in a variety of
situations.
Journal: Journal of Applied Statistics
Pages: 827-841
Issue: 7
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076083
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076083
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:7:p:827-841
Template-Type: ReDIF-Article 1.0
Author-Name: Youngjo Lee
Author-X-Name-First: Youngjo
Author-X-Name-Last: Lee
Author-Name: John Nelder
Author-X-Name-First: John
Author-X-Name-Last: Nelder
Title: Extended-REML estimators
Abstract:
Restricted likelihood was originally introduced as the criterion for the
estimation of dispersion components in normal mixed linear models. Lee &
Nelder (2001a) showed that it can be extended to a much wider class of
models via double extended quasi-likelihood. We give a detailed
description of the new method and show that it gives an efficient
estimation procedure for dispersion components.
Journal: Journal of Applied Statistics
Pages: 845-856
Issue: 8
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000075930
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000075930
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:8:p:845-856
Template-Type: ReDIF-Article 1.0
Author-Name: Kelvin Yau
Author-X-Name-First: Kelvin
Author-X-Name-Last: Yau
Author-Name: Karen Yip
Author-X-Name-First: Karen
Author-X-Name-Last: Yip
Author-Name: H. K. Yuen
Author-X-Name-First: H. K.
Author-X-Name-Last: Yuen
Title: Modelling repeated insurance claim frequency data using the generalized linear mixed model
Abstract:
Most of the methods used to estimate claim frequency rates in general
insurance have assumed that data are independent. However, it is not
uncommon for information stored in the database of an insurance company to
contain previous years' claim data from each policyholder. We consider the
application of the generalized linear mixed model approach to the analysis
of repeated insurance claim frequency data in which a conditionally fixed
random effect vector is incorporated explicitly into the linear predictor
to model the inherent correlation. A motor insurance data set is used as
the basis for simulation to demonstrate the advantages of the method.
Ignoring the underlying association for observations within the same
policyholder results in an underestimation of the standard error of the
parameter estimates and a remarkable reduction in the prediction accuracy.
The method provides a viable alternative for incorporating repeated claim
experience that enables the revision of rates in general insurance.
Journal: Journal of Applied Statistics
Pages: 857-865
Issue: 8
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000075949
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000075949
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:8:p:857-865
Template-Type: ReDIF-Article 1.0
Author-Name: V. K. Sharma
Author-X-Name-First: V. K.
Author-X-Name-Last: Sharma
Author-Name: Seema Jaggi
Author-X-Name-First: Seema
Author-X-Name-Last: Jaggi
Author-Name: Cini Varghese
Author-X-Name-First: Cini
Author-X-Name-Last: Varghese
Title: Minimal balanced repeated measurements designs
Abstract:
Experimental designs in which treatments are applied to the experimental
units, one at a time, in sequences over a number of periods, have been
used in several scientific investigations and are known as repeated
measurements designs. Besides direct effects, these designs allow
estimation of residual effects of treatments along with adjustment for
them. Assuming the existence of first-order residual effects of
treatments, Hedayat & Afsarinejad (1975) gave a method of constructing
minimal balanced repeated measurements [RM(v,n,p)] design for v treatments
using n=2v experimental units for p [=(v+1)/2] periods when v is a prime
or prime power. Here, a general method of construction of these designs
for all odd v has been given along with an outline for their analysis. In
terms of variances of estimated elementary contrasts between treatment
effects (direct and residual), these designs are seen to be partially
variance balanced based on the circular association scheme.
Journal: Journal of Applied Statistics
Pages: 867-872
Issue: 8
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000075958
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000075958
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:8:p:867-872
Template-Type: ReDIF-Article 1.0
Author-Name: Sueli Mingoti
Author-X-Name-First: Sueli
Author-X-Name-Last: Mingoti
Title: A note on the sample size required in sequential tests for the generalized binomial distribution
Abstract:
In this paper, we discuss the sample size needed to perform Wald's
sequential statistical test for the proportion of non-conforming items
generated by a process when the results of the inspections are correlated
and the generalized binomial distribution proposed by Madsen (1993) is
used. It will be shown that, in the presence of correlation, the sample
size increases as the value of the coefficient of correlation
increases--being much higher for processes with small failure rates.
Journal: Journal of Applied Statistics
Pages: 873-879
Issue: 8
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000075967
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000075967
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:8:p:873-879
Template-Type: ReDIF-Article 1.0
Author-Name: Hyun Sook Oh
Author-X-Name-First: Hyun Sook
Author-X-Name-Last: Oh
Author-Name: Seoung-gon Ko
Author-X-Name-First: Seoung-gon
Author-X-Name-Last: Ko
Author-Name: Man-Suk Oh
Author-X-Name-First: Man-Suk
Author-X-Name-Last: Oh
Title: A Bayesian approach to assessing population bioequivalence in a 2 2 2 crossover design
Abstract:
A Bayesian testing procedure is proposed for assessment of the
bioequivalence in both mean and variance, which ensures population
bioequivalence under the normality assumption. We derive the joint
posterior distribution of the means and variances in a standard 2 2 2
crossover experimental design and propose a Bayesian testing procedure for
bioequivalence based on a Markov chain Monte Carlo method. The proposed
method is applied to a real data set.
Journal: Journal of Applied Statistics
Pages: 881-891
Issue: 8
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000117131
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000117131
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:8:p:881-891
Template-Type: ReDIF-Article 1.0
Author-Name: R. Bellio
Author-X-Name-First: R.
Author-X-Name-Last: Bellio
Author-Name: E. Gori
Author-X-Name-First: E.
Author-X-Name-Last: Gori
Title: Impact evaluation of job training programmes: Selection bias in multilevel models
Abstract:
This paper focuses on the evaluation of a job training programme composed
of several different courses. The aim is to evaluate the impact of the
programme for the participants with respect to non-participants, paying
attention to possible differences in the effectiveness between the
courses. The analysis is based on discrete data with a hierarchical
structure. Multilevel modelling is the natural choice in this setting, but
the results may be severely affected by selection bias. We propose a
two-step procedure, which suits both the hierarchical structure and the
observational nature of data. The method selects the appropriate control
group, using standard results of the propensity score methodology. A
suitable multilevel model is formulated, and the dependence of the results
on the amount of non-random sample selection is analysed within a
likelihood-based framework. As a result, rankings for comparative
performances are obtained, adjusted for the amount of plausible selection
bias. The procedure is illustrated with reference to a data set about a
job training programme organized in Italy in the late 1990s.
Journal: Journal of Applied Statistics
Pages: 893-907
Issue: 8
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000075976
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000075976
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:8:p:893-907
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Austin
Author-X-Name-First: Peter
Author-X-Name-Last: Austin
Author-Name: Michael Escobar
Author-X-Name-First: Michael
Author-X-Name-Last: Escobar
Title: The use of finite mixture models to estimate the distribution of the health utilities index in the presence of a ceiling effect
Abstract:
Finite mixture models are flexible parametric models that allow one to
describe complex probability distributions as a mixture of a small number
of simple probability distributions. Measures of health status are often
used to reflect a person's overall health. Such measures may be subject to
a ceiling effect, in that the measure is unable to discern gradations in
health status above the ceiling. The purpose of this paper is to
illustrate the use of finite mixture models to describe the probability
distribution of the Health Utilities Index, under the assumption that the
HUI is subject to a ceiling effect. Mixture models with two through six
components are fit to the HUI. Bayes factors were used to compare the
evidence that the Canadian population of non-institutionalized residents
is composed of four distinct subpopulations, and that a mixture of six
Normal components is required to describe these four subpopulations.
Journal: Journal of Applied Statistics
Pages: 909-923
Issue: 8
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000075985
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000075985
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:8:p:909-923
Template-Type: ReDIF-Article 1.0
Author-Name: Rhonda Magel
Author-X-Name-First: Rhonda
Author-X-Name-Last: Magel
Author-Name: Li Qin
Author-X-Name-First: Li
Author-X-Name-Last: Qin
Title: A non-parametric test for umbrella alternatives based on ranked-set sampling
Abstract:
A test is proposed that extends the Chen-Wolfe (1990) test for umbrella
alternatives with an unknown peak to use with ranked-set samples data.
This follows from ideas in Bohn & Wolfe (1992), Magel (1994), and Hartlaub
& Wolfe (1999). Critical values are simulated for the proposed test based
on ranked-set samples of size 2 for 3, 4 and 5 populations. A power study
is conducted comparing the proposed test using ranked-set samples with the
Chen-Wolfe and Mack-Wolfe tests using simple random samples. Results are
given.
Journal: Journal of Applied Statistics
Pages: 925-937
Issue: 8
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000075994
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000075994
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:8:p:925-937
Template-Type: ReDIF-Article 1.0
Author-Name: Goran Arnoldsson
Author-X-Name-First: Goran
Author-X-Name-Last: Arnoldsson
Title: Optimal designs for beta-binomial logistic regression models
Abstract:
Optimal designs for a logistic regression model with over-dispersion
introduced by a beta-binomial distribution are characterized. Designs are
defined by a set of design points and design weights as usual but, in
addition, the experimenter must also make a choice of a sub-sampling
design specifying the distribution of observations on sample sizes. In an
earlier work it has been shown that Ds-optimal sampling designs for
estimation of the parameters of the beta-binomial distribution are
supported on at most two design points. This admits a simplified approach
using single sample sizes. Linear predictor values for Ds-optimal designs
using a common sample size are tabulated for different levels of
over-dispersion and choice of subsets of parameters.
Journal: Journal of Applied Statistics
Pages: 939-951
Issue: 8
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076001
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076001
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:8:p:939-951
Template-Type: ReDIF-Article 1.0
Author-Name: Vernon Farewell
Author-X-Name-First: Vernon
Author-X-Name-Last: Farewell
Author-Name: Agnes Herzberg
Author-X-Name-First: Agnes
Author-X-Name-Last: Herzberg
Title: Plaid designs for the evaluation of training for medical practitioners
Abstract:
The training of medical practitioners to improve the practitioner/patient
relationship may be difficult, as limitations often exist on the choice of
patients included in the study. A specific study of this type of training
is given. It is proposed that a simple modification and generalization of
Yates' plaid-square designs be used. It is shown that a replicated
plaid-design incorporates as a special case the criss-cross or strip-plot
design. The usefulness of these designs in studies of the training of
medical practitioners is illustrated. The basic characteristics of their
analysis are outlined.
Journal: Journal of Applied Statistics
Pages: 957-965
Issue: 9
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076092
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076092
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:9:p:957-965
Template-Type: ReDIF-Article 1.0
Author-Name: Richard Stevens
Author-X-Name-First: Richard
Author-X-Name-Last: Stevens
Title: Evaluation of methods for interval estimation of model outputs, with application to survival models
Abstract:
When a published statistical model is also distributed as computer
software, it will usually be desirable to present the outputs as interval,
as well as point, estimates. The present paper compares three methods for
approximate interval estimation about a model output, for use when the
model form does not permit an exact interval estimate. The methods
considered are first-order asymptotics, using second derivatives of the
log-likelihood to estimate variance information; higher-order asymptotics
based on the signed-root transformation; and the non-parametric bootstrap.
The signed-root method is Bayesian, and uses an approximation for
posterior moments that has not previously been tested in a real-world
application. Use of the three methods is illustrated with reference to a
software project arising in medical decision-making, the UKPDS Risk
Engine. Intervals from the first-order and signed-root methods are near-
identical, and typically 1% wider to 7% narrower than those from the
non-parametric bootstrap. The asymptotic methods are markedly faster than
the bootstrap method.
Journal: Journal of Applied Statistics
Pages: 967-981
Issue: 9
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076100
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076100
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:9:p:967-981
Template-Type: ReDIF-Article 1.0
Author-Name: Yuzhi Cai
Author-X-Name-First: Yuzhi
Author-X-Name-Last: Cai
Author-Name: Neville Davies
Author-X-Name-First: Neville
Author-X-Name-Last: Davies
Title: Monitoring the parameter changes in general ARIMA time series models
Abstract:
We propose methods for monitoring the residuals of a fitted ARIMA or an
autoregressive fractionally integrated moving average (ARFIMA) model in
order to detect changes of the parameters in that model. We extend the
procedures of Box & Ramirez (1992) and Ramirez (1992) and allow the
differencing parameter, d to be fractional or integer. Test statistics are
approximated by Wiener processes. We carry out simulations and also apply
our method to several real time series. The results show that our method
is effective for monitoring all parameters in ARFIMA models.
Journal: Journal of Applied Statistics
Pages: 983-1001
Issue: 9
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076119
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076119
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:9:p:983-1001
Template-Type: ReDIF-Article 1.0
Author-Name: David Thomson
Author-X-Name-First: David
Author-X-Name-Last: Thomson
Author-Name: Arie van Noordwijk
Author-X-Name-First: Arie
Author-X-Name-Last: van Noordwijk
Author-Name: Ward Hagemeijer
Author-X-Name-First: Ward
Author-X-Name-Last: Hagemeijer
Title: Estimating avian dispersal distances from data on ringed birds
Abstract:
Data from birds ringed as chicks and recaptured during subsequent
breeding seasons provide information on avian natal dispersal distances.
However, national patterns of ring reports are influenced by recapture
rates as well as by dispersal rates. While an extensive methodology has
been developed to study survival rates using models that correct for
recapture rates, the same is not true for dispersal. Here, we present such
a method, showing how corrections for spatial heterogeneity in recapture
rate can be built into estimates of dispersal rates if detailed atlas data
and ringing totals can be combined with extensive data on birds ringed as
chicks and recaptured as breeding adults. We show how the method can be
implemented in the software package SURVIV (White, 1992).
Journal: Journal of Applied Statistics
Pages: 1003-1008
Issue: 9
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076128
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076128
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:9:p:1003-1008
Template-Type: ReDIF-Article 1.0
Author-Name: Seppo Laaksonen
Author-X-Name-First: Seppo
Author-X-Name-Last: Laaksonen
Title: Alternative imputation techniques for complex metric variables
Abstract:
This paper deals with imputation techniques and strategies. Usually,
imputation truly commences after the first data editing, but many
preceding operations are needed before that. In this editing step, the
missing or deficient items are to be recognized and coded, and then it is
decided which of these, if any, should be substituted by imputing. There
are a number of imputation methods and their specifications. Consequently,
it is not clear what method finally should be chosen, especially when an
imputation method may be best in one respect, and another method in the
other. In this paper, we consider these questions through the following
four imputation methods: (i) random hot decking, (ii) logistic regression
imputation, (iii) linear regression imputation, and (iv) regression-based
nearest neighbour hot decking. The last two methods are applied with the
two different specifications. The two metric variables have been used in
empirical tests. The first is very complex, but the second is more
ordinary, and thus easier to handle. The empirical examples are based on
simulations, which clearly show the biases of the various methods and
their specifications. In general, it seems that method (iv) is
recommendable although the results from it are not perfect either.
Journal: Journal of Applied Statistics
Pages: 1009-1020
Issue: 9
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076137
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076137
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:9:p:1009-1020
Template-Type: ReDIF-Article 1.0
Author-Name: Luis Gil-Alana
Author-X-Name-First: Luis
Author-X-Name-Last: Gil-Alana
Title: Estimation of the degree of dependence in the temperatures in the northern hemisphere using semi-parametric techniques
Abstract:
We are concerned in this article with the estimation of the degree of
dependence between the observations of the monthly temperatures in the
northern hemisphere from 1854 to 1989 by means of using fractionally
integrated semi-parametric techniques. We use several estimation
procedures proposed by P. M. Robinson in a number of papers, and the
results indicate that the order of integration of the series is around
0.37, implying that the time series is stationary but with long memory
behaviour. Separating the data in two subsamples (1854-1921 and 1922-89),
the results show that there has been an increase in the degree of
dependence across time by about 0.05-0.10, the order of integration
oscillating around 0.3 (0.35) for the time period 1854-1921, and around
0.35 (0.40) for the period 1922-89.
Journal: Journal of Applied Statistics
Pages: 1021-1031
Issue: 9
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076146
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076146
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:9:p:1021-1031
Template-Type: ReDIF-Article 1.0
Author-Name: J. F. Walhin
Author-X-Name-First: J. F.
Author-X-Name-Last: Walhin
Title: Bivariate Hofmann distributions
Abstract:
The aim of this paper is to develop some bivariate generalizations of the
Hofmann distribution. The Hofmann distribution is known to give nice fits
for overdispersed data sets. Two bivariate models are proposed. Recursive
formulae are given for the evaluation of the probability function.
Moments, conditional distributions and marginal distributions are studied.
Two data sets are fitted based on the proposed models. Parameters are
estimated by maximum likelihood.
Journal: Journal of Applied Statistics
Pages: 1033-1046
Issue: 9
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076155
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076155
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:9:p:1033-1046
Template-Type: ReDIF-Article 1.0
Author-Name: H. Oztay Ayhan
Author-X-Name-First: H. Oztay
Author-X-Name-Last: Ayhan
Title: Models of response error components in supervised interview-reinterview surveys
Abstract:
The current work deals with modelling of response error components in
supervised interview-reinterview surveys. The model considers several
stages of an interactive process to obtain and record a response. The
response process is evaluated as,
controller-interviewer-respondent-interviewer-controller interaction
setting under a supervised interviewing process. The allocation of
controllers, interviewers and respondents is made by a hierarchical design
for the interview-reinterview process. In addition, a coder error
component is also added to the above proposed model. The proposed model
operates under two major sub-models, namely an error detection model and
response model.
Journal: Journal of Applied Statistics
Pages: 1047-1054
Issue: 9
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076164
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076164
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:9:p:1047-1054
Template-Type: ReDIF-Article 1.0
Author-Name: Hassen Muttlak
Author-X-Name-First: Hassen
Author-X-Name-Last: Muttlak
Author-Name: Walid Al-Sabah
Author-X-Name-First: Walid
Author-X-Name-Last: Al-Sabah
Title: Statistical quality control based on ranked set sampling
Abstract:
Different quality control charts for the sample mean are developed using
ranked set sampling (RSS), and two of its modifications, namely median
ranked set sampling (MRSS) and extreme ranked set sampling (ERSS). These
new charts are compared to the usual control charts based on simple random
sampling (SRS) data. The charts based on RSS or one of its modifications
are shown to have smaller average run length (ARL) than the classical
chart when there is a sustained shift in the process mean. The MRSS and
ERSS methods are compared with RSS and SRS data, it turns out that MRSS
dominates all other methods in terms of the out-of-control ARL
performance. Real data are collected using the RSS, MRSS, and ERSS in
cases of perfect and imperfect ranking. These data sets are used to
construct the corresponding control charts. These charts are compared to
usual SRS chart. Throughout this study we are assuming that the underlying
distribution is normal. A check of the normality for our example data set
indicated that the normality assumption is reasonable.
Journal: Journal of Applied Statistics
Pages: 1055-1078
Issue: 9
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000076173
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000076173
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:9:p:1055-1078
Template-Type: ReDIF-Article 1.0
Author-Name: Saralees Nadarajah
Author-X-Name-First: Saralees
Author-X-Name-Last: Nadarajah
Author-Name: Kosto Mitov
Author-X-Name-First: Kosto
Author-X-Name-Last: Mitov
Author-Name: Samuel Kotz
Author-X-Name-First: Samuel
Author-X-Name-Last: Kotz
Title: Local dependence functions for extreme value distributions
Abstract:
Kotz & Nadarajah (2002) introduced a measure of local dependence which is
a localized version of the Pearson's correlation coefficient. In this
paper we provide detailed analyses (both algebraic and numerical) of the
form of the measure for the class of bivariate extreme value
distributions. We consider, in particular, five families of bivariate
extreme value distributions. We also discuss two applications of the new
measure. In the first application we introduce an overall measure of
correlation and produce evidence to suggest that it is superior than the
usual Pearson's correlation coefficient. The second application introduces
two new concepts for ordering of bivariate dependence.
Journal: Journal of Applied Statistics
Pages: 1081-1100
Issue: 10
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000107123
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000107123
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:10:p:1081-1100
Template-Type: ReDIF-Article 1.0
Author-Name: Michael Greenacre
Author-X-Name-First: Michael
Author-X-Name-Last: Greenacre
Title: Singular value decomposition of matched matrices
Abstract:
We consider the joint analysis of two matched matrices which have common
rows and columns, for example multivariate data observed at two time
points or split according to a dichotomous variable. Methods of interest
include principal components analysis for interval-scaled data,
correspondence analysis for frequency data, log-ratio analysis of
compositional data and linear biplots in general, all of which depend on
the singular value decomposition. A simple result in matrix algebra shows
that by setting up two matched matrices in a particular block format,
matrix sum and difference components can be analysed using a single
application of the singular value decomposition algorithm. The methodology
is applied to data from the International Social Survey Program comparing
male and female attitudes on working wives across eight countries. The
resulting biplots optimally display the overall cross-cultural differences
as well as the male-female differences. The case of more than two matched
matrices is also discussed.
Journal: Journal of Applied Statistics
Pages: 1101-1113
Issue: 10
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000107132
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000107132
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:10:p:1101-1113
Template-Type: ReDIF-Article 1.0
Author-Name: Christian Sonesson
Author-X-Name-First: Christian
Author-X-Name-Last: Sonesson
Title: Evaluations of some Exponentially Weighted Moving Average methods
Abstract:
The need for statistical surveillance has been noted in many different
areas, and examples of applications include the detection of an increased
incidence of a disease, the detection of an increased radiation level and
the detection of a turning point in a leading index for a business cycle.
In all cases, preventive actions are possible if the alarm is made early.
Several versions of the EWMA (Exponentially Weighted Moving Average)
method for monitoring a process with the aim of detecting a shift in the
mean are studied both for the one-sided and the two-sided case. The
effects of using barriers for the one-sided alarm statistic are also
studied. One important issue is the effect of different types of alarm
limits. Different measures of evaluation, suitable in different types of
applications, are considered such as the expected delay, the ARL¹,
the probability of successful detection and the predictive value of an
alarm, to give a broad picture of the features of the methods. Results
from a large-scale simulation study are presented both for a fixed ARL0
and a fixed probability of a false alarm. It appears that important
differences from an inferential point of view exist between the one- and
two-sided versions of the methods. It is demonstrated that the method,
usually considered as a convenient approximation, is to be preferred over
the exact version in the overwhelming majority of applications.
Journal: Journal of Applied Statistics
Pages: 1115-1133
Issue: 10
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000107141
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000107141
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:10:p:1115-1133
Template-Type: ReDIF-Article 1.0
Author-Name: Yangxin Huang
Author-X-Name-First: Yangxin
Author-X-Name-Last: Huang
Title: Selection of number of dose levels and its robustness for binary response data
Abstract:
Muller & Schmitt (1990) have considered the question of how to choose the
number of doses for estimating the median effective dose (ED50) when a
probit dose-response curve is correctly assumed. However, they restricted
their investigation to designs with doses symmetrical about the true ED50.
In this paper, we investigate how the conclusions of Muller & Schmitt may
change as the dose designs become slightly asymmetric about the true ED50.
In addition, we address the question of the robustness of the number of
doses chosen for an incorrectly assumed logistic model, when the dose
designs are asymmetric about the assumed ED50. The underlying true
dose-response curves considered here include the probit, cubic logistic
and Aranda- Ordaz asymmetric models. The simulation results show that, for
various underlying true dose-response curves and the uniform design
density with doses spaced asymmetrically around the assumed ED50, the
choice of as many doses as possible is almost optimal. This agrees with
the results obtained for a correctly assumed probit or logistic
dose-response curve when the dose designs are symmetric or slightly
asymmetric about the ED50.
Journal: Journal of Applied Statistics
Pages: 1135-1146
Issue: 10
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000107150
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000107150
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:10:p:1135-1146
Template-Type: ReDIF-Article 1.0
Author-Name: L. A. Gil-Alana
Author-X-Name-First: L. A.
Author-X-Name-Last: Gil-Alana
Title: A fractional integration analysis of the population in some OECD countries
Abstract:
In this article we examine the degree of persistence of the population
series in 19 OECD countries during the period 1948-2000 by means of using
fractionally integrated techniques. We use a parametric procedure due to
Robinson (1994) that permits us to test I(d) statistical models. The
results show that the order of integration of the series substantially
varies across countries and also depending on how we specify the I(0)
disturbances. Overall, Germany and Portugal present the smallest degrees
of integration while population in Japan appears as the most
non-stationary series.
Journal: Journal of Applied Statistics
Pages: 1147-1159
Issue: 10
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000107169
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000107169
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:10:p:1147-1159
Template-Type: ReDIF-Article 1.0
Author-Name: Manuel Salvador
Author-X-Name-First: Manuel
Author-X-Name-Last: Salvador
Author-Name: Pilar Gargallo
Author-X-Name-First: Pilar
Author-X-Name-Last: Gargallo
Title: Automatic selective intervention in dynamic linear models
Abstract:
In this paper we propose an algorithm to carry out the monitoring and
retrospective intervention process in Dynamic Linear Models, both
selectively and automatically. The algorithm is illustrated by analysing
several series taken from the literature, in which the proposed procedure
is shown to perform better than the scheme proposed by West & Harrison
(1997, Chapter 11).
Journal: Journal of Applied Statistics
Pages: 1161-1184
Issue: 10
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000107178
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000107178
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:10:p:1161-1184
Template-Type: ReDIF-Article 1.0
Author-Name: Herve Cardot
Author-X-Name-First: Herve
Author-X-Name-Last: Cardot
Author-Name: Robert Faivre
Author-X-Name-First: Robert
Author-X-Name-Last: Faivre
Author-Name: Michel Goulard
Author-X-Name-First: Michel
Author-X-Name-Last: Goulard
Title: Functional approaches for predicting land use with the temporal evolution of coarse resolution remote sensing data
Abstract:
The sensor SPOT 4/Vegetation gives every day satellite images of Europe
with medium spatial resolution, each pixel corresponding to an area of 1 r
km 2 1 r km. Such data are useful to characterize the development of the
vegetation at a large scale. The pixels, named "mixed' pixels,
aggregate information of different crops and thus different themes of
interest (wheat, corn, forest, …). We aim at estimating the land
use when observing the temporal evolution of reflectances of mixed pixels.
The statistical problem is to predict proportions with longitudinal
covariates. We compared two functional approaches. The first relies on
varying-time regression models and the second is an extension of the
multilogit model for functional data. The comparison is achieved on a
small area on which the land use is known. Satellite data were collected
between March and August 1998. The functional multilogit model gives
better predictions and the use of composite vegetation index is more
efficient.
Journal: Journal of Applied Statistics
Pages: 1185-1199
Issue: 10
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000107187
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000107187
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:10:p:1185-1199
Template-Type: ReDIF-Article 1.0
Author-Name: Robert Till
Author-X-Name-First: Robert
Author-X-Name-Last: Till
Author-Name: David Hand
Author-X-Name-First: David
Author-X-Name-Last: Hand
Title: Behavioural models of credit card usage
Abstract:
Behavioural models characterize the way customers behave in their use of
a credit product. In this paper, we examine repayment and transaction
behaviour with credit cards. In particular, we describe the development of
Markov chain models for late repayment, investigate the extent to which
there are different classes of behaviour pattern, and explore the extent
to which distinct behaviours can be predicted. We also develop overall
models for transaction time distributions. Once such models have been
built to summarize the data, they can be used to predict likely future
behaviour, and can also serve as the basis of predictions of what one
might expect when economic circumstances change.
Journal: Journal of Applied Statistics
Pages: 1201-1220
Issue: 10
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000107196
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000107196
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:10:p:1201-1220
Template-Type: ReDIF-Article 1.0
Author-Name: Dejian Lai
Author-X-Name-First: Dejian
Author-X-Name-Last: Lai
Author-Name: Rakesh Sharma
Author-X-Name-First: Rakesh
Author-X-Name-Last: Sharma
Author-Name: Jerry Wolinsky
Author-X-Name-First: Jerry
Author-X-Name-Last: Wolinsky
Author-Name: Ponnada Narayana
Author-X-Name-First: Ponnada
Author-X-Name-Last: Narayana
Title: A comparative study of correlation coefficients in spatially MRSI-observed neurochemicals from multiple sclerosis patients
Abstract:
In measuring the association between magnetic resonance spectroscopic
imaging (MRSI)-observed neurochemicals from multiple sclerosis (MS)
patients, the classic correlation coefficients such as the Pearson's r,
Spearman's A and Kendall's F do not take into account the spatial
dependence of the observations. This paper reports a comparative study on
these classic correlation coefficients (Pearson's r, Spearman's A and
Kendall's F ) and some more recent correlation coefficients (Tjostheim's
a, the modified t) that take into account the spatial dependence of the
intensities of the concentrations of several neurochemicals in MS
patients. Our study indicates that the use of the classic correlation
coefficients that ignores the spatial dependence of the observations may
overestimate the statistical significance of the results.
Journal: Journal of Applied Statistics
Pages: 1221-1229
Issue: 10
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000107204
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000107204
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:10:p:1221-1229
Template-Type: ReDIF-Article 1.0
Author-Name: Rand Wilcox
Author-X-Name-First: Rand
Author-X-Name-Last: Wilcox
Title: Multiple comparisons based on a modified one-step M-estimator
Abstract:
Although many methods are available for performing multiple comparisons
based on some measure of location, most can be unsatisfactory in at least
some situations, in simulations when sample sizes are small, say less than
or equal to twenty. That is, the actual Type I error probability can
substantially exceed the nominal level, and for some methods the actual
Type I error probability can be well below the nominal level, suggesting
that power might be relatively poor. In addition, all methods based on
means can have relatively low power under arbitrarily small departures
from normality. Currently, a method based on 20% trimmed means and a
percentile bootstrap method performs relatively well (Wilcox, in press).
However, symmetric trimming was used, even when sampling from a highly
skewed distribution and a rigid adherence to 20% trimming can result in
low efficiency when a distribution is sufficiently heavy-tailed. Robust
M-estimators are more flexible but they can be unsatisfactory in terms of
Type I errors when sample sizes are small. This paper describes an
alternative approach based on a modified one-step M-estimator that
introduces more flexibility than a trimmed mean but provides better
control over Type I error probabilities compared with using a one-step
M-estimator.
Journal: Journal of Applied Statistics
Pages: 1231-1241
Issue: 10
Volume: 30
Year: 2003
X-DOI: 10.1080/0266476032000137463
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000137463
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:30:y:2003:i:10:p:1231-1241
Template-Type: ReDIF-Article 1.0
Author-Name: L. Di Scala
Author-X-Name-First: L. Di
Author-X-Name-Last: Scala
Author-Name: L. La Rocca
Author-X-Name-First: L.
Author-X-Name-Last: La Rocca
Author-Name: G. Consonni
Author-X-Name-First: G.
Author-X-Name-Last: Consonni
Title: A Bayesian Hierarchical Model for the Evaluation of a Website
Abstract:
Consider a website and the surfers visiting its pages. A typical issue of
interest, for example while monitoring an advertising campaign, concerns
whether a specific page has been designed successfully, i.e. is able to
attract surfers or address them to other pages within the site. We assume
that the surfing behaviour is fully described by the transition
probabilities from one page to another, so that a clickstream (sequence of
consecutively visited pages) can be viewed as a finite-state-space Markov
chain. We then implement a variety of hierarchical prior distributions on
the multivariate logits of the transition probabilities and define, for
each page, a content effect and a link effect. The former measures the
attractiveness of the page due to its contents, while the latter signals
its ability to suggest further interesting links within the site.
Moreover, we define an additional effect, representing overall page
success, which incorporates both effects previously described. Using
WinBUGS, we provide estimates and credible intervals for each of the above
effects and rank pages accordingly.
Journal: Journal of Applied Statistics
Pages: 15-27
Issue: 1
Volume: 31
Year: 2004
Keywords: Clickstream analysis, multilevel models, multivariate logits, ranking, transition counts,
X-DOI: 10.1080/0266476032000148920
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000148920
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:1:p:15-27
Template-Type: ReDIF-Article 1.0
Author-Name: L. Pace
Author-X-Name-First: L.
Author-X-Name-Last: Pace
Author-Name: A. Salvan
Author-X-Name-First: A.
Author-X-Name-Last: Salvan
Author-Name: L. Ventura
Author-X-Name-First: L.
Author-X-Name-Last: Ventura
Title: The Effects of Rounding on Likelihood Procedures
Abstract:
The aim of this paper is to investigate the robustness properties of
likelihood inference with respect to rounding effects. Attention is
focused on exponential families and on inference about a scalar parameter
of interest, also in the presence of nuisance parameters. A summary value
of the influence function of a given statistic, the local-shift
sensitivity, is considered. It accounts for small fluctuations in the
observations. The main result is that the local-shift sensitivity is
bounded for the usual likelihood-based statistics, i.e. the directed
likelihood, the Wald and score statistics. It is also bounded for the
modified directed likelihood, which is a higher-order adjustment of the
directed likelihood. The practical implication is that likelihood
inference is expected to be robust with respect to rounding effects.
Theoretical analysis is supplemented and confirmed by a number of Monte
Carlo studies, performed to assess the coverage probabilities of
confidence intervals based on likelihood procedures when data are rounded.
In addition, simulations indicate that the directed likelihood is less
sensitive to rounding effects than the Wald and score statistics. This
provides another criterion for choosing among first-order equivalent
likelihood procedures. The modified directed likelihood shows the same
robustness as the directed likelihood, so that its gain in inferential
accuracy does not come at the price of an increase in instability with
respect to rounding.
Journal: Journal of Applied Statistics
Pages: 29-48
Issue: 1
Volume: 31
Year: 2004
Keywords: Directed likelihood, exponential family, higher-order asymptotics, influence function, maximum likelihood estimator, modified directed likelihood, robustness, Wald test,
X-DOI: 10.1080/0266476032000148939
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000148939
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:1:p:29-48
Template-Type: ReDIF-Article 1.0
Author-Name: A. Martin Andres
Author-X-Name-First: A. Martin
Author-X-Name-Last: Andres
Author-Name: I. Herranz Tejedor
Author-X-Name-First: I. Herranz
Author-X-Name-Last: Tejedor
Title: The Equivalence of Two Proportions Revisited
Abstract:
The classic conditional test for checking that the difference between two
independent proportions is not null may not be appropriate in many
circumstances. Dunnett & Gent (1977) showed that in clinical trials, in
studies of drugs, etc, the aim is to prove the practical equality
(equivalence) of both proportions. On other occasions the aim may be the
opposite: i.e. to prove that the two proportions are substantially
different (biologically significant). Both cases are usually solved by two
one-sided tests (TOST test). In this article, this procedure is shown to
be conservative and two true two-sided tests for each case are proposed.
Journal: Journal of Applied Statistics
Pages: 61-72
Issue: 1
Volume: 31
Year: 2004
Keywords: Confidence intervals, comparisons of two proportions, conditional test, equivalence test, Z test, ω² test, 2×2 tables,
X-DOI: 10.1080/0266476032000148957
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000148957
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:1:p:61-72
Template-Type: ReDIF-Article 1.0
Author-Name: C. Illert
Author-X-Name-First: C.
Author-X-Name-Last: Illert
Author-Name: A. Allison
Author-X-Name-First: A.
Author-X-Name-Last: Allison
Title: Phono-genesis and the Origin of Accusative Syntax in Proto-Australian Language
Abstract:
It is claimed that a set of 62 known (Illert, 2003) ancient Aboriginal
words constitute a representative sample of the original proto-Australian
lexicon whose maximum likelihood (Fisher, 1912) 'power law signature' is
determined and shown to precisely fit genetically related 'modern'
lexicons from south-eastern-Australia. This measure of 'sameness' builds
the confidence required to justify inter-lexicon diachronic word-
frequency comparisons which provide a powerful new statistical tool
capable of revealing important features of ancestral grammar. This paper
supplies the first ever published modern translations of authentic
traditional language documented in obscure literary and archival sources
which have, until recently, been lost (Dawes, 1790b; Wood, 1924; Troy,
1992) or overlooked (Everitt et al., 1900; Illert, 2001) for centuries.
These newly found examples of accusative syntax supported by word-
frequency data may come as quite a surprise to some linguists (Dixon,
1980; Osmond, 1989; Troy, 1992; Nichols, 1993) who, in the absence of
adequate evidence, seem to have long-imagined that language from this
region—if not the entire continent— simply had to be
inherently and at the core ergative. On the contrary we find that changing
word-frequencies, from proto-Australian to modern times, supply
overwhelming evidence of the emergence of ancient accusative prefixes
which have even survived into recent centuries in the Sydney region.
Additionally it is found that, over millennia, words die-off in a lexicon,
replaced by others, according to the famous "mortality law' of
Gompertz (1825) which also describes the likelihood of death of biological
organisms within populations and is the basis for modern actuarial science
(Bowers et al., 1997). Just as disease and epidemics can wipe out entire
cohorts of creatures from a population, so too can syntactic change
annihilate word-classes in an evolving lexicon.
Journal: Journal of Applied Statistics
Pages: 73-104
Issue: 1
Volume: 31
Year: 2004
Keywords: Maximum-likelihood, frequency analysis, Gompertz law, morpho-statistics, linguistics,
X-DOI: 10.1080/0266476032000148966
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000148966
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:1:p:73-104
Template-Type: ReDIF-Article 1.0
Author-Name: T. Nummi
Author-X-Name-First: T.
Author-X-Name-Last: Nummi
Author-Name: J. Mottonen
Author-X-Name-First: J.
Author-X-Name-Last: Mottonen
Title: Prediction of Stem Measurements of Scots Pine
Abstract:
The aim of this study was to investigate prediction of stem measurements
of Scots pine(Pinus sylvestris L.) for a modern computerized forest
harvester. We are interested in the prediction of stem curve measurements
when measurements of stems already processed and a short section of the
stem under process are known. The techniques presented here are based on
cubic smoothing splines and on multivariate regression models. One
advantage of these methods is that they do not assume any special
functional form of the stem curve. They can also be applied to the
prediction of branch limits and stem height of pine stems.
Journal: Journal of Applied Statistics
Pages: 105-114
Issue: 1
Volume: 31
Year: 2004
Keywords: Cubic smoothing splines, forest harvesting, mixed models,
X-DOI: 10.1080/0266476032000148975
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000148975
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:1:p:105-114
Template-Type: ReDIF-Article 1.0
Author-Name: Pedro Gouveia
Author-X-Name-First: Pedro
Author-X-Name-Last: Gouveia
Author-Name: Paulo Rodrigues
Author-X-Name-First: Paulo
Author-X-Name-Last: Rodrigues
Title: Threshold Cointegration and the PPP Hypothesis
Abstract:
Self-Exciting Threshold Autoregressive (SETAR) models are a non-linear
variant of conventional linear Autoregressive (AR) models. One advantage
of SETAR models over conventional AR models lies in its flexible nature in
dealing with possible asymmetric behaviour of economic variables. The
concept of threshold cointegration implies that the Error Correction
Mechanism (ECM) at a particular interval is inactive as a result of
adjustment costs, and active when deviations from equilibrium exceed
certain thresholds. For instance, the presence of adjustment costs can, in
many circumstances, justify the fact that economic agents intervene to
recalibrate back to a tolerable limit, as in the case when the benefits of
adjustment are superior to its costs. We introduce an approach that
accounts for potential asymmetry and we investigate the presence of the
relative version of the purchasing power parity (PPP) hypothesis for 14
countries. Based on a threshold cointegration adaptation of the unit root
test procedure suggested by Caner & Hansen (2001), we find evidence of an
asymmetric adjustment for the relative version of PPP for eight pairs of
countries.
Journal: Journal of Applied Statistics
Pages: 115-127
Issue: 1
Volume: 31
Year: 2004
Keywords: Nonlinearity, cointegration, Setar models,
X-DOI: 10.1080/0266476032000148984
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000148984
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:1:p:115-127
Template-Type: ReDIF-Article 1.0
Author-Name: Rand Wilcox
Author-X-Name-First: Rand
Author-X-Name-Last: Wilcox
Title: Inferences Based on a Skipped Correlation Coefficient
Abstract:
The most popular method for trying to detect an association between two
random variables is to test H0 : ρ=0, the hypothesis that Pearson's
correlation is equal to zero. It is well known, however, that Pearson's
correlation is not robust, roughly meaning that small changes in any
distribution, including any bivariate normal distribution as a special
case, can alter its value. Moreover, the usual estimate of ρ, r, is
sensitive to only a few outliers which can mask a true association. A
simple alternative to testing H0 : ρ =0 is to switch to a measure of
association that guards against outliers among the marginal distributions
such as Kendall's tau, Spearman's rho, a Winsorized correlation, or a
so-called percentage bend correlation. But it is known that these methods
fail to take into account the overall structure of the data. Many measures
of association that do take into account the overall structure of the data
have been proposed, but it seems that nothing is known about how they
might be used to detect dependence. One such measure of association is
selected, which is designed so that under bivariate normality, its
estimator gives a reasonably accurate estimate of ρ. Then methods
for testing the hypothesis of a zero correlation are studied.
Journal: Journal of Applied Statistics
Pages: 131-143
Issue: 2
Volume: 31
Year: 2004
Keywords: Skipped correlation coefficient, inferences, random variables, Pearson's correlation,
X-DOI: 10.1080/0266476032000148821
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000148821
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:2:p:131-143
Template-Type: ReDIF-Article 1.0
Author-Name: Markus Neuhauser
Author-X-Name-First: Markus
Author-X-Name-Last: Neuhauser
Author-Name: Herbert Buning
Author-X-Name-First: Herbert
Author-X-Name-Last: Buning
Author-Name: Ludwig Hothorn
Author-X-Name-First: Ludwig
Author-X-Name-Last: Hothorn
Title: Maximum Test versus Adaptive Tests for the Two-Sample Location Problem
Abstract:
For the non-parametric two-sample location problem, adaptive tests based
on a selector statistic are compared with a maximum and a sum test,
respectively. When the class of all continuous distributions is not
restricted, the sum test is not a robust test, i.e. it does not have a
relatively high power across the different possible distributions.
However, according to our simulation results, the adaptive tests as well
as the maximum test are robust. For a small sample size, the maximum test
is preferable, whereas for a large sample size the comparison between the
adaptive tests and the maximum test does not show a clear winner.
Consequently, one may argue in favour of the maximum test since it is a
useful test for all sample sizes. Furthermore, it does not need a selector
and the specification of which test is to be performed for which values of
the selector. When the family of possible distributions is restricted, the
maximin efficiency robust test may be a further robust alternative.
However, for the family of t distributions this test is not as powerful as
the corresponding maximum test.
Journal: Journal of Applied Statistics
Pages: 215-227
Issue: 2
Volume: 31
Year: 2004
Keywords: Location-shift model, measures of skewness and tailweight, maximin efficiency robust test, non-parametric tests, two-sample location problem, selector statistic,
X-DOI: 10.1080/0266476032000148876
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000148876
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:2:p:215-227
Template-Type: ReDIF-Article 1.0
Author-Name: Theodore Karrison
Author-X-Name-First: Theodore
Author-X-Name-Last: Karrison
Author-Name: Peter O'Brien
Author-X-Name-First: Peter
Author-X-Name-Last: O'Brien
Title: A Rank-Sum-Type Test for Paired Data with Multiple Endpoints
Abstract:
Clinical trials and other types of studies often examine the effects of a
particular treatment or experimental condition on a number of different
response variables. Although the usual approach for analysing such data is
to examine each variable separately, this can increase the chance of false
positive findings. Bonferroni's inequality or Hotelling's T2 statistic can
be employed to control the overall type I error rate, but these tests
generally lack power for alternatives in which the treatment improves the
outcome on most or all of the endpoints. For the comparison of independent
groups, O'Brien (1984) developed a rank-sum type test that has greater
power than the Bonferroni and T2 procedures when one treatment is
uniformly better (i.e. for all endpoints) than the other treatment(s). In
this paper we adapt the rank-sum test to studies involving paired data and
demonstrate that it, too, has power advantages for such alternatives.
Simulation results are described, and an example from a study measuring
the effects of sleep loss on glucose metabolism is presented to illustrate
the methodology.
Journal: Journal of Applied Statistics
Pages: 229-238
Issue: 2
Volume: 31
Year: 2004
Keywords: Multiple endpoints, paired data, non-parametric tests, statistical power,
X-DOI: 10.1080/0266476032000148885
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000148885
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:2:p:229-238
Template-Type: ReDIF-Article 1.0
Author-Name: Tom Benton
Author-X-Name-First: Tom
Author-X-Name-Last: Benton
Author-Name: David Hand
Author-X-Name-First: David
Author-X-Name-Last: Hand
Author-Name: Martin Crowder
Author-X-Name-First: Martin
Author-X-Name-Last: Crowder
Title: Two zs are Better than One
Abstract:
Given only a random sample of observations, the usual estimator for the
population mean is the sample mean. If additional information is provided
it might be possible in some situations to obtain a better estimator. The
situation considered here is when the variable whose mean is sought is
composed of factors that are themselves observable. In the basic case, the
variable can be expressed as the product of two, independent, more basic
variables, but we also consider the case of more than two, the effect of
correlation, and when there are observation costs.
Journal: Journal of Applied Statistics
Pages: 239-247
Issue: 2
Volume: 31
Year: 2004
Keywords: Sample mean, component observations, product estimators,
X-DOI: 10.1080/0266476032000148894
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476032000148894
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:2:p:239-247
Template-Type: ReDIF-Article 1.0
Author-Name: Guillermo Miro-Quesada
Author-X-Name-First: Guillermo
Author-X-Name-Last: Miro-Quesada
Author-Name: Enrique Del Castillo
Author-X-Name-First: Enrique Del
Author-X-Name-Last: Castillo
Author-Name: John Peterson
Author-X-Name-First: John
Author-X-Name-Last: Peterson
Title: A Bayesian Approach for Multiple Response Surface Optimization in the Presence of Noise Variables
Abstract:
An approach for the multiple response robust parameter design problem
based on a methodology by Peterson (2000) is presented. The approach is
Bayesian, and consists of maximizing the posterior predictive probability
that the process satisfies a set of constraints on the responses. In order
to find a solution robust to variation in the noise variables, the
predictive density is integrated not only with respect to the response
variables but also with respect to the assumed distribution of the noise
variables. The maximization problem involves repeated Monte Carlo
integrations, and two different methods to solve it are evaluated. A
Matlab code was written that rapidly finds an optimal (robust) solution in
case it exists. Two examples taken from the literature are used to
illustrate the proposed method.
Journal: Journal of Applied Statistics
Pages: 251-270
Issue: 3
Volume: 31
Year: 2004
Keywords: Response surface methodology, robust parameter design, Bayesian statistics, Monte Carlo integration,
X-DOI: 10.1080/0266476042000184019
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000184019
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:3:p:251-270
Template-Type: ReDIF-Article 1.0
Author-Name: Sadao Tomizawa
Author-X-Name-First: Sadao
Author-X-Name-Last: Tomizawa
Author-Name: Nobuko Miyamoto
Author-X-Name-First: Nobuko
Author-X-Name-Last: Miyamoto
Author-Name: Ryo Funato
Author-X-Name-First: Ryo
Author-X-Name-Last: Funato
Title: Conditional Difference Asymmetry Model for Square Contingency Tables with Nominal Categories
Abstract:
This paper proposes a model, which is an extension-of-symmetry model, for
square contingency tables with the same nominal row and column
classifications. The model states that the absolute values of difference
between the conditional probability that an observation will fall in cell
(i, j) on condition that it falls in cell (i, j) or (j, i) and the
conditional probability that it falls in cell (j, i) on the same
condition, are constant for every i≠j. The model describes a
structure of asymmetry (not symmetry), and it is applied to the data on a
nominal scale. An example is given.
Journal: Journal of Applied Statistics
Pages: 271-277
Issue: 3
Volume: 31
Year: 2004
Keywords: Asymmetry, conditional distribution, nominal category, model, square table, symmetry,
X-DOI: 10.1080/0266476042000184028
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000184028
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:3:p:271-277
Template-Type: ReDIF-Article 1.0
Author-Name: H. Bayo Lawal
Author-X-Name-First: H. Bayo
Author-X-Name-Last: Lawal
Title: Using a GLM to Decompose the Symmetry Model in Square Contingency Tables with Ordered Categories
Abstract:
In this paper, we are employing the generalized linear model (GLM) in the
form lij=Xλ to decompose the symmetry model into the class of models
discussed in Tomizawa (1992). In this formulation, the random component
would be the observed counts fij with an underlying Poisson distribution.
This approach utilizes the non-standard log-linear model and our focus in
this paper therefore relates to models that are decompositions of the
complete symmetry model. That is, models that are implied by the symmetry
models. We develop factor and regression variables required for the
implementation of these models in SAS PROC GENMOD and SPSS PROC GENLOG. We
apply this methodology to analyse the three 4×4 contingency table,
one of which is the Japanese Unaided distance vision data. Results
obtained in this study are consistent with those from the numerous
literature on the subject. We further extend our applications to the
6×6 Brazilian social mobility data. We found that both the quasi
linear diagonal-parameters symmetry (QLDPS) and the quasi 2-ratios
parameter symmetry (Q2RPS) models fit the Brazilian data very well.
Parsimonious models being the QLDPS and the quasi-conditional symmetry
(QCS) models. The SAS and SPSS programs for implementing the models
discussed in this paper are presented in Appendices A, B and C.
Journal: Journal of Applied Statistics
Pages: 279-303
Issue: 3
Volume: 31
Year: 2004
Keywords: Poisson, factor, regression, quasi-diagonal symmetry model,
X-DOI: 10.1080/0266476042000184037
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000184037
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:3:p:279-303
Template-Type: ReDIF-Article 1.0
Author-Name: Yves Berger
Author-X-Name-First: Yves
Author-X-Name-Last: Berger
Title: A Simple Variance Estimator for Unequal Probability Sampling without Replacement
Abstract:
Survey sampling textbooks often refer to the Sen-Yates-Grundy variance
estimator for use with without-replacement unequal probability designs.
This estimator is rarely implemented because of the complexity of
determining joint inclusion probabilities. In practice, the variance is
usually estimated by simpler variance estimators such as the
Hansen-Hurwitz with replacement variance estimator; which often leads to
overestimation of the variance for large sampling fractions that are
common in business surveys. We will consider an alternative estimator: the
Hajek (1964) variance estimator that depends on the first-order inclusion
probabilities only and is usually more accurate than the Hansen-Hurwitz
estimator. We review this estimator and show its practical value. We
propose a simple alternative expression; which is as simple as the Hansen-
Hurwitz estimator. We also show how the Hajek estimator can be easily
implemented with standard statistical packages.
Journal: Journal of Applied Statistics
Pages: 305-315
Issue: 3
Volume: 31
Year: 2004
Keywords: Design-based inference, Hansen-Hurwitz variance estimator, inclusion probabilities, π-estimator, Sen-Yates-Grundy variance estimator,
X-DOI: 10.1080/0266476042000184046
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000184046
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:3:p:305-315
Template-Type: ReDIF-Article 1.0
Author-Name: A. A. Kalgonda
Author-X-Name-First: A. A.
Author-X-Name-Last: Kalgonda
Author-Name: S. R. Kulkarni
Author-X-Name-First: S. R.
Author-X-Name-Last: Kulkarni
Title: Multivariate Quality Control Chart for Autocorrelated Processes
Abstract:
Traditional multivariate statistical process control (SPC) techniques are
based on the assumption that the successive observation vectors are
independent. In recent years, due to automation of measurement and data
collection systems, a process can be sampled at higher rates, which
ultimately leads to autocorrelation. Consequently, when the
autocorrelation is present in the data, it can have a serious impact on
the performance of classical control charts. This paper considers the
problem of monitoring the mean vector of a process in which observations
can be modelled as a first-order vector autoregressive VAR (1) process. We
propose a control chart called Z-chart which is based on the single step
finite intersection test (Timm, 1996). An important feature of the
proposed method is that it not only detects an out of control status but
also helps in identifying variable(s) responsible for the out of control
situation. The proposed method is illustrated with the help of suitable
illustrations.
Journal: Journal of Applied Statistics
Pages: 317-327
Issue: 3
Volume: 31
Year: 2004
Keywords: Multivariate statistical process control, autocorrelation,
X-DOI: 10.1080/0266476042000184000
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000184000
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:3:p:317-327
Template-Type: ReDIF-Article 1.0
Author-Name: R. Deardon
Author-X-Name-First: R.
Author-X-Name-Last: Deardon
Author-Name: S. G. Gilmour
Author-X-Name-First: S. G.
Author-X-Name-Last: Gilmour
Author-Name: N. A. Butler
Author-X-Name-First: N. A.
Author-X-Name-Last: Butler
Author-Name: K. Phelps
Author-X-Name-First: K.
Author-X-Name-Last: Phelps
Author-Name: R. Kennedy
Author-X-Name-First: R.
Author-X-Name-Last: Kennedy
Title: A Method for Ascertaining and Controlling Representation Bias in Field Trials for Airborne Plant Pathogens
Abstract:
The basic premise of running a field trial is that the estimates of
treatment effects obtained are representative of how the different
treatments will perform in the field. The disparities between the
treatment effects observed experimentally, and those that would be
observed were the treatments applied to the field, we term 'representation
bias.' When looking at field trials testing the efficacies of treatment
sprays on plant pathogens, representation bias can be caused by positive
and negative inter-plot interference. The potential for such effects will
be greatest when looking at pathogens that are dispersed by wind. In this
paper, a computer simulation that simulates plant disease dispersal under
such conditions is described. This program is used to quantify the amount
of representation bias occurring in various experimental situations.
Through this, the relationships between field design parameters and
representation bias are explored, and the importance of plot dimension and
spacing, as well as treatment to plot allocation, emphasized.
Journal: Journal of Applied Statistics
Pages: 329-343
Issue: 3
Volume: 31
Year: 2004
Keywords: Inter-plot interference, experimental design, plant pathology, simulation of plant disease dispersal,
X-DOI: 10.1080/0266476042000184073
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000184073
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:3:p:329-343
Template-Type: ReDIF-Article 1.0
Author-Name: Willem Albers
Author-X-Name-First: Willem
Author-X-Name-Last: Albers
Author-Name: Wilbert Kallenberg
Author-X-Name-First: Wilbert
Author-X-Name-Last: Kallenberg
Title: Empirical Non-Parametric Control Charts: Estimation Effects and Corrections
Abstract:
Owing to the extreme quantiles involved, standard control charts are very
sensitive to the effects of parameter estimation and non-normality. More
general parametric charts have been devised to deal with the latter
complication and corrections have been derived to compensate for the
estimation step, both under normal and parametric models. The resulting
procedures offer a satisfactory solution over a broad range of underlying
distributions. However, situations do occur where even such a large model
is inadequate and nothing remains but to consider non- parametric charts.
In principle, these form ideal solutions, but the problem is that huge
sample sizes are required for the estimation step. Otherwise the resulting
stochastic error is so large that the chart is very unstable, a
disadvantage that seems to outweigh the advantage of avoiding the model
error from the parametric case. Here we analyse under what conditions
non-parametric charts actually become feasible alternatives for their
parametric counterparts. In particular, corrected versions are suggested
for which a possible change point is reached at sample sizes that are
markedly less huge (but still larger than the customary range). These
corrections serve to control the behaviour during in-control (markedly
wrong outcomes of the estimates only occur sufficiently rarely). The price
for this protection will clearly be some loss of detection power during
out-of-control. A change point comes in view as soon as this loss can be
made sufficiently small.
Journal: Journal of Applied Statistics
Pages: 345-360
Issue: 3
Volume: 31
Year: 2004
Keywords: Statistical process control, Phase II control limits, exceedance probability, empirical quantiles,
X-DOI: 10.1080/0266476042000184055
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000184055
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:3:p:345-360
Template-Type: ReDIF-Article 1.0
Author-Name: Thaddeus Tarpey
Author-X-Name-First: Thaddeus
Author-X-Name-Last: Tarpey
Author-Name: Richard Sanders
Author-X-Name-First: Richard
Author-X-Name-Last: Sanders
Title: Linear Conditional Expectation for Discretized Distributions
Abstract:
Many statistical methods for continuous distributions assume a linear
conditional expectation. Components of multivariate distributions are
often measured on a discrete ordinal scale based on a discretization of an
underlying continuous latent variable. The results in this paper show that
common examples of discretized bivariate and trivariate distributions will
have a linear conditional expectation. Examples and simulations are
provided to illustrate the results.
Journal: Journal of Applied Statistics
Pages: 361-372
Issue: 3
Volume: 31
Year: 2004
Keywords: Conditional expectation, biserial/polyserial correlations, elliptical distributions, polychoric correlations, tetrachoric correlations,
X-DOI: 10.1080/0266476042000184064
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000184064
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:3:p:361-372
Template-Type: ReDIF-Article 1.0
Author-Name: H. E. T. Holgersson
Author-X-Name-First: H. E. T.
Author-X-Name-Last: Holgersson
Title: Testing for Multivariate Autocorrelation
Abstract:
This paper concerns the problem of assessing autocorrelation of
multivariate (i.e. systemwise) models. It is well known that systemwise
diagnostic tests for autocorrelation often suffers from poor small sample
properties in the sense that the true size overstates the nominal size.
The failure of keeping control of the size usually stems from the fact
that the critical values (used to decide the rejection area) originate
from the slowly converging asymptotic null distribution. Another drawback
of existing tests is that the power may be rather low if the deviation
from the null is not symmetrical over the marginal models. In this paper
we consider four quite different test techniques for autocorrelation.
These are (i) Pillai's trace, (ii) Roy's largest root, (iii) the maximum
F-statistic and (iv) the maximum t2 test. We show how to obtain control of
the size of the tests, and then examine the true (small sample) size and
power properties by means of Monte Carlo simulations.
Journal: Journal of Applied Statistics
Pages: 379-395
Issue: 4
Volume: 31
Year: 2004
Keywords: Autocorrelation Test, Multivariate Analysis, Linear Hypothesis, Residuals,
X-DOI: 10.1080/02664760410001681693
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681693
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:4:p:379-395
Template-Type: ReDIF-Article 1.0
Author-Name: Xia Pan
Author-X-Name-First: Xia
Author-X-Name-Last: Pan
Author-Name: Jeffrey Jarrett
Author-X-Name-First: Jeffrey
Author-X-Name-Last: Jarrett
Title: Applying State Space to SPC: Monitoring Multivariate Time Series
Abstract:
Monitoring cross-sectional and serially interdependent processes has
become a new issue in statistical process control (SPC). In up-to-date SPC
literature, Kalman filtering was reported to monitor univariate
autocorrelated processes. This paper applies a Kalman filter or
state-space method for SPC to monitoring multivariate time series. We use
Aoki's approach to estimate the parameter matrices of a state-space model.
Multivariate Hotelling T2 control charts are employed to monitor the
residuals of the state-space. Examples of this approach are illustrated.
Journal: Journal of Applied Statistics
Pages: 397-418
Issue: 4
Volume: 31
Year: 2004
Keywords: Quality Control Charts, Spc, State-space, Multivariate Time Series, Aoki's Approach,
X-DOI: 10.1080/02664760410001681701
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681701
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:4:p:397-418
Template-Type: ReDIF-Article 1.0
Author-Name: Ping Sa
Author-X-Name-First: Ping
Author-X-Name-Last: Sa
Author-Name: Luminita Razaila
Author-X-Name-First: Luminita
Author-X-Name-Last: Razaila
Title: One-sided Continuous Tolerance Limits and their Accompanying Sample Size Problem
Abstract:
Tolerance limits are those limits that contain a certain proportion of
the distribution of a characteristic with a given probability. 'They are
used to make sure that the production will not be outside of
specifications' (Amin & Lee, 1999). Usually, tolerance limits are
constructed at the beginning of the monitoring of the process. Since they
are calculated just one time, these tolerance limits cannot reflect
changes of tolerance level over the lifetime of the process. This research
proposes an algorithm to construct tolerance limits continuously over time
for any given distribution. This algorithm makes use of the exponentially
weighted moving average (EWMA) technique. It can be observed that the
sample size required by this method is reduced over time.
Journal: Journal of Applied Statistics
Pages: 419-434
Issue: 4
Volume: 31
Year: 2004
Keywords: Tolerance Limits, Exponentially Weighted Moving Average Technique, Order Statistics,
X-DOI: 10.1080/02664760410001681710
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681710
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:4:p:419-434
Template-Type: ReDIF-Article 1.0
Author-Name: Sanjoy Roy Chowdhury
Author-X-Name-First: Sanjoy Roy
Author-X-Name-Last: Chowdhury
Title: Catalogue of Group Structures for Three-level Fractional Factorial Designs
Abstract:
Taguchi (1959) introduced the concept of split-unit design to sort the
factors into different groups depending upon the difficulties involved in
changing the levels of factors. Li et al. (1991) renamed it as split-plot
design. Chen et al. (1993) have given a catalogue of small designs for
two- and three-level fractional factorial designs pertaining to a single
type of factors. Aggarwal et al. (1997) have given a catalogue of group
structure for two-level fractional factorial designs developed under the
concept of split-plot design. In this paper, an algorithm has been
developed for generating group structure and possible allocations for
various 3n-k fractional factorial designs.
Journal: Journal of Applied Statistics
Pages: 435-444
Issue: 4
Volume: 31
Year: 2004
Keywords: Interaction Graphs, Orthogonal Arrays, Split Plot Designs, Group Structures, Word Length Patterns,
X-DOI: 10.1080/02664760410001681729
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681729
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:4:p:435-444
Template-Type: ReDIF-Article 1.0
Author-Name: Petros Maravelakis
Author-X-Name-First: Petros
Author-X-Name-Last: Maravelakis
Author-Name: John Panaretos
Author-X-Name-First: John
Author-X-Name-Last: Panaretos
Author-Name: Stelios Psarakis
Author-X-Name-First: Stelios
Author-X-Name-Last: Psarakis
Title: EWMA Chart and Measurement Error
Abstract:
Measurement error is a usually met distortion factor in real-world
applications that influences the outcome of a process. In this paper, we
examine the effect of measurement error on the ability of the EWMA control
chart to detect out-of-control situations. The model used is the one
involving linear covariates. We investigate the ability of the EWMA chart
in the case of a shift in mean. The effect of taking multiple measurements
on each sampled unit and the case of linearly increasing variance are also
examined. We prove that, in the case of measurement error, the performance
of the chart regarding the mean is significantly affected.
Journal: Journal of Applied Statistics
Pages: 445-455
Issue: 4
Volume: 31
Year: 2004
Keywords: Exponentially Weighted Moving Average Control Chart, Average Run Length, Average Time To Signal, Measurement Error, Markov Chain, Statistical Process Control,
X-DOI: 10.1080/02664760410001681738
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681738
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:4:p:445-455
Template-Type: ReDIF-Article 1.0
Author-Name: Rose Baker
Author-X-Name-First: Rose
Author-X-Name-Last: Baker
Title: A Modified Knox Test of Space-Time Clustering
Abstract:
Tests of space-time clustering such as the Knox test are used by
epidemiologists in the preliminary analysis of datasets where an
infectious aetiology is suspected. The Knox test statistic is the number
of cases close in both space and time to another case. The test statistic
proposed here is the excess number of such cases over that expected under
H0 of no infection. It is argued that this modified test is more powerful
than the Knox test, because the test statistic is not heavily tied as is
the Knox test statistic. The use of the test is illustrated with examples.
Journal: Journal of Applied Statistics
Pages: 457-463
Issue: 4
Volume: 31
Year: 2004
Keywords: Epidemiology, Permutation Test, Monte Carlo, Fortran95 Program,
X-DOI: 10.1080/02664760410001681747
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681747
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:4:p:457-463
Template-Type: ReDIF-Article 1.0
Author-Name: K. V. Mardia
Author-X-Name-First: K. V.
Author-X-Name-Last: Mardia
Author-Name: J. Kirkbride
Author-X-Name-First: J.
Author-X-Name-Last: Kirkbride
Author-Name: F. L. Bookstein
Author-X-Name-First: F. L.
Author-X-Name-Last: Bookstein
Title: Statistics of Shape, Direction and Cylindrical Variables
Abstract:
In statistical shape analysis, the shape of an object is understood to be
what remains after the effects of location, scale and rotation are
removed. We consider the distributional problem of triangular shape and an
associated direction; motivated by a data set of microscopic fossils. We
begin by constructing a parallel transport system such that the data
transform onto the space S2×S2. A joint shape distribution on
S2×S1 is proposed based on Jupp & Mardia's bivariate distribution on
S2×S1. For concentrated data, an approximation to the distribution on
S2×S1 is given by a distribution on 1×S1, and we explore
a distribution on this space by extending Mardia & Sutton's distribution
on 2×S1. In this distribution, the expected edgel direction
varies linearly in the shape coordinates. This is found to be a useful
model for the microfossil data.
Journal: Journal of Applied Statistics
Pages: 465-479
Issue: 4
Volume: 31
Year: 2004
Keywords: Bookstein Coordinates, Edgel, Fisher Distribution, Kendall Coordinates, Microfossil Data, Triangle Shape, Von Mises Distribution,
X-DOI: 10.1080/02664760410001681756
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681756
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:4:p:465-479
Template-Type: ReDIF-Article 1.0
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Author-Name: K. Subramani
Author-X-Name-First: K.
Author-X-Name-Last: Subramani
Title: Modified CSP-C Continuous Sampling Plan for Consumer Protection
Abstract:
Dodge (1943) introduced a single level attribute continuous sampling plan
designated as CSP-1 for the application of continuous production
processes. Govindaraju & Kandasamy (2000) developed a new single level
continuous sampling plan whose sampling inspection phase is characterized
by a maximum allowable number of non-conforming units c, and a constant
sampling rate f and was designated as CSP-C. In this paper, a modification
is proposed on the CSP-C continuous sampling plan. In this modified plan,
sampling inspection is continued until the occurrence of c+1
non-conforming units, provided the first m sampled units have been found
conforming during the sampling phase. Using a Markov chain model,
expressions for the performance measures of the modified CSP-C plan are
derived. The main advantage of the modified plan is that it is possible to
lower the average outgoing quality limit.
Journal: Journal of Applied Statistics
Pages: 481-494
Issue: 4
Volume: 31
Year: 2004
Keywords: Csp-C Continuous Sampling, Production Processes, Consumer Protection,
X-DOI: 10.1080/02664760410001681765A
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681765A
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:4:p:481-494
Template-Type: ReDIF-Article 1.0
Author-Name: B. M. Colosimo
Author-X-Name-First: B. M.
Author-X-Name-Last: Colosimo
Author-Name: R. Pan
Author-X-Name-First: R.
Author-X-Name-Last: Pan
Author-Name: E. del Castillo
Author-X-Name-First: E. del
Author-X-Name-Last: Castillo
Title: A Sequential Markov Chain Monte Carlo Approach to Set-up Adjustment of a Process over a Set of Lots
Abstract:
We consider the problem of adjusting a machine that manufactures parts in
batches or lots and experiences random offsets or shifts whenever a set-up
operation takes place between lots. The existing procedures for adjusting
set-up errors in a production process over a set of lots are based on the
assumption of known process parameters. In practice, these parameters are
usually unknown, especially in short-run production. Due to this lack of
knowledge, adjustment procedures such as Grubbs' (1954, 1983) rules and
discrete integral controllers (also called EWMA controllers) aimed at
adjusting for the initial offset in each single lot, are typically used.
This paper presents an approach for adjusting the initial machine offset
over a set of lots when the process parameters are unknown and are
iteratively estimated using Markov Chain Monte Carlo (MCMC). As each
observation becomes available, a Gibbs Sampler is run to estimate the
parameters of a hierarchical normal means model given the observations up
to that point in time. The current lot mean estimate is then used for
adjustment. If used over a series of lots, the proposed method allows one
eventually to start adjusting the offset before producing the first part
in each lot. The method is illustrated with application to two examples
reported in the literature. It is shown how the proposed MCMC adjusting
procedure can outperform existing rules based on a quadratic off-target
criterion.
Journal: Journal of Applied Statistics
Pages: 499-520
Issue: 5
Volume: 31
Year: 2004
Keywords: Process Adjustment, Gibbs Sampling, Bayesian Hierarchical Models, Random Effects Model, Normal Means Model, Process Control,
X-DOI: 10.1080/02664760410001681765
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681765
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:5:p:499-520
Template-Type: ReDIF-Article 1.0
Author-Name: Darryl Holden
Author-X-Name-First: Darryl
Author-X-Name-Last: Holden
Title: Testing the Normality Assumption in the Tobit Model
Abstract:
This paper examines a number of statistics that have been proposed to
test the normality assumption in the tobit (censored regression) model. It
argues that a number of commonly proposed statistics can be interpreted as
different versions of the Lagrange multiplier, or score, test for a common
null hypothesis. This observation is useful in examining the Monte Carlo
results presented in the paper. The Monte Carlo results suggest that the
computational convenience of a number of statistics is obtained at the
cost of poor finite sample performance under the null hypothesis.
Journal: Journal of Applied Statistics
Pages: 521-532
Issue: 5
Volume: 31
Year: 2004
Keywords: Tobit (Censored Regression) And Probit Models, Normality, Language Multiplier (score) Tests, Hours Of Work Equations,
X-DOI: 10.1080/02664760410001681783
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681783
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:5:p:521-532
Template-Type: ReDIF-Article 1.0
Author-Name: Kui-Yin Cheung
Author-X-Name-First: Kui-Yin
Author-X-Name-Last: Cheung
Author-Name: Elspeth Thomson
Author-X-Name-First: Elspeth
Author-X-Name-Last: Thomson
Title: The Demand for Gasoline in China: A Cointegration Analysis
Abstract:
The economic reforms in China since 1979 and consequent increases in
disposable income have caused total gasoline consumption to soar nearly
240% between 1980 and 1999. As the growth rate of gasoline consumption is
expected to be high due to the increased economic activity resulting from
China's re-accession to the WTO, the government must understand the
implications for economic growth and balance of payments. Using
cointegration techniques, it was found that, between 1980 and 1999, demand
for gasoline was relatively inelastic to price changes, both in the short
and long terms. The long-run income elasticity was 0.97, implying that the
future growth rate of gasoline consumption will be close to the growth
rate of the economy, which is predicted to be about 7% per annum from 2001
to 2005, and 5-6% over the decade thereafter.
Journal: Journal of Applied Statistics
Pages: 533-544
Issue: 5
Volume: 31
Year: 2004
Keywords: Gasoline Consumption, Price And Income Elasticities, Cointegration Analysis,
X-DOI: 10.1080/02664760410001681837
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681837
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:5:p:533-544
Template-Type: ReDIF-Article 1.0
Author-Name: K. G. Russell
Author-X-Name-First: K. G.
Author-X-Name-Last: Russell
Author-Name: S. M. Lewis
Author-X-Name-First: S. M.
Author-X-Name-Last: Lewis
Author-Name: A. Dean
Author-X-Name-First: A.
Author-X-Name-Last: Dean
Title: Fractional Factorial Designs for the Detection of Interactions between Design and Noise Factors
Abstract:
In industrial experiments on both design (control) factors and noise
factors aimed at improving the quality of manufactured products, designs
are needed which afford independent estimation of all design×noise
interactions in as few runs as possible, while allowing aliasing between
those factorial effects of less interest. An algorithm for generating
orthogonal fractional factorial designs of this type is described for
factors at two levels. The generated designs are appropriate for
experimenting on individual factors or for experimentation involving group
screening of factors.
Journal: Journal of Applied Statistics
Pages: 545-552
Issue: 5
Volume: 31
Year: 2004
Keywords: Algorithms, Alias Count Vectors, Regular Fractional Factorials, Tables Of Designs,
X-DOI: 10.1080/02664760410001681800
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681800
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:5:p:545-552
Template-Type: ReDIF-Article 1.0
Author-Name: P. H. Chau
Author-X-Name-First: P. H.
Author-X-Name-Last: Chau
Author-Name: Paul Yip
Author-X-Name-First: Paul
Author-X-Name-Last: Yip
Title: Non-parametric Back-projection of HIV Positive Tests using Multinomial and Poisson Settings
Abstract:
Back-projection is a commonly used method in reconstructing HIV
incidence. Instead of using AIDS incidence data in back-projection, this
paper uses HIV positive tests data. Both multinomial and Poisson settings
are used. The two settings give similar results when a parametric form or
step function is assumed for the infection curve. However, this may not be
true when the HIV infection in each year is characterized by a different
parameter. This paper attempts to use simulation studies to compare these
two settings by constructing various scenarios for the infection curve.
Results show that both methods give approximately the same estimates of
the number of HIV infections in the past, whilst the estimates for HIV
infections in the recent past differ a lot. The multinomial setting always
gives a levelling-off pattern for the recent past, while the Poisson
setting is more sensitive to the change in the shape of the HIV infection
curve. Nonetheless, the multinomial setting gives a relatively narrower
point-wise probability interval. When the size of the epidemic is large,
the narrow probability interval may be under-estimating the true
underlying variation.
Journal: Journal of Applied Statistics
Pages: 553-564
Issue: 5
Volume: 31
Year: 2004
Keywords: Back-calculation, Back-projection, Diagnoses, Hiv/AIDS, Hong Kong, Incidence, Multinomial, Poisson, Simulation,
X-DOI: 10.1080/02664760410001681792
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681792
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:5:p:553-564
Template-Type: ReDIF-Article 1.0
Author-Name: Horng-Jinh Chang
Author-X-Name-First: Horng-Jinh
Author-X-Name-Last: Chang
Author-Name: Chih-Li Wang
Author-X-Name-First: Chih-Li
Author-X-Name-Last: Wang
Author-Name: Kuo-Chung Huang
Author-X-Name-First: Kuo-Chung
Author-X-Name-Last: Huang
Title: Using Randomized Response to Estimate the Proportion and Truthful Reporting Probability in a Dichotomous Finite Population
Abstract:
In this paper, an alternative randomized response procedure is given that
allows us to estimate the population proportion in addition to the
probability of providing a truthful answer. It overcomes a difficulty
associated with traditional randomized response techniques. Properties of
the proposed estimators as well as sample size allocations are studied. In
addition, an efficiency comparison is carried out to investigate the
performance of the proposed technique.
Journal: Journal of Applied Statistics
Pages: 565-573
Issue: 5
Volume: 31
Year: 2004
Keywords: Binomial Distribution, Estimation Of Proportion, Randomized Response,
X-DOI: 10.1080/02664760410001681819
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681819
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:5:p:565-573
Template-Type: ReDIF-Article 1.0
Author-Name: Arthur Pewsey
Author-X-Name-First: Arthur
Author-X-Name-Last: Pewsey
Title: Testing for Circular Reflective Symmetry about a Known Median Axis
Abstract:
Circular data arise in many contexts, a particularly rich source being
animal orientation experiments. Often, in the analysis of such data, a
fundamental question of scientific interest is whether the underlying
distribution is reflectively symmetric about some specific axis. In this
paper, the situation in which the axis of interest is known to be a median
axis is considered and a simple, asymptotically distribution- free test
for circular reflective symmetry against skew alternatives is developed.
The results from a simulation study lead to a testing strategy
incorporating the new test and the circular analogue of the modified runs
test of Modarres & Gastwirth (1996). The application of the testing
strategy is illustrated using circular data arising from two animal
orientation experiments.
Journal: Journal of Applied Statistics
Pages: 575-585
Issue: 5
Volume: 31
Year: 2004
Keywords: Circular Data, Hybrid Testing Strategy, Modified Runs Test, Skew Alternates,
X-DOI: 10.1080/02664760410001681828
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681828
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:5:p:575-585
Template-Type: ReDIF-Article 1.0
Author-Name: Key-Il Shin
Author-X-Name-First: Key-Il
Author-X-Name-Last: Shin
Title: A Multivariate Unit Root Test Based on the Modified Weighted Symmetric Estimator for VAR(p)
Abstract:
Multivariate unit root tests for the VAR model have been commonly used in
time series analysis. Several unit root tests were developed. Most of the
estimators of coefficient matrices developed in the VAR model are obtained
using ordinary least squares estimators. In this paper, we suggest a
multivariate unit root test based on a modified weighted symmetric
estimator. Using a limited Monte Carlo simulation, we compare the powers
of the new test statistic and the test statistic suggested in Fuller
(1996).
Journal: Journal of Applied Statistics
Pages: 587-596
Issue: 5
Volume: 31
Year: 2004
Keywords: Vector Autoregressive Process, Cointegration,
X-DOI: 10.1080/02664760410001681774
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760410001681774
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:5:p:587-596
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Congdon
Author-X-Name-First: Peter
Author-X-Name-Last: Congdon
Title: Modelling Trends and Inequality in Small Area Mortality
Abstract:
This paper considers the modelling of mortality rates classified by age,
time, and small area with a view to developing life table parameters
relevant to assessing trends in inequalities in life chances. In
particular, using a fully Bayes perspective, one may assess the stochastic
variation in small area life table parameters, such as life expectancies,
and also formally assess whether trends in indices of inequality in
mortality are significant. Modelling questions include choice between
random walk priors for age and time effects as against non-linear
regression functions, questions of identifiability when several random
effects are present in the death rates model, and the choice of model when
both within and out-of-sample performance may be important. A case study
application involves 44 small areas in North East London and mortality in
five sub-periods (1986-88, 1989-91, 1992-94, 1995-97, 1998-2000) between
1986 and 2000, with the final period used for assessing out-of-sample
performance.
Journal: Journal of Applied Statistics
Pages: 603-622
Issue: 6
Volume: 31
Year: 2004
Keywords: Apc Models, Mortality, Life Tables, Random Effects Model, Cohort, Bayesian,
X-DOI: 10.1080/1478881042000214695
File-URL: http://www.tandfonline.com/doi/abs/10.1080/1478881042000214695
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:6:p:603-622
Template-Type: ReDIF-Article 1.0
Author-Name: Wei-Ming Luh
Author-X-Name-First: Wei-Ming
Author-X-Name-Last: Luh
Author-Name: Jiin-Huarng Guo
Author-X-Name-First: Jiin-Huarng
Author-X-Name-Last: Guo
Title: Improved Robust Test Statistic Based on Trimmed Means and Hall's Transformation for Two-way ANOVA Models Under Non-normality
Abstract:
For the two-way fixed effects ANOVA, under assumption violations, the
present study employs trimmed means and Hall's transformation to correct
asymmetry, and an approximate test, such as the Alexander-Govern or
Welch-James test, to correct heterogeneity. The unweighted as well as
weighted means analyses of omnibus effects in unbalanced designs were
considered. A simulated data set was presented and computer simulations
were performed to investigate the small-sample properties of the methods.
The simulation results show that the proposed technique is valid and
powerful compared with the conventional methods.
Journal: Journal of Applied Statistics
Pages: 623-643
Issue: 6
Volume: 31
Year: 2004
Keywords: Alexander-Govern Test, Computer Simulation, Non-orthogonal, Robustness, Welch-James Type, Winsorized Variance,
X-DOI: 10.1080/1478881042000214622
File-URL: http://www.tandfonline.com/doi/abs/10.1080/1478881042000214622
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:6:p:623-643
Template-Type: ReDIF-Article 1.0
Author-Name: D. K. Al-Mutairi
Author-X-Name-First: D. K.
Author-X-Name-Last: Al-Mutairi
Title: Bayesian Computations for Random Environment Models
Abstract:
This paper deals with the analysis of reliability data from a Bayesian
perspective for Random Environment (RE) models. We give an overview of
current literature on RE models. We also study the computational problems
associated with the implementations of RE models in a Bayesian setting.
Then, we present the Markov Chain Monte Carlo technique to solve such
problems. These problems arise in posterior and predictive analysis and
their relevant quantities such as mean, variance, and median. The
suggested methodology is incorporated with an illustration.
Journal: Journal of Applied Statistics
Pages: 645-659
Issue: 6
Volume: 31
Year: 2004
Keywords: Bayesian Computation, Bayesian Inference, Gibbs Sampling, Joint Prior Distribution, Random Environment,
X-DOI: 10.1080/1478881042000214631
File-URL: http://www.tandfonline.com/doi/abs/10.1080/1478881042000214631
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:6:p:645-659
Template-Type: ReDIF-Article 1.0
Author-Name: Gang Zheng
Author-X-Name-First: Gang
Author-X-Name-Last: Zheng
Title: Maximizing a Family of Optimal Statistics over a Nuisance Parameter with Applications to Genetic Data Analysis
Abstract:
In this article, a simple algorithm is used to maximize a family of
optimal statistics for hypothesis testing with a nuisance parameter not
defined under the null hypothesis. This arises from genetic linkage and
association studies and other hypothesis testing problems. The maximum of
optimal statistics over the nuisance parameter space can be used as a
robust test in this situation. Here, we use the maximum and minimum
statistics to examine the sensitivity of testing results with respect to
the unknown nuisance parameter. Examples from genetic linkage analysis
using affected sub pairs and a candidate-gene association study in
case-parents trio design are studied.
Journal: Journal of Applied Statistics
Pages: 661-671
Issue: 6
Volume: 31
Year: 2004
Keywords: Genetic Analysis, Maximal Statistics, Nuisance Parameter, Robust Test,
X-DOI: 10.1080/1478881042000214640
File-URL: http://www.tandfonline.com/doi/abs/10.1080/1478881042000214640
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:6:p:661-671
Template-Type: ReDIF-Article 1.0
Author-Name: Alan Hutson
Author-X-Name-First: Alan
Author-X-Name-Last: Hutson
Title: Utilizing the Flexibility of the Epsilon-Skew-Normal Distribution for Common Regression Problems
Abstract:
In this paper we illustrate the properties of the epsilon-skew-normal
(ESN) distribution with respect to developing more flexible regression
models. The ESN model is a simple one-parameter extension of the standard
normal model. The additional parameter ~ corresponds to the degree of
skewness in the model. In the fitting process we take advantage of
relatively new powerful routines that are now available in standard
software packages such as SAS. It is illustrated that even if the true
underlying error distribution is exactly normal there is no practical loss
n power with respect to testing for non-zero regression coefficients. If
the true underlying error distribution is slightly skewed, the ESN model
is superior in terms of statistical power for tests about the regression
coefficient. This model has good asymptotic properties for samples of size
n>50.
Journal: Journal of Applied Statistics
Pages: 673-683
Issue: 6
Volume: 31
Year: 2004
Keywords: Robust Regression, Epsilon-skew-normal Distribution,
X-DOI: 10.1080/1478881042000214659
File-URL: http://www.tandfonline.com/doi/abs/10.1080/1478881042000214659
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:6:p:673-683
Template-Type: ReDIF-Article 1.0
Author-Name: Cynthia Tojeiro
Author-X-Name-First: Cynthia
Author-X-Name-Last: Tojeiro
Author-Name: Francisco Louzada-Neto
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada-Neto
Author-Name: Heleno Bolfarine
Author-X-Name-First: Heleno
Author-X-Name-Last: Bolfarine
Title: A Bayesian Analysis for Accelerated Lifetime Tests Under an Exponential Power Law Model with Threshold Stress
Abstract:
In this paper, we present a Bayesian methodology for modelling
accelerated lifetime tests under a stress response relationship with a
threshold stress. Both Laplace and MCMC methods are considered. The
methodology is described in detail for the case when an exponential
distribution is assumed to express the behaviour of lifetimes, and a power
law model with a threshold stress is assumed as the stress response
relationship. We assume vague but proper priors for the parameters of
interest. The methodology is illustrated by a accelerated failure test on
an electrical insulation film.
Journal: Journal of Applied Statistics
Pages: 685-691
Issue: 6
Volume: 31
Year: 2004
Keywords: Accelerated Life Tests, Threshold Stress, Bayesian Approach, Mcmc, Laplace Approxiation,
X-DOI: 10.1080/1478881042000214668
File-URL: http://www.tandfonline.com/doi/abs/10.1080/1478881042000214668
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:6:p:685-691
Template-Type: ReDIF-Article 1.0
Author-Name: Chung-Ho Chen
Author-X-Name-First: Chung-Ho
Author-X-Name-Last: Chen
Title: Determining the Optimum Process Mean of a One-sided Specification Limit with the Linear Quality Loss Function of Product
Abstract:
Wen & Mergen (1999) proposed a method for setting the optimal process
mean when a process was not capable of meeting specifications in the short
term. However, they neglected to consider the quality loss for a product
within specifications in the model. Chen & Chou (2002) presented a
modified Wen & Mergen's (1999) model, including the quadratic quality loss
function for a one-sided specification limit. In this paper, we propose
the modified Wen & Mergen (1999) cost model including the linear quality
loss function of a product for determining the optimal process mean of a
one-sided specification limit.
Journal: Journal of Applied Statistics
Pages: 693-703
Issue: 6
Volume: 31
Year: 2004
Keywords: Quality Loss Function, Specification Limits, Process Mean, Process Standard Deviation,
X-DOI: 10.1080/1478881042000214677
File-URL: http://www.tandfonline.com/doi/abs/10.1080/1478881042000214677
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:6:p:693-703
Template-Type: ReDIF-Article 1.0
Author-Name: Seung-Hoon Yoo
Author-X-Name-First: Seung-Hoon
Author-X-Name-Last: Yoo
Title: A Note on an Approximation of the Mobile Communications Expenditures Distribution Function Using a Mixture Model
Abstract:
Approximating the distribution of mobile communications expenditures
(MCE) is complicated by zero observations in the sample. To deal with the
zero observations by allowing a point mass at zero, a mixture model of MCE
distributions is proposed and applied. The MCE distribution is specified
as a mixture of two distributions, one with a point mass at zero and the
other with full support on the positive half of the real line. The model
is empirically verified for individual MCE survey data collected in Seoul,
Korea. The mixture model can easily capture the common bimodality feature
of the MCE distribution. In addition, when covariates were added to the
model, it was found that the probability that an individual has
non-expenditure significantly varies with some variables. Finally, the
goodness-of-fit test suggests that the data are well represented by the
mixture model.
Journal: Journal of Applied Statistics
Pages: 747-752
Issue: 7
Volume: 31
Year: 2004
Keywords: Mobile Communications Expenditures, Zero Observations, Mixture Model, Weibull Distribution,
X-DOI: 10.1080/0266476042000214475
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000214475
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:7:p:747-752
Template-Type: ReDIF-Article 1.0
Author-Name: Haobo Ren
Author-X-Name-First: Haobo
Author-X-Name-Last: Ren
Author-Name: Xiao-Hua Zhou
Author-X-Name-First: Xiao-Hua
Author-X-Name-Last: Zhou
Author-Name: Hua Liang
Author-X-Name-First: Hua
Author-X-Name-Last: Liang
Title: A Flexible Method for Estimating the ROC Curve
Abstract:
In this paper we propose a flexible method for estimating a receiver
operating characteristic (ROC) curve that is based on a continuous-scale
test. The approach is easily understood and efficiently computed, and
robust to the smooth parameter selection, which needs intensive
computation when using local polynomial and smoothing spline techniques.
The results from our simulation experiment indicate that the
moderate-sample numerical performance of our estimator is better than the
empirical ROC curve estimator and comparable to the local linear
estimator. The availability of easy implementation is also illustrated by
our simulation. We apply the proposed method to two real data sets.
Journal: Journal of Applied Statistics
Pages: 773-784
Issue: 7
Volume: 31
Year: 2004
Keywords: Penalized Spline, Kernel Smoothing, Local Polynomial, Bandwidth Selection,
X-DOI: 10.1080/0266476042000214493
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000214493
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:7:p:773-784
Template-Type: ReDIF-Article 1.0
Author-Name: Philip Shively
Author-X-Name-First: Philip
Author-X-Name-Last: Shively
Title: Testing for a Unit Root in ARIMA Processes
Abstract:
A unit root has important long-run implications for many time series in
economics and finance. This paper develops a unit-root test of an
ARIMA(p-1, 1, q) with drift null process against a trend-stationary
ARMA(p, q) alternative process, where the order of the time series is
assumed known through previous statistical testing or relevant theory.
This test uses a point-optimal test statistic, but it estimates the null
and alternative variance-covariance matrices that are used in the test
statistic. Consequently, this test approximates a point-optimal test.
Simulations show that its small-sample size is close to the nominal test
level for a variety of unit-root processes, that it has a robust power
curve against a variety of stationary alternatives, that its combined
small-sample size and power properties are highly competitive with
previous unit-root tests, and that it is robust to conditional
heteroskedasticity. An application to post-Second World War real per
capita gross domestic product is provided.
Journal: Journal of Applied Statistics
Pages: 785-798
Issue: 7
Volume: 31
Year: 2004
Keywords: Point-optimal, Invariant, Unit Root, ARIMA, Gross Domestic Product,
X-DOI: 10.1080/0266476042000214547
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000214547
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:7:p:785-798
Template-Type: ReDIF-Article 1.0
Author-Name: Silvia Ferrari
Author-X-Name-First: Silvia
Author-X-Name-Last: Ferrari
Author-Name: Francisco Cribari-Neto
Author-X-Name-First: Francisco
Author-X-Name-Last: Cribari-Neto
Title: Beta Regression for Modelling Rates and Proportions
Abstract:
This paper proposes a regression model where the response is beta
distributed using a parameterization of the beta law that is indexed by
mean and dispersion parameters. The proposed model is useful for
situations where the variable of interest is continuous and restricted to
the interval (0, 1) and is related to other variables through a regression
structure. The regression parameters of the beta regression model are
interpretable in terms of the mean of the response and, when the logit
link is used, of an odds ratio, unlike the parameters of a linear
regression that employs a transformed response. Estimation is performed by
maximum likelihood. We provide closed-form expressions for the score
function, for Fisher's information matrix and its inverse. Hypothesis
testing is performed using approximations obtained from the asymptotic
normality of the maximum likelihood estimator. Some diagnostic measures
are introduced. Finally, practical applications that employ real data are
presented and discussed.
Journal: Journal of Applied Statistics
Pages: 799-815
Issue: 7
Volume: 31
Year: 2004
Keywords: Beta Distribution, Maximum Likelihood Estimation, Leverage, Proportions, Residuals,
X-DOI: 10.1080/0266476042000214501
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000214501
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:7:p:799-815
Template-Type: ReDIF-Article 1.0
Author-Name: Rosalba Miceli
Author-X-Name-First: Rosalba
Author-X-Name-Last: Miceli
Author-Name: Lara Lusa
Author-X-Name-First: Lara
Author-X-Name-Last: Lusa
Author-Name: Luigi Mariani
Author-X-Name-First: Luigi
Author-X-Name-Last: Mariani
Title: Revising a Prognostic Index Developed for Classification Purposes: An Application to Gastric Cancer Data
Abstract:
A prognostic index (PI) is usually derived from a regression model as a
weighted mean of the covariates, with weights (partial scores)
proportional to the parameter estimates. When a PI is applied to patients
other than those considered for its development, the issue of assessing
its validity on the new case series is crucial. For this purpose, Van
Houwelingen (2000) proposed a method of validation by calibration, which
limits overfitting by embedding the original model into a new one, so that
only a few parameters will have to be estimated. Here we address the
problem of PI validation and revision with the above approach when the PI
has classification purposes and it represents the linear predictor of a
Weibull model, derived from an accelerated failure time parameterization
instead of a proportional hazards one, as originally described by Van
Houwelingen. We show that the Van Houwelingen method can be applied in a
straightforward manner, provided that the parameterization originally used
in the PI model is appropriately taken into account. We also show that
model validation and revision can be carried out by modifying the cut-off
values used for prognostic grouping without affecting the partial scores
of the original PI. This procedure can be applied to simplify the
clinician's use of an established PI for classification purposes.
Journal: Journal of Applied Statistics
Pages: 817-830
Issue: 7
Volume: 31
Year: 2004
Keywords: Prognostic Index, Survival Analysis, Weibull Model, Gastric Cancer,
X-DOI: 10.1080/0266476042000214510
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000214510
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:7:p:817-830
Template-Type: ReDIF-Article 1.0
Author-Name: James Reed
Author-X-Name-First: James
Author-X-Name-Last: Reed
Author-Name: David Stark
Author-X-Name-First: David
Author-X-Name-Last: Stark
Title: Robust Two-Sample Statistics for Testing Equality of Means: A Simulation Study
Abstract:
When testing the equality of the means from two independent normally
distributed populations given that the variances of the two populations
are unknown but assumed equal, the classical two-sample t-test is
recommended. If the underlying population distributions are normal with
unequal and unknown variances, either Welch's t-statistic or
Satterthwaite's Approximate F-test is suggested. However, Welch's
procedure is non-robust under most non-normal distributions. There is a
variable tolerance level around the strict assumptions of data
independence, homogeneity of variances and normality of the distributions.
Few textbooks offer alternatives when one or more of the underlying
assumptions are not defensible.
Journal: Journal of Applied Statistics
Pages: 831-854
Issue: 7
Volume: 31
Year: 2004
Keywords: Behrens-Fisher Problem, Two Sample t-tests, Adaptive Two-sample Robust Tests,
X-DOI: 10.1080/0266476042000214529
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000214529
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:7:p:831-854
Template-Type: ReDIF-Article 1.0
Author-Name: G. J. M. Rosa
Author-X-Name-First: G. J. M.
Author-X-Name-Last: Rosa
Author-Name: D. Gianola
Author-X-Name-First: D.
Author-X-Name-Last: Gianola
Author-Name: C. R. Padovani
Author-X-Name-First: C. R.
Author-X-Name-Last: Padovani
Title: Bayesian Longitudinal Data Analysis with Mixed Models and Thick-tailed Distributions using MCMC
Abstract:
Linear mixed effects models are frequently used to analyse longitudinal
data, due to their flexibility in modelling the covariance structure
between and within observations. Further, it is easy to deal with
unbalanced data, either with respect to the number of observations per
subject or per time period, and with varying time intervals between
observations. In most applications of mixed models to biological sciences,
a normal distribution is assumed both for the random effects and for the
residuals. This, however, makes inferences vulnerable to the presence of
outliers. Here, linear mixed models employing thick-tailed distributions
for robust inferences in longitudinal data analysis are described.
Specific distributions discussed include the Student-t, the slash and the
contaminated normal. A Bayesian framework is adopted, and the Gibbs
sampler and the Metropolis-Hastings algorithms are used to carry out the
posterior analyses. An example with data on orthodontic distance growth in
children is discussed to illustrate the methodology. Analyses based on
either the Student-t distribution or on the usual Gaussian assumption are
contrasted. The thick-tailed distributions provide an appealing robust
alternative to the Gaussian process for modelling distributions of the
random effects and of residuals in linear mixed models, and the MCMC
implementation allows the computations to be performed in a flexible
manner.
Journal: Journal of Applied Statistics
Pages: 855-873
Issue: 7
Volume: 31
Year: 2004
Keywords: Robust-inference, Longitudinal Study, Mixed Model, Thick-tailed Distribution, Heteroscedasticity, Bayesian Inference,
X-DOI: 10.1080/0266476042000214538
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000214538
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:7:p:855-873
Template-Type: ReDIF-Article 1.0
Author-Name: David Hand
Author-X-Name-First: David
Author-X-Name-Last: Hand
Title: Pattern Discovery
Abstract:
This article does not have an abstract
Journal: Journal of Applied Statistics
Pages: 883-884
Issue: 8
Volume: 31
Year: 2004
X-DOI: 10.1080/0266476042000270509
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000270509
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:8:p:883-884
Template-Type: ReDIF-Article 1.0
Author-Name: David Hand
Author-X-Name-First: David
Author-X-Name-Last: Hand
Author-Name: Richard Bolton
Author-X-Name-First: Richard
Author-X-Name-Last: Bolton
Title: Pattern Discovery and Detection: A Unified Statistical Methodology
Abstract:
Modern statistical data analysis is predominantly model-driven, seeking
to decompose an observed data distribution in terms of major underlying
descriptive features modified by some stochastic variation. A large part
of data mining is also concerned with this exercise. However, another
fundamental part of data mining is concerned with detecting anomalies
amongst the vast mass of the data: the small deviations, unusual
observations, unexpected clusters of observations, or surprising blips in
the data, which the model does not explain. We call such anomalies
patterns. For sound reasons, which are outlined in the paper, the data
mining community has tended to focus on the algorithmic aspects of pattern
discovery, and has not developed any general underlying theoretical base.
However, such a base is important for any technology: it helps to steer
the direction in which the technology develops, as well as serving to
provide a basis from which algorithms can be compared, and to indicate
which problems are the important ones waiting to be solved. This paper
attempts to provide such a theoretical base, linking the ideas to
statistical work in spatial epidemiology, scan statistics, outlier
detection, and other areas. One of the striking characteristics of work on
pattern discovery is that the ideas have been developed in several
theoretical arenas, and also in several application domains, with little
apparent awareness of the fundamentally common nature of the problem. Like
model building, pattern discovery is fundamentally an inferential
activity, and is an area in which statisticians can make very significant
contributions.
Journal: Journal of Applied Statistics
Pages: 885-924
Issue: 8
Volume: 31
Year: 2004
Keywords: Patterns, pattern discovery, data mining, association analysis, bioinformatics, technical analysis, market basket analysis, configural frequency analysis, scan statistics, spatial epidemiology,
X-DOI: 10.1080/0266476042000270518
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000270518
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:8:p:885-924
Template-Type: ReDIF-Article 1.0
Author-Name: Gonzalo Navarro
Author-X-Name-First: Gonzalo
Author-X-Name-Last: Navarro
Title: Pattern Matching
Abstract:
An important subtask of the pattern discovery process is pattern
matching, where the pattern sought is already known and we want to
determine how often and where it occurs in a sequence. In this paper we
review the most practical techniques to find patterns of different kinds.
We show how regular expressions can be searched for with general
techniques, and how simpler patterns can be dealt with more simply and
efficiently. We consider exact as well as approximate pattern matching.
Also we cover both sequential searching, where the sequence cannot be
preprocessed, and indexed searching, where we have a data structure built
over the sequence to speed up the search.
Journal: Journal of Applied Statistics
Pages: 925-949
Issue: 8
Volume: 31
Year: 2004
Keywords: Regular expressions, automata, back tracking, suffix trees and arrays, approximate string matching, bit parallelism,
X-DOI: 10.1080/0266476042000270527
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000270527
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:8:p:925-949
Template-Type: ReDIF-Article 1.0
Author-Name: Andrew Lawson
Author-X-Name-First: Andrew
Author-X-Name-Last: Lawson
Author-Name: Allan Clark
Author-X-Name-First: Allan
Author-X-Name-Last: Clark
Author-Name: Carmen Vidal Rodeiro
Author-X-Name-First: Carmen Vidal
Author-X-Name-Last: Rodeiro
Title: Developments in General and Syndromic Surveillance for Small Area Health Data
Abstract:
In this paper we examine a range of issues related to the analysis of
health surveillance data when it is spatially-referenced. The importance
of considering alarm functions derived from likelihood or Bayesian models
is stressed. In addition, we focus on some new developments in predictive
distribution residuals in the analysis.
Journal: Journal of Applied Statistics
Pages: 951-966
Issue: 8
Volume: 31
Year: 2004
Keywords: Syndromic, surveillance, statistics, small area, health,
X-DOI: 10.1080/0266476042000270568
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000270568
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:8:p:951-966
Template-Type: ReDIF-Article 1.0
Author-Name: Joseph Glaz
Author-X-Name-First: Joseph
Author-X-Name-Last: Glaz
Author-Name: Zhenkui Zhang
Author-X-Name-First: Zhenkui
Author-X-Name-Last: Zhang
Title: Multiple Window Discrete Scan Statistics
Abstract:
In this article, multiple scan statistics of variable window sizes are
derived for independent and identically distributed 0-1 Bernoulli trials.
Both one and two dimensional, as well as, conditional and unconditional
cases are treated. The advantage in using multiple scan statistics, as
opposed to single fixed window scan statistics, is that they are more
sensitive in detecting a change in the underlying distribution of the
observed data. We show how to derive simple approximations for the
significance level of these testing procedures and present numerical
results to evaluate their performance.
Journal: Journal of Applied Statistics
Pages: 967-980
Issue: 8
Volume: 31
Year: 2004
Keywords: Combining test statistics, one-dimensional scan statistics, p-values, two-dimensional scan statistics, variable windows,
X-DOI: 10.1080/0266476042000270536
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000270536
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:8:p:967-980
Template-Type: ReDIF-Article 1.0
Author-Name: Alexander Von Eye
Author-X-Name-First: Alexander
Author-X-Name-Last: Von Eye
Author-Name: Eduardo Gutierrez Pena
Author-X-Name-First: Eduardo Gutierrez
Author-X-Name-Last: Pena
Title: Configural Frequency Analysis: The Search for Extreme Cells
Abstract:
Configural Frequency Analysis (CFA) asks whether a cell in a
cross-classification contains more or fewer cases than expected with
respect to some base model. This base model is specified such that cells
with more cases than expected (also called types) can be interpreted from
a substantive perspective. The same applies to cells with fewer cases than
expected (antitypes). This article gives an introduction to both
frequentist and Bayesian approaches to CFA. Specification of base models,
testing, and protection are discussed. In an example, Prediction CFA and
two-sample CFA are illustrated. The discussion focuses on the differences
between CFA and modelling.
Journal: Journal of Applied Statistics
Pages: 981-997
Issue: 8
Volume: 31
Year: 2004
Keywords: Configural frequency analysis (CFA), extreme cells, types, antitypes, base models, protection, frequentist CFA, Bayesian CFA, Dirichlet distribution, contingency table,
X-DOI: 10.1080/0266476042000270545
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000270545
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:8:p:981-997
Template-Type: ReDIF-Article 1.0
Author-Name: Cheolwoo Park
Author-X-Name-First: Cheolwoo
Author-X-Name-Last: Park
Author-Name: J. S. Marron
Author-X-Name-First: J. S.
Author-X-Name-Last: Marron
Author-Name: Vitaliana Rondonotti
Author-X-Name-First: Vitaliana
Author-X-Name-Last: Rondonotti
Title: Dependent SiZer: Goodness-of-Fit Tests for Time Series Models
Abstract:
In this paper, we extend SiZer (SIgnificant ZERo crossing of the
derivatives) to dependent data for the purpose of goodness-of-fit tests
for time series models. Dependent SiZer compares the observed data with a
specific null model being tested by adjusting the statistical inference
using an assumed autocovariance function. This new approach uses a SiZer
type visualization to flag statistically significant differences between
the data and a given null model. The power of this approach is
demonstrated through some examples of time series of Internet traffic
data. It is seen that such time series can have even more burstiness than
is predicted by the popular, long- range dependent, Fractional Gaussian
Noise model.
Journal: Journal of Applied Statistics
Pages: 999-1017
Issue: 8
Volume: 31
Year: 2004
Keywords: Autocovariance function, dependent SiZer, fractional Gaussian noise, Internet traffic data, goodness-of-fit test, SiZer, time series,
X-DOI: 10.1080/0266476042000270554
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000270554
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:8:p:999-1017
Template-Type: ReDIF-Article 1.0
Author-Name: Balaji Padmanabhan
Author-X-Name-First: Balaji
Author-X-Name-Last: Padmanabhan
Title: The Interestingness Paradox in Pattern Discovery
Abstract:
Noting that several rule discovery algorithms in data mining can produce
a large number of irrelevant or obvious rules from data, there has been
substantial research in data mining that addressed the issue of what makes
rules truly 'interesting'. This resulted in the development of a number of
interestingness measures and algorithms that find all interesting rules
from data. However, these approaches have the drawback that many of the
discovered rules, while supposed to be interesting by definition, may
actually (1) be obvious in that they logically follow from other
discovered rules or (2) be expected given some of the other discovered
rules and some simple distributional assumptions. In this paper we argue
that this is a paradox since rules that are supposed to be interesting, in
reality are uninteresting for the above reason. We show that this paradox
exists for various popular interestingness measures and present an
abstract characterization of an approach to alleviate the paradox. We
finally discuss existing work in data mining that addresses this issue and
show how these approaches can be viewed with respect to the
characterization presented here.
Journal: Journal of Applied Statistics
Pages: 1019-1035
Issue: 8
Volume: 31
Year: 2004
Keywords: Interestingness measures, rule discovery, minimality,
X-DOI: 10.1080/0266476042000270563
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000270563
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:8:p:1019-1035
Template-Type: ReDIF-Article 1.0
Author-Name: Steven Gilmour
Author-X-Name-First: Steven
Author-X-Name-Last: Gilmour
Title: Irregular Four-level Response Surface Designs
Abstract:
Four-level response surface designs based on regular two-level fractional
factorial designs were introduced by Edmondson (1991). Here, the methods
are extended to include designs based on irregular two-level fractional
factorials. These designs allow orthogonal blocking and require fewer
experimental units than the regular designs.
Journal: Journal of Applied Statistics
Pages: 1043-1048
Issue: 9
Volume: 31
Year: 2004
Keywords: Agricultural experimentation, experimental design, polynomial regression, pseudo-factors,
X-DOI: 10.1080/0266476042000280391
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000280391
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:9:p:1043-1048
Template-Type: ReDIF-Article 1.0
Author-Name: Manuel Galea
Author-X-Name-First: Manuel
Author-X-Name-Last: Galea
Author-Name: Victor Leiva-Sanchez
Author-X-Name-First: Victor
Author-X-Name-Last: Leiva-Sanchez
Author-Name: Gilberto Paula
Author-X-Name-First: Gilberto
Author-X-Name-Last: Paula
Title: Influence Diagnostics in log-Birnbaum-Saunders Regression Models
Abstract:
In this paper we present various diagnostic methods for a linear
regression model under a logarithmic Birnbaum-Saunders distribution for
the errors, which may be applied for accelerated life testing or to
compare the median lives of several populations. Some influence methods,
such as the local influence, total local influence of an individual and
generalized leverage are derived, analysed and discussed. We also present
a connection between the local influence and generalized leverage methods.
A discussion of the computation of the likelihood displacement as well as
the normal curvature in the local influence method are presented. Finally,
an example with real data is given for illustration.
Journal: Journal of Applied Statistics
Pages: 1049-1064
Issue: 9
Volume: 31
Year: 2004
Keywords: Birnbaum- Saunders distribution, life distributions, sinh-normal distribution, fatigue life, log-linear models, influence diagnostic, generalized leverage, local influence, maximum likelihood estimator,
X-DOI: 10.1080/0266476042000280409
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000280409
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:9:p:1049-1064
Template-Type: ReDIF-Article 1.0
Author-Name: M. L. Aggarwal
Author-X-Name-First: M. L.
Author-X-Name-Last: Aggarwal
Author-Name: Lih-Yuan Deng
Author-X-Name-First: Lih-Yuan
Author-X-Name-Last: Deng
Author-Name: Mithilesh Kumar Jha
Author-X-Name-First: Mithilesh Kumar
Author-X-Name-Last: Jha
Title: Some New Residual Treatment Effects Designs for Comparing Test Treatments with a Control
Abstract:
Pigeon & Raghavarao (1987) introduced control balanced residual treatment
effects designs for the situation where one treatment is a control or
standard and is to be compared with the v test treatments, and they have
also given methods of construction of control balanced residual treatment
effects designs and have investigated their efficiencies. In this paper we
have developed some new families of control balanced residual treatment
effects designs, which are Schur-optimal.
Journal: Journal of Applied Statistics
Pages: 1065-1081
Issue: 9
Volume: 31
Year: 2004
Keywords: Residual treatment effects designs, control treatment, test treatments, Schur-optimality,
X-DOI: 10.1080/0266476042000280382
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000280382
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:9:p:1065-1081
Template-Type: ReDIF-Article 1.0
Author-Name: Hafzullah Aksoy
Author-X-Name-First: Hafzullah
Author-X-Name-Last: Aksoy
Title: Using Markov Chains for Non-perennial Daily Streamflow Data Generation
Abstract:
The use of Markov chains to simulate non-perennial streamflow data is
considered. A non-perennial stream may be thought as having three states,
namely zero flow, increasing flow and decreasing flow, for which a
three-state Markov chain can be constructed. Alternatively, two two-state
Markov chains can be used, the first of which represents the existence and
non-existence of flow, whereas the second deals with the increment and
decrement in the flow for periods with flow. Probabilistic relationships
between the two alternatives are derived. Their performances in simulating
the state of the stream are compared on the basis of data from two
different geographical regions in Turkey. It is concluded that both
alternatives are capable of simulating the state of the stream.
Journal: Journal of Applied Statistics
Pages: 1083-1094
Issue: 9
Volume: 31
Year: 2004
Keywords: Daily streamflow, data generation, Markov chain, simulation,
X-DOI: 10.1080/0266476042000280418
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000280418
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:9:p:1083-1094
Template-Type: ReDIF-Article 1.0
Author-Name: M. L. Menendez
Author-X-Name-First: M. L.
Author-X-Name-Last: Menendez
Author-Name: J. A. Pardo
Author-X-Name-First: J. A.
Author-X-Name-Last: Pardo
Author-Name: L. Pardo
Author-X-Name-First: L.
Author-X-Name-Last: Pardo
Title: Tests of Symmetry in Three-dimensional Contingency Tables Based on Phi-divergence Statistics
Abstract:
In this paper we introduce a family of test statistics for testing
complete symmetry in three-dimensional contingency tables based on phi-
divergence families. These test statistics yield the likelihood ratio test
and the Pearson test statistics as special cases. Asymptotic distribution
for the new test statistics are derived under both the null and the
alternative hypotheses. A simulation study is presented to show that some
new statistics offer an attractive alternative to the classical Pearson
and likelihood ratio test statistics for this problem of complete
symmetry.
Journal: Journal of Applied Statistics
Pages: 1095-1114
Issue: 9
Volume: 31
Year: 2004
Keywords: Three dimensional contingency table, complete symmetry, φ-divergence statistic,
X-DOI: 10.1080/0266476042000280373
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000280373
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:9:p:1095-1114
Template-Type: ReDIF-Article 1.0
Author-Name: W. L. Pearn
Author-X-Name-First: W. L.
Author-X-Name-Last: Pearn
Author-Name: Y. C. Chang
Author-X-Name-First: Y. C.
Author-X-Name-Last: Chang
Author-Name: Chien-Wei Wu
Author-X-Name-First: Chien-Wei
Author-X-Name-Last: Wu
Title: Distributional and Inferential Properties of the Process Loss Indices
Abstract:
Johnson (1992) developed the process loss index Le, which is defined as
the ratio of the expected quadratic loss to the square of half
specification width. Tsui (1997) expressed the index LeasLe=Lpe+Lot, which
provides an uncontaminated separation between information concerning the
potential relative expected loss (Lpe) and the relative off-target squared
(Lot), as the ratio of the process variance and the square of the half
specification width, and the square of the ratio of the deviation of mean
from the target and the half specification width, respectively. In this
paper, we consider these three loss function indices, and investigate the
statistical properties of their natural estimators. For the three indices,
we obtain their UMVUEs and MLEs, and compare the reliability of the two
estimators based on the relative mean squared errors. In addition, we
construct 90%, 95%, and 99% upper confidence limits, and the maximum
values of L^e for which the process is capable, 90%, 95%, and 99% of the
time. The results obtained in this paper are useful to the practitioners
in choosing good estimators and making reliable decisions on judging
process capability.
Journal: Journal of Applied Statistics
Pages: 1115-1135
Issue: 9
Volume: 31
Year: 2004
Keywords: MLE, potential relative expected loss, relative expected loss, relative mean squared error, relative off-target squared, UMVUE,
X-DOI: 10.1080/0266476042000280364
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000280364
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:9:p:1115-1135
Template-Type: ReDIF-Article 1.0
Author-Name: Genming Shi
Author-X-Name-First: Genming
Author-X-Name-Last: Shi
Author-Name: N. Rao Chaganty
Author-X-Name-First: N. Rao
Author-X-Name-Last: Chaganty
Title: Application of Quasi-Least Squares to Analyse Replicated Autoregressive Time Series Regression Models
Abstract:
Time series regression models have been widely studied in the literature
by several authors. However, statistical analysis of replicated time
series regression models has received little attention. In this paper, we
study the application of the quasi-least squares method to estimate the
parameters in a replicated time series model with errors that follow an
autoregressive process of order p. We also discuss two other established
methods for estimating the parameters: maximum likelihood assuming
normality and the Yule-Walker method. When the number of repeated
measurements is bounded and the number of replications n goes to infinity,
the regression and the autocorrelation parameters are consistent and
asymptotically normal for all three methods of estimation. Basically, the
three methods estimate the regression parameter efficiently and differ in
how they estimate the autocorrelation. When p=2, for normal data we use
simulations to show that the quasi-least squares estimate of the
autocorrelation is undoubtedly better than the Yule-Walker estimate. And
the former estimate is as good as the maximum likelihood estimate almost
over the entire parameter space.
Journal: Journal of Applied Statistics
Pages: 1147-1156
Issue: 10
Volume: 31
Year: 2004
Keywords: Autoregression, quasi-least squares, relative efficiency, repeated measurements, time series regression models,
X-DOI: 10.1080/0266476042000285530
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000285530
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:10:p:1147-1156
Template-Type: ReDIF-Article 1.0
Author-Name: Jin-Guan Lin
Author-X-Name-First: Jin-Guan
Author-X-Name-Last: Lin
Author-Name: Bo-Cheng Wei
Author-X-Name-First: Bo-Cheng
Author-X-Name-Last: Wei
Author-Name: Nan-Song Zhang
Author-X-Name-First: Nan-Song
Author-X-Name-Last: Zhang
Title: Varying Dispersion Diagnostics for Inverse Gaussian Regression Models
Abstract:
Homogeneity of dispersion parameters is a standard assumption in inverse
Gaussian regression analysis. However, this assumption is not necessarily
appropriate. This paper is devoted to the test for varying dispersion in
general inverse Gaussian linear regression models. Based on the modified
profile likelihood (Cox & Reid, 1987), the adjusted score test for varying
dispersion is developed and illustrated with Consumer- Product Sales data
(Whitmore, 1986) and Gas vapour data (Weisberg, 1985). The effectiveness
of orthogonality transformation and the properties of a score statistic
and its adjustment are investigated through Monte Carlo simulations.
Journal: Journal of Applied Statistics
Pages: 1157-1170
Issue: 10
Volume: 31
Year: 2004
Keywords: Adjusted score test, dispersion parameter, inverse Gaussian models, orthogonality transformation, simulation study, varying dispersion,
X-DOI: 10.1080/0266476042000285512
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000285512
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:10:p:1157-1170
Template-Type: ReDIF-Article 1.0
Author-Name: A. F. B. Costa
Author-X-Name-First: A. F. B.
Author-X-Name-Last: Costa
Author-Name: M. A. Rahim
Author-X-Name-First: M. A.
Author-X-Name-Last: Rahim
Title: Monitoring Process Mean and Variability with One Non-central Chi-square Chart
Abstract:
Traditionally, an X-chart is used to control the process mean and an
R-chart to control the process variance. However, these charts are not
sensitive to small changes in process parameters. A good alternative to
these charts is the exponentially weighted moving average (EWMA) control
chart for controlling the process mean and variability, which is very
effective in detecting small process disturbances. In this paper, we
propose a single chart that is based on the non-central chi-square
statistic, which is more effective than the joint X and R charts in
detecting assignable cause(s) that change the process mean and/or increase
variability. It is also shown that the EWMA control chart based on a
non-central chi-square statistic is more effective in detecting both
increases and decreases in mean and/or variability.
Journal: Journal of Applied Statistics
Pages: 1171-1183
Issue: 10
Volume: 31
Year: 2004
Keywords: Monitoring process mean and variance, X chart, EWMA chart, non-central chi-square chart,
X-DOI: 10.1080/0266476042000285503
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000285503
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:10:p:1171-1183
Template-Type: ReDIF-Article 1.0
Author-Name: Mu'azu Abujiya
Author-X-Name-First: Mu'azu
Author-X-Name-Last: Abujiya
Author-Name: Hassen Muttlak
Author-X-Name-First: Hassen
Author-X-Name-Last: Muttlak
Title: Quality Control Chart for the Mean using Double Ranked Set Sampling
Abstract:
In this paper, an attempt is made to develop Quality Control Charts for
monitoring the process mean based on Double Ranked Set Sampling (DRSS)
rather than the traditional Simple Random Sampling (SRS). Considering a
normal population and several shift values, the performance of the Average
Run Length (ARL) of these new charts was compared with the control charts
based on Ranked Set Sampling (RSS) and SRS with the same number of
observations. It is shown that the new charts do a better job of detecting
changes in process mean compared with SRS and RSS.
Journal: Journal of Applied Statistics
Pages: 1185-1201
Issue: 10
Volume: 31
Year: 2004
Keywords: Average run length, double median ranked set sampling, lower central limit, median double ranked set sampling, median ranked set sampling, ranked set sampling and upper central limit,
X-DOI: 10.1080/0266476042000285549
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000285549
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:10:p:1185-1201
Template-Type: ReDIF-Article 1.0
Author-Name: Seung-Hoon Yoo
Author-X-Name-First: Seung-Hoon
Author-X-Name-Last: Yoo
Title: A Note on a Bayesian Approach to a Dichotomous Choice Environmental Valuation Model
Abstract:
As an alternative to the classical approach for analysing dichotomous
choice environmental valuation data, this note develops a Bayesian
approach by using the idea of Gibbs sampling and data augmentation. A
by-product from the approach is a welfare measure, such as the mean
willingness to pay, and its confidence interval, which can be used for
policy analysis.
Journal: Journal of Applied Statistics
Pages: 1203-1209
Issue: 10
Volume: 31
Year: 2004
Keywords: Bayesian approach, dichotomous choice environmental valuation, Gibbs sampling, data augmentation,
X-DOI: 10.1080/0266476042000285558
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000285558
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:10:p:1203-1209
Template-Type: ReDIF-Article 1.0
Author-Name: David Reineke
Author-X-Name-First: David
Author-X-Name-Last: Reineke
Author-Name: John Crown
Author-X-Name-First: John
Author-X-Name-Last: Crown
Title: Estimation of Hazard, Density and Survivor Functions for Randomly Censored Data
Abstract:
Maximum likelihood estimation and goodness-of-fit techniques are used
within a competing risks framework to obtain maximum likelihood estimates
of hazard, density, and survivor functions for randomly right-censored
variables. Goodness-of- fit techniques are used to fit distributions to
the crude lifetimes, which are used to obtain an estimate of the hazard
function, which, in turn, is used to construct the survivor and density
functions of the net lifetime of the variable of interest. If only one of
the crude lifetimes can be adequately characterized by a parametric model,
then semi-parametric estimates may be obtained using a maximum likelihood
estimate of one crude lifetime and the empirical distribution function of
the other. Simulation studies show that the survivor function estimates
from crude lifetimes compare favourably with those given by the
product-limit estimator when crude lifetimes are chosen correctly. Other
advantages are discussed.
Journal: Journal of Applied Statistics
Pages: 1211-1225
Issue: 10
Volume: 31
Year: 2004
Keywords: Randomly censored data, competing risks, net and crude lifetimes, maximum likelihood estimation, goodness-of-fit, semi-parametric models,
X-DOI: 10.1080/0266476042000285521
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000285521
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:10:p:1211-1225
Template-Type: ReDIF-Article 1.0
Author-Name: Clive Granger
Author-X-Name-First: Clive
Author-X-Name-Last: Granger
Author-Name: Yongil Jeon
Author-X-Name-First: Yongil
Author-X-Name-Last: Jeon
Title: Forecasting Performance of Information Criteria with Many Macro Series
Abstract:
Stock & Watson (1999) consider the relative quality of different
univariate forecasting techniques. This paper extends their study on
forecasting practice, comparing the forecasting performance of two popular
model selection procedures, the Akaike information criterion (AIC) and the
Bayesian information criterion (BIC). This paper considers several topics:
how AIC and BIC choose lags in autoregressive models on actual series, how
models so selected forecast relative to an AR(4) model, the effect of
using a maximum lag on model selection, and the forecasting performance of
combining AR(4), AIC, and BIC models with an equal weight.
Journal: Journal of Applied Statistics
Pages: 1227-1240
Issue: 10
Volume: 31
Year: 2004
Keywords: Large macro model, information criterion, AIC, BIC,
X-DOI: 10.1080/0266476042000285495
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000285495
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:10:p:1227-1240
Template-Type: ReDIF-Article 1.0
Author-Name: Gordana Derado
Author-X-Name-First: Gordana
Author-X-Name-Last: Derado
Author-Name: Kanti Mardia
Author-X-Name-First: Kanti
Author-X-Name-Last: Mardia
Author-Name: Vic Patrangenaru
Author-X-Name-First: Vic
Author-X-Name-Last: Patrangenaru
Author-Name: Hilary Thompson
Author-X-Name-First: Hilary
Author-X-Name-Last: Thompson
Title: A Shape-based Glaucoma Index for Tomographic Images
Abstract:
We examine the use of Confocal Laser Tomographic images for detecting
glaucoma. From the clinical aspect, the optic nerve head's (ONH) area
contains all the relevant information on glaucoma. The shape of ONH is
approximately a skewed cup. We summarize its shape by three biological
landmarks on the neural-rim and the fourth landmark as the point of the
maximum depth, which is approximately the point where the optic nerve
enters this eye cup. These four landmarks are extracted from the images
related to some Rhesus monkeys before and after inducing glaucoma.
Previous analysis on Bookstein shape coordinates of these four landmarks
revealed only marginally significant findings. From clinical experience,
it is believed that the ratio depth to diameter of the eye cup provides a
useful measure of the shape change. We consider the bootstrap distribution
of this normalized 'depth' (G) and give evidence that it provides an
appropriate measure of the shape change. This measure G is labelled as the
glaucoma index. Further experiments are in progress to validate its use
for glaucoma in humans.
Journal: Journal of Applied Statistics
Pages: 1241-1248
Issue: 10
Volume: 31
Year: 2004
Keywords: Glaucoma index, medical imaging, high level image analysis, anatomical landmarks, non- parametric bootstrap,
X-DOI: 10.1080/0266476042000285486
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000285486
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:31:y:2004:i:10:p:1241-1248
Template-Type: ReDIF-Article 1.0
Author-Name: Don Amos
Author-X-Name-First: Don
Author-X-Name-Last: Amos
Author-Name: T. Randolph Beard
Author-X-Name-First: T. Randolph
Author-X-Name-Last: Beard
Author-Name: Steven Caudill
Author-X-Name-First: Steven
Author-X-Name-Last: Caudill
Title: A statistical analysis of the handling characteristics of certain sporting arms: frontier regression, the moment of inertia, and the radius of gyration
Abstract:
This article applies composed error frontier regression techniques to
estimate the minimal moments of inertia and radii of gyration for a unique
and varied sample of shotguns. We find that minimum inertia depends on
weight, center of gravity, length of pull, and barrel length, but not on
gauge, action type, or number of barrels. Curiously, minimal radii of
gyration does not depend on barrel length, suggesting that the constraints
on these two related but non-identical measures of handling are
significantly different despite their high correlation. We also provide
evidence in support of G. T. Garwood's claim that a lower inertia, other
things equal, is a market-validated characteristic associated with
quality.
Journal: Journal of Applied Statistics
Pages: 3-16
Issue: 1
Volume: 32
Year: 2005
Keywords: Stochastic frontier, best practice, moment of inertia,
X-DOI: 10.1080/0266476042000305113
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000305113
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:1:p:3-16
Template-Type: ReDIF-Article 1.0
Author-Name: Neil Marks
Author-X-Name-First: Neil
Author-X-Name-Last: Marks
Title: Estimation of Weibull parameters from common percentiles
Abstract:
Estimation of Weibull distribution shape and scale parameters is
accomplished through use of symmetrically located percentiles from a
sample. The process requires algebraic solution of two equations derived
from the cumulative distribution function. Three alternatives examined are
compared for precision and variability with maximum likelihood (MLE) and
least squares (LS) estimators. The best percentile estimator (using the
10th and 90th) is inferior to MLE in variability and to one least squares
estimator in accuracy and variability to a small degree. However,
application of a correction factor related to sample size improves the
percentile estimator substantially, making it more accurate than LS.
Journal: Journal of Applied Statistics
Pages: 17-24
Issue: 1
Volume: 32
Year: 2005
Keywords: Parameter estimation, Weibull distribution, percentiles,
X-DOI: 10.1080/0266476042000305122
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000305122
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:1:p:17-24
Template-Type: ReDIF-Article 1.0
Author-Name: Chao-Yu Chou
Author-X-Name-First: Chao-Yu
Author-X-Name-Last: Chou
Author-Name: CHung-Ho Chen
Author-X-Name-First: CHung-Ho
Author-X-Name-Last: Chen
Author-Name: Hui-Rong Liu
Author-X-Name-First: Hui-Rong
Author-X-Name-Last: Liu
Title: Acceptance control charts for non-normal data
Abstract:
Control charts are one of the most important methods in industrial
process control. The acceptance control chart is generally applied in
situations when an X-super-¯ chart is used to control the fraction of
conforming units produced by the process and where 6-sigma spread of the
process is smaller than the spread in the specification limits.
Traditionally, when designing control charts, one usually assumes that the
data or measurements are normally distributed. However, this assumption
may not be true in some processes. In this paper, we use the Burr
distribution, which is employed to represent various non-normal
distributions, to determine the appropriate control limits or sample size
for the acceptance control chart under non-normality. Some numerical
examples are given for illustration. From the presented examples, ignoring
the effect of non-normality in the data leads to a higher type I or type
II error probability.
Journal: Journal of Applied Statistics
Pages: 25-36
Issue: 1
Volume: 32
Year: 2005
Keywords: Control chart, non-normality, skewness, kurtosis, the Burr distribution,
X-DOI: 10.1080/0266476042000305131
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000305131
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:1:p:25-36
Template-Type: ReDIF-Article 1.0
Author-Name: James Reed
Author-X-Name-First: James
Author-X-Name-Last: Reed
Title: Contributions to two-sample statistics
Abstract:
When testing the equality of the means from two independent normally
distributed populations given that the variances of the two populations
are unknown but assumed equal, the classical Student's two-sample t-test
is recommended. If the underlying population distributions are normal with
unequal and unknown variances, either Welch's t-statistic or
Satterthwaite's approximate F test is suggested. However, Welch's
procedure is non-robust under most non-normal distributions. There is a
variable tolerance level around the strict assumptions of data
independence, homogeneity of variances, and identical and normal
distributions. Few textbooks offer alternatives when one or more of the
underlying assumptions are not defensible. While there are more than a few
non-parametric (rank) procedures that provide alternatives to Student's
t-test, we restrict this review to the promising alternatives to Student's
two-sample t-test in non-normal models.
Journal: Journal of Applied Statistics
Pages: 37-44
Issue: 1
Volume: 32
Year: 2005
Keywords: Robust two-sample t-tests, symmetric trimmed means, asymmetric trimmed means, linear rank statistics, transformation statistics,
X-DOI: 10.1080/0266476042000305140
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000305140
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:1:p:37-44
Template-Type: ReDIF-Article 1.0
Author-Name: George Halkos
Author-X-Name-First: George
Author-X-Name-Last: Halkos
Author-Name: Ilias Kevork
Author-X-Name-First: Ilias
Author-X-Name-Last: Kevork
Title: A comparison of alternative unit root tests
Abstract:
In this paper we evaluate the performance of three methods for testing
the existence of a unit root in a time series, when the models under
consideration in the null hypothesis do not display autocorrelation in the
error term. In such cases, simple versions of the Dickey-Fuller test
should be used as the most appropriate ones instead of the known augmented
Dickey-Fuller or Phillips-Perron tests. Through Monte Carlo simulations we
show that, apart from a few cases, testing the existence of a unit root we
obtain actual type I error and power very close to their nominal levels.
Additionally, when the random walk null hypothesis is true, by gradually
increasing the sample size, we observe that p-values for the drift in the
unrestricted model fluctuate at low levels with small variance and the
Durbin-Watson (DW) statistic is approaching 2 in both the unrestricted and
restricted models. If, however, the null hypothesis of a random walk is
false, taking a larger sample, the DW statistic in the restricted model
starts to deviate from 2 while in the unrestricted model it continues to
approach 2. It is also shown that the probability not to reject that the
errors are uncorrelated, when they are indeed not correlated, is higher
when the DW test is applied at 1% nominal level of significance.
Journal: Journal of Applied Statistics
Pages: 45-60
Issue: 1
Volume: 32
Year: 2005
Keywords: Unit root tests, type I error, power of the test, Monte Carlo simulations,
X-DOI: 10.1080/0266476052000330286
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476052000330286
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:1:p:45-60
Template-Type: ReDIF-Article 1.0
Author-Name: Alex Riba
Author-X-Name-First: Alex
Author-X-Name-Last: Riba
Author-Name: Josep Ginebra
Author-X-Name-First: Josep
Author-X-Name-Last: Ginebra
Title: Change-point estimation in a multinomial sequence and homogeneity of literary style
Abstract:
To help settle the debate around the authorship of Tirant lo Blanc, all
the words in each chapter of that book are categorized according to their
length and the appearance of various words is counted. The graphical
exploration of the sequences of multinomial observations obtained reveals
a clear single sudden change point that is consistently estimated to be
between chapters 371 and 382 and might indicate a switch of author.
Correspondence analysis indicates that at the end of the book the words
tend to be longer and the frequency of various words changes
significantly. By doing a cluster analysis of the multinomial
observations, the evidence in favor of the existence of that stylistic
boundary is strengthened, because the two clusters obtained match very
closely the before and after change-point groups; only a few chapters at
the end of the book appear to be misclassified by the change point.
Journal: Journal of Applied Statistics
Pages: 61-74
Issue: 1
Volume: 32
Year: 2005
Keywords: Correspondence analysis, multinomial cluster analysis, stylometry, word length,
X-DOI: 10.1080/0266476052000330295
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476052000330295
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:1:p:61-74
Template-Type: ReDIF-Article 1.0
Author-Name: Jens Nielsen
Author-X-Name-First: Jens
Author-X-Name-Last: Nielsen
Author-Name: Henrik Jensen
Author-X-Name-First: Henrik
Author-X-Name-Last: Jensen
Author-Name: Per Andersen
Author-X-Name-First: Per
Author-X-Name-Last: Andersen
Title: Creating a reference to a complex emergency situation using time series methods: war in Guinea-Bissau 1998-1999
Abstract:
Impacts of complex emergencies or relief interventions have often been
evaluated by absolute mortality compared to international standardized
mortality rates. A better evaluation would be to compare with local
baseline mortality of the affected populations. A projection of
population-based survival data into time of emergency or intervention
based on information from before the emergency may create a local baseline
reference. We find a log-transformed Gaussian time series model where
standard errors of the estimated rates are included in the variance to
have the best forecasting capacity. However, if time-at-risk during the
forecasted period is known then forecasting might be done using a Poisson
time series model with overdispersion. Whatever, the standard error of the
estimated rates must be included in the variance of the model either in an
additive form in a Gaussian model or in a multiplicative form by
overdispersion in a Poisson model. Data on which the forecasting is based
must be modelled carefully concerning not only calendar-time trends but
also periods with excessive frequency of events (epidemics) and seasonal
variations to eliminate residual autocorrelation and to make a proper
reference for comparison, reflecting changes over time during the
emergency. Hence, when modelled properly it is possible to predict a
reference to an emergency-affected population based on local conditions.
We predicted childhood mortality during the war in Guinea-Bissau
1998-1999. We found an increased mortality in the first half-year of the
war and a mortality corresponding to the expected one in the last
half-year of the war.
Journal: Journal of Applied Statistics
Pages: 75-86
Issue: 1
Volume: 32
Year: 2005
Keywords: Time series, forecasting, Poisson regression, mixed models, complex emergency, mortality,
X-DOI: 10.1080/0266476042000305168
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000305168
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:1:p:75-86
Template-Type: ReDIF-Article 1.0
Author-Name: Jitendra Singh Tomar
Author-X-Name-First: Jitendra Singh
Author-X-Name-Last: Tomar
Author-Name: Seema Jaggi
Author-X-Name-First: Seema
Author-X-Name-Last: Jaggi
Author-Name: Cini Varghese
Author-X-Name-First: Cini
Author-X-Name-Last: Varghese
Title: On totally balanced block designs for competition effects
Abstract:
Competition between neighbouring units in field experiments is a serious
source of bias. The study of a competing situation needs construction of
an environment in which it can happen and the competing units have to
appear in a predetermined pattern. This paper describes methods of
constructing incomplete block designs balanced for neighbouring
competition effects. The designs obtained are totally balanced in the
sense that all the effects, direct and neighbours, are estimated with the
same variance. The efficiency of these designs has been computed as
compared to a complete block design balanced for neighbours and a
catalogue has also been prepared.
Journal: Journal of Applied Statistics
Pages: 87-97
Issue: 1
Volume: 32
Year: 2005
Keywords: Competition effects, circular design, totally balanced design, MOLS,
X-DOI: 10.1080/0266476042000305177
File-URL: http://www.tandfonline.com/doi/abs/10.1080/0266476042000305177
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:1:p:87-97
Template-Type: ReDIF-Article 1.0
Author-Name: Athanasios Micheas
Author-X-Name-First: Athanasios
Author-X-Name-Last: Micheas
Author-Name: Dipak Dey
Author-X-Name-First: Dipak
Author-X-Name-Last: Dey
Title: Assessing shape differences in populations of shapes using the complex watson shape distribution
Abstract:
This paper presents a novel Bayesian method based on the complex Watson
shape distribution that is used in detecting shape differences between the
second thoracic vertebrae for two groups of mice, small and large,
categorized according to their body weight. Considering the data provided
in Johnson et al. (1988), we provide Bayesian methods of estimation as
well as highest posterior density (HPD) estimates for modal vertebrae
shapes within each group. Finally, we present a classification procedure
that can be used in any shape classification experiment, and apply it for
categorizing new vertebrae shapes in small or large groups.
Journal: Journal of Applied Statistics
Pages: 105-116
Issue: 2
Volume: 32
Year: 2005
Keywords: Average shape difference, complex Watson shape distribution, HPD credible set, modal shape,
X-DOI: 10.1080/02664760500054137
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054137
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:2:p:105-116
Template-Type: ReDIF-Article 1.0
Author-Name: Matthew Hall
Author-X-Name-First: Matthew
Author-X-Name-Last: Hall
Author-Name: Matthew Mayo
Author-X-Name-First: Matthew
Author-X-Name-Last: Mayo
Title: The impact of correlated readings on the estimation of the average area under readers' ROC curves
Abstract:
Receiver operating characteristic (ROC) analysis has been used in a
variety of settings since it was first declassified by the United States
government over 60 years ago. One venue in which it has received
particular attention is in the field of radiology. In radiology, as in
other areas of application, ROC analysis is used to assess the ability of
a diagnostic test to distinguish between two opposing states. One useful
descriptor in ROC analysis is the area under the ROC curve. At times, it
is useful and insightful to average ROC curves in order to create a single
curve that summarizes all of the data from multiple readers. In this
paper, we investigate the impact of correlated readings on the average
area under two readers' ROC curves using several common averaging
strategies, and then apply the results to a radiologic study.
Journal: Journal of Applied Statistics
Pages: 117-125
Issue: 2
Volume: 32
Year: 2005
Keywords: Receiver operating characteristic curve, correlated data,
X-DOI: 10.1080/02664760500054152
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054152
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:2:p:117-125
Template-Type: ReDIF-Article 1.0
Author-Name: J. A. Amaral
Author-X-Name-First: J. A.
Author-X-Name-Last: Amaral
Author-Name: E. P. Pereira
Author-X-Name-First: E. P.
Author-X-Name-Last: Pereira
Author-Name: M. T. Paixao
Author-X-Name-First: M. T.
Author-X-Name-Last: Paixao
Title: Data and projections of HIV/AIDS cases in Portugal: an unstoppable epidemic?
Abstract:
The size of the affected population with HIV/AIDS is a vital question
asked by healthcare providers. A statistical procedure called
Back-calculation has been the most widely used method to answer that
question. Recent discussions suggest that this method is gradually
becoming less appropriate for reliable incidence and prevalence estimates,
as it does not take into account the effect of treatment. In spite of
this, in the current paper that method and a worst-case scenario are used
to assess the quality of previous projections and obtain new ones. The
first problem faced was the need to account for reporting delays, no
reporting and underreporting. The adjusted AIDS incidence data were then
used to obtain lower bounds on the size of the AIDS epidemic, using the
back-calculation methodology. A Weibull and Gamma distribution was
considered for the latency period distribution. The EM algorithm was
applied to obtain maximum likelihood estimates of the HIV incidence. The
density of infection times was parameterized as a step function. The
methodology is applied to AIDS incidence in Portugal for four different
transmission categories (injecting drug users, heterosexual, homo/bisexual
and other) to obtain short-term projections (2002-2005) and an estimate of
the minimum size of the epidemic.
Journal: Journal of Applied Statistics
Pages: 127-140
Issue: 2
Volume: 32
Year: 2005
Keywords: HIV/AIDS, back-calculation, projections, Portugal,
X-DOI: 10.1080/02664760500054160
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054160
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:2:p:127-140
Template-Type: ReDIF-Article 1.0
Author-Name: Paul Zhang
Author-X-Name-First: Paul
Author-X-Name-Last: Zhang
Title: Multiple imputation of missing data with ante-dependence covariance structure
Abstract:
A controlled clinical trial was conducted to investigate the efficacy
effect of a chemical compound in the treatment of Premenstrual Dysphoric
Disorder (PMDD). The data from the trial showed a non-monotone pattern of
missing data and an ante-dependence covariance structure. A new analytical
method for imputing the missing data with the ante-dependence covariance
is proposed. The PMDD data are analysed by the non-imputation method and
two imputation methods: the proposed method and the MCMC method.
Journal: Journal of Applied Statistics
Pages: 141-155
Issue: 2
Volume: 32
Year: 2005
Keywords: Missing data, multiple imputation, ante-dependence covariance,
X-DOI: 10.1080/02664760500054178
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054178
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:2:p:141-155
Template-Type: ReDIF-Article 1.0
Author-Name: Victor Guerrero
Author-X-Name-First: Victor
Author-X-Name-Last: Guerrero
Title: Restricted estimation of an adjusted time series: application to Mexico's industrial production index
Abstract:
The inclusion of linear deterministic effects in a time series model is
important to get an appropriate specification. Such effects may be due to
calendar variation, outlying observations or interventions. This article
proposes a two-step method for estimating an adjusted time series and the
parameters of its linear deterministic effects simultaneously. Although
the main goal when applying this method in practice might only be to
estimate the adjusted series, an important by-product is a substantial
increase in efficiency in the estimates of the deterministic effects. Some
theoretical examples are presented to demonstrate the intuitive appeal of
this proposal. Then the methodology is applied on two real datasets. One
of these applications investigates the importance of the 1995 economic
crisis on Mexico's industrial production index.
Journal: Journal of Applied Statistics
Pages: 157-177
Issue: 2
Volume: 32
Year: 2005
Keywords: Deterministic effects, intervention analysis, minimum mean square error, precision share,
X-DOI: 10.1080/02664760500054186
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054186
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:2:p:157-177
Template-Type: ReDIF-Article 1.0
Author-Name: Jan Graffelman
Author-X-Name-First: Jan
Author-X-Name-Last: Graffelman
Title: Enriched biplots for canonical correlation analysis
Abstract:
This paper discusses biplots of the between-set correlation matrix
obtained by canonical correlation analysis. It is shown that these biplots
can be enriched with the representation of the cases of the original data
matrices. A representation of the cases that is optimal in the generalized
least squares sense is obtained by the superposition of a scatterplot of
the canonical variates on the biplot of the between-set correlation
matrix. Goodness of fit statistics for all correlation and data matrices
involved in canonical correlation analysis are discussed. It is shown that
adequacy and redundancy coefficients are in fact statistics that express
the goodness of fit of the original data matrices in the biplot. The
within-set correlation matrix that is represented in standard coordinates
always has a better goodness of fit than the within-set correlation matrix
that is represented in principal coordinates. Given certain scalings, the
scalar products between variable vectors approximate correlations better
than the cosines of angles between variable vectors. Several data sets are
used to illustrate the results.
Journal: Journal of Applied Statistics
Pages: 173-188
Issue: 2
Volume: 32
Year: 2005
Keywords: Canonical weights, canonical loadings, supplementary variables, generalized least squares, goodness of fit,
X-DOI: 10.1080/02664760500054202
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054202
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:2:p:173-188
Template-Type: ReDIF-Article 1.0
Author-Name: S. Hussain
Author-X-Name-First: S.
Author-X-Name-Last: Hussain
Author-Name: R. Harrison
Author-X-Name-First: R.
Author-X-Name-Last: Harrison
Author-Name: J. Ayres
Author-X-Name-First: J.
Author-X-Name-Last: Ayres
Author-Name: S. Walter
Author-X-Name-First: S.
Author-X-Name-Last: Walter
Author-Name: J. Hawker
Author-X-Name-First: J.
Author-X-Name-Last: Hawker
Author-Name: R. Wilson
Author-X-Name-First: R.
Author-X-Name-Last: Wilson
Author-Name: G. Shukur
Author-X-Name-First: G.
Author-X-Name-Last: Shukur
Title: Estimation and forecasting hospital admissions due to Influenza: Planning for winter pressure. The case of the West Midlands, UK
Abstract:
Winters are a difficult period for the National Health Service (NHS) in
the United Kingdom (UK), due to the combination of cold weather and the
increased likelihood of respiratory infections, especially influenza. In
this article we present a proper statistical time series approach for
modelling and analysing weekly hospital admissions in the West Midlands in
the UK during the period week 15/1990 to week 14/1999. We consider three
variables, namely, hospital admissions, general practitioner consultants,
and minimum temperature. The autocorrelations of each series are shown to
decay hyperbolically. The correlations of hospital admission and the lag
of other series also decay hyperbolically but with different speed and
directions. One of the main objectives of this paper is to show that each
of the three series can be represented by a Fractional Differenced
Autoregressive integrated moving average model, (FDA). Further, the
hospital admission winter and summer residuals shows significant
interdependency, which may be interpreted as hidden periodicities within
the last 10-years time interval. The short-range (8 weeks) forecasting of
hospital admission of the FDA model and a fourth-order AutoRegressive
AR(4) model are quite similar. However, our results reveal that the
long-range forecasting of FDA is more realistic. This implies that, using
the FDA approach, the respective authority can plan for winter pressure
properly.
Journal: Journal of Applied Statistics
Pages: 191-205
Issue: 3
Volume: 32
Year: 2005
Keywords: Hospital admissions, long-range dependence, periodicity, fractional forecasting,
X-DOI: 10.1080/02664760500054384
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054384
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:3:p:191-205
Template-Type: ReDIF-Article 1.0
Author-Name: Sung Park
Author-X-Name-First: Sung
Author-X-Name-Last: Park
Author-Name: Bum Lee
Author-X-Name-First: Bum
Author-X-Name-Last: Lee
Author-Name: Hyang Jung
Author-X-Name-First: Hyang
Author-X-Name-Last: Jung
Title: Joint impact of multiple observations on a subset of variables in multiple linear regression
Abstract:
In multiple linear regression analysis, each observation affects the
fitted regression equation differently and has varying influences on the
regression coefficients of the different variables. Chatterjee & Hadi
(1988) have proposed some measures such as DSSEij (Impact on Residual Sum
of Squares of simultaneously omitting the ith observation and the jth
variable), Fj (Partial F-test for the jth variable) and Fj(i) (Partial
F-test for the jth variable omitting the ith observation) to show the
joint impact and the interrelationship that exists among a variable and an
observation. In this paper we have proposed more extended form of those
measures DSSEIJ, FJ and FJ(I) to deal with the interrelationships that
exist among the multiple observations and a subset of variables by
monitoring the effects of the simultaneous omission of multiple variables
and multiple observations.
Journal: Journal of Applied Statistics
Pages: 207-219
Issue: 3
Volume: 32
Year: 2005
Keywords: Subset of variables, multiple linear regression, joint impact, regression diagnostics,
X-DOI: 10.1080/02664760500054418
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054418
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:3:p:207-219
Template-Type: ReDIF-Article 1.0
Author-Name: Zhang Wu
Author-X-Name-First: Zhang
Author-X-Name-Last: Wu
Author-Name: Yu Tian
Author-X-Name-First: Yu
Author-X-Name-Last: Tian
Author-Name: Sheng Zhang
Author-X-Name-First: Sheng
Author-X-Name-Last: Zhang
Title: Adjusted-loss-function charts with variable sample sizes and sampling intervals
Abstract:
Recent research has shown that the control charts with adaptive features
are quicker than the traditional static Shewhart charts in detecting
process shifts. This article presents the design and implementation of a
control chart based on Adjusted Loss Function (AL) with Variable Sample
Sizes and Sampling Intervals (VSSI). This single chart (called the VSSI AL
chart) is able to monitor the process shifts in mean and variance
simultaneously. Our studies show that the VSSI AL chart is not only easier
to design and implement than the VSSI X & S (or X & R) charts, but is also
10% more effective than the latter in detecting the process shifts from an
overall viewpoint.
Journal: Journal of Applied Statistics
Pages: 221-242
Issue: 3
Volume: 32
Year: 2005
Keywords: Statistical Process Control, loss function, adaptive control chart,
X-DOI: 10.1080/02664760500054475
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054475
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:3:p:221-242
Template-Type: ReDIF-Article 1.0
Author-Name: Duolao Wang
Author-X-Name-First: Duolao
Author-X-Name-Last: Wang
Author-Name: Michael Murphy
Author-X-Name-First: Michael
Author-X-Name-Last: Murphy
Title: Identifying Nonlinear Relationships in Regression using the ACE Algorithm
Abstract:
This paper introduces an alternating conditional expectation (ACE)
algorithm: a non-parametric approach for estimating the transformations
that lead to the maximal multiple correlation of a response and a set of
independent variables in regression and correlation analysis. These
transformations can give the data analyst insight into the relationships
between these variables so that this can be best described and non-linear
relationships uncovered. Using the Bayesian information criterion (BIC),
we show how to find the best closed-form approximations for the optimal
ACE transformations. By means of ACE and BIC, the model fit can be
considerably improved compared with the conventional linear model as
demonstrated in the two simulated and two real datasets in this paper.
Journal: Journal of Applied Statistics
Pages: 243-258
Issue: 3
Volume: 32
Year: 2005
Keywords: Alternating Conditional Expectation (ACE) algorithm, transformation, non-parametric regression, smoothing, Bayesian Information Criterion (BIC),
X-DOI: 10.1080/02664760500054517
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054517
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:3:p:243-258
Template-Type: ReDIF-Article 1.0
Author-Name: Ulysses Brown
Author-X-Name-First: Ulysses
Author-X-Name-Last: Brown
Author-Name: Dharam Rana
Author-X-Name-First: Dharam
Author-X-Name-Last: Rana
Title: Generalized exchange and propensity for military service: The moderating effect of prior military exposure
Abstract:
The propensity for military service (PMS) of young Americans is an
important issue for our Armed Forces. Since the 1990s, the PMS of young
Americans has steadily declined. Overtime, a declining PMS may cause
military mission degradation, lowering of military recruitment standards,
base closures, and reinstatement of the unpopular military draft system.
This paper investigates the moderator effect of prior military service on
the Generalized Exchange-PMS relationship. Generalized exchange is when
indirect benefits such as preserving freedom and the American way of life
accrue to the larger society because of an individual's military service.
This paper uses a structural equation modelling approach to analyse the
moderating effect of prior military exposure on prospective recruits
regarding their PMS. Findings indicate that the group of prospective
recruits with prior military exposure had higher levels of PMS than the
group without such exposure, that is, the young people with prior military
exposure are more likely to enlist in the military than the young
Americans with no prior military exposure.
Journal: Journal of Applied Statistics
Pages: 259-270
Issue: 3
Volume: 32
Year: 2005
Keywords: Propensity, structural equation modelling, military, exchange theory,
X-DOI: 10.1080/02664760500054590
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054590
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:3:p:259-270
Template-Type: ReDIF-Article 1.0
Author-Name: Ronald Tracy
Author-X-Name-First: Ronald
Author-X-Name-Last: Tracy
Author-Name: David Doane
Author-X-Name-First: David
Author-X-Name-Last: Doane
Title: Using the studentized range to assess kurtosis
Abstract:
Because it is easy to compute from three common statistics (minimum,
maximum, standard deviation) the studentized range is a useful test for
non-normality when the original data are unavailable. For samples from
symmetric populations, the studentized range allows an assessment of
kurtosis with Type I and II error rates similar to those obtained from the
moment coefficients.
Journal: Journal of Applied Statistics
Pages: 271-280
Issue: 3
Volume: 32
Year: 2005
Keywords: EDA, Studentized range, kurtosis, skewness, normality,
X-DOI: 10.1080/02664760500054632
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054632
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:3:p:271-280
Template-Type: ReDIF-Article 1.0
Author-Name: Andres Alonso
Author-X-Name-First: Andres
Author-X-Name-Last: Alonso
Author-Name: Juan Romo
Author-X-Name-First: Juan
Author-X-Name-Last: Romo
Title: Forecast of the expected non-epidemic morbidity of acute diseases using resampling methods
Abstract:
In epidemiological surveillance it is important that any unusual increase
of reported cases be detected as rapidly as possible. Reliable forecasting
based on a suitable time series model for an epidemiological indicator is
necessary for estimating the expected non-epidemic indicator and to
elaborate an alert threshold. Time series analyses of acute diseases often
use Gaussian autoregressive integrated moving average models. However,
these approaches can be adversely affected by departures from the true
underlying distribution. The objective of this paper is to introduce a
bootstrap procedure for obtaining prediction intervals in linear models in
order to avoid the normality assumption. We present a Monte Carlo study
comparing the finite sample properties of bootstrap prediction intervals
with those of alternative methods. Finally, we illustrate the performance
of the proposed method with a meningococcal disease incidence series.
Journal: Journal of Applied Statistics
Pages: 281-295
Issue: 3
Volume: 32
Year: 2005
Keywords: Morbidity prediction, epidemiological time series, sieve bootstrap,
X-DOI: 10.1080/02664760500054780
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054780
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:3:p:281-295
Template-Type: ReDIF-Article 1.0
Author-Name: Guillermo De Leon Adams
Author-X-Name-First: Guillermo De Leon
Author-X-Name-Last: Adams
Author-Name: Pere Grima Cintas
Author-X-Name-First: Pere Grima
Author-X-Name-Last: Cintas
Author-Name: Xavier Tort-Martorell Llabres
Author-X-Name-First: Xavier Tort-Martorell
Author-X-Name-Last: Llabres
Title: Experimentation order in factorial designs with 8 or 16 runs
Abstract:
Randomizing the order of experimentation in a factorial design does not
always achieve the desired effect of neutralizing the influence of unknown
factors. In fact, with some very reasonable assumptions, an important
proportion of random orders afford the same degree of protection as that
obtained by experimenting in the design matrix standard order. In
addition, randomization can induce a big number of changes in factor
levels and thus make experimentation expensive and difficult. This paper
discusses this subject and suggests experimentation orders for designs
with 8 or 16 runs that combine an excellent level of protection against
the influence of unknown factors, with the minimum number of changes in
factor levels.
Journal: Journal of Applied Statistics
Pages: 297-313
Issue: 3
Volume: 32
Year: 2005
Keywords: Randomization, experimentation order, factorial design, bias protection, minimum number of level changes,
X-DOI: 10.1080/02664760500054731
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500054731
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:3:p:297-313
Template-Type: ReDIF-Article 1.0
Author-Name: Belen Zalba
Author-X-Name-First: Belen
Author-X-Name-Last: Zalba
Author-Name: Belen Sanchez-valverde
Author-X-Name-First: Belen
Author-X-Name-Last: Sanchez-valverde
Author-Name: Jose Marin
Author-X-Name-First: Jose
Author-X-Name-Last: Marin
Title: An experimental study of thermal energy storage with phase change materials by design of experiments
Abstract:
Accurate theoretical modelling and simulation of thermal energy storage
(TES) by means of phase change materials (PCM) is very complex and its
results are not close enough to experimental values. This paper presents
the empirical study of a thermal storage unit operating with a commercial
PCM called RT25. The study is carried out by means of the statistical
procedure, Design of Experiments. This methodology has rarely been used in
the analysis of heat transfer problems. The present study has allowed us
to investigate the phenomena involved and to design an actual system. We
show the whole procedure followed in order to design the set-up, to run
the experiments with a 23 factorial design, to compare its results with a
numerical simulation and to get the empirical model by regression. Its
results have been used to design actual installations aimed at
free-cooling or maintaining the temperature constant in rooms where
thermal security is necessary.
Journal: Journal of Applied Statistics
Pages: 321-332
Issue: 4
Volume: 32
Year: 2005
Keywords: Factorial design, simulation, thermal energy storage, phase change materials,
X-DOI: 10.1080/02664760500078920
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500078920
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:4:p:321-332
Template-Type: ReDIF-Article 1.0
Author-Name: Rolf Aaberge
Author-X-Name-First: Rolf
Author-X-Name-Last: Aaberge
Author-Name: Li-chun Zhang
Author-X-Name-First: Li-chun
Author-X-Name-Last: Zhang
Title: A class of exact UMP unbiased tests for conditional symmetry in small-sample square contingency tables
Abstract:
Testing conditional symmetry against various alternative
diagonals-parameter symmetry models often provides a point of departure in
studies of square contingency tables with ordered categories. Typically,
chi-square or likelihood-ratio tests are used for such purposes. Since
these tests depend on the validity of asymptotic approximation, they may
be inappropriate in small-sample situations where exact tests are
required. In this paper, we apply the theory of UMP unbiased tests to
develop a class of exact tests for conditional symmetry in small samples.
Oesophageal cancer and longitudinal income data are used to illustrate the
approach.
Journal: Journal of Applied Statistics
Pages: 333-340
Issue: 4
Volume: 32
Year: 2005
Keywords: Multinomial distribution, conditional symmetry, diagonals-parameter symmetry,
X-DOI: 10.1080/02664760500078953
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500078953
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:4:p:333-340
Template-Type: ReDIF-Article 1.0
Author-Name: Francois Husson
Author-X-Name-First: Francois
Author-X-Name-Last: Husson
Author-Name: Jerome Pages
Author-X-Name-First: Jerome
Author-X-Name-Last: Pages
Title: Scatter Plot and Additional Variables
Abstract:
We often want to complete the interpretation of the usual graphs (x, y)
with additional quantitative variables. The Prefmap method (vectorial
model) proposes a representation of these additional variables but this
representation has some drawbacks when the variables x and y are
correlated. To solve this problem, we propose to substitute the
coefficients of the linear regression by the coefficient of the PLS
regression in the Prefmap method. The graph obtained is made operational
thanks to contour lines of quality of representation and it becomes richer
than the Prefmap one.
Journal: Journal of Applied Statistics
Pages: 341-350
Issue: 4
Volume: 32
Year: 2005
Keywords: Scatter plot, Prefmap, Pls, additional variables,
X-DOI: 10.1080/02664760500079043
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079043
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:4:p:341-350
Template-Type: ReDIF-Article 1.0
Author-Name: Natasha Yakovchuk
Author-X-Name-First: Natasha
Author-X-Name-Last: Yakovchuk
Author-Name: Thomas Willemain
Author-X-Name-First: Thomas
Author-X-Name-Last: Willemain
Title: Monte carlo comparison of estimation methods for additive two-way tables
Abstract:
We considered the problem of estimating effects in the following linear
model for data arranged in a two-way table: Response = Common effect + Row
effect + Column effect + Residual. This work was occasioned by a project
to analyse Federal Aviation Administration (FAA) data on daily temporal
deviations from flight plans for commercial US flights, with rows and
columns representing origin and destination airports, respectively. We
conducted a large Monte Carlo study comparing the accuracy of three
methods of estimation: classical least squares, median polish and least
absolute deviations (LAD). The experiments included a wide spectrum of
tables of different sizes and shapes, with different levels of
non-linearity, noise variance, and percentages of empty cells and
outliers. We based our comparison on the accuracy of the estimates and on
computational speed. We identified factors that significantly affect
accuracy and speed, and compared the methods based on their sensitivity to
these factors. We concluded that there is no dominant method of estimation
and identified conditions under which each method is most attractive.
Journal: Journal of Applied Statistics
Pages: 351-374
Issue: 4
Volume: 32
Year: 2005
Keywords: Additive model, least squares, least absolute deviations, Monte Carlo, robust estimation, two-way tables,
X-DOI: 10.1080/02664760500079118
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079118
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:4:p:351-374
Template-Type: ReDIF-Article 1.0
Author-Name: Michael Zeifman
Author-X-Name-First: Michael
Author-X-Name-Last: Zeifman
Author-Name: Dov Ingman
Author-X-Name-First: Dov
Author-X-Name-Last: Ingman
Title: Modelling of unexpected shift in SPC
Abstract:
Optimal statistical process control (SPC) requires models of both
in-control and out-of-control process states. Whereas a normal
distribution is the generally accepted model for the in-control state,
there is a doubt as to the existence of reliable models for out-of-control
cases. Various process models, available in the literature, for discrete
manufacturing systems (parts industry) can be treated as bounded
discrete-space Markov chains, completely characterized by the original
in-control state and a transition matrix for shifts to an out-of-control
state. The present work extends these models by using a continuous-state
Markov chain, incorporating non-random corrective actions. These actions
are to be realized according to the SPC technique and should substantially
affect the model. The developed stochastic model yields a Laplace
distribution of a process mean. An alternative approach, based on the
Information theory, also results in a Laplace distribution. Real-data
tests confirm the applicability of a Laplace distribution for the parts
industry and show that the distribution parameter is mainly controlled by
the SPC sample size.
Journal: Journal of Applied Statistics
Pages: 375-386
Issue: 4
Volume: 32
Year: 2005
Keywords: Control charts, Markov chain, mixture distribution, information distance,
X-DOI: 10.1080/02664760500079175
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079175
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:4:p:375-386
Template-Type: ReDIF-Article 1.0
Author-Name: Jadran Dobric
Author-X-Name-First: Jadran
Author-X-Name-Last: Dobric
Author-Name: Friedrich Schmid
Author-X-Name-First: Friedrich
Author-X-Name-Last: Schmid
Title: Nonparametric estimation of the lower tail dependence λL in bivariate copulas
Abstract:
The lower tail dependence λL is a measure that characterizes the
tendency of extreme co-movements in the lower tails of a bivariate
distribution. It is invariant with respect to strictly increasing
transformations of the marginal distribution and is therefore a function
of the copula of the bivariate distribution. λL plays an important
role in modelling aggregate financial risk with copulas. This paper
introduces three non-parametric estimators for λL. They are weakly
consistent under mild regularity conditions on the copula and under the
assumption that the number k = k(n) of observations in the lower tail,
used for estimation, is asymptotically k ≈ √n. The finite
sample properties of the estimators are investigated using a Monte Carlo
simulation in special cases. It turns out that these estimators are
biased, where amount and sign of the bias depend on the underlying copula,
on the sample size n, on k, and on the true value of λL.
Journal: Journal of Applied Statistics
Pages: 387-407
Issue: 4
Volume: 32
Year: 2005
Keywords: Copula, lower tail dependence, non-parametric estimation, empirical copula process, consistency of estimators, small sample properties of estimators,
X-DOI: 10.1080/02664760500079217
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079217
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:4:p:387-407
Template-Type: ReDIF-Article 1.0
Author-Name: Dogan Argac
Author-X-Name-First: Dogan
Author-X-Name-Last: Argac
Title: Testing hypotheses on coefficients of variation from a series of two-armed experiments
Abstract:
We consider the problem of testing hypotheses on the difference of the
coefficients of variation from several two-armed experiments with normally
distributed outcomes. In particular, we deal with testing the homogeneity
of the difference of the coefficients of variation and testing the
equality of the difference of the coefficients of variation to a specified
value. The test statistics proposed are derived in a limiting one-way
classification with fixed effects and heteroscedastic error variances,
using results from analysis of variance. By way of simulation, the
performance of these test statistics is compared for both testing problems
considered.
Journal: Journal of Applied Statistics
Pages: 409-419
Issue: 4
Volume: 32
Year: 2005
Keywords: Analysis of variance, heteroscedastic variances, homogeneity, one-way classification, two-armed experiments,
X-DOI: 10.1080/02664760500079225
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079225
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:4:p:409-419
Template-Type: ReDIF-Article 1.0
Author-Name: Takafumi Isogai
Author-X-Name-First: Takafumi
Author-X-Name-Last: Isogai
Title: Applications of a new power normal family
Abstract:
The main purpose of this paper is to give an algorithm to attain joint
normality of non-normal multivariate observations through a new power
normal family introduced by the author (Isogai, 1999). The algorithm tries
to transform each marginal variable simultaneously to joint normality, but
due to a large number of parameters it repeats a maximization process with
respect to the conditional normal density of one transformed variable
given the other transformed variables. A non-normal data set is used to
examine performance of the algorithm, and the degree of achievement of
joint normality is evaluated by measures of multivariate skewness and
kurtosis. Besides the above topic, making use of properties of our power
normal family, we discuss not only a normal approximation formula of
non-central F distributions in the frame of regression analysis but also
some decomposition formulas of a power parameter, which appear in a
Wilson-Hilferty power transformation setting.
Journal: Journal of Applied Statistics
Pages: 421-436
Issue: 4
Volume: 32
Year: 2005
Keywords: Power normal family, non-normality, joint normality, measures of multivariate skewness and kurtosis,
X-DOI: 10.1080/02664760500079233
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079233
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:4:p:421-436
Template-Type: ReDIF-Article 1.0
Author-Name: Diego Kuonen
Author-X-Name-First: Diego
Author-X-Name-Last: Kuonen
Title: Studentized bootstrap confidence intervals based on M-estimates
Abstract:
This article reviews and applies saddlepoint approximations to
studentized confidence intervals based on robust M-estimates. The latter
are known to be very accurate without needing standard theory assumptions.
As examples, the classical studentized statistic, the studentized versions
of Huber's M-estimate of location, of its initially MAD scaled version and
of Huber's proposal 2 are considered. The aim is to know whether the
studentized statistics yield robust confidence intervals with coverages
close to nominal, with short intervals. The results of an extensive
simulation study and the recommendations for practical use given in this
article may fill gaps in the current literature and stimulate further
discussion and research.
Journal: Journal of Applied Statistics
Pages: 443-460
Issue: 5
Volume: 32
Year: 2005
Keywords: Bootstrap, confidence interval, M-estimation, resampling, robust inference, saddlepoint, studentized bootstrap,
X-DOI: 10.1080/02664760500079340
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079340
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:5:p:443-460
Template-Type: ReDIF-Article 1.0
Author-Name: Thomas Ryan
Author-X-Name-First: Thomas
Author-X-Name-Last: Ryan
Author-Name: William Woodall
Author-X-Name-First: William
Author-X-Name-Last: Woodall
Title: The most-cited statistical papers
Abstract:
We attempt to identify the 25 most-cited statistical papers, providing
some brief commentary on each paper on our list. This list consists, to a
great extent, of papers that are on non-parametric methods, have
applications in the life sciences, or deal with the multiple comparisons
problem. We also list the most-cited papers published in 1993 or later. In
contrast to the overall most-cited papers, these are predominately papers
on Bayesian methods and wavelets. We briefly discuss some of the issues
involved in the use of citation counts.
Journal: Journal of Applied Statistics
Pages: 461-474
Issue: 5
Volume: 32
Year: 2005
Keywords: Citations, history of statistics,
X-DOI: 10.1080/02664760500079373
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079373
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:5:p:461-474
Template-Type: ReDIF-Article 1.0
Author-Name: Mike Nicholson
Author-X-Name-First: Mike
Author-X-Name-Last: Nicholson
Author-Name: Jon Barry
Author-X-Name-First: Jon
Author-X-Name-Last: Barry
Title: Target detection from a classical and a Bayesian viewpoint
Abstract:
Two approaches have been used for designing spatial surveys to detect a
target. The classical approach controls the probability of missing a
target that exists; a Bayesian approach controls the probability that a
target exists given that none was seen. In both cases, information about
the likely size of the target can reduce sampling requirements. In this
paper, previous results are summarized and then used to assess the risk
that Roman remains could be present at sites scheduled for development in
Greater London.
Journal: Journal of Applied Statistics
Pages: 475-482
Issue: 5
Volume: 32
Year: 2005
Keywords: Target detection, statistical archaeology, spatial surveys,
X-DOI: 10.1080/02664760500079407
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079407
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:5:p:475-482
Template-Type: ReDIF-Article 1.0
Author-Name: Diane Dancer
Author-X-Name-First: Diane
Author-X-Name-Last: Dancer
Author-Name: Andrew Tremayne
Author-X-Name-First: Andrew
Author-X-Name-Last: Tremayne
Title: R-squared and prediction in regression with ordered quantitative response
Abstract:
This paper is concerned with the use of regression methods to predict
values of a response variable when that variable is naturally ordered. An
application to the prediction of student examination performance is
provided and it is argued that, although individual scores are unlikely to
be well predicted at the extremes of the range using the conditional mean,
conditional on covariates, it is possible to usefully predict where an
individual is likely to feature in the rank order of performance.
Journal: Journal of Applied Statistics
Pages: 483-493
Issue: 5
Volume: 32
Year: 2005
Keywords: Regression prediction, prediction error, rank correlation,
X-DOI: 10.1080/02664760500079423
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079423
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:5:p:483-493
Template-Type: ReDIF-Article 1.0
Author-Name: Rand Wilcox
Author-X-Name-First: Rand
Author-X-Name-Last: Wilcox
Title: Estimating the conditional variance of Y, given X, in a simple regression model
Abstract:
Consider the regression model [image omitted] . In a variety of
situations, an estimate of VAR (Y &7C X) = λ (X) is needed. The
paper compares the small-sample accuracy of five estimators of λ
(X). The results suggest that the optimal estimator is a somewhat complex
function of the underlying distributions. In terms of mean squared error,
one of the estimators, which is based in part on a non-robust version of
Cleveland's smoother, performed about as well as a bagged version of the
so-called running interval smoother, but the running interval smoother was
found to be preferable in terms of bias. A modification of Cleveland's
smoother, stemming from Ruppert et al. (1997), achieves its intended goal
of reducing bias when the error term is homoscedastic, but under
heteroscedasticity, bias can be high, and in terms of mean squared error
it does not compete well with the kernel method considered in the paper.
When ε has a heavy-tailed distribution, a robust version of
Cleveland's smoother performed particularly well except in some situations
where X has a heavy-tailed distribution as well. A negative feature of
using Cleveland's robust smoother is relatively high bias, and when there
is heteroscedasticity and X has a heavy-tailed distribution, a kernel-type
method and the running interval smoother give superior results in terms of
both mean squared error and bias.
Journal: Journal of Applied Statistics
Pages: 495-502
Issue: 5
Volume: 32
Year: 2005
Keywords: Strength of association, heteroscedasticity,
X-DOI: 10.1080/02664760500079480
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079480
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:5:p:495-502
Template-Type: ReDIF-Article 1.0
Author-Name: Gabriel Huerta
Author-X-Name-First: Gabriel
Author-X-Name-Last: Huerta
Title: Multivariate Bayes Wavelet shrinkage and applications
Abstract:
In recent years, wavelet shrinkage has become a very appealing method for
data de-noising and density function estimation. In particular, Bayesian
modelling via hierarchical priors has introduced novel approaches for
Wavelet analysis that had become very popular, and are very competitive
with standard hard or soft thresholding rules. In this sense, this paper
proposes a hierarchical prior that is elicited on the model parameters
describing the wavelet coefficients after applying a Discrete Wavelet
Transformation (DWT). In difference to other approaches, the prior
proposes a multivariate Normal distribution with a covariance matrix that
allows for correlations among Wavelet coefficients corresponding to the
same level of detail. In addition, an extra scale parameter is
incorporated that permits an additional shrinkage level over the
coefficients. The posterior distribution for this shrinkage procedure is
not available in closed form but it is easily sampled through Markov chain
Monte Carlo (MCMC) methods. Applications on a set of test signals and two
noisy signals are presented.
Journal: Journal of Applied Statistics
Pages: 529-542
Issue: 5
Volume: 32
Year: 2005
Keywords: Bayes shrinkage, wavelets, discrete wavelet transformation, data de-noising, MCMC methods,
X-DOI: 10.1080/02664760500079662
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079662
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:5:p:529-542
Template-Type: ReDIF-Article 1.0
Author-Name: Jason Dietrich
Author-X-Name-First: Jason
Author-X-Name-Last: Dietrich
Title: The effects of sampling strategies on the small sample properties of the logit estimator
Abstract:
Empirical researchers face a trade-off between the lower resource costs
associated with smaller samples and the increased confidence in the
results gained from larger samples. Choice of sampling strategy is one
tool researchers can use to reduce costs yet still attain desired
confidence levels. This study uses Monte Carlo simulation to examine the
impact of nine sampling strategies on the finite sample performance of the
maximum likelihood logit estimator. The results show stratified random
sampling with balanced strata sizes and a bias correction for choice-based
sampling outperforms all other sampling strategies with respect to four
small-sample performance measures.
Journal: Journal of Applied Statistics
Pages: 543-554
Issue: 6
Volume: 32
Year: 2005
Keywords: Sampling, Logit, Monte Carlo,
X-DOI: 10.1080/02664760500078888
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500078888
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:6:p:543-554
Template-Type: ReDIF-Article 1.0
Author-Name: Paulo Rodrigues
Author-X-Name-First: Paulo
Author-X-Name-Last: Rodrigues
Author-Name: Philip Hans Franses
Author-X-Name-First: Philip Hans
Author-X-Name-Last: Franses
Title: A sequential approach to testing seasonal unit roots in high frequency data
Abstract:
In this paper we introduce a sequential seasonal unit root testing
approach which explicitly addresses its application to high frequency
data. The main idea is to see which unit roots at higher frequency data
can also be found in temporally aggregated data. We illustrate our
procedure to the analysis of monthly data, and we find, upon analysing the
aggregated quarterly data, that a smaller amount of test statistics can
sometimes be considered. Monte Carlo simulation and empirical
illustrations emphasize the practical relevance of our method.
Journal: Journal of Applied Statistics
Pages: 555-569
Issue: 6
Volume: 32
Year: 2005
Keywords: Seasonal unit roots, temporal aggregation,
X-DOI: 10.1080/02664760500078912
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500078912
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:6:p:555-569
Template-Type: ReDIF-Article 1.0
Author-Name: John Zhang
Author-X-Name-First: John
Author-X-Name-Last: Zhang
Author-Name: Mahmud Ibrahim
Author-X-Name-First: Mahmud
Author-X-Name-Last: Ibrahim
Title: A simulation study on SPSS ridge regression and ordinary least squares regression procedures for multicollinearity data
Abstract:
This study compares the SPSS ordinary least squares (OLS) regression and
ridge regression procedures in dealing with multicollinearity data. The LS
regression method is one of the most frequently applied statistical
procedures in application. It is well documented that the LS method is
extremely unreliable in parameter estimation while the independent
variables are dependent (multicollinearity problem). The Ridge Regression
procedure deals with the multicollinearity problem by introducing a small
bias in the parameter estimation. The application of Ridge Regression
involves the selection of a bias parameter and it is not clear if it works
better in applications. This study uses a Monte Carlo method to compare
the results of OLS procedure with the Ridge Regression procedure in SPSS.
Journal: Journal of Applied Statistics
Pages: 571-588
Issue: 6
Volume: 32
Year: 2005
Keywords: Ridge regression, least squares regression, eigenvalues, eigenvectors, simulation,
X-DOI: 10.1080/02664760500078946
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500078946
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:6:p:571-588
Template-Type: ReDIF-Article 1.0
Author-Name: E. Schrevens
Author-X-Name-First: E.
Author-X-Name-Last: Schrevens
Author-Name: H. Coppenolle
Author-X-Name-First: H.
Author-X-Name-Last: Coppenolle
Author-Name: K. M. Portier
Author-X-Name-First: K. M.
Author-X-Name-Last: Portier
Title: A comparative study between latent class binomial segmentation and mixed-effects logistic regression to explore between-respondent variability in visual preference for horticultural products
Abstract:
A methodological concept is proposed to study between-respondent
variability in visual preference for horticultural products using
quantitative imaging techniques. Chicory, a typical Belgian vegetable,
serves as a model product. Eight image sequences of high-quality chicory,
each representing a different combination of two factor levels of length,
width and ovality, were constructed to satisfy a 23 factorial design by
using quantitative imaging techniques. The image sequences were pair-wise
visualized using a computer-based image system to study visual preference.
Twenty respondents chose which of two samples was preferred in all 28
pair-wise combinations of the eight constructed image sequences. The
consistency of the respondents and the agreement between respondents was
evaluated. The poor fit of a traditional binomial logit model that relates
preference with quality descriptors was due to the low agreement in
preference between respondents. Therefore, latent class binomial
segmentation is compared to mixed-effects logistic regression. Both
approaches relax the traditional assumption that the same model holds for
all respondents by recognizing the typical between-respondent variability
inherent in preference studies. Where the latent class model
simultaneously estimates different logit models for different consumer
segments, the mixed-effects model recognizes between-respondent
variability by incorporating random effects varying by respondent in model
formulation.
Journal: Journal of Applied Statistics
Pages: 589-605
Issue: 6
Volume: 32
Year: 2005
Keywords: Chicory, latent class, logistic, mixed model, pair-wise comparison, variability,
X-DOI: 10.1080/02664760500078987
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500078987
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:6:p:589-605
Template-Type: ReDIF-Article 1.0
Author-Name: Mark Greer
Author-X-Name-First: Mark
Author-X-Name-Last: Greer
Title: Combination forecasting for directional accuracy: An application to survey interest rate forecasts
Abstract:
Using published interest rates forecasts issued by professional
economists, two combination forecasts designed to improve the directional
accuracy of interest rate forecasting are constructed. The first
combination forecast takes a weighted average of the individual
forecasters' predictions. The more successful the forecaster was in past
forecasts at predicting the direction of change in interest rates, the
greater is the weight given to his/her current forecast. The second
combination forecast is simply the forecast issued by the forecaster who
had the greatest success rate at predicting the direction of change in
interest rates in previous forecasts. In cases where two or more
forecasters tie for best historic directional accuracy track record, the
arithmetic mean of these forecasters is used. The study finds that neither
combination forecasting method performs better than coin-flipping at
predicting the direction of change in interest rates. Nor does either
method beat the simple arithmetic mean of the predictions of all the
forecasters surveyed at predicting the direction of change in interest
rates.
Journal: Journal of Applied Statistics
Pages: 607-615
Issue: 6
Volume: 32
Year: 2005
Keywords: Forecasting, directional accuracy, combination forecasting, interest rates,
X-DOI: 10.1080/02664760500079027
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079027
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:6:p:607-615
Template-Type: ReDIF-Article 1.0
Author-Name: Man Lai Tang
Author-X-Name-First: Man Lai
Author-X-Name-Last: Tang
Author-Name: Ka Ho Wu
Author-X-Name-First: Ka Ho
Author-X-Name-Last: Wu
Title: A unified sequential test procedure for simultaneous testing the equality of several binomial proportions to a specified standard
Abstract:
In this article, we propose a unified sequentially rejective test
procedure for testing simultaneously the equality of several independent
binomial proportions to a specified standard. The proposed test procedure
is general enough to include some well-known multiple testing procedures
such as the Ordinary Bonferroni procedure, Hochberg procedure and Rom
procedure. It involves multiple tests of significance based on the simple
binomial tests (exact or approximate) which can be easily found in many
elementary standard statistics textbooks. Unlike the traditional
Chi-square test of the overall hypothesis, the procedure can identify the
subset of the binomial proportions, which are different from the
prespecified standard with the control of the familywise type I error
rate. Moreover, the power computation of the procedure is provided and the
procedure is illustrated by two real examples from an ecological study and
a carcinogenicity study.
Journal: Journal of Applied Statistics
Pages: 617-624
Issue: 6
Volume: 32
Year: 2005
Keywords: Multiple test procedure, Binomial proportions,
X-DOI: 10.1080/02664760500079100
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079100
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:6:p:617-624
Template-Type: ReDIF-Article 1.0
Author-Name: Christopher Illert
Author-X-Name-First: Christopher
Author-X-Name-Last: Illert
Title: Origins of linguistic zonation in the Australian Alps. part 1 - Huygens' principle
Abstract:
The hitherto poorly recorded boundaries of extinct traditional
south-east-Australian Aboriginal languages can now be redetermined with
greatly improved precision using an entropy-maximizing phonetic-signature
calculated from existing data sources, including old word-lists and census
forms, that have, until now, largely been considered informationally
worthless. Having thus determined traditional Aboriginal language zones to
a previously unimaginable degree of geographical precision, it is argued
that these boundaries should not be viewed merely as a static 'snapshot'
but, instead, as the end-product of a knowable dynamic process (Gillieron
wave propagation) governed by well-known physical rules (such as Huygens'
principle and Snell's Law) and operating over 'deep' time-scales more
familiar to the archaeologist than the linguist. Although this initial
study is limited to south-eastern Australia, the new methodology provides
the first real hope of obtaining a detailed understanding of language
dispersal throughout the entire continent over the past 60,000 years.
Journal: Journal of Applied Statistics
Pages: 625-659
Issue: 6
Volume: 32
Year: 2005
Keywords: Lexical signature, deep linguistics, Gillieron wave propagation, Huygens', Principle,
X-DOI: 10.1080/02664760500079258
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079258
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:6:p:625-659
Template-Type: ReDIF-Article 1.0
Author-Name: Mariam Mahfouz
Author-X-Name-First: Mariam
Author-X-Name-Last: Mahfouz
Author-Name: Pascale Giraudet
Author-X-Name-First: Pascale
Author-X-Name-Last: Giraudet
Author-Name: Michel Chaput
Author-X-Name-First: Michel
Author-X-Name-Last: Chaput
Title: Strategy for a statistical analysis of odour influence on the mammalian olfactory bulb responsiveness
Abstract:
This paper proposes a global strategy for statistical analysis of odour
influence on the responsiveness of the mammalian olfactory bulb, the first
relay of the olfactory pathway. Experiments were performed on 86 mitral
cells recorded in 17 anaesthetized freely breathing rats. Five pure odours
and their binary mixture were used. The spontaneous activity and
odour-evoked responses of the cells were characterized by their temporal
distribution of activity along the respiratory cycle, i.e. by
cycle-triggered histograms. Several statistical analyses were performed to
describe the influence of binary odour mixtures and, especially, to detect
a possible dominance of one component of the mixture.
Journal: Journal of Applied Statistics
Pages: 661-679
Issue: 6
Volume: 32
Year: 2005
Keywords: Olfaction, odour responses, statistical analyses, temporal patterns, response comparison,
X-DOI: 10.1080/02664760500079357
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079357
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:6:p:661-679
Template-Type: ReDIF-Article 1.0
Author-Name: Saralees Nadarajah
Author-X-Name-First: Saralees
Author-X-Name-Last: Nadarajah
Title: A generalized normal distribution
Abstract:
Undoubtedly, the normal distribution is the most popular distribution in
statistics. In this paper, we introduce a natural generalization of the
normal distribution and provide a comprehensive treatment of its
mathematical properties. We derive expressions for the nth moment, the nth
central moment, variance, skewness, kurtosis, mean deviation about the
mean, mean deviation about the median, Renyi entropy, Shannon entropy, and
the asymptotic distribution of the extreme order statistics. We also
discuss estimation by the methods of moments and maximum likelihood and
provide an expression for the Fisher information matrix.
Journal: Journal of Applied Statistics
Pages: 685-694
Issue: 7
Volume: 32
Year: 2005
Keywords: Estimation, entropy, generalized normal distribution, moments, normal distribution, order statistics,
X-DOI: 10.1080/02664760500079464
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079464
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:7:p:685-694
Template-Type: ReDIF-Article 1.0
Author-Name: Xia Pan
Author-X-Name-First: Xia
Author-X-Name-Last: Pan
Title: An alternative approach to multivariate EWMA control chart
Abstract:
This Paper proposes a multivariate EWMA scheme that is alternative to the
traditional EWMA-M. The distribution of the chart statistic is derived
from Box quadratic form and the sensitivity of the chart is examined. The
average run lengths of the M-EWMA scheme are numerically computed with the
integral equation method. The exponential weight of 0.2 is found to be the
optimal choice for the sensitive chart to detect assignable causes in the
mean vector of processes.
Journal: Journal of Applied Statistics
Pages: 695-705
Issue: 7
Volume: 32
Year: 2005
Keywords: Quality control, multivariate EWMA chart, box quadratic form, ARL, integral equation for numerical analysis,
X-DOI: 10.1080/02664760500079522
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079522
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:7:p:695-705
Template-Type: ReDIF-Article 1.0
Author-Name: Norah Al-Ballaa
Author-X-Name-First: Norah
Author-X-Name-Last: Al-Ballaa
Title: Test for cointegration based on two-stage least squares
Abstract:
A residual-based test for cointegration is proposed. The method of
two-stage least squares is used to estimate the cointegration model
parameters. The residuals are then tested for the existence of a unit root
using the augmented Dickey-Fuller test.
Journal: Journal of Applied Statistics
Pages: 707-713
Issue: 7
Volume: 32
Year: 2005
Keywords: Single-equation approach, residual-based test, two-stage least squares, Monte Carlo,
X-DOI: 10.1080/02664760500079571
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079571
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:7:p:707-713
Template-Type: ReDIF-Article 1.0
Author-Name: M. E. Ghitany
Author-X-Name-First: M. E.
Author-X-Name-Last: Ghitany
Author-Name: S. Kotz
Author-X-Name-First: S.
Author-X-Name-Last: Kotz
Author-Name: M. Xie
Author-X-Name-First: M.
Author-X-Name-Last: Xie
Title: On some reliability measures and their stochastic orderings for the Topp-Leone distribution
Abstract:
Topp-Leone distribution is a continuous unimodal distribution with
bounded support (recently rediscovered) which is useful for modelling
life-time phenomena. In this paper we study some reliability measures of
this distribution such as the hazard rate, mean residual life, reversed
hazard rate, expected inactivity time, and their stochastic orderings.
Journal: Journal of Applied Statistics
Pages: 715-722
Issue: 7
Volume: 32
Year: 2005
Keywords: Expected inactivity time, hazard rate, mean residual life, reversed hazard rate, stochastic orders, Topp-Leone distribution,
X-DOI: 10.1080/02664760500079613
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079613
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:7:p:715-722
Template-Type: ReDIF-Article 1.0
Author-Name: Filidor Labra
Author-X-Name-First: Filidor
Author-X-Name-Last: Labra
Author-Name: Reiko Aoki
Author-X-Name-First: Reiko
Author-X-Name-Last: Aoki
Author-Name: Heleno Bolfarine
Author-X-Name-First: Heleno
Author-X-Name-Last: Bolfarine
Title: Local influence in null intercept measurement error regression under a student_t model
Abstract:
In this paper we discuss the application of local influence in a
measurement error regression model with null intercepts under a Student_t
model with dependent populations. The Student_t distribution is a robust
alternative to modelling data sets involving errors with longer than
Normal tails. We derive the appropriate matrices for assessing the local
influence for different perturbation schemes and use real data as an
illustration of the usefulness of the application.
Journal: Journal of Applied Statistics
Pages: 723-740
Issue: 7
Volume: 32
Year: 2005
Keywords: Influence diagnostic, student_t model, likelihood displacement, pretest/post-test data, measurement error models,
X-DOI: 10.1080/02664760500079639
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079639
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:7:p:723-740
Template-Type: ReDIF-Article 1.0
Author-Name: M. A. Alkhamisi
Author-X-Name-First: M. A.
Author-X-Name-Last: Alkhamisi
Author-Name: Ghazi Shukur
Author-X-Name-First: Ghazi
Author-X-Name-Last: Shukur
Title: Bayesian analysis of a linear mixed model with AR(p) errors via MCMC
Abstract:
We develop Bayesian procedures to make inference about parameters of a
statistical design with autocorrelated error terms. Modelling treatment
effects can be complex in the presence of other factors such as time; for
example in longitudinal data. In this paper, Markov chain Monte Carlo
methods (MCMC), the Metropolis-Hastings algorithm and Gibbs sampler are
used to facilitate the Bayesian analysis of real life data when the error
structure can be expressed as an autoregressive model of order p. We
illustrate our analysis with real data.
Journal: Journal of Applied Statistics
Pages: 741-755
Issue: 7
Volume: 32
Year: 2005
Keywords: Linear mixed model, autoregressive process, Metropolis-Hastings algorithm, Gibbs sampling, Bayesian statistics, autocorrelation, repeated measurement designs,
X-DOI: 10.1080/02664760500079688
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079688
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:7:p:741-755
Template-Type: ReDIF-Article 1.0
Author-Name: Donna Mohr
Author-X-Name-First: Donna
Author-X-Name-Last: Mohr
Title: Confidence limits for estimates of totals from stratified samples, with application to medicare Part B overpayment audits
Abstract:
Superpopulation models are proposed that should be appropriate for
modelling sample-based audits of Medicare payments and other overpayment
situations. Simulations are used to estimate the coverage probabilities of
confidence intervals formed using the standard Stratified Expansion and
Combined Ratio estimators of the total. Despite severe departures from the
usual model of normal deviations, these methods have actual coverage
probabilities reasonably close to the nominal level specified by the US
government's sampling guidelines. An exception occurs when all claims from
a single sampling unit are either completely allowed, or completely
denied, and for this situation an alternative is explored. A balanced
sampling design is also examined, but shown to make no improvement over
ordinary stratified samples used in conjunction with ratio estimates.
Journal: Journal of Applied Statistics
Pages: 757-769
Issue: 7
Volume: 32
Year: 2005
Keywords: Stratified samples, ratio estimators stratified expansion estimators, coverage probability, audit, overpayment,
X-DOI: 10.1080/02664760500079712
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079712
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:7:p:757-769
Template-Type: ReDIF-Article 1.0
Author-Name: Emilio Gomez-deniz
Author-X-Name-First: Emilio
Author-X-Name-Last: Gomez-deniz
Author-Name: Francisco Vazquez-polo
Author-X-Name-First: Francisco
Author-X-Name-Last: Vazquez-polo
Title: Modelling uncertainty in insurance Bonus-Malus premium principles by using a Bayesian robustness approach
Abstract:
When Bayesian models are implemented for a Bonus-Malus System (BMS), a
parametric structure, π0 (λ), is normally included in the
insurer's portfolio. Following Bayesian sensitivity analysis, it is
possible to model the structure function by specifying a class Γ of
priors instead of a single prior. This paper examines the ranges of the
relativities of the form, [image omitted] Standard and robust
Bayesian tools are combined to show how the choice of the prior can affect
the relative premiums. As an extension of the paper by Gomez et al.
(2002b), our model is developed to the variance premium principle and the
class of prior densities extended to ones that are more realistic in an
actuarial setting, i.e. classes of generalized moments conditions. The
proposed method is illustrated with data from Lemaire (1979). The main aim
of the paper is to demonstrate an appropriate methodology to perform a
Bayesian sensitivity analysis of the Bonus-Malus of loaded premiums.
Journal: Journal of Applied Statistics
Pages: 771-784
Issue: 7
Volume: 32
Year: 2005
Keywords: Bonus-malus, Bayesian robustness, ε-contamination class, generalized moments conditions,
X-DOI: 10.1080/02664760500079746
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079746
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:7:p:771-784
Template-Type: ReDIF-Article 1.0
Author-Name: Tsung-Shan Tsou
Author-X-Name-First: Tsung-Shan
Author-X-Name-Last: Tsou
Title: Inferences of variance function - a parametric robust way
Abstract:
Tsou (2003a) proposed a parametric procedure for making robust inference
for mean regression parameters in the context of generalized linear
models. This robust procedure is extended to model variance heterogeneity.
The normal working model is adjusted to become asymptotically robust for
inference about regression parameters of the variance function for
practically all continuous response variables. The connection between the
novel robust variance regression model and the estimating equations
approach is also provided.
Journal: Journal of Applied Statistics
Pages: 785-796
Issue: 8
Volume: 32
Year: 2005
Keywords: Generalized linear models, variance function, robust profile likelihood, normal regression,
X-DOI: 10.1080/02664760500079803
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079803
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:8:p:785-796
Template-Type: ReDIF-Article 1.0
Author-Name: Juan Miguel Marin
Author-X-Name-First: Juan Miguel
Author-X-Name-Last: Marin
Author-Name: Lluis Pla
Author-X-Name-First: Lluis
Author-X-Name-Last: Pla
Author-Name: David Rios-Insua
Author-X-Name-First: David
Author-X-Name-Last: Rios-Insua
Title: Forecasting for some stochastic process models related to sow farm management
Abstract:
Sow farm management requires appropriate methods to forecast the sow
population structure evolution. We describe two models for such purpose.
The first is a semi-Markov process model, used for long-term predictions
and strategic management. The second is a state-space model for continuous
proportions, used for short-term predictions and operational management.
Journal: Journal of Applied Statistics
Pages: 797-812
Issue: 8
Volume: 32
Year: 2005
Keywords: Sow herd management, semi-Markov models, dynamic linear models, Bayesian inference and forecasting,
X-DOI: 10.1080/02664760500079845
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500079845
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:8:p:797-812
Template-Type: ReDIF-Article 1.0
Author-Name: Yuehjen Shao
Author-X-Name-First: Yuehjen
Author-X-Name-Last: Shao
Author-Name: John Fowler
Author-X-Name-First: John
Author-X-Name-Last: Fowler
Author-Name: George Runger
Author-X-Name-First: George
Author-X-Name-Last: Runger
Title: A note on determining an optimal target by considering the dependence of holding costs and the quality characteristics
Abstract:
Products that do not meet the specification criteria of an intended buyer
represent a challenge to the producer in maximizing profits. To understand
the value of the optimal process target (OPT) set at a profit-maximizing
level, a model was developed by Shao et al. (1999) involving multiple
markets and finished products having holding costs independent from their
quality. Investigation in cases considered previously has involved holding
costs as a fixed amount or as a normal random variable independent of the
quality characteristic (QC) of the product. Less specific in nature, this
study considers more general cases in which the HC can be a truncated
normal random variable, which is dependent on the QC of the product.
Journal: Journal of Applied Statistics
Pages: 813-822
Issue: 8
Volume: 32
Year: 2005
Keywords: Optimal process target, dependence, profit function, quality characteristics,
X-DOI: 10.1080/02664760500080066
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500080066
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:8:p:813-822
Template-Type: ReDIF-Article 1.0
Author-Name: Arturo Fernandez
Author-X-Name-First: Arturo
Author-X-Name-Last: Fernandez
Title: Progressively censored variables sampling plans for two-parameter exponential distributions
Abstract:
Progressive censoring is quite useful in many practical situations where
budget constraints are in place or there is a demand for rapid testing.
Balasooriya & Saw (1998) present reliability sampling plans for the
two-parameter exponential distribution, based on progressively censored
samples. However, the operating characteristic (OC) curve derived in their
paper does not depend on the sample size. This seems to be because, in
their computations, they forget to consider the proportion of uncensored
data, which also has an important influence on the subsequent
developments. In consequence, their OC curve is only valid when there is
no censoring. In this paper, some modifications are proposed. These are
needed to obtain a proper design of the above sampling plan. Whenever at
least two uncensored observations are available, the OC curve is derived
in closed form and a procedure for determining progressively censored
reliability sampling plans is also presented. Finally, the example
considered by Balasooriya & Saw is used to illustrate the results
developed in this paper for several censoring levels.
Journal: Journal of Applied Statistics
Pages: 823-829
Issue: 8
Volume: 32
Year: 2005
Keywords: Reliability sampling plans, operating characteristic curve, acceptable and rejectable quality levels,
X-DOI: 10.1080/02664760500080074
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500080074
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:8:p:823-829
Template-Type: ReDIF-Article 1.0
Author-Name: C. Mante
Author-X-Name-First: C.
Author-X-Name-Last: Mante
Author-Name: J. P. Durbec
Author-X-Name-First: J. P.
Author-X-Name-Last: Durbec
Author-Name: J. C. Dauvin
Author-X-Name-First: J. C.
Author-X-Name-Last: Dauvin
Title: A functional data-analytic approach to the classification of species according to their spatial dispersion. Application to a marine macrobenthic community from the Bay of Morlaix (Western English Channel)
Abstract:
We investigate with multivariate methods the behaviour of species
collected in a sequence of ecological surveys. The behaviour of the sth
species is first characterized by a classical dispersion index. Under the
hypothesis (H) of spatial randomness, the probability distribution vs of
this index obeys a reference law μ. All the sampled species are then
compared through a Principal Components Analysis whose metric structure
depends on μ. More precisely, the distance between two species s and
s' is an approximation of the μ-centred chi-square distance
(Benzecri 1976) between vs and vs'. Thus, while Correspondence Analysis
displays departures from independence or homogeneity, the proposed
analysis displays departures of the species from (H). As an application, a
macrobenthic data time series is analysed, and the obtained species
typology is described and discussed. The method enabled us to separate
rare species from random ones while rare species could easily be confused
with random ones. All the aggregated species were common (or even
dominant), and most random ones were moderately abundant. Finally, a group
of 23 species showed a mixed random-aggregated behaviour. The repulsive
(uniform) behaviour was extremely rare.
Journal: Journal of Applied Statistics
Pages: 831-840
Issue: 8
Volume: 32
Year: 2005
Keywords: Index of dispersion, quadrat method, principal components analysis, density approximation, marine ecology,
X-DOI: 10.1080/02664760500080124
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500080124
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:8:p:831-840
Template-Type: ReDIF-Article 1.0
Author-Name: Yoon Young Jung
Author-X-Name-First: Yoon Young
Author-X-Name-Last: Jung
Author-Name: Dong Wan Shin
Author-X-Name-First: Dong Wan
Author-X-Name-Last: Shin
Author-Name: Man-Suk Oh
Author-X-Name-First: Man-Suk
Author-X-Name-Last: Oh
Title: Bayesian analysis of panel data using an MTAR model
Abstract:
Bayesian analysis of panel data using a class of momentum threshold
autoregressive (MTAR) models is considered. Posterior estimation of
parameters of the MTAR models is done by using a simple Markov Chain Monte
Carlo (MCMC) algorithm. Selection of appropriate differenced variables,
test for asymmetry and unit roots are recast as model selections and a
simple way of computing posterior probabilities of the candidate models is
proposed. The proposed method is applied to the yearly unemployment rates
of 51 US states and the results show strong evidence of stationarity and
asymmetry.
Journal: Journal of Applied Statistics
Pages: 841-854
Issue: 8
Volume: 32
Year: 2005
Keywords: MTAR, panel data, MCMC, model selection,
X-DOI: 10.1080/02664760500080132
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500080132
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:8:p:841-854
Template-Type: ReDIF-Article 1.0
Author-Name: Dejian Lai
Author-X-Name-First: Dejian
Author-X-Name-Last: Lai
Author-Name: Shyang-Yun Pamela Shiao
Author-X-Name-First: Shyang-Yun Pamela
Author-X-Name-Last: Shiao
Title: Comparing two clinical measurements: a linear mixed model approach
Abstract:
In this article, we extended the widely used Bland-Altman graphical
technique of comparing two measurements in clinical studies to include an
analytical approach using a linear mixed model. The proposed statistical
inferences can be conducted easily by commercially available statistical
software such as SAS. The linear mixed model approach was illustrated
using a real example in a clinical nursing study of oxygen saturation
measurements, when functional oxygen saturation was compared against
fractional oxy-hemoglobin.
Journal: Journal of Applied Statistics
Pages: 855-860
Issue: 8
Volume: 32
Year: 2005
Keywords: Accuracy, Bland-Altman method, linear mixed model, oxygen saturation,
X-DOI: 10.1080/02664760500080157
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500080157
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:8:p:855-860
Template-Type: ReDIF-Article 1.0
Author-Name: Joseph Brian Adams
Author-X-Name-First: Joseph Brian
Author-X-Name-Last: Adams
Author-Name: Yijin Wert
Author-X-Name-First: Yijin
Author-X-Name-Last: Wert
Title: Logistic and neural network models for predicting a hospital admission
Abstract:
Feedforward neural networks are often used in a similar manner as
logistic regression models; that is, to estimate the probability of the
occurrence of an event. In this paper, a probabilistic model is developed
for the purpose of estimating the probability that a patient who has been
admitted to the hospital with a medical back diagnosis will be released
after only a short stay or will remain hospitalized for a longer period of
time. As the purpose of the analysis is to determine if hospital
characteristics influence the decision to retain a patient, the inputs to
this model are a set of demographic variables that describe the various
hospitals. The output is the probability of either a short or long term
hospital stay. In order to compare the ability of each method to model the
data, a hypothesis test is performed to test for an improvement resulting
from the use of the neural network model.
Journal: Journal of Applied Statistics
Pages: 861-869
Issue: 8
Volume: 32
Year: 2005
Keywords: Neural networks, logistic regression, prediction, hospital admissions, medical informatics,
X-DOI: 10.1080/02664760500080207
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500080207
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:8:p:861-869
Template-Type: ReDIF-Article 1.0
Author-Name: Duolao Wang
Author-X-Name-First: Duolao
Author-X-Name-Last: Wang
Author-Name: Pengjun Lu
Author-X-Name-First: Pengjun
Author-X-Name-Last: Lu
Title: Modelling and forecasting mortality distributions in England and Wales using the Lee-Carter model
Abstract:
Lee and Carter proposed in 1992 a non-linear model mxt = exp (ax + bx kt
+ εxt) for fitting and forecasting age-specific mortality rates at
age x and time t. For the model parameter estimation, they employed the
singular value decomposition method to find a least squares solution.
However, the singular value decomposition algorithm does not provide the
standard errors of estimated parameters, making it impossible to assess
the accuracy of model parameters. This article describes the Lee-Carter
model and the technical procedures to fit and extrapolate this model. To
estimate the precision of the parameter estimates of the Lee-Carter model,
we propose a binomial framework, whose parameter point estimates can be
obtained by the maximum likelihood approach and interval estimates by a
bootstrap approach. This model is used to fit mortality data in England
and Wales from 1951 to 1990 and to forecast mortality change from 1991 to
2020. The Lee-Carter model fits these mortality data very well with R2
being 0.9980. The estimated overall age pattern of mortality ax is very
robust whereas there is considerable uncertainty in bx (changes in the age
pattern over time) and kt (overall change in mortality). The fitted log
age-specific mortality rates have been declining linearly from 1951 to
1990 at different paces and the projected rates will continue to decline
in such a way in the 30 years prediction period.
Journal: Journal of Applied Statistics
Pages: 873-885
Issue: 9
Volume: 32
Year: 2005
Keywords: Lee-Carter model, single value decomposition, binomial distribution, bootstrap, mortality forecasting,
X-DOI: 10.1080/02664760500163441
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500163441
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:9:p:873-885
Template-Type: ReDIF-Article 1.0
Author-Name: Andre Khuri
Author-X-Name-First: Andre
Author-X-Name-Last: Khuri
Title: Slack-variable models versus Scheffe's mixture models
Abstract:
Slack-variable models are compared against Scheffe's polynomial model for
mixture experiments. The notion of model equivalence and the use of
various diagnostic measures provide effective tools in making such
comparisons, particularly when the experimental region is highly
constrained. It is demonstrated that the choice of the best fitting model,
through variable selection, depends on which mixture component is selected
as a slack variable, and on the size of the fitted model. In addition, the
equivalence of two well-known representations of a complete mixture model
is shown to be valid. Two numerical examples are presented.
Journal: Journal of Applied Statistics
Pages: 887-908
Issue: 9
Volume: 32
Year: 2005
Keywords: Collinearity, column space, condition number, constrained mixture region, mixture components, model equivalence, L-pseudocomponents, variable selection, variance-decomposition proportions, variance inflation factors, well-formulated model,
X-DOI: 10.1080/02664760500163466
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500163466
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:9:p:887-908
Template-Type: ReDIF-Article 1.0
Author-Name: Yoeng-Kuan Chang
Author-X-Name-First: Yoeng-Kuan
Author-X-Name-Last: Chang
Author-Name: Deng-Yuan Huang
Author-X-Name-First: Deng-Yuan
Author-X-Name-Last: Huang
Title: On some robust estimation procedures for quantiles based on data
Abstract:
This paper introduces some robust estimation procedures to estimate
quantiles of a continuous random variable based on data, without any other
assumptions of probability distribution. We construct a reasonable linear
regression model to connect the relationship between a suitable symmetric
data transformation and the approximate standard normal statistics.
Statistical properties of this linear regression model and its
applications are studied, including estimators of quantiles, quartile
mean, quartile deviation, correlation coefficient of quantiles and
standard errors of these estimators. We give some empirical examples to
illustrate the statistical properties and apply our estimators to grouping
data.
Journal: Journal of Applied Statistics
Pages: 909-927
Issue: 9
Volume: 32
Year: 2005
Keywords: Quantile estimation, symmetric data transformations, quartile mean, quartile deviation, correlation coefficient of quantiles grouping data,
X-DOI: 10.1080/02664760500163532
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500163532
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:9:p:909-927
Template-Type: ReDIF-Article 1.0
Author-Name: A. H. M. Rahmatullah Imon
Author-X-Name-First: A. H. M. Rahmatullah
Author-X-Name-Last: Imon
Title: Identifying multiple influential observations in linear regression
Abstract:
The identification of influential observations has drawn a great deal of
attention in regression diagnostics. Most of these identification
techniques are based on single case deletion and among them DFFITS has
become very popular with the statisticians. But this technique along with
all other single case diagnostics may be ineffective in the presence of
multiple influential observations. In this paper we develop a generalized
version of DFFITS based on group deletion and then propose a new technique
to identify multiple influential observations using this. The advantage of
using the proposed method in the identification of multiple influential
cases is then investigated through several well-referred data sets.
Journal: Journal of Applied Statistics
Pages: 929-946
Issue: 9
Volume: 32
Year: 2005
Keywords: Influential observations, high leverage points, outliers, masking, swamping, group deletion, generalized Cook's distance, generalized DFFITS, index plot,
X-DOI: 10.1080/02664760500163599
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500163599
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:9:p:929-946
Template-Type: ReDIF-Article 1.0
Author-Name: Jacques Benasseni
Author-X-Name-First: Jacques
Author-X-Name-Last: Benasseni
Title: A concentration study of principal components
Abstract:
Influence functions are commonly used as diagnostic tools in order to
investigate sensitivity aspects in principal component analysis. This
paper suggests a practical alternative for the eigenvalues by introducing
a sensitivity measure derived from the classical Lorenz curve and
associated Gini index. The results are illustrated by analysing an
example.
Journal: Journal of Applied Statistics
Pages: 947-957
Issue: 9
Volume: 32
Year: 2005
Keywords: Gini index of concentration, influence function, Lorenz curve, principal component analysis, sensitivity,
X-DOI: 10.1080/02664760500163664
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500163664
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:9:p:947-957
Template-Type: ReDIF-Article 1.0
Author-Name: Edward Bedrick
Author-X-Name-First: Edward
Author-X-Name-Last: Bedrick
Title: Graphical modelling and the Mahalanobis distance
Abstract:
I consider the problem of estimating the Mahalanobis distance between
multivariate normal populations when the population covariance matrix
satisfies a graphical model. In addition to providing a clear
understanding of the dependencies in a multivariate data set, the use of
graphical models can reduce the variability of the estimated distances and
improve inferences. I derive the asymptotic distribution of the estimated
Mahalanobis distance under a general covariance model, which includes
graphical models as a special case. Two examples are discussed.
Journal: Journal of Applied Statistics
Pages: 959-967
Issue: 9
Volume: 32
Year: 2005
Keywords: Discriminant analysis, distance between populations,
X-DOI: 10.1080/02664760500163680
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500163680
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:9:p:959-967
Template-Type: ReDIF-Article 1.0
Author-Name: Hugh Chipman
Author-X-Name-First: Hugh
Author-X-Name-Last: Chipman
Author-Name: Hong Gu
Author-X-Name-First: Hong
Author-X-Name-Last: Gu
Title: Interpretable dimension reduction
Abstract:
The analysis of high-dimensional data often begins with the
identification of lower dimensional subspaces. Principal component
analysis is a dimension reduction technique that identifies linear
combinations of variables along which most variation occurs or which best
“reconstruct” the original variables. For example, many
temperature readings may be taken in a production process when in fact
there are just a few underlying variables driving the process. A problem
with principal components is that the linear combinations can seem quite
arbitrary. To make them more interpretable, we introduce two classes of
constraints. In the first, coefficients are constrained to equal a small
number of values (homogeneity constraint). The second constraint attempts
to set as many coefficients to zero as possible (sparsity constraint). The
resultant interpretable directions are either calculated to be close to
the original principal component directions, or calculated in a stepwise
manner that may make the components more orthogonal. A small dataset on
characteristics of cars is used to introduce the techniques. A more
substantial data mining application is also given, illustrating the
ability of the procedure to scale to a very large number of variables.
Journal: Journal of Applied Statistics
Pages: 969-987
Issue: 9
Volume: 32
Year: 2005
Keywords: Principal component, interpretable, homogeneity, sparsity, stepwise algorithm, dimension reduction, data mining,
X-DOI: 10.1080/02664760500168648
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500168648
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:9:p:969-987
Template-Type: ReDIF-Article 1.0
Author-Name: Gabriel Nunez-Antonio
Author-X-Name-First: Gabriel
Author-X-Name-Last: Nunez-Antonio
Author-Name: Eduardo Gutierrez-Pena
Author-X-Name-First: Eduardo
Author-X-Name-Last: Gutierrez-Pena
Title: A Bayesian analysis of directional data using the projected normal distribution
Abstract:
This paper presents a Bayesian analysis of the projected normal
distribution, which is a flexible and useful distribution for the analysis
of directional data. We obtain samples from the posterior distribution
using the Gibbs sampler after the introduction of suitably chosen latent
variables. The procedure is illustrated using simulated data as well as a
real data set previously analysed in the literature.
Journal: Journal of Applied Statistics
Pages: 995-1001
Issue: 10
Volume: 32
Year: 2005
Keywords: Circular data, Gibbs sampler, latent variables, radial projection, spherical data,
X-DOI: 10.1080/02664760500164886
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500164886
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:10:p:995-1001
Template-Type: ReDIF-Article 1.0
Author-Name: W. L. Pearn
Author-X-Name-First: W. L.
Author-X-Name-Last: Pearn
Author-Name: M. H. Shu
Author-X-Name-First: M. H.
Author-X-Name-Last: Shu
Author-Name: B. M. Hsu
Author-X-Name-First: B. M.
Author-X-Name-Last: Hsu
Title: Testing process capability based on Cpm in the presence of random measurement errors
Abstract:
Process capability indices have been widely used in the manufacturing
industry providing numerical measures on process performance. The index Cp
provides measures on process precision (or product consistency). The index
Cpm, sometimes called the Taguchi index, meditates on process centring
ability and process loss. Most research work related to Cp and Cpm assumes
no gauge measurement errors. This assumption insufficiently reflects real
situations even with highly advanced measuring instruments. Conclusions
drawn from process capability analysis are therefore unreliable and
misleading. In this paper, we conduct sensitivity investigation on process
capability Cp and Cpm in the presence of gauge measurement errors. Due to
the randomness of variations in the data, we consider capability testing
for Cp and Cpm to obtain lower confidence bounds and critical values for
true process capability when gauge measurement errors are unavoidable. The
results show that the estimator with sample data contaminated by the
measurement errors severely underestimates the true capability, resulting
in imperceptible smaller test power. To obtain the true process
capability, adjusted confidence bounds and critical values are presented
to practitioners for their factory applications.
Journal: Journal of Applied Statistics
Pages: 1003-1024
Issue: 10
Volume: 32
Year: 2005
Keywords: Gauge measurement error, lower confidence bound, critical value, process capability analysis,
X-DOI: 10.1080/02664760500164951
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500164951
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:10:p:1003-1024
Template-Type: ReDIF-Article 1.0
Author-Name: M. E. Ghitany
Author-X-Name-First: M. E.
Author-X-Name-Last: Ghitany
Author-Name: E. K. Al-Hussaini
Author-X-Name-First: E. K.
Author-X-Name-Last: Al-Hussaini
Author-Name: R. A. Al-Jarallah
Author-X-Name-First: R. A.
Author-X-Name-Last: Al-Jarallah
Title: Marshall-Olkin extended weibull distribution and its application to censored data
Abstract:
In this paper we show that the Marshall-Olkin extended Weibull
distribution can be obtained as a compound distribution with mixing
exponential distribution. In addition, we provide simple sufficient
conditions for the shape of the hazard rate function of the distribution.
Moreover, we extend the considered distribution to accommodate randomly
right censored data. Finally, application of the extended distribution to
a data set representing the remission times of bladder cancer patients is
given and its goodness-of-fit is demonstrated.
Journal: Journal of Applied Statistics
Pages: 1025-1034
Issue: 10
Volume: 32
Year: 2005
Keywords: Akiake information criterion, Bayesian information criterion, censored data, compound distribution, hazard rate, likelihood ratio test, maximum likelihood, Weibull distribution,
X-DOI: 10.1080/02664760500165008
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500165008
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:10:p:1025-1034
Template-Type: ReDIF-Article 1.0
Author-Name: Marc Callens
Author-X-Name-First: Marc
Author-X-Name-Last: Callens
Author-Name: Christophe Croux
Author-X-Name-First: Christophe
Author-X-Name-Last: Croux
Title: The impact of education on third births. A multilevel discrete-time hazard analysis
Abstract:
We propose to use multilevel discrete-time hazard models to assess the
impact of societal and individual level covariates on the timing and
occurrence of third births. We focus mainly on the impact of educational
attainment on third births across 15 European countries. From the analysis
in this paper, the effect of education on the propensity to have a third
child is found to be negative. This education effect is not significantly
weakened by the Nordic countries, but living in Scandinavia does increase
the hazard for a third birth.
Journal: Journal of Applied Statistics
Pages: 1035-1050
Issue: 10
Volume: 32
Year: 2005
Keywords: Multilevel analysis, discrete-time hazard analysis, multilevel hazard analysis, life course events,
X-DOI: 10.1080/02664760500165040
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500165040
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:10:p:1035-1050
Template-Type: ReDIF-Article 1.0
Author-Name: Bi-super-˙rdal Senoğlu
Author-X-Name-First: Bi-super-˙rdal
Author-X-Name-Last: Senoğlu
Title: Robust 2k factorial design with Weibull error distributions
Abstract:
It is well known that the least squares method is optimal only if the
error distributions are normally distributed. However, in practice,
non-normal distributions are more prevalent. If the error terms have a
non-normal distribution, then the efficiency of least squares estimates
and tests is very low. In this paper, we consider the 2k factorial design
when the distribution of error terms are Weibull W(p,σ). From the
methodology of modified likelihood, we develop robust and efficient
estimators for the parameters in 2k factorial design. F statistics based
on modified maximum likelihood estimators (MMLE) for testing the main
effects and interaction are defined. They are shown to have high powers
and better robustness properties as compared to the normal theory
solutions. A real data set is analysed.
Journal: Journal of Applied Statistics
Pages: 1051-1066
Issue: 10
Volume: 32
Year: 2005
Keywords: Least squares, modified maximum likelihood, robustness, experimental design, Weibull distribution,
X-DOI: 10.1080/02664760500165099
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500165099
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:10:p:1051-1066
Template-Type: ReDIF-Article 1.0
Author-Name: D. T. Shirke
Author-X-Name-First: D. T.
Author-X-Name-Last: Shirke
Author-Name: R. R. Kumbhar
Author-X-Name-First: R. R.
Author-X-Name-Last: Kumbhar
Author-Name: D. Kundu
Author-X-Name-First: D.
Author-X-Name-Last: Kundu
Title: Tolerance intervals for exponentiated scale family of distributions
Abstract:
In this article we provide an asymptotic upper β-expectation and
β-content γ-level tolerance intervals for a new family of
distributions, namely the Exponentiated Scale family of distributions.
Expected coverage of a proposed β-expectation Tolerance Interval is
obtained. Bootstrap-based tolerance limits are obtained for data arising
from an exponentiated exponential distribution.
Journal: Journal of Applied Statistics
Pages: 1067-1074
Issue: 10
Volume: 32
Year: 2005
Keywords: β-expectation tolerance interval, β-content γ-level tolerance interval, expected coverage, exponentiated scale family, exponentiated exponential distribution,
X-DOI: 10.1080/02664760500165297
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500165297
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:10:p:1067-1074
Template-Type: ReDIF-Article 1.0
Author-Name: J. Lopez-Fidalgo
Author-X-Name-First: J.
Author-X-Name-Last: Lopez-Fidalgo
Author-Name: J. M. Rodriguez-Diaz
Author-X-Name-First: J. M.
Author-X-Name-Last: Rodriguez-Diaz
Author-Name: G. Sanchez
Author-X-Name-First: G.
Author-X-Name-Last: Sanchez
Author-Name: M. T. Santos-Martin
Author-X-Name-First: M. T.
Author-X-Name-Last: Santos-Martin
Title: Optimal designs for compartmental models with correlated observations
Abstract:
The flow of internally deposited radioisotope particles inside the body
of people exposed to inhalation, ingestion, injection or other ways is
usually evaluated using compartmental models (see Sanchez & Lopez-Fidalgo,
(2003, and Lopez-Fidalgo & Sanchez, 2005). The International Commission on
Radiological Protection (ICRP, 1994) describes the model of the human
respiratory tract, represented by two main regions. One of these, the
thoracic region (lungs) is divided into different compartments. The
retention in the lungs is given by a large combination of ratios of
exponential sums depending on time. The aim of this work is to provide
optimal times for making bioassays when there has been an accidental
radioactivity intake and there is interest in estimating it. In this
paper, a large two-parameter model is studied and a simplified model is
proposed in order to obtain optimal designs in a more suitable way. Local
c-optimal designs for the main parameters are obtained using the results
of Lopez-Fidalgo & Rodriguez-Diaz, 2004). Efficiencies for all the
computed designs are provided and compared.
Journal: Journal of Applied Statistics
Pages: 1075-1088
Issue: 10
Volume: 32
Year: 2005
Keywords: Bioassays, biokinetic models, design efficiencies, initial deposition factors, radioactivity retention,
X-DOI: 10.1080/02664760500165313
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500165313
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:10:p:1075-1088
Template-Type: ReDIF-Article 1.0
Author-Name: Douglas Bonett
Author-X-Name-First: Douglas
Author-X-Name-Last: Bonett
Title: Robust confidence interval for a residual standard deviation
Abstract:
The residual standard deviation of a general linear model provides
information about predictive accuracy that is not revealed by the multiple
correlation or regression coefficients. The classic confidence interval
for a residual standard deviation is hypersensitive to minor violations of
the normality assumption and its robustness does not improve with
increasing sample size. An approximate confidence interval for the
residual standard deviation is proposed and shown to be robust to moderate
violations of the normality assumption with robustness to extreme
non-normality that improves with increasing sample size.
Journal: Journal of Applied Statistics
Pages: 1089-1094
Issue: 10
Volume: 32
Year: 2005
Keywords: Dispersion, regression, model fit,
X-DOI: 10.1080/02664760500165339
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500165339
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:32:y:2005:i:10:p:1089-1094
Template-Type: ReDIF-Article 1.0
Author-Name: Patrick Bourke
Author-X-Name-First: Patrick
Author-X-Name-Last: Bourke
Title: The RL2 chart versus the np chart for detecting upward shifts in fraction defective
Abstract:
The use of the np chart for monitoring fraction-defective is
well-established, but there are a number of relatively simple alternatives
based on run-lengths of conforming items. Here, the RL2 chart, based on
the moving sum of two successive conforming run-lengths, is investigated
in order to provide SPC practitioners with clear-cut guidance on the
comparative performance of these competing charts. Both sampling
inspection and 100% inspection are considered here, and it is shown that
the RL2 chart can often be considerably more efficient than the np chart,
but the comparative performance depends on the false-alarm rate used for
the comparison. Graphs to aid parameter-choice for the RL2 chart are also
provided.
Journal: Journal of Applied Statistics
Pages: 1-15
Issue: 1
Volume: 33
Year: 2006
Keywords: Conforming run-length, control chart, statistical process control,
X-DOI: 10.1080/02664760500389400
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500389400
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:1:p:1-15
Template-Type: ReDIF-Article 1.0
Author-Name: Yoshio Komori
Author-X-Name-First: Yoshio
Author-X-Name-Last: Komori
Title: Properties of the Weibull cumulative exposure model
Abstract:
This article is aimed at the investigation of some properties of the
Weibull cumulative exposure model on multiple-step step-stress accelerated
life test data. Although the model includes a probabilistic idea of
Miner's rule in order to express the effect of cumulative damage in
fatigue, our result shows that the application of only this is not
sufficient to express degradation of specimens and the shape parameter
must be larger than 1. For a random variable obeying the model, its
average and standard deviation are investigated on a various sets of
parameter values. In addition, a way of checking the validity of the model
is illustrated through an example of the maximum likelihood estimation on
an actual data set, which is about time to breakdown of cross-linked
polyethylene-insulated cables.
Journal: Journal of Applied Statistics
Pages: 17-34
Issue: 1
Volume: 33
Year: 2006
Keywords: Residual lifetime estimation, step-stress accelerated life test, maximum likelihood estimation,
X-DOI: 10.1080/02664760500389475
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500389475
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:1:p:17-34
Template-Type: ReDIF-Article 1.0
Author-Name: Daniel Hunt
Author-X-Name-First: Daniel
Author-X-Name-Last: Hunt
Author-Name: Dale Bowman
Author-X-Name-First: Dale
Author-X-Name-Last: Bowman
Title: Modeling developmental data using U-shaped threshold dose-response curves
Abstract:
This paper develops threshold models for developmental toxicity data. The
distinguishing feature of these threshold models is their flexibility in
modeling data below threshold with a U-shaped function if the data
warrants. The method is applied to actual data from a developmental study
which exhibits U-shaped behavior in early dose groups. Results from a
simulation study demonstrate the flexibility of the threshold model to
pick up on U-shaped trends in the data. In addition, the simulation study
reveals important considerations in design of developmental studies.
Journal: Journal of Applied Statistics
Pages: 35-47
Issue: 1
Volume: 33
Year: 2006
Keywords: Beta-binomial, dose-response curves, threshold models, U-shaped models,
X-DOI: 10.1080/02664760500389525
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500389525
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:1:p:35-47
Template-Type: ReDIF-Article 1.0
Author-Name: Lee-Ing Tong
Author-X-Name-First: Lee-Ing
Author-X-Name-Last: Tong
Author-Name: Chien-Hui Yang
Author-X-Name-First: Chien-Hui
Author-X-Name-Last: Yang
Title: Analyzing type II censored data obtained from repetitious experiments
Abstract:
Experimental design and Taguchi's parameter design are widely employed by
industry to optimize the process/product. However, censored data are often
observed in product lifetime testing during the experiments. After
implementing a repetitious experiment with type II censored data, the
censored data are usually estimated by establishing a complex statistical
model. However, using the incomplete data to fit a model may not
accurately estimates the censored data. Moreover, the model fitting
process is complicated for a practitioner who has only limited statistical
training. This study proposes a less complex approach to analyze censored
data, using the least square estimation method and Torres's analysis of
unreplicated factorials with possible abnormalities. This study also
presents an effective method to analyze the censored data from Taguchi's
parameter design using least square estimation method. Finally, examples
are given to illustrate the effectiveness of the proposed methods.
Journal: Journal of Applied Statistics
Pages: 49-63
Issue: 1
Volume: 33
Year: 2006
Keywords: Type II censored data, least square estimation, Torres's method, experimental design, Taguchi's parameter design,
X-DOI: 10.1080/02664760500389673
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500389673
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:1:p:49-63
Template-Type: ReDIF-Article 1.0
Author-Name: Li-Ping Zhu
Author-X-Name-First: Li-Ping
Author-X-Name-Last: Zhu
Author-Name: Li-Xing Zhu
Author-X-Name-First: Li-Xing
Author-X-Name-Last: Zhu
Author-Name: Shi-Song Mao
Author-X-Name-First: Shi-Song
Author-X-Name-Last: Mao
Title: A non-iterative approach to estimating parameters in a linear structural equation model
Abstract:
The research described herein was motivated by a study of the
relationship between the performance of students in senior high schools
and at universities in China. A special linear structural equation model
is established, in which some parameters are known and both the responses
and the covariables are measured with errors. To explore the relationship
between the true responses and latent covariables and to estimate the
parameters, we suggest a non-iterative estimation approach that can
account for the external dependence between the true responses and latent
covariables. This approach can also deal with the collinearity problem
because the use of dimension-reduction techniques can remove redundant
variables. Combining further with the information that some of parameters
are given, we can perform estimation for the other unknown parameters. An
easily implemented algorithm is provided. A simulation is carried out to
provide evidence of the performance of the approach and to compare it with
existing methods. The approach is applied to the education example for
illustration, and it can be readily extended to more general models.
Journal: Journal of Applied Statistics
Pages: 65-78
Issue: 1
Volume: 33
Year: 2006
Keywords: Linear structural equation model, collinearity, canonical correlation analysis, partial least squares,
X-DOI: 10.1080/02664760500389723
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500389723
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:1:p:65-78
Template-Type: ReDIF-Article 1.0
Author-Name: Markus Neuhauser
Author-X-Name-First: Markus
Author-X-Name-Last: Neuhauser
Title: An exact test for trend among binomial proportions based on a modified Baumgartner-Weiss-Schindler statistic
Abstract:
The Cochran-Armitage test is the most frequently used test for trend
among binomial proportions. This test can be performed based on the
asymptotic normality of its test statistic or based on an exact null
distribution. As an alternative, a recently introduced modification of the
Baumgartner-Weiss-Schindler statistic, a novel nonparametric statistic,
can be used. Simulation results indicate that the exact test based on this
modification is preferable to the Cochran-Armitage test. This exact test
is less conservative and more powerful than the exact Cochran-Armitage
test. The power comparison to the asymptotic Cochran-Armitage test does
not show a clear winner, but the difference in power is usually small. The
exact test based on the modification is recommended here because, in
contrast to the asymptotic Cochran-Armitage test, it guarantees a type I
error rate less than or equal to the significance level. Moreover, an
exact test is often more appropriate than an asymptotic test because
randomization rather than random sampling is the norm, for example in
biomedical research. The methods are illustrated with an example data set.
Journal: Journal of Applied Statistics
Pages: 79-88
Issue: 1
Volume: 33
Year: 2006
Keywords: Binomial data, Cochran-Armitage test, exact conditional test, randomization model,
X-DOI: 10.1080/02664760500389756
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500389756
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:1:p:79-88
Template-Type: ReDIF-Article 1.0
Author-Name: Enrique De Alba
Author-X-Name-First: Enrique
Author-X-Name-Last: De Alba
Author-Name: Juan Fernandez-Duran
Author-X-Name-First: Juan
Author-X-Name-Last: Fernandez-Duran
Author-Name: M. Mercedes Gregorio-Dominguez
Author-X-Name-First: M. Mercedes
Author-X-Name-Last: Gregorio-Dominguez
Title: Bayesian inference for the mean and standard deviation of a normal population when only the sample size, mean and range are observed
Abstract:
Consider a random sample X1, X2,…, Xn, from a normal population
with unknown mean and standard deviation. Only the sample size, mean and
range are recorded and it is necessary to estimate the unknown population
mean and standard deviation. In this paper the estimation of the mean and
standard deviation is made from a Bayesian perspective by using a Markov
Chain Monte Carlo (MCMC) algorithm to simulate samples from the
intractable joint posterior distribution of the mean and standard
deviation. The proposed methodology is applied to simulated and real data.
The real data refers to the sugar content (oBRIX level) of orange juice
produced in different countries.
Journal: Journal of Applied Statistics
Pages: 89-99
Issue: 1
Volume: 33
Year: 2006
Keywords: Bayesian estimation, range, order statistics, MCMC,
X-DOI: 10.1080/02664760500389913
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500389913
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:1:p:89-99
Template-Type: ReDIF-Article 1.0
Author-Name: D. Muthuraj
Author-X-Name-First: D.
Author-X-Name-Last: Muthuraj
Author-Name: D. Senthilkumar
Author-X-Name-First: D.
Author-X-Name-Last: Senthilkumar
Title: Designing and construction of tightened-normal-tightened variables sampling scheme
Abstract:
The tightened-normal-tightened (TNT) attributes sampling scheme was
devised by Calvin (1977). In this paper, a TNT Scheme with variables
sampling plan as the reference plan, designated as TNTVSS (nσ; kT,
kN) is introduced, where nσ is the sample size under the reference
plan, and kT and kN are the acceptance constants corresponding to
tightened and normal plans respectively. The behaviour of OC curves of the
TNTVSS (nσ; kT, kN) is studied. The efficiency of TNTVSS (nσ;
kT, kN) with respect to smaller sample sizes has been established over the
attributes scheme. The TNTVSS is matched with the TNT (n; cN, cT) of
Vijayaraghavan and Soundararajan (1996), for the specified points on the
OC curves, namely (p1, α) and (p2, β) and it is shown that the
sample size of the variables scheme is much smaller than that of the
attributes scheme. The TNT scheme with an unknown σ variables plan
as the reference plan is also introduced along with the procedure of
selection of the parameters. The method of designing the scheme based on
the given AQL (Acceptable Quality level), α (producer's risk), LQL
(Limiting Quality Level) and β (consumer's risk) is indicated. Among
the class of TNTVSS which exists, for a given (p1,α) and (p2,
β), a scheme, which will have a more steeper OC curve than that of
any other scheme, is identified and given.
Journal: Journal of Applied Statistics
Pages: 101-111
Issue: 1
Volume: 33
Year: 2006
Keywords: Variables plan, tightened-normal-tightened plan, AQL, LQL, switching rules, producer's risk, consumer's risk,
X-DOI: 10.1080/02664760500389582
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500389582
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:1:p:101-111
Template-Type: ReDIF-Article 1.0
Author-Name: Teo Jasic
Author-X-Name-First: Teo
Author-X-Name-Last: Jasic
Author-Name: Douglas Wood
Author-X-Name-First: Douglas
Author-X-Name-Last: Wood
Title: Testing for efficiency and non-linearity in market and natural time series
Abstract:
Time series in traded markets such as currencies and securities involve
supply/demand interaction, so they might be expected to contain
distinctive and identifiable structures in comparison with data based on
natural phenomena such as river flows or sunspots. This paper tests this
proposition using standard econometric tests including variance ratios,
modified rescaled range (R/S) ratios and BDS statistics together with
non-linear prediction models. Four time series of each type (market or
natural) are subject to a battery of tests for random walk and non-linear
dependence. Surprisingly, the tests provide no reliable discrimination
between the two types of series or reveal any embedded specification
differences.
Journal: Journal of Applied Statistics
Pages: 113-138
Issue: 2
Volume: 33
Year: 2006
Keywords: Efficiency tests, non-linearity, neural networks, market versus natural data,
X-DOI: 10.1080/02664760500250370
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500250370
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:2:p:113-138
Template-Type: ReDIF-Article 1.0
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Author-Name: Chi-Hyuck Jun
Author-X-Name-First: Chi-Hyuck
Author-X-Name-Last: Jun
Title: Average outgoing quality of CSP-C continuous sampling plan under short run production processes
Abstract:
A CSP-C continuous sampling plan is a new single-level continuous
sampling procedure developed by Govindaraju & Kandasamy (2000) by
incorporating the concept of acceptance number to the CSP-1 plan for the
application of continuous production processes. In this new plan, the
sampling inspection phase is characterized by a maximum allowable number
of non-conforming units, c, and a constant sampling rate, f. Govindaraju &
Kandasamy (2000) derived the performance measures such as average outgoing
quality (AOQ), average fraction inspected (AFI) etc, of the CSP-C plan
using a Markov chain model for long run production processes. Yang (1983)
has observed that the AOQ and AFI, being long run average measures, are
not satisfactory measures of performance for short run production
processes. Hence, formulas are derived in this paper, using the renewal
theory approach enabling one to compute AOQ and AFI for both long run and
short run production processes. Numerical illustrations are also given. By
simulation, the accuracy of the short run measures is studied.
Journal: Journal of Applied Statistics
Pages: 139-154
Issue: 2
Volume: 33
Year: 2006
Keywords: Average outgoing quality, renewal theory, short run production process,
X-DOI: 10.1080/02664760500250537
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500250537
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:2:p:139-154
Template-Type: ReDIF-Article 1.0
Author-Name: Yangxin Huang
Author-X-Name-First: Yangxin
Author-X-Name-Last: Huang
Author-Name: Hulin Wu
Author-X-Name-First: Hulin
Author-X-Name-Last: Wu
Title: A Bayesian approach for estimating antiviral efficacy in HIV dynamic models
Abstract:
The study of HIV dynamics is one of the most important developments in
recent AIDS research. It has led to a new understanding of the
pathogenesis of HIV infection. Although important findings in HIV dynamics
have been published in prestigious scientific journals, the statistical
methods for parameter estimation and model-fitting used in those papers
appear surprisingly crude and have not been studied in more detail. For
example, the unidentifiable parameters were simply imputed by mean
estimates from previous studies, and important pharmacological/clinical
factors were not considered in the modelling. In this paper, a viral
dynamic model is developed to evaluate the effect of pharmacokinetic
variation, drug resistance and adherence on antiviral responses. In the
context of this model, we investigate a Bayesian modelling approach under
a non-linear mixed-effects (NLME) model framework. In particular, our
modelling strategy allows us to estimate time-varying antiviral efficacy
of a regimen during the whole course of a treatment period by
incorporating the information of drug exposure and drug susceptibility.
Both simulated and real clinical data examples are given to illustrate the
proposed approach. The Bayesian approach has great potential to be used in
many aspects of viral dynamics modelling since it allow us to fit complex
dynamic models and identify all the model parameters. Our results suggest
that Bayesian approach for estimating parameters in HIV dynamic models is
flexible and powerful.
Journal: Journal of Applied Statistics
Pages: 155-174
Issue: 2
Volume: 33
Year: 2006
Keywords: Bayesian mixed-effects models, drug efficacy, drug resistance, HIV, MCMC, viral dynamics,
X-DOI: 10.1080/02664760500250552
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500250552
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:2:p:155-174
Template-Type: ReDIF-Article 1.0
Author-Name: Ashis Sengupta
Author-X-Name-First: Ashis
Author-X-Name-Last: Sengupta
Author-Name: Fidelis Ugwuowo
Author-X-Name-First: Fidelis
Author-X-Name-Last: Ugwuowo
Title: Modelling multi-stage processes through multivariate distributions
Abstract:
A new model combining parametric and semi-parametric approaches and
following the lines of a semi-Markov model is developed for multi-stage
processes. A Bivariate sojourn time distribution derived from the
bivariate exponential distribution of Marshall & Olkin (1967) is adopted.
The results compare favourably with the usual semi-parametric approaches
that have been in use. Our approach also has several advantages over the
models in use including its amenability to statistical inference. For
example, the tests for symmetry and also for independence of the marginals
of the sojourn time distributions, which were not available earlier, can
now be conveniently derived and are enhanced in elegant forms. A unified
Goodness-of-Fit test procedure for our proposed model is also presented.
An application to the human resource planning involving real-life data
from University of Nigeria is given.
Journal: Journal of Applied Statistics
Pages: 175-188
Issue: 2
Volume: 33
Year: 2006
Keywords: Bivariate exponential, multi-stage processes, semi-Markov, semi-parametric, human resource planning,
X-DOI: 10.1080/02664760500250586
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500250586
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:2:p:175-188
Template-Type: ReDIF-Article 1.0
Author-Name: Alberto Luceno
Author-X-Name-First: Alberto
Author-X-Name-Last: Luceno
Author-Name: Jaime Puig-Pey
Author-X-Name-First: Jaime
Author-X-Name-Last: Puig-Pey
Title: The random intrinsic fast initial response of one-sided CUSUM charts
Abstract:
This article analyses the performance of a one-sided cumulative sum
(CUSUM) chart that is initialized using a random starting point following
the natural or intrinsic probability distribution of the CUSUM statistic.
By definition, this probability distribution remains stable as the chart
is used. The probability that the chart starts at zero according to this
intrinsic distribution is always smaller than one, which confers on the
chart a fast initial response feature. The article provides a fast and
accurate algorithm to compute the in-control and out-of-control average
run lengths and run-length probability distributions for one-sided CUSUM
charts initialized using this random intrinsic fast initial response
(RIFIR) scheme. The algorithm also computes the intrinsic distribution of
the CUSUM statistic and random samples extracted from this distribution.
Most importantly, no matter how the chart was initialized, if no level
shifts and no alarms have occurred before time τ > 0, the
distribution of the run length remaining after τ is provided by this
algorithm very accurately, provided that τ is not too small.
Journal: Journal of Applied Statistics
Pages: 189-201
Issue: 2
Volume: 33
Year: 2006
Keywords: Average run length, cumulative sum charts, Gaussian quadrature, Markov chains, run-length distribution, statistical process control,
X-DOI: 10.1080/02664760500250610
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500250610
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:2:p:189-201
Template-Type: ReDIF-Article 1.0
Author-Name: Nobuko Miyamoto
Author-X-Name-First: Nobuko
Author-X-Name-Last: Miyamoto
Author-Name: Kouji Tahata
Author-X-Name-First: Kouji
Author-X-Name-Last: Tahata
Author-Name: Hirokazu Ebie
Author-X-Name-First: Hirokazu
Author-X-Name-Last: Ebie
Author-Name: Sadao Tomizawa
Author-X-Name-First: Sadao
Author-X-Name-Last: Tomizawa
Title: Marginal inhomogeneity models for square contingency tables with nominal categories
Abstract:
For the analysis of square contingency tables with nominal categories,
this paper proposes two kinds of models that indicate the structure of
marginal inhomogeneity. One model states that the absolute values of log
odds of the row marginal probability to the corresponding column marginal
probability for each category i are constant for every i. The other model
states that, on the condition that an observation falls in one of the
off-diagonal cells in the square table, the absolute values of log odds of
the conditional row marginal probability to the corresponding conditional
column marginal probability for each category i are constant for every i.
These models are used when the marginal homogeneity model does not hold,
and the values of parameters in the models are useful for seeing the
degree of departure from marginal homogeneity for the data on a nominal
scale. Examples are given.
Journal: Journal of Applied Statistics
Pages: 203-215
Issue: 2
Volume: 33
Year: 2006
Keywords: Asymmetry, conditional probability, nominal category, marginal homogeneity, marginal inhomogeneity, model, square table,
X-DOI: 10.1080/02664760500251576
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500251576
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:2:p:203-215
Template-Type: ReDIF-Article 1.0
Author-Name: D. J. Spiegelhalter
Author-X-Name-First: D. J.
Author-X-Name-Last: Spiegelhalter
Author-Name: E. C. Marshall
Author-X-Name-First: E. C.
Author-X-Name-Last: Marshall
Title: Strategies for inference robustness in focused modelling
Abstract:
Advances in computation mean that it is now possible to fit a wide range
of complex models to data, but there remains the problem of selecting a
model on which to base reported inferences. Following an early suggestion
of Box & Tiao, it seems reasonable to seek 'inference robustness' in
reported models, so that alternative assumptions that are reasonably well
supported would not lead to substantially different conclusions. We
propose a four-stage modelling strategy in which we iteratively assess and
elaborate an initial model, measure the support for each of the resulting
family of models, assess the influence of adopting alternative models on
the conclusions of primary interest, and identify whether an approximate
model can be reported. The influence-support plot is then introduced as a
tool to aid model comparison. The strategy is semi-formal, in that it
could be embedded in a decision-theoretic framework but requires
substantive input for any specific application. The one restriction of the
strategy is that the quantity of interest, or 'focus', must retain its
interpretation across all candidate models. It is, therefore, applicable
to analyses whose goal is prediction, or where a set of common model
parameters are of interest and candidate models make alternative
distributional assumptions. The ideas are illustrated by two examples.
Technical issues include the calibration of the Kullback-Leibler
divergence between marginal distributions, and the use of alternative
measures of support for the range of models fitted.
Journal: Journal of Applied Statistics
Pages: 217-232
Issue: 2
Volume: 33
Year: 2006
Keywords: Influence diagnostics, hierarchical models, model choice, prediction, institutional comparisons, Markov chain Monte Carlo, Kullback-Leibler divergence,
X-DOI: 10.1080/02664760500251618
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500251618
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:2:p:217-232
Template-Type: ReDIF-Article 1.0
Author-Name: Steven Cook
Author-X-Name-First: Steven
Author-X-Name-Last: Cook
Title: A finite-sample sensitivity analysis of the Dickey-Fuller test under local-to-unity detrending
Abstract:
In recent research, Elliott et al. (1996) have shown the use of
local-to-unity detrending via generalized least squares (GLS) to
substantially increase the power of the Dickey-Fuller (1979) unit root
test. In this paper the relationship between the extent of detrending
undertaken, determined by the detrending parameter [image omitted], and
the power of the resulting GLS-based Dickey-Fuller (DF-GLS) test is
examined. Using Monte Carlo simulation it is shown that the values of
[image omitted] suggested by Elliott et al. (1996) on the basis of a
limiting power function seldom maximize the power of the DF-GLS test for
the finite samples encountered in applied research. This result is found
to hold for the DF-GLS test including either an intercept or an intercept
and a trend term. An empirical examination of the order of integration of
the UK household savings ratio illustrates these findings, with the unit
root hypothesis rejected using values of [image omitted] other than that
proposed by Elliott et al. (1996).
Journal: Journal of Applied Statistics
Pages: 233-240
Issue: 2
Volume: 33
Year: 2006
Keywords: Dickey-Fuller test, unit roots, local-to-unity detrending, savings ratio,
X-DOI: 10.1080/02664760500251725
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500251725
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:2:p:233-240
Template-Type: ReDIF-Article 1.0
Author-Name: Richard Woodhouse
Author-X-Name-First: Richard
Author-X-Name-Last: Woodhouse
Title: Graphical solutions for structural regression assist errors-in-variables modelling
Abstract:
Structural regression attempts to reveal an underlying relationship by
compensating for errors in the variables. Ordinary least-squares
regression has an entirely different purpose and provides a relationship
between error-included variables. Structural model solutions, also known
as the errors-in-variables and measurement-error solutions, use various
inputs such as the error-variance ratio and x-error variance. This paper
proposes that more accurate structural line gradient (coefficient)
solutions will result from using the several solutions together as a
system of equations. The known data scatter, as measured by the
correlation coefficient, should always be used in choosing legitimate
combinations of x- and y-error terms. However, this is difficult using
equations. Chart solutions are presented to assist users to understand the
structural regression process, to observe the correlation coefficient
constraint, to assess the impact of their error estimates and, therefore,
to provide better quality estimates of the structural regression gradient.
Journal: Journal of Applied Statistics
Pages: 241-255
Issue: 3
Volume: 33
Year: 2006
Keywords: Correlation coefficient constraint, error compensation, error-variance ratio, line fitting, measurement-error model,,
X-DOI: 10.1080/02664760500445483
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500445483
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:3:p:241-255
Template-Type: ReDIF-Article 1.0
Author-Name: E. Andersson
Author-X-Name-First: E.
Author-X-Name-Last: Andersson
Author-Name: D. Bock
Author-X-Name-First: D.
Author-X-Name-Last: Bock
Author-Name: M. Frisen
Author-X-Name-First: M.
Author-X-Name-Last: Frisen
Title: Some statistical aspects of methods for detection of turning points in business cycles
Abstract:
Methods for online turning point detection in business cycles are
discussed. The statistical properties of three likelihood-based methods
are compared. One is based on a Hidden Markov Model, another includes a
non-parametric estimation procedure and the third combines features of the
other two. The methods are illustrated by monitoring a period of the
Swedish industrial production. Evaluation measures that reflect timeliness
are used. The effects of smoothing, seasonal variation, autoregression and
multivariate issues on methods for timely detection are discussed.
Journal: Journal of Applied Statistics
Pages: 257-278
Issue: 3
Volume: 33
Year: 2006
Keywords: Monitoring, surveillance, early warning system, regime switching,
X-DOI: 10.1080/02664760500445517
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500445517
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:3:p:257-278
Template-Type: ReDIF-Article 1.0
Author-Name: Fong-jung Yu
Author-X-Name-First: Fong-jung
Author-X-Name-Last: Yu
Author-Name: Jiang-liang Hou
Author-X-Name-First: Jiang-liang
Author-X-Name-Last: Hou
Title: Optimization of design parameters for [image omitted] control charts with multiple assignable causes
Abstract:
Duncan's economic model of Shewhart's original x-super-¯ chart has
established its optimal and economic application for processes with the
Markovian failure characteristic. As the sample statistics show some
indications of process variations, the variable-sampling-interval (VSI)
control charts perform more effectively than the fixed sampling interval
(FSI) ones due to a higher frequency in the sampling rate. Regarding the
economic design of control charts, most studies have been dedicated to the
FSI scheme. In 1998, Bai & Lee considered the production process with a
single assignable cause and proposed an economic VSI design for a general
x-super-¯ control chart. However, in real cases, there are multiple
assignable causes in the production process. Therefore, concerning the
operation characteristics of the real industry, this research develops an
economic model for the VSI control chart with multiple assignable causes
based on stochastic and statistics theory and determines the optimal
design parameters of the chart. A numerical example is also provided to
demonstrate the effectiveness of the proposed model and the result
indicates that VSI performs more effectively than a FSI control chart.
Journal: Journal of Applied Statistics
Pages: 279-290
Issue: 3
Volume: 33
Year: 2006
Keywords: Economic design, Variable Sampling Interval (VSI), x-super-¯ control charts, multiple assignable causes, Statistical Process Control (SPC),
X-DOI: 10.1080/02664760500445541
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500445541
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:3:p:279-290
Template-Type: ReDIF-Article 1.0
Author-Name: R. R. L. Kantam
Author-X-Name-First: R. R. L.
Author-X-Name-Last: Kantam
Author-Name: G. Srinivasa Rao
Author-X-Name-First: G. Srinivasa
Author-X-Name-Last: Rao
Author-Name: B. Sriram
Author-X-Name-First: B.
Author-X-Name-Last: Sriram
Title: An economic reliability test plan: Log-logistic distribution
Abstract:
Sampling plans in which items that are put to test, to collect the life
of the items in order to decide upon accepting or rejecting a submitted
lot, are called reliability test plans. The basic probability model of the
life of the product is specified as the well-known log-logistic
distribution with a known shape parameter. For a given producer's risk,
sample size, termination number, and waiting time to terminate the test
plan are computed. The preferability of the test plan over similar plans
existing in the literature is established with respect to cost and time of
the experiment.
Journal: Journal of Applied Statistics
Pages: 291-296
Issue: 3
Volume: 33
Year: 2006
Keywords: Log-logistic distribution, reliability test plan, producer's risk, acceptance sample number,
X-DOI: 10.1080/02664760500445681
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500445681
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:3:p:291-296
Template-Type: ReDIF-Article 1.0
Author-Name: A. M. Wade
Author-X-Name-First: A. M.
Author-X-Name-Last: Wade
Author-Name: K. Lawrence
Author-X-Name-First: K.
Author-X-Name-Last: Lawrence
Author-Name: W. Mandy
Author-X-Name-First: W.
Author-X-Name-Last: Mandy
Author-Name: D. Skuse
Author-X-Name-First: D.
Author-X-Name-Last: Skuse
Title: Charting the development of emotion recognition from 6 years of age
Abstract:
Recognition of emotions within others is a necessary life skill. We know
that this is a learnt skill, which develops throughout childhood and is
deficient in some individuals. To put individual development in context,
it is necessary to understand the nature of development amongst the normal
population. Age-related centiles can be used to add this context. The
level of emotion recognition is assessed using an ordinal outcome scale,
and hence establishing age-related centiles for these measures creates
particular analytical problems. In this paper, we use methodology
previously developed by us for monitoring the development of visual acuity
during childhood to calculate age-related centiles for emotion recognition
ratings. The ratings do not consistently improve with age and appear to be
affected by hormonal developments. A comparison of ability to rate
emotions according to the stage of pubertal development is used to
illustrate how the conversion of ordinal assessments to continuous centile
scores facilitates the investigation. The specific issues relating to the
application of the methodology to data that are not consistent in the
direction of change with age and where large amounts of data can be
gathered electronically are discussed.
Journal: Journal of Applied Statistics
Pages: 297-315
Issue: 3
Volume: 33
Year: 2006
Keywords: Ordinal, age-related centiles, emotion recognition, ekman-friesen test, proportional odds models,
X-DOI: 10.1080/02664760500445756
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500445756
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:3:p:297-315
Template-Type: ReDIF-Article 1.0
Author-Name: Rand Wilcox
Author-X-Name-First: Rand
Author-X-Name-Last: Wilcox
Title: Confidence intervals for prediction intervals
Abstract:
When working with a single random variable, the simplest and most obvious
approach when estimating a 1 - γ prediction interval, is to estimate
the γ/2 and 1 - γ/2 quantiles. The paper compares the
small-sample properties of several methods aimed at estimating an interval
that contains the 1 - γ prediction interval with probability 1 -
α. In effect, the goal is to compute a 1 - α confidence
interval for the true 1 - γ prediction interval. The only successful
method when the sample size is small is based in part on an adaptive
kernel estimate of the underlying density. Some simulation results are
reported on how an extension to non-parametric regression performs, based
on a so-called running interval smoother.
Journal: Journal of Applied Statistics
Pages: 317-326
Issue: 3
Volume: 33
Year: 2006
Keywords: Quantile estimation, kernel density estimators, non-parametric regression,
X-DOI: 10.1080/02664760500445962
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500445962
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:3:p:317-326
Template-Type: ReDIF-Article 1.0
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Author-Name: Chi-hyuck Jun
Author-X-Name-First: Chi-hyuck
Author-X-Name-Last: Jun
Title: Repetitive group sampling procedure for variables inspection
Abstract:
This paper introduces the concept of repetitive group sampling (RGS) for
variables inspection. The repetitive group sampling plan for variables
inspection will be useful when testing is costly and destructive. The
advantages of the variables RGS plan over variables single sampling plan,
variables double sampling plan and attributes RGS plan are discussed.
Tables are also constructed for the selection of parameters of known and
unknown standard deviation variables repetitive group sampling plan
indexed by acceptable quality level and limiting quality level.
Journal: Journal of Applied Statistics
Pages: 327-338
Issue: 3
Volume: 33
Year: 2006
Keywords: Acceptable quality level, average sample number, limiting quality level, repetitive group sampling, sampling by variables,
X-DOI: 10.1080/02664760500446010
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500446010
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:3:p:327-338
Template-Type: ReDIF-Article 1.0
Author-Name: Kanti Mardia
Author-X-Name-First: Kanti
Author-X-Name-Last: Mardia
Author-Name: Paul McDonnell
Author-X-Name-First: Paul
Author-X-Name-Last: McDonnell
Author-Name: Alf Linney
Author-X-Name-First: Alf
Author-X-Name-Last: Linney
Title: Penalized image averaging and discrimination with facial and fishery applications
Abstract:
In this paper we use a penalized likelihood approach to image warping in
the context of discrimination and averaging. The choice of average image
is formulated statistically by minimizing a penalized likelihood, where
the likelihood measures the similarity between images after warping and
the penalty is a measure of distortion of a warping. The notions of
measures of similarity are given in terms of normalized image information.
The measures of distortion are landmark based. Thus we use a combination
of landmark and normalized image information. The average defined in the
paper is also extended by allowing random perturbation of the landmarks.
This strategy improves averages for discrimination purposes. We give here
real applications from medical and biological areas.
Journal: Journal of Applied Statistics
Pages: 339-371
Issue: 3
Volume: 33
Year: 2006
Keywords: Female and male faces, Fisher discriminant analysis, haddock and whiting fish, laser images, normalized images, penalized likelihood,
X-DOI: 10.1080/02664760500163649
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500163649
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:3:p:339-371
Template-Type: ReDIF-Article 1.0
Author-Name: Stefano Barone
Author-X-Name-First: Stefano
Author-X-Name-Last: Barone
Author-Name: Alberto Lombardo
Author-X-Name-First: Alberto
Author-X-Name-Last: Lombardo
Title: Balanced Asymmetrical Nearly Orthogonal Designs for first and second order effect estimation
Abstract:
A method for constructing asymmetrical (mixed-level) designs, satisfying
the balancing and interaction estimability requirements with a number of
runs as small as possible, is proposed in this paper. The method, based on
a heuristic procedure, uses a new optimality criterion formulated here.
The proposed method demonstrates efficiency in terms of searching time and
optimality of the attained designs. A complete collection of such
asymmetrical designs with two- and three-level factors is available. A
technological application is also presented.
Journal: Journal of Applied Statistics
Pages: 373-386
Issue: 4
Volume: 33
Year: 2006
Keywords: Balancing, asymmetrical (mixed-level) designs, nearly orthogonal arrays, optimality, two- and three-level designs,
X-DOI: 10.1080/02664760500448917
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500448917
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:4:p:373-386
Template-Type: ReDIF-Article 1.0
Author-Name: Murat Kulahci
Author-X-Name-First: Murat
Author-X-Name-Last: Kulahci
Author-Name: Søren Bisgaard
Author-X-Name-First: Søren
Author-X-Name-Last: Bisgaard
Title: A generalization of the alias matrix
Abstract:
The investigation of aliases or biases is important for the
interpretation of the results from factorial experiments. For two-level
fractional factorials this can be facilitated through their group
structure. For more general arrays the alias matrix can be used. This tool
is traditionally based on the assumption that the error structure is that
associated with ordinary least squares. For situations where that is not
the case, we provide in this article a generalization of the alias matrix
applicable under the generalized least squares assumptions. We also show
that for the special case of split plot error structure, the generalized
alias matrix simplifies to the ordinary alias matrix.
Journal: Journal of Applied Statistics
Pages: 387-395
Issue: 4
Volume: 33
Year: 2006
X-DOI: 10.1080/02664760500449014
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500449014
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:4:p:387-395
Template-Type: ReDIF-Article 1.0
Author-Name: Greg Piepel
Author-X-Name-First: Greg
Author-X-Name-Last: Piepel
Title: A note comparing component-slope, Scheffe and Cox parameterizations of the linear mixture experiment model
Abstract:
A mixture experiment involves combining two or more components in various
proportions and collecting data on one or more responses. A linear mixture
model may adequately represent the relationship between a response and
mixture component proportions and be useful in screening the mixture
components. The Scheffe and Cox parameterizations of the linear mixture
model are commonly used for analyzing mixture experiment data. With the
Scheffe parameterization, the fitted coefficient for a component is the
predicted response at that pure component (i.e. single-component mixture).
With the Cox parameterization, the fitted coefficient for a mixture
component is the predicted difference in response at that pure component
and at a pre-specified reference composition. This article presents a new
component-slope parameterization, in which the fitted coefficient for a
mixture component is the predicted slope of the linear response surface
along the direction determined by that pure component and at a
pre-specified reference composition. The component-slope, Scheffe, and Cox
parameterizations of the linear mixture model are compared and their
advantages and disadvantages are discussed.
Journal: Journal of Applied Statistics
Pages: 397-403
Issue: 4
Volume: 33
Year: 2006
Keywords: Mixture component effects, Scheffe, linear mixture model, Cox linear mixture model, component-slope linear mixture model,
X-DOI: 10.1080/02664760500449170
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500449170
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:4:p:397-403
Template-Type: ReDIF-Article 1.0
Author-Name: Ron Kenett
Author-X-Name-First: Ron
Author-X-Name-Last: Kenett
Title: On the planning and design of sample surveys
Abstract:
Surveys rely on structured questions used to map out reality, using
sample observations from a population frame, into data that can be
statistically analyzed. This paper focuses on the planning and design of
surveys, making a distinction between individual surveys, household
surveys and establishment surveys. Knowledge from cognitive science is
used to provide guidelines on questionnaire design. Non-standard, but
simple, statistical methods are described for analyzing survey results.
The paper is based on experience gained by conducting over 150 customer
satisfaction surveys in Europe, America and the Far East.
Journal: Journal of Applied Statistics
Pages: 405-415
Issue: 4
Volume: 33
Year: 2006
Keywords: Questionnaire design, cognitive science, individual surveys, household surveys, establishment surveys, control charts analysis of survey data,
X-DOI: 10.1080/02664760500448974
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500448974
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:4:p:405-415
Template-Type: ReDIF-Article 1.0
Author-Name: Sungcheol Yun
Author-X-Name-First: Sungcheol
Author-X-Name-Last: Yun
Author-Name: So Young Sohn
Author-X-Name-First: So Young
Author-X-Name-Last: Sohn
Author-Name: Youngjo Lee
Author-X-Name-First: Youngjo
Author-X-Name-Last: Lee
Title: Modelling and estimating heavy-tailed non-homogeneous correlated queues: Pareto-inverse gamma HGLM with covariates
Abstract:
Evidence of communication traffic complexity reveals correlation in a
within-queue and heterogeneity among queues. We show how a random-effect
model can be used to accommodate these kinds of phenomena. We apply a
Pareto distribution for arrival (service) time of individual queue for
given arrival (service) rate. For modelling potential correlation in
arrival (service) times within a queue and heterogeneity of the arrival
(service) rates among queues, we use an inverse gamma distribution. This
modelling approach is then applied to the cache access log data processed
through an Internet server. We believe that our approach is potentially
useful in the area of network resource management.
Journal: Journal of Applied Statistics
Pages: 417-425
Issue: 4
Volume: 33
Year: 2006
Keywords: Within-queue correlation, between-queue variability, internet traffic, random effects linear model, hierarchical generalized linear model,
X-DOI: 10.1080/02664760500449311
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500449311
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:4:p:417-425
Template-Type: ReDIF-Article 1.0
Author-Name: Issam Samarah
Author-X-Name-First: Issam
Author-X-Name-Last: Samarah
Author-Name: Gamal Weheba
Author-X-Name-First: Gamal
Author-X-Name-Last: Weheba
Author-Name: Thomas Lacy
Author-X-Name-First: Thomas
Author-X-Name-Last: Lacy
Title: Response surface characterization of the mechanical behavior of impact-damaged sandwich composites
Abstract:
In this research, Response Surface Methodology (RSM) is employed to
characterize the influence of material configuration on the damage
tolerance and residual strength characteristics of sandwich composites.
Test specimens used were comprised of carbon-epoxy woven fabric
facesheets, and Nomex honeycomb cores. The ranges of the material
configuration used are typical of those employed in aircraft applications.
A series of carefully selected tests were used to isolate the coupled
influence of various combinations of the number of facesheet plies, core
density, and core thickness on the damage formation and residual strength
degradation due to normal impact. Response surface estimates suggest that
impact damage development and residual strength degradation are highly
material and lay-up configuration dependent. Increasing the core thickness
for a specific number of facesheet plies resulted in decreasing the impact
damage, whereas increasing the number of facesheet plies for a given core
thickness resulted in enhancing the residual strength. The derived damage
tolerance and residual strength models can lead to a better understanding
of the mechanical behavior of the impact-damaged sandwich composites, and
hence improve their design and expand their applications.
Journal: Journal of Applied Statistics
Pages: 427-437
Issue: 4
Volume: 33
Year: 2006
Keywords: Sandwich composites, damage tolerance, response surface methods, Box-Behnken design,
X-DOI: 10.1080/02664760500449295
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500449295
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:4:p:427-437
Template-Type: ReDIF-Article 1.0
Author-Name: Subha Chakraborti
Author-X-Name-First: Subha
Author-X-Name-Last: Chakraborti
Title: Parameter estimation and design considerations in prospective applications of the X chart
Abstract:
The effects of parameter estimation on the in-control performance of the
Shewhart X chart are studied in prospective (phase 2 or stage 2)
applications via a thorough examination of the attained false alarm rate
(AFAR), the conditional false alarm rate (CFAR), the conditional and the
unconditional run-length distributions, some run-length characteristics
such as the ARL, the conditional ARL (CARL), some selected percentiles
including the median, and cumulative run-length probabilities. The
examination involves both numerical evaluations and graphical displays.
The effects of parameter estimation need to be accounted for in designing
the chart. To this end, as an application of the exact formulations, chart
constants are provided for a specified in-control average run-length of
370 and 500 for a number of subgroups and subgroup sizes. These will be
useful in the implementation of the X chart in practice.
Journal: Journal of Applied Statistics
Pages: 439-459
Issue: 4
Volume: 33
Year: 2006
Keywords: Shewhart chart for the mean, run-length, false alarm rate, conditional distribution, phase 1, phase 2, average run-length, median run-length, chart constant,
X-DOI: 10.1080/02664760500163516
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500163516
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:4:p:439-459
Template-Type: ReDIF-Article 1.0
Author-Name: Christopher Pomory
Author-X-Name-First: Christopher
Author-X-Name-Last: Pomory
Title: A note on calculating P values from 0.15-0.005 for the Anderson-Darling normality test using the F distribution
Abstract:
Exact P values in the range 0.15-0.005 for the Anderson-Darling statistic
can be calculated using the F distribution by modifying the asymptotic
statistic A* with a simple formula. The formula calculates F* and P is
calculated using [image omitted] .
Journal: Journal of Applied Statistics
Pages: 461-462
Issue: 4
Volume: 33
Year: 2006
Keywords: EDF test,
X-DOI: 10.1080/02664760600677720
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600677720
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:4:p:461-462
Template-Type: ReDIF-Article 1.0
Author-Name: L. A. McSweeney
Author-X-Name-First: L. A.
Author-X-Name-Last: McSweeney
Title: Monitoring paper production using a spectral control chart designed to detect in the presence of multiple cycles
Abstract:
In this paper we introduce a spectral control chart that is designed to
detect the onset of cyclic behaviour in a process, even in the presence of
multiple cycles. This new spectral control chart is based on the
periodogram test proposed by Bølviken (1983a, b). While no more
difficult to implement than the traditional spectral control based on
Fisher's test statistic, this new control chart shows improvement in
detecting the presence of compound periodicity, which the chart based on
Fisher's test is not designed to handle. This is assessed using Monte
Carlo simulations to estimate and compare the average run lengths of
several spectral control charts. In addition, the spectral control charts
are applied to paper production data, published by Pandit & Wu (1993), in
which the stock flow and paper thickness are monitored. The application of
the new spectral control chart to the stock flow process detects
out-of-control behaviour that is not found using standard control charts.
This behaviour, in turn, appears to be related to out-of-control behaviour
that is observed in the paper thickness measurements later in the
production process.
Journal: Journal of Applied Statistics
Pages: 467-480
Issue: 5
Volume: 33
Year: 2006
Keywords: Periodogram, Fourier frequency, quality control, Monte Carlo methods, average run length,
X-DOI: 10.1080/02664760500446333
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500446333
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:5:p:467-480
Template-Type: ReDIF-Article 1.0
Author-Name: Roberta Zizza
Author-X-Name-First: Roberta
Author-X-Name-Last: Zizza
Title: A measure of output gap for Italy through structural time series models
Abstract:
The aim of this paper is to achieve a reliable estimate of the output gap
for Italy through the development of several models within the class of
the unobserved component time series models. These formulations imply the
decomposition of output into a trend component (the 'potential output')
and a cycle component (the 'output gap'). Both univariate and multivariate
methods will be explored. In the former, only one measure of aggregate
activity, such as GDP, is considered; in the latter, unemployment and
industrial production are introduced. A comparison with alternative
measures of output gap, mainly those published by international
organisations, will conclude.
Journal: Journal of Applied Statistics
Pages: 481-496
Issue: 5
Volume: 33
Year: 2006
Keywords: Output gap, potential output, trend and cycle decomposition, unobserved component models,
X-DOI: 10.1080/02664760500448875
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500448875
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:5:p:481-496
Template-Type: ReDIF-Article 1.0
Author-Name: Chang Dorea
Author-X-Name-First: Chang
Author-X-Name-Last: Dorea
Author-Name: Sueli Mingoti
Author-X-Name-First: Sueli
Author-X-Name-Last: Mingoti
Title: Estimating the total number of distinct species using quadrat sampling and under-dependence structure
Abstract:
To estimate the total number of distinct species in a given region,
Bayesian methods along with quadrat sampling procedures have been used by
several authors. A key underlying assumption relies on the independence
among the species. In this note, we analyse these estimates allowing a
generalized binomial dependence between species.
Journal: Journal of Applied Statistics
Pages: 497-512
Issue: 5
Volume: 33
Year: 2006
Keywords: Estimating the number of species, quadrat sampling, generalized binomial distribution,
X-DOI: 10.1080/02664760600585535
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600585535
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:5:p:497-512
Template-Type: ReDIF-Article 1.0
Author-Name: S. S. Ganguly
Author-X-Name-First: S. S.
Author-X-Name-Last: Ganguly
Title: Cumulative logit models for matched pairs case-control design: Studies with covariates
Abstract:
Binary as well as polytomous logistic models have been found useful for
estimating odds ratios when the exposure of prime interest assumes
unordered multiple levels under matched pair case-control design. In our
earlier studies, we have shown the use of a polytomous logistic model for
estimating cumulative odds ratios when the exposure of prime interest
assumes multiple ordered levels under matched pair case-control design. In
this paper, using the above model, we estimate the covariate adjusted
cumulative odds ratios, in the case of an ordinal multiple level exposure
variable under a pairwise matched case-control retrospective design. An
approach, based on asymptotic distributional results, is also described to
investigate whether or not the response categories are distinguishable
with respect to the cumulative odds ratios after adjusting the effect of
covariates. An illustrative example is presented and discussed.
Journal: Journal of Applied Statistics
Pages: 513-522
Issue: 5
Volume: 33
Year: 2006
Keywords: Logistic model, polytomous logistic model, matched pairs, odds ratio, cumulative odds ratio, deviance statistic,
X-DOI: 10.1080/02664760600585576
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600585576
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:5:p:513-522
Template-Type: ReDIF-Article 1.0
Author-Name: Kaliappa Kalirajan
Author-X-Name-First: Kaliappa
Author-X-Name-Last: Kalirajan
Author-Name: Shashanka Bhide
Author-X-Name-First: Shashanka
Author-X-Name-Last: Bhide
Title: Bias free measurement of technical efficiency
Abstract:
Technical efficiency, which is a measure of production performance of a
firm, has been estimated generally using a primal production frontier.
Since the estimation is carried out for a given level of inputs, the
efficiency measure includes the effect of 'input-mix' or
'input-allocation' and consequently, the technical efficiency estimate is
biased. The objectives of this paper are to gauge the magnitude of
'input-mix' bias in technical efficiency estimate and to suggest a method
to measure technical efficiency eliminating the bias. The workability of
the suggested method is demonstrated through an empirical analysis using
agricultural data from India covering the period 1970-1993.
Journal: Journal of Applied Statistics
Pages: 523-533
Issue: 5
Volume: 33
Year: 2006
Keywords: Technical efficiency, input-mix bias, frontier production function, India,
X-DOI: 10.1080/02664760600585592
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600585592
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:5:p:523-533
Template-Type: ReDIF-Article 1.0
Author-Name: Jennifer Mooney
Author-X-Name-First: Jennifer
Author-X-Name-Last: Mooney
Author-Name: Ian Jolliffe
Author-X-Name-First: Ian
Author-X-Name-Last: Jolliffe
Author-Name: Peter Helms
Author-X-Name-First: Peter
Author-X-Name-Last: Helms
Title: Modelling seasonally varying data: A case study for Sudden Infant Death Syndrome (SIDS)
Abstract:
Many time series are measured monthly, either as averages or totals, and
such data often exhibit seasonal variability - the values of the series
are consistently larger for some months of the year than for others. A
typical series of this type is the number of deaths each month attributed
to SIDS (Sudden Infant Death Syndrome). Seasonality can be modelled in a
number of ways. This paper describes and discusses various methods for
modelling seasonality in SIDS data, though much of the discussion is
relevant to other seasonally varying data. There are two main approaches,
either fitting a circular probability distribution to the data, or using
regression-based techniques to model the mean seasonal behaviour. Both are
discussed in this paper.
Journal: Journal of Applied Statistics
Pages: 535-547
Issue: 5
Volume: 33
Year: 2006
Keywords: Cardioid distribution, circular data, cosinor analysis, regression, seasonality, SIDS, von Mises distribution,
X-DOI: 10.1080/2664760600585642
File-URL: http://www.tandfonline.com/doi/abs/10.1080/2664760600585642
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:5:p:535-547
Template-Type: ReDIF-Article 1.0
Author-Name: J. Peng
Author-X-Name-First: J.
Author-X-Name-Last: Peng
Author-Name: C. I. C. Lee
Author-X-Name-First: C. I. C.
Author-X-Name-Last: Lee
Author-Name: L. Liu
Author-X-Name-First: L.
Author-X-Name-Last: Liu
Title: Max-min multiple comparison procedure for comparing several dose levels with a zero dose control
Abstract:
The comparison of increasing doses of a compound to a zero dose control
is of interest in medical and toxicological studies. Assume that the mean
dose effects are non-decreasing among the non-zero doses of the compound.
A simple procedure that modifies Dunnett's procedure is proposed to
construct simultaneous confidence intervals for pairwise comparisons of
each dose group with the zero dose control by utilizing the ordering of
the means. The simultaneous lower bounds and upper bounds by the new
procedure are monotone, which is not the case with Dunnett's procedure.
This is useful to categorize dose levels. The expected gains of the new
procedure over Dunnett's procedure are studied. The procedure is shown by
real data to compare well with its predecessor.
Journal: Journal of Applied Statistics
Pages: 549-555
Issue: 5
Volume: 33
Year: 2006
Keywords: Dunnett's procedure, simultaneous confidence intervals,
X-DOI: 10.1080/02664760600585675
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600585675
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:5:p:549-555
Template-Type: ReDIF-Article 1.0
Author-Name: M. Khazaee
Author-X-Name-First: M.
Author-X-Name-Last: Khazaee
Author-Name: K. Shafie
Author-X-Name-First: K.
Author-X-Name-Last: Shafie
Title: Regression models for Boolean random sets
Abstract:
In this paper we consider the regression problem for random sets of the
Boolean-model type. Regression modeling of the Boolean random sets using
some explanatory variables are classified according to the type of these
variables as propagation, growth or propagation-growth models. The maximum
likelihood estimation of the parameters for the propagation model is
explained in detail for some specific link functions using three methods.
These three methods of estimation are also compared in a simulation study.
Journal: Journal of Applied Statistics
Pages: 557-567
Issue: 5
Volume: 33
Year: 2006
Keywords: Random closed set, Boolean model, generalized linear model,
X-DOI: 10.1080/02664760600585683
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600585683
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:5:p:557-567
Template-Type: ReDIF-Article 1.0
Author-Name: Adelaide Figueiredo
Author-X-Name-First: Adelaide
Author-X-Name-Last: Figueiredo
Title: Two-way analysis of variance for data from a concentrated bipolar Watson distribution
Abstract:
The bipolar Watson distribution is frequently used for modeling axial
data. We extend the one-way analysis of variance based on this
distribution to a two-way layout. We illustrate the method with
directional data in three dimensions
Journal: Journal of Applied Statistics
Pages: 575-581
Issue: 6
Volume: 33
Year: 2006
Keywords: Axial data, ANOVA, directional data, Watson distribution,
X-DOI: 10.1080/02664760600679619
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600679619
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:6:p:575-581
Template-Type: ReDIF-Article 1.0
Author-Name: James Stamey
Author-X-Name-First: James
Author-X-Name-Last: Stamey
Author-Name: Dean Young
Author-X-Name-First: Dean
Author-X-Name-Last: Young
Author-Name: Tom Bratcher
Author-X-Name-First: Tom
Author-X-Name-Last: Bratcher
Title: Bayesian sample-size determination for one and two Poisson rate parameters with applications to quality control
Abstract:
We formulate Bayesian approaches to the problems of determining the
required sample size for Bayesian interval estimators of a predetermined
length for a single Poisson rate, for the difference between two Poisson
rates, and for the ratio of two Poisson rates. We demonstrate the efficacy
of our Bayesian-based sample-size determination method with two real-data
quality-control examples and compare the results to frequentist
sample-size determination methods.
Journal: Journal of Applied Statistics
Pages: 583-594
Issue: 6
Volume: 33
Year: 2006
Keywords: Average coverage criterion, interval estimators, HPD intervals, coverage probability,
X-DOI: 10.1080/02664760600679643
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600679643
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:6:p:583-594
Template-Type: ReDIF-Article 1.0
Author-Name: Tzong-Ru Tsai
Author-X-Name-First: Tzong-Ru
Author-X-Name-Last: Tsai
Author-Name: Shuo-Jye Wu
Author-X-Name-First: Shuo-Jye
Author-X-Name-Last: Wu
Title: Acceptance sampling based on truncated life tests for generalized Rayleigh distribution
Abstract:
This paper considers the problem of an acceptance sampling plan for a
truncated life test when the lifetime follows the generalized Rayleigh
distribution. For different acceptance numbers, confidence levels, and
values of the ratio of the fixed experiment time to the specified mean
life, the minimum sample sizes necessary to ensure the specified mean life
are found. The operating characteristic values of the sampling plans and
producer's risk are discussed. Some tables are presented and the use of
the tables is illustrated by a numerical example.
Journal: Journal of Applied Statistics
Pages: 595-600
Issue: 6
Volume: 33
Year: 2006
Keywords: Consumer's risk, generalized Rayleigh distribution, operating characteristic curve, producer's risk, truncated life tests,
X-DOI: 10.1080/02664760600679700
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600679700
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:6:p:595-600
Template-Type: ReDIF-Article 1.0
Author-Name: Ashish Das
Author-X-Name-First: Ashish
Author-X-Name-Last: Das
Author-Name: Sudhir Gupta
Author-X-Name-First: Sudhir
Author-X-Name-Last: Gupta
Author-Name: Sanpei Kageyama
Author-X-Name-First: Sanpei
Author-X-Name-Last: Kageyama
Title: A-optimal diallel crosses for test versus control comparisons
Abstract:
A-optimality of block designs for control versus test comparisons in
diallel crosses is investigated. A sufficient condition for designs to be
A-optimal is derived. Type S0 designs are defined and A-optimal type S0
designs are characterized. A lower bound to the A-efficiency of type S0
designs is also given. Using the lower bound to A-efficiency, type S0
designs are shown to yield efficient designs for test versus control
comparisons.
Journal: Journal of Applied Statistics
Pages: 601-608
Issue: 6
Volume: 33
Year: 2006
Keywords: Diallel cross, type S design, A-optimality, A-efficiency,
X-DOI: 10.1080/02664760600679726
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600679726
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:6:p:601-608
Template-Type: ReDIF-Article 1.0
Author-Name: Ian Dryden
Author-X-Name-First: Ian
Author-X-Name-Last: Dryden
Author-Name: Rahman Farnoosh
Author-X-Name-First: Rahman
Author-X-Name-Last: Farnoosh
Author-Name: Charles Taylor
Author-X-Name-First: Charles
Author-X-Name-Last: Taylor
Title: Image segmentation using voronoi polygons and MCMC, with application to muscle fibre images
Abstract:
We investigate a Bayesian method for the segmentation of muscle fibre
images. The images are reasonably well approximated by a Dirichlet
tessellation, and so we use a deformable template model based on Voronoi
polygons to represent the segmented image. We consider various prior
distributions for the parameters and suggest an appropriate likelihood.
Following the Bayesian paradigm, the mathematical form for the posterior
distribution is obtained (up to an integrating constant). We introduce a
Metropolis-Hastings algorithm and a reversible jump Markov chain Monte
Carlo algorithm (RJMCMC) for simulation from the posterior when the number
of polygons is fixed or unknown. The particular moves in the RJMCMC
algorithm are birth, death and position/colour changes of the point
process which determines the location of the polygons. Segmentation of the
true image was carried out using the estimated posterior mode and
posterior mean. A simulation study is presented which is helpful for
tuning the hyperparameters and to assess the accuracy. The algorithms work
well on a real image of a muscle fibre cross-section image, and an
additional parameter, which models the boundaries of the muscle fibres, is
included in the final model.
Journal: Journal of Applied Statistics
Pages: 609-622
Issue: 6
Volume: 33
Year: 2006
Keywords: Coloured tessellation, Markov chain Monte Carlo, point pattern, regularity, reversible jump, Strauss process,
X-DOI: 10.1080/02664760600679825
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600679825
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:6:p:609-622
Template-Type: ReDIF-Article 1.0
Author-Name: Martin Arvidsson
Author-X-Name-First: Martin
Author-X-Name-Last: Arvidsson
Author-Name: Ida Gremyr
Author-X-Name-First: Ida
Author-X-Name-Last: Gremyr
Author-Name: Bo Bergman
Author-X-Name-First: Bo
Author-X-Name-Last: Bergman
Title: Interpretation of dispersion effects in a robust design context
Abstract:
The purpose of this paper is to discuss the interpretation of dispersion
effects in un-replicated fractional factorials from a robust design
perspective. We propose an interpretation of dispersion effects as
manifested interactions between control factors and unobserved and
uncontrolled factors, an interpretation shown to be useful in achieving
robust designs. Further, we show the consequences this interpretation has
on the identification of dispersion effects.
Journal: Journal of Applied Statistics
Pages: 623-627
Issue: 6
Volume: 33
Year: 2006
Keywords: Dispersion effects, robust design, split-plot experiments, control factors, noise factors, random noise factors,
X-DOI: 10.1080/02664760600679874
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600679874
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:6:p:623-627
Template-Type: ReDIF-Article 1.0
Author-Name: Petros Hadjicostas
Author-X-Name-First: Petros
Author-X-Name-Last: Hadjicostas
Title: Maximizing proportions of correct classifications in binary logistic regression
Abstract:
In this paper, we give simple mathematical results that allow us to get
all cut-off points that maximize the overall proportion of correct
classifications in any binary classification method (and, in particular,
in binary logistic regression). In addition, we give results that allow us
to get all cut-off points that maximize a weighted combination of
specificity and sensitivity. In addition, we discuss measures of
association between predicted probabilities and observed responses, and,
in particular, we discuss the calculation of the overall percentages of
concordant, discordant, and tied pairs of input observations with
different responses. We mention that the calculation of these quantities
by SAS and Minitab is sometimes incorrect. The concepts and methods of the
paper are illustrated by a hypothetical example of school retention data.
Journal: Journal of Applied Statistics
Pages: 629-640
Issue: 6
Volume: 33
Year: 2006
Keywords: Classification, concordant pairs, cut-off points, discordant pairs, logistic regression, maximization of proportions, sensitivity, specificity,
X-DOI: 10.1080/02664760600723367
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600723367
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:6:p:629-640
Template-Type: ReDIF-Article 1.0
Author-Name: G. Yi
Author-X-Name-First: G.
Author-X-Name-Last: Yi
Author-Name: S. Coleman
Author-X-Name-First: S.
Author-X-Name-Last: Coleman
Author-Name: Q. Ren
Author-X-Name-First: Q.
Author-X-Name-Last: Ren
Title: CUSUM method in predicting regime shifts and its performance in different stock markets allowing for transaction fees
Abstract:
Statistical Process Control (SPC) is a scientific approach to quality
improvement in which data are collected and used as evidence of the
performance of a process, organisation or set of equipment. One of the SPC
techniques, the cumulative sum (CUSUM) method, first developed by E.S.
Page (1961), uses a series of cumulative sums of sample data for online
process control. This paper reviews CUSUM techniques applied to financial
markets in several different ways. The performance of the CUSUM method in
predicting regime shifts in stock market indices is then studied in
detail. Research in this field so far does not take the transaction fees
of buying and selling into consideration. As the study in this paper
shows, the performances of the CUSUM when taking account of transaction
fees are quite different to those not taking transaction fees into
account. The CUSUM plan is defined by parameters h and k. Choosing the
parameters of the method should be based on studies that take transaction
fees into account. The performances of the CUSUM in different stock
markets are also compared in this paper. The results show that the same
CUSUM plan has remarkably different performances in different stock
markets.
Journal: Journal of Applied Statistics
Pages: 647-661
Issue: 7
Volume: 33
Year: 2006
Keywords: SPC, CUSUM, regime shifts, financial markets, transaction fees,
X-DOI: 10.1080/02664760600708590
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600708590
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:7:p:647-661
Template-Type: ReDIF-Article 1.0
Author-Name: Rob Deardon
Author-X-Name-First: Rob
Author-X-Name-Last: Deardon
Author-Name: Steven Gilmour
Author-X-Name-First: Steven
Author-X-Name-Last: Gilmour
Author-Name: Neil Butler
Author-X-Name-First: Neil
Author-X-Name-Last: Butler
Author-Name: Kath Phelps
Author-X-Name-First: Kath
Author-X-Name-Last: Phelps
Author-Name: Roy Kennedy
Author-X-Name-First: Roy
Author-X-Name-Last: Kennedy
Title: Designing field experiments which are subject to representation bias
Abstract:
The term 'representation bias' is used to describe the disparities that
exist between treatment effects estimated from field experiments, and
those effects that would be seen if treatments were used in the field. In
this paper we are specifically concerned with representation bias caused
by disease inoculum travelling between plots, or out of the experimental
area altogether. The scope for such bias is maximized in the case of
airborne spread diseases. This paper extends the work of Deardon et al.
(2004), using simulation methods to explore the relationship between
design and representation bias. In doing so, we illustrate the importance
of plot size and spacing, as well as treatment-to-plot allocation. We
examine a novel class of designs, incomplete column designs, to develop an
understanding of the mechanisms behind representation bias. We also
introduce general methods of designing field trials, which can be used to
limit representation bias by carefully controlling treatment to block
allocation in both incomplete column and incomplete randomized block
designs. Finally, we show how the commonly used practice of sampling from
the centres of plots, rather than entire plots, can also help to control
representation bias.
Journal: Journal of Applied Statistics
Pages: 663-678
Issue: 7
Volume: 33
Year: 2006
Keywords: Experimental design, inter-plot interference, plant pathology, plant disease dispersal simulation,
X-DOI: 10.1080/02664760600708681
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600708681
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:7:p:663-678
Template-Type: ReDIF-Article 1.0
Author-Name: Alan Hutson
Author-X-Name-First: Alan
Author-X-Name-Last: Hutson
Title: Modifying the exact test for a binomial proportion and comparisons with other approaches
Abstract:
In this note we provide a simple continuity and tail-corrected approach
to the standard exact test for a single binomial proportion commonly used
in practice. We redefine the p-value for the two-sided alternative by
noting the skewed distribution of the sample proportion under the null
hypothesis. We illustrate that for both one and two-sided alternatives the
coverage probabilities of the new methodology approaches more closely the
desired type I error α and thus recommend these modifications to the
applied statistician for consideration.
Journal: Journal of Applied Statistics
Pages: 679-690
Issue: 7
Volume: 33
Year: 2006
Keywords: Binomial confidence interval, exact test,
X-DOI: 10.1080/02664760600708723
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600708723
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:7:p:679-690
Template-Type: ReDIF-Article 1.0
Author-Name: Shih-Chou Kao
Author-X-Name-First: Shih-Chou
Author-X-Name-Last: Kao
Author-Name: Chuan-Ching Ho
Author-X-Name-First: Chuan-Ching
Author-X-Name-Last: Ho
Author-Name: Ying-Chin Ho
Author-X-Name-First: Ying-Chin
Author-X-Name-Last: Ho
Title: Transforming the exponential by minimizing the sum of the absolute differences
Abstract:
This work presents an optimal value to be used in the power
transformation to transform the exponential to normality for statistical
process control (SPC) applications. The optimal value is found by
minimizing the sum of absolute differences between two distinct cumulative
probability functions. Based on this criterion, a numerical search yields
a proposed value of 3.5142, so the transformed distribution is well
approximated by the normal distribution. Two examples are presented to
demonstrate the effectiveness of using the transformation method and its
applications in SPC. The transformed data are almost normally distributed
and the performance of the individual charts is satisfactory. Compared to
charts that use the original exponential data and probability control
limits, the individual charts constructed using the transformed
distribution are superior in appearance, ease of interpretation and
implementation by practitioners.
Journal: Journal of Applied Statistics
Pages: 691-702
Issue: 7
Volume: 33
Year: 2006
Keywords: Weibull distribution, exponential distribution, normal distribution, individual chart, probability control limits,
X-DOI: 10.1080/02664760600708780
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600708780
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:7:p:691-702
Template-Type: ReDIF-Article 1.0
Author-Name: Michael Martin
Author-X-Name-First: Michael
Author-X-Name-Last: Martin
Author-Name: Steven Roberts
Author-X-Name-First: Steven
Author-X-Name-Last: Roberts
Title: An evaluation of bootstrap methods for outlier detection in least squares regression
Abstract:
Outlier detection is a critical part of data analysis, and the use of
Studentized residuals from regression models fit using least squares is a
very common approach to identifying discordant observations in linear
regression problems. In this paper we propose a bootstrap approach to
constructing critical points for use in outlier detection in the context
of least-squares Studentized residuals, and find that this approach allows
naturally for mild departures in model assumptions such as non-Normal
error distributions. We illustrate our methodology through both a real
data example and simulated data.
Journal: Journal of Applied Statistics
Pages: 703-720
Issue: 7
Volume: 33
Year: 2006
Keywords: Case-based resampling, error distribution, externally Studentized residuals, internally Studentized residuals, jackknife-after-bootstrap, residual-based resampling, RSTUDENT,
X-DOI: 10.1080/02664760600708863
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600708863
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:7:p:703-720
Template-Type: ReDIF-Article 1.0
Author-Name: Markus Neuhauser
Author-X-Name-First: Markus
Author-X-Name-Last: Neuhauser
Author-Name: Ludwig Hothorn
Author-X-Name-First: Ludwig
Author-X-Name-Last: Hothorn
Title: A robust modification of the ordered-heterogeneity test
Abstract:
An ordered heterogeneity (OH) test is a test for a trend that combines a
non-directional heterogeneity test with the rank-order information
specified under the alternative. We propose two modifications of the OH
test procedure: (1) to use the mean ranks of the groups rather than the
sample means to determine the observed ordering of the groups, and (2) to
use the maximum correlation out of the 2k - 1 - 1 possibilities under the
alternative rather than the single ordering (1, 2, … , k),
where k is the number of independent groups. A simulation study indicates
that these two changes increase the power of the ordered heterogeneity
test when, as common in practice, the underlying distribution may deviate
from a normal distribution and the trend pattern is a priori unknown. In
contrast to the original OH test, the modified OH test can detect all
possible patterns under the alternative with a relatively high power.
Journal: Journal of Applied Statistics
Pages: 721-727
Issue: 7
Volume: 33
Year: 2006
Keywords: Comparing more than two groups, k-sample test, tests for trend, non-parametric tests, Spearman's rank correlation,
X-DOI: 10.1080/02664760600708954
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600708954
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:7:p:721-727
Template-Type: ReDIF-Article 1.0
Author-Name: Alex Riba
Author-X-Name-First: Alex
Author-X-Name-Last: Riba
Author-Name: Josep Ginebra
Author-X-Name-First: Josep
Author-X-Name-Last: Ginebra
Title: Diversity of vocabulary and homogeneity of literary style
Abstract:
To help settle the debate around the authorship of Tirant lo Blanc, we
analyse the evolution of the diversity of the vocabulary used in that
book, as measured through eight different diversity indices. The
exploratory analysis reveals a clear single shift in diversity, that is
estimated through change-point techniques to be in chapter 382, and might
indicate the existence of one main author writing about four fifths of the
book, and of a second author finishing the last one fifth of the book.
Before chapter 382, the language is richer and more diverse than after it.
Journal: Journal of Applied Statistics
Pages: 729-741
Issue: 7
Volume: 33
Year: 2006
Keywords: Change-point analysis, inverse Gaussian-Poisson mixture, Sichel distribution, Simpson index, stylometry,
X-DOI: 10.1080/02664760600708970
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600708970
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:7:p:729-741
Template-Type: ReDIF-Article 1.0
Author-Name: Farid Zayeri
Author-X-Name-First: Farid
Author-X-Name-Last: Zayeri
Author-Name: Anoshirvan Kazemnejad
Author-X-Name-First: Anoshirvan
Author-X-Name-Last: Kazemnejad
Title: A latent variable regression model for asymmetric bivariate ordered categorical data
Abstract:
In many areas of medical research, especially in studies that involve
paired organs, a bivariate ordered categorical response should be
analyzed. Using a bivariate continuous distribution as the latent variable
is an interesting strategy for analyzing these data sets. In this context,
the bivariate standard normal distribution, which leads to the bivariate
cumulative probit regression model, is the most common choice. In this
paper, we introduce another latent variable regression model for modeling
bivariate ordered categorical responses. This model may be an appropriate
alternative for the bivariate cumulative probit regression model, when
postulating a symmetric form for marginal or joint distribution of
response data does not appear to be a valid assumption. We also develop
the necessary numerical procedure to obtain the maximum likelihood
estimates of the model parameters. To illustrate the proposed model, we
analyze data from an epidemiologic study to identify some of the most
important risk indicators of periodontal disease among students 15-19
years in Tehran, Iran.
Journal: Journal of Applied Statistics
Pages: 743-753
Issue: 7
Volume: 33
Year: 2006
Keywords: Latent variable, paired organs, bivariate cumulative model, asymmetric distribution,
X-DOI: 10.1080/02664760600709010
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600709010
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:7:p:743-753
Template-Type: ReDIF-Article 1.0
Author-Name: Wen-Den Chen
Author-X-Name-First: Wen-Den
Author-X-Name-Last: Chen
Title: Testing for spurious regression in a panel data model with the individual number and time length growing
Abstract:
This article shows a test for the spurious regression problem in a panel
data model with a growing individual number and time series length. In the
estimation, tapers are used and the integrated order for the remainder
disturbance is extended to a real number; at the same time, the spurious
regression problem can be detected without prior knowledge. Through Monte
Carlo experiments, we examine the consistent estimators by various sizes
of time length and individual number, in which the remainder disturbance
is assumed to be either stationary or non-stationary. In addition, the
asymptotic normality properties are discussed with a quasi log-likelihood
function. From the power tests we can see that the estimators are quite
successful and powerful.
Journal: Journal of Applied Statistics
Pages: 759-772
Issue: 8
Volume: 33
Year: 2006
Keywords: Spurious regression, Whittle method, panel data model, pseudo spectral density function, tapering,
X-DOI: 10.1080/02664760600741989
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600741989
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:8:p:759-772
Template-Type: ReDIF-Article 1.0
Author-Name: Tony Cooper
Author-X-Name-First: Tony
Author-X-Name-Last: Cooper
Author-Name: Mary Leitnaker
Author-X-Name-First: Mary
Author-X-Name-Last: Leitnaker
Title: Further exploratory analysis of split-plot experiments to study certain stratified effects
Abstract:
Designed experiments are a key component in many companies' improvement
strategies. Because completely randomized experiments are not always
reasonable from a cost or physical perspective, split-plot experiments are
prevalent. The recommended analysis accounts for the different sources of
variation affecting whole-plot and split-plot error. However experiments
on industrial processes must be run and, consequently analyzed quite
differently from ones run in a controlled environment. Such experiments
are typically subject to a wide array of uncontrolled, and barely
understood, variation. In particular, it is important to examine the
experimental results for additional, unanticipated sources of variation.
In this paper, we consider how unanticipated, stratified effects may
influence a split-plot experiment and discuss further exploratory analysis
to indicate the presence of stratified effects. Examples of such
experiments are provided, additional tests are suggested and discussed in
light of their power, and recommendations given.
Journal: Journal of Applied Statistics
Pages: 773-786
Issue: 8
Volume: 33
Year: 2006
Keywords: Designed experiment, split-plot error, source of variation,
X-DOI: 10.1080/02664760600742201
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600742201
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:8:p:773-786
Template-Type: ReDIF-Article 1.0
Author-Name: Sanaa Ismail
Author-X-Name-First: Sanaa
Author-X-Name-Last: Ismail
Author-Name: Hesham Auda
Author-X-Name-First: Hesham
Author-X-Name-Last: Auda
Title: Bayesian and fiducial inference for the inverse gaussian distribution via Gibbs sampler
Abstract:
This paper presents a kernel estimation of the distribution of the scale
parameter of the inverse Gaussian distribution under type II censoring
together with the distribution of the remaining time. Estimation is
carried out via the Gibbs sampling algorithm combined with a missing data
approach. Estimates and confidence intervals for the parameters of
interest are also presented.
Journal: Journal of Applied Statistics
Pages: 787-805
Issue: 8
Volume: 33
Year: 2006
Keywords: Gibbs sampler, Bayesian inference, Fiducial inference,
X-DOI: 10.1080/02664760600742268
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600742268
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:8:p:787-805
Template-Type: ReDIF-Article 1.0
Author-Name: Yuang-Chin Chiang
Author-X-Name-First: Yuang-Chin
Author-X-Name-Last: Chiang
Author-Name: Lin-An Chen
Author-X-Name-First: Lin-An
Author-X-Name-Last: Chen
Author-Name: Hsien-Chueh Peter Yang
Author-X-Name-First: Hsien-Chueh Peter
Author-X-Name-Last: Yang
Title: Symmetric quantiles and their applications
Abstract:
To develop estimators with stronger efficiencies than the trimmed means
which use the empirical quantile, Kim (1992) and Chen & Chiang (1996),
implicitly or explicitly used the symmetric quantile, and thus introduced
new trimmed means for location and linear regression models, respectively.
This study further investigates the properties of the symmetric quantile
and extends its application in several aspects. (a) The symmetric quantile
is more efficient than the empirical quantiles in asymptotic variances
when quantile percentage α is either small or large. This reveals
that for any proposal involving the α th quantile of small or large
α s, the symmetric quantile is the right choice; (b) a trimmed mean
based on it has asymptotic variance achieving a Cramer-Rao lower bound in
one heavy tail distribution; (c) an improvement of the quantiles-based
control chart by Grimshaw & Alt (1997) is discussed; (d) Monte Carlo
simulations of two new scale estimators based on symmetric quantiles also
support this new quantile.
Journal: Journal of Applied Statistics
Pages: 807-817
Issue: 8
Volume: 33
Year: 2006
Keywords: Regression quantile, scale estimator, trimmed mean,
X-DOI: 10.1080/02664760600743464
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600743464
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:8:p:807-817
Template-Type: ReDIF-Article 1.0
Author-Name: E. Ayuga Tellez
Author-X-Name-First: E. Ayuga
Author-X-Name-Last: Tellez
Author-Name: A.J. Martin Fernandez
Author-X-Name-First: A.J. Martin
Author-X-Name-Last: Fernandez
Author-Name: C. Gonzalez Garcia
Author-X-Name-First: C. Gonzalez
Author-X-Name-Last: Garcia
Author-Name: E. Martinez Falero
Author-X-Name-First: E. Martinez
Author-X-Name-Last: Falero
Title: Estimation of non-parametric regression for dasometric measures
Abstract:
The aim of this paper is to describe a simulation procedure to compare
parametric regression against a non-parametric regression method, for
different functions and sets of information. The proposed methodology
improves lack of fit at the edges of the regression curves, and an
acceptable result is obtained for the no-parametric estimation in all
studied cases. Larger differences appear at the edges of the estimation.
The results are applied to the study of dasometric variables, which do not
fulfil the normality hypothesis needed for parametric estimation. The
kernel regression shows the relationship between the studied variables,
which would not be detected with more rigid parametric models.
Journal: Journal of Applied Statistics
Pages: 819-836
Issue: 8
Volume: 33
Year: 2006
Keywords: Regression kernel, edge effect, simulation, comparison, dasometric variables,
X-DOI: 10.1080/02664760600743472
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600743472
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:8:p:819-836
Template-Type: ReDIF-Article 1.0
Author-Name: Jixian Wang
Author-X-Name-First: Jixian
Author-X-Name-Last: Wang
Title: Optimal parametric design with applications to pharmacokinetic and pharmacodynamic trials
Abstract:
This paper considers optimal parametric designs, i.e. designs represented
by probability measures determined by a set of parameters, for nonlinear
models and illustrates their use in designs for pharmacokinetic (PK) and
pharmacokinetic/pharmacodynamic (PK/PD) trials. For some practical
problems, such as designs for modelling PK/PD relationship, this is often
the only feasible type of design, as the design points follow a PK model
and cannot be directly controlled. Even for ordinary design problems the
parametric designs have some advantages over the traditional designs,
which often have too few design points for model checking and may not be
robust to model and parameter misspecifications. We first describe methods
and algorithms to construct the parametric design for ordinary nonlinear
design problems and show that the parametric designs are robust to
parameter misspecification and have good power for model discrimination.
Then we extend this design method to construct optimal repeated
measurement designs for nonlinear mixed models. We also use this
parametric design for modelling a PK/PD relationship and propose a
simulation based algorithm. The application of parametric designs is
illustrated with a three-parameter open one-compartment PK model for the
ordinary design and repeated measurement design, and an Emax model for the
phamacokinetic/pharmacodynamic trial design.
Journal: Journal of Applied Statistics
Pages: 837-852
Issue: 8
Volume: 33
Year: 2006
Keywords: D-optimal design, model discrimination, pharmacokinetic models, repeated measure design, parametric design, PK/PD models,
X-DOI: 10.1080/02664760600743571
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600743571
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:8:p:837-852
Template-Type: ReDIF-Article 1.0
Author-Name: Seong-Keon Lee
Author-X-Name-First: Seong-Keon
Author-X-Name-Last: Lee
Author-Name: Seohoon Jin
Author-X-Name-First: Seohoon
Author-X-Name-Last: Jin
Title: Decision tree approaches for zero-inflated count data
Abstract:
There have been many methodologies developed about zero-inflated data in
the field of statistics. However, there is little literature in the data
mining fields, even though zero-inflated data could be easily found in
real application fields. In fact, there is no decision tree method that is
suitable for zero-inflated responses. To analyze continuous target
variable with decision trees as one of data mining techniques, we use
F-statistics (CHAID) or variance reduction (CART) criteria to find the
best split. But these methods are only appropriate to a continuous target
variable. If the target variable is rare events or zero-inflated count
data, the above criteria could not give a good result because of its
attributes. In this paper, we will propose a decision tree for
zero-inflated count data, using a maximum of zero-inflated Poisson
likelihood as the split criterion. In addition, using well-known data sets
we will compare the performance of the split criteria. In the case when
the analyst is interested in lower value groups (e.g. no defect areas,
customers who do not claim), the suggested ZIP tree would be more
efficient.
Journal: Journal of Applied Statistics
Pages: 853-865
Issue: 8
Volume: 33
Year: 2006
Keywords: Data mining, decision tree, homogeneity, maximum likelihood, zero-inflated Poisson (ZIP),
X-DOI: 10.1080/02664760600743613
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600743613
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:8:p:853-865
Template-Type: ReDIF-Article 1.0
Author-Name: Drgabriel Escarela
Author-X-Name-First: Drgabriel
Author-X-Name-Last: Escarela
Author-Name: Jacques Carriere
Author-X-Name-First: Jacques
Author-X-Name-Last: Carriere
Title: A bivariate model of claim frequencies and severities
Abstract:
Bivariate claim data come from a population that consists of insureds who
may claim either one, both or none of the two types of benefits covered by
a policy. In the present paper, we develop a statistical procedure to fit
bivariate distributions of claims in presence of covariates. This allows
for a more accurate study of insureds' choice and size in the frequency
and severity of the two types of claims. A generalised logistic model is
employed to examine the frequency probabilities, whilst the three
parameter Burr distribution is suggested to model the underlying severity
distributions. The bivariate copula model is exploited in such a way that
it allows us to adjust for a range of frequency dependence structures; a
method for assessing the adequacy of the fitted severity model is
outlined. A health claims dataset illustrates the methods; we describe the
use of orthogonal polynomials for characterising the relationship between
age and the frequency and severity models.
Journal: Journal of Applied Statistics
Pages: 867-883
Issue: 8
Volume: 33
Year: 2006
Keywords: Bivariate loss distribution, Frank's copula, Survival copula, Burr regression, Diagnostics,
X-DOI: 10.1080/02664760600743969
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600743969
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:8:p:867-883
Template-Type: ReDIF-Article 1.0
Author-Name: Ayoe Hoff
Author-X-Name-First: Ayoe
Author-X-Name-Last: Hoff
Title: Bootstrapping Malmquist Indices for Danish Seiners in the North Sea and Skagerrak
Abstract:
In connection with assessing how an ongoing development in fisheries
management may change fishing activity, evaluation of Total Factor
Productivity (TFP) change over a period, including efficiency, scale and
technology changes, is an important tool. The Malmquist index, based on
distance functions evaluated with Data Envelopment Analysis (DEA), is
often employed to estimate TFP changes. DEA is generally gaining attention
for evaluating efficiency and capacity in fisheries. One main criticism of
DEA is that it does not have any statistical foundation, i.e. that it is
not possible to make inference about DEA scores or related parameters. The
bootstrap method for estimating confidence intervals of deterministic
parameters can however be applied to estimate confidence intervals for DEA
scores. This method is applied in the present paper for assessing TFP
changes between 1987 and 1999 for the fleet of Danish seiners operating in
the North Sea and the Skagerrak.
Journal: Journal of Applied Statistics
Pages: 891-907
Issue: 9
Volume: 33
Year: 2006
Keywords: Total factor productivity change, Malmquist index, data envelopment analysis, bootstrap,
X-DOI: 10.1080/02664760600742151
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600742151
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:9:p:891-907
Template-Type: ReDIF-Article 1.0
Author-Name: Masakazu Iwasaki
Author-X-Name-First: Masakazu
Author-X-Name-Last: Iwasaki
Author-Name: Hiroe Tsubaki
Author-X-Name-First: Hiroe
Author-X-Name-Last: Tsubaki
Title: Bivariate Negative Binomial Generalized Linear Models for Environmental Count Data
Abstract:
We propose a new bivariate negative binomial model with constant
correlation structure, which was derived from a contagious bivariate
distribution of two independent Poisson mass functions, by mixing the
proposed bivariate gamma type density with constantly correlated
covariance structure (Iwasaki & Tsubaki, 2005), which satisfies the
integrability condition of McCullagh & Nelder (1989, p. 334). The proposed
bivariate gamma type density comes from a natural exponential family. Joe
(1997) points out the necessity of a multivariate gamma distribution to
derive a multivariate distribution with negative binomial margins, and the
luck of a convenient form of multivariate gamma distribution to get a
model with greater flexibility in a dependent structure with indices of
dispersion. In this paper we first derive a new bivariate negative
binomial distribution as well as the first two cumulants, and, secondly,
formulate bivariate generalized linear models with a constantly correlated
negative binomial covariance structure in addition to the moment estimator
of the components of the matrix. We finally fit the bivariate negative
binomial models to two correlated environmental data sets.
Journal: Journal of Applied Statistics
Pages: 909-923
Issue: 9
Volume: 33
Year: 2006
Keywords: Bivariate negative binomial generalized linear models (BIVARNB GLM), bivariate negative binomial distribution, bivariate gamma type GLM, bivariate count data analysis,
X-DOI: 10.1080/02664760600744157
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600744157
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:9:p:909-923
Template-Type: ReDIF-Article 1.0
Author-Name: Mak Kaboudan
Author-X-Name-First: Mak
Author-X-Name-Last: Kaboudan
Title: Computational Forecasting of Wavelet-converted Monthly Sunspot Numbers
Abstract:
Monthly average sunspot numbers follow irregular cycles with complex
nonlinear dynamics. Statistical linear models constructed to forecast them
are therefore inappropriate, while nonlinear models produce solutions
sensitive to initial conditions. Two computational techniques - neural
networks and genetic programming - that have their advantages are applied
instead to the monthly numbers and their wavelet-transformed and
wavelet-denoised series. The objective is to determine if modeling
wavelet-conversions produces better forecasts than those from modeling
series' observed values. Because sunspot numbers are indicators of
geomagnetic activity their forecast is important. Geomagnetic storms
endanger satellites and disrupt communications and power systems on Earth.
Journal: Journal of Applied Statistics
Pages: 925-941
Issue: 9
Volume: 33
Year: 2006
Keywords: Wavelets, thresholding, neural networks, genetic programming, sunspot numbers,
X-DOI: 10.1080/02664760600744215
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600744215
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:9:p:925-941
Template-Type: ReDIF-Article 1.0
Author-Name: Murat Kucuk
Author-X-Name-First: Murat
Author-X-Name-Last: Kucuk
Author-Name: Necati Ağirali-super-˙oğlu
Author-X-Name-First: Necati
Author-X-Name-Last: Ağirali-super-˙oğlu
Title: Wavelet Regression Technique for Streamflow Prediction
Abstract:
In order to explain many secret events of natural phenomena, analyzing
non-stationary series is generally an attractive issue for various
research areas. The wavelet transform technique, which has been widely
used last two decades, gives better results than former techniques for the
analysis of earth science phenomena and for feature detection of real
measurements. In this study, a new technique is offered for streamflow
modeling by using the discrete wavelet transform. This new technique
depends on the feature detection characteristic of the wavelet transform.
The model was applied to two geographical locations with different
climates. The results were compared with energy variation and error values
of models. The new technique offers a good advantage through a physical
interpretation. This technique is applied to streamflow regression models,
because they are simple and widely used in practical applications.
However, one can apply this technique to other models.
Journal: Journal of Applied Statistics
Pages: 943-960
Issue: 9
Volume: 33
Year: 2006
Keywords: Streamflow prediction, discrete wavelet transform, hydrological modeling,
X-DOI: 10.1080/02664760600744298
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600744298
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:9:p:943-960
Template-Type: ReDIF-Article 1.0
Author-Name: Kuo-Yuan Liang
Author-X-Name-First: Kuo-Yuan
Author-X-Name-Last: Liang
Author-Name: Jack Lee
Author-X-Name-First: Jack
Author-X-Name-Last: Lee
Author-Name: Kurt Shao
Author-X-Name-First: Kurt
Author-X-Name-Last: Shao
Title: On the Distribution of the Inverted Linear Compound of Dependent F-Variates and its Application to the Combination of Forecasts
Abstract:
This paper establishes a sampling theory for an inverted linear
combination of two dependent F-variates. It is found that the random
variable is approximately expressible in terms of a mixture of weighted
beta distributions. Operational results, including rth-order raw moments
and critical values of the density are subsequently obtained by using the
Pearson Type I approximation technique. As a contribution to the
probability theory, our findings extend Lee & Hu's (1996) recent
investigation on the distribution of the linear compound of two
independent F-variates. In terms of relevant applied works, our results
refine Dickinson's (1973) inquiry on the distribution of the optimal
combining weights estimates based on combining two independent rival
forecasts, and provide a further advancement to the general case of
combining three independent competing forecasts. Accordingly, our
conclusions give a new perception of constructing the confidence intervals
for the optimal combining weights estimates studied in the literature of
the linear combination of forecasts.
Journal: Journal of Applied Statistics
Pages: 961-973
Issue: 9
Volume: 33
Year: 2006
Keywords: Combining weights, critical values, error-variance minimizing criterion, inverted F-variates, Pearson Type I approximation,
X-DOI: 10.1080/02664760600744330
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600744330
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:9:p:961-973
Template-Type: ReDIF-Article 1.0
Author-Name: Ho-Seog Kang
Author-X-Name-First: Ho-Seog
Author-X-Name-Last: Kang
Author-Name: Kee-Hoon Kang
Author-X-Name-First: Kee-Hoon
Author-X-Name-Last: Kang
Author-Name: Sung Park
Author-X-Name-First: Sung
Author-X-Name-Last: Park
Title: Minimax Designs for the Stability of Slope Estimation on Second-order Response Surfaces
Abstract:
In this paper, designs for the stability of the slope estimation on a
second-order response surface are considered. Minimization of the point
dispersion measure, which is maximized over all points in the region of
interest is taken as the optimality criterion, and the minimax properties
in some class of designs are derived in spherical and cubic regions of
interest. We study the efficiencies of the minimax designs relative to
other optimal designs with various criteria.
Journal: Journal of Applied Statistics
Pages: 975-988
Issue: 9
Volume: 33
Year: 2006
Keywords: Point dispersion measure, minimax criterion, slope variance, stable design, stability measure,
X-DOI: 10.1080/02664760600744447
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600744447
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:9:p:975-988
Template-Type: ReDIF-Article 1.0
Author-Name: Christopher Illert
Author-X-Name-First: Christopher
Author-X-Name-Last: Illert
Title: Origins of Linguistic Zonation in the Australian Alps. Part 2 - Snell's Law
Abstract:
In this second paper, analysing archival SE-Australian Aboriginal
word/name lists, Snell's Law is used to deduce the likely minimal
sound-systems of pre Ice-Age language superfamilies - some probably dating
back beyond the first occupation of Australia by humans. The deduced
'Turuwal-like' ancestral sound-system is then used as a basis for
reconstructing deictic forms apparently so ancient that they seem to even
unify 'PamaNyungan' and 'non-PamaNyungan' language within a single system
of formal logic which, having apparently provided the semantic basis for
at least 60,000 years of speech throughout the entire Australian
continent, deserves to be called proto-Australian regardless of whether or
not it arose in SE-Asia tens of millennia before. Whatever the exact age
of this reconstructed proto-Australian, presented here for the first time,
it is an order of magnitude older than any known human language and, as
such, a 'Rosetta Stone' for human languages worldwide. It also provides an
unprecedented window into human consciousness and perception of the world
up to 75,000 years ago, which is especially significant given that humans
can only have engaged in finely controlled speech and fully modern
language since chance mutation of our FOXP2 gene about 120,000 years ago.
These truly ancient deictic forms dating halfway back to the beginning of
modern human speech, retrieved only through modern statistical analysis,
provide insight into our very origins and as such are perhaps amongst the
most precious cultural treasures that humanity currently possesses.1
Journal: Journal of Applied Statistics
Pages: 989-1030
Issue: 9
Volume: 33
Year: 2006
Keywords: Phonotactic signatures, archaeo-linguistics, proto-Australian, Snell's Law,
X-DOI: 10.1080/02664760500450160
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760500450160
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:9:p:989-1030
Template-Type: ReDIF-Article 1.0
Author-Name: Saralees Nadarajah
Author-X-Name-First: Saralees
Author-X-Name-Last: Nadarajah
Title: Acknowledgement of Priority: the Generalized Normal Distribution
Abstract:
Journal: Journal of Applied Statistics
Pages: 1031-1032
Issue: 9
Volume: 33
Year: 2006
X-DOI: 10.1080/02664760600938494
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600938494
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:9:p:1031-1032
Template-Type: ReDIF-Article 1.0
Author-Name: Pradeep George
Author-X-Name-First: Pradeep
Author-X-Name-Last: George
Author-Name: Madara Ogot
Author-X-Name-First: Madara
Author-X-Name-Last: Ogot
Title: A Compromise Experimental Design Method for Parametric Polynomial Response Surface Approximations
Abstract:
This study presents a compromise approach to augmentation of experimental
designs, necessitated by the expense of performing each experiment
(computational or physical), that yields higher quality parametric
polynomial response surface approximations than traditional augmentation.
Based on the D-optimality criterion as a measure of experimental design
quality, the method simultaneously considers several polynomial models
during the experimental design, resulting in good quality designs for all
models under consideration, as opposed to good quality designs only for
lower-order models, as in the case of traditional augmentation. Several
numerical examples and an engineering example are presented to illustrate
the efficacy of the approach.
Journal: Journal of Applied Statistics
Pages: 1037-1050
Issue: 10
Volume: 33
Year: 2006
Keywords: Response surface method, surrogate models,
X-DOI: 10.1080/02664760600746533
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600746533
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:10:p:1037-1050
Template-Type: ReDIF-Article 1.0
Author-Name: Steven Cook
Author-X-Name-First: Steven
Author-X-Name-Last: Cook
Author-Name: Alan Speight
Author-X-Name-First: Alan
Author-X-Name-Last: Speight
Title: International Business Cycle Asymmetry and Time Irreversible Nonlinearities
Abstract:
Using tests of time reversibility, this paper provides further
statistical evidence on the long-standing conjecture in economics
concerning the potentially asymmetric behaviour of output over the
expansionary and contractionary phases of the business cycle. A particular
advantage of this approach is that it provides a discriminating test that
is instructive as to whether any asymmetries detected are due to
asymmetric shocks to a linear model, or an underlying non-linear model
with symmetric shocks, and in the latter case is informative as to the
potential form of that nonlinear model. Using a long span of international
per capita output growth data, the asymmetry detected is overwhelmingly
consistent with the long standing perception that the output business
cycle is characterized by steeper recessions and longer more gentle
expansions, but the evidence for this form of business cycle asymmetry is
weaker in the data adjusted for the influence of outliers associated with
wars and other extreme events. Statistically significant time
irreversibility is reported for the output growth rates of almost all of
the countries considered in the full sample data, and there is evidence
that this time irreversibility is of a form implying an underlying
nonlinear model with symmetrically distributed innovations for 15 of the
22 countries considered. However, the time irreversibility test results
for the outlier-trimmed full sample data reveal significant time
irreversibility in output growth for around one half of the countries
considered, predominantly in Northern Europe and North America, and of a
form implying a nonlinear underlying model in only a further half of those
cases.
Journal: Journal of Applied Statistics
Pages: 1051-1065
Issue: 10
Volume: 33
Year: 2006
Keywords: Time reversibility, time irreversibility, nonlinearity, per capita output growth,
X-DOI: 10.1080/02664760600746582
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600746582
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:10:p:1051-1065
Template-Type: ReDIF-Article 1.0
Author-Name: Kamziah Abd Kudus
Author-X-Name-First: Kamziah Abd
Author-X-Name-Last: Kudus
Author-Name: A. C. Kimber
Author-X-Name-First: A. C.
Author-X-Name-Last: Kimber
Author-Name: J. Lapongan
Author-X-Name-First: J.
Author-X-Name-Last: Lapongan
Title: A Parametric Model for the Interval Censored Survival Times of Acacia Mangium Plantation in a Spacing Trial
Abstract:
Survival times for the Acacia mangium plantation in the Segaliud Lokan
Project, Sabah, East Malaysia were analysed based on 20 permanent sample
plots (PSPs) established in 1988 as a spacing experiment. The PSPs were
established following a complete randomized block design with five levels
of spacing randomly assigned to units within four blocks at different
sites. The survival times of trees in years are of interest. Since the
inventories were only conducted annually, the actual survival time for
each tree was not observed. Hence, the data set comprises censored
survival times. Initial analysis of the survival of the Acacia mangium
plantation suggested there is block by spacing interaction; a Weibull
model gives a reasonable fit to the replicate survival times within each
PSP; but a standard Weibull regression model is inappropriate because the
shape parameter differs between PSPs. In this paper we investigate the
form of the non-constant Weibull shape parameter. Parsimonious models for
the Weibull survival times have been derived using maximum likelihood
methods. The factor selection for the parameters is based on a backward
elimination procedure. The models are compared using likelihood ratio
statistics. The results suggest that both Weibull parameters depend on
spacing and block.
Journal: Journal of Applied Statistics
Pages: 1067-1074
Issue: 10
Volume: 33
Year: 2006
Keywords: Survival times, interval censored, Weibull distribution, maximum likelihood, backward elimination,
X-DOI: 10.1080/02664760600746616
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600746616
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:10:p:1067-1074
Template-Type: ReDIF-Article 1.0
Author-Name: Ian McHale
Author-X-Name-First: Ian
Author-X-Name-Last: McHale
Author-Name: Patrick Laycock
Author-X-Name-First: Patrick
Author-X-Name-Last: Laycock
Title: Applications of a General Stable Law Regression Model
Abstract:
In this paper we present a method for performing regression with stable
disturbances. The method of maximum likelihood is used to estimate both
distribution and regression parameters. Our approach utilises a numerical
integration procedure to calculate the stable density, followed by
sequential quadratic programming optimisation procedures to obtain
estimates and standard errors. A theoretical justification for the use of
stable law regression is given followed by two real world practical
examples of the method. First, we fit the stable law multiple regression
model to housing price data and examine how the results differ from normal
linear regression. Second, we calculate the beta coefficients for 26
companies from the Financial Times Ordinary Shares Index.
Journal: Journal of Applied Statistics
Pages: 1075-1084
Issue: 10
Volume: 33
Year: 2006
Keywords: Stable distribution, heavy-tails, extreme values, regression,
X-DOI: 10.1080/02664760600746699
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600746699
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:10:p:1075-1084
Template-Type: ReDIF-Article 1.0
Author-Name: Eiji Minemura
Author-X-Name-First: Eiji
Author-X-Name-Last: Minemura
Title: An Interest-rate Model Analysis Based on Data Augmentation Bayesian Forecasting
Abstract:
In this paper, the author presents an efficient method of analyzing an
interest-rate model using a new approach called 'data augmentation
Bayesian forecasting.' First, a dynamic linear model estimation was
constructed with a hierarchically-incorporated model. Next, an
observational replication was generated based on the one-step forecast
distribution derived from the model. A Markov-chain Monte Carlo sampling
method was conducted on it as a new observation and unknown parameters
were estimated. At that time, the EM algorithm was applied to establish
initial values of unknown parameters while the 'quasi Bayes factor' was
used to appreciate parameter candidates. 'Data augmentation Bayesian
forecasting' is a method of evaluating the transition and history of
'future,' 'present' and 'past' of an arbitrary stochastic process by which
an appropriate evaluation is conducted based on the probability measure
that has been sequentially modified with additional information. It would
be possible to use future prediction results for modifying the model to
grasp the present state or re-evaluate the past state. It would be also
possible to raise the degree of precision in predicting the future through
the modification of the present and the past. Thus, 'data augmentation
Bayesian forecasting' is applicable not only in the field of financial
data analysis but also in forecasting and controlling the stochastic
process.
Journal: Journal of Applied Statistics
Pages: 1085-1104
Issue: 10
Volume: 33
Year: 2006
Keywords: Bayesian inference, dynamic linear model, Markov-chain Monte Carlo, computational simulation, probability measure transformation,
X-DOI: 10.1080/02664760600746756
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600746756
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:10:p:1085-1104
Template-Type: ReDIF-Article 1.0
Author-Name: W. L. Pearn
Author-X-Name-First: W. L.
Author-X-Name-Last: Pearn
Author-Name: Y. C. Chang
Author-X-Name-First: Y. C.
Author-X-Name-Last: Chang
Author-Name: Chien-Wei Wu
Author-X-Name-First: Chien-Wei
Author-X-Name-Last: Wu
Title: Measuring Process Performance Based on Expected Loss with Asymmetric Tolerances
Abstract:
By approaching capability from the point of view of process loss similar
to Cpm , Johnson (1992) provided the expected relative loss Le to consider
the proximity of the target value. Putting the loss in relative terms, a
user needs only to specify the target and the distance from the target at
which the product would have zero worth to quantify the process loss. Tsui
(1997) expressed the index Le as Le = Lot + Lpe , which provides an
uncontaminated separation between information concerning the process
relative off-target loss (Lot) and the process relative inconsistency loss
(Lpe). Unfortunately, the index Le inconsistently measures process
capability in many cases, particularly for processes with asymmetric
tolerances, and thus reflects process potential and performance
inaccurately. In this paper, we consider a generalization, which we refer
to as [image omitted] , to deal with processes with asymmetric
tolerances. The generalization is shown to be superior to the original
index Le. In the cases of symmetric tolerances, the new generalization of
process loss indices [image omitted] , [image omitted] and
[image omitted] reduces to the original index Le, Lot, and Lpe ,
respectively. We investigate the statistical properties of a natural
estimator of [image omitted] [image omitted] and [image
omitted] when the underlying process is normally distributed. We
obtained the rth moment, expected value, and the variance of the natural
estimator [image omitted] , [image omitted] , and [image
omitted] . We also analyzed the bias and the mean squared error in
each case. The new generalization [image omitted] measures process
loss more accurately than the original index Le.
Journal: Journal of Applied Statistics
Pages: 1105-1120
Issue: 10
Volume: 33
Year: 2006
Keywords: Asymmetric tolerances, bias, mean squared error, process capability indices, process loss indices,
X-DOI: 10.1080/02664760600746871
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600746871
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:10:p:1105-1120
Template-Type: ReDIF-Article 1.0
Author-Name: Bradley Ewing
Author-X-Name-First: Bradley
Author-X-Name-Last: Ewing
Author-Name: Teresa Kerr
Author-X-Name-First: Teresa
Author-X-Name-Last: Kerr
Author-Name: Mark Thompson
Author-X-Name-First: Mark
Author-X-Name-Last: Thompson
Title: Do Flow Rates Respond Asymmetrically to Water Level? Evidence from the Edwards Aquifer
Abstract:
This research examines the time series relationship between the Comal
Springs flow rate and the water level in the Edwards Aquifer (Well J-17).
The empirical methodology utilizes threshold autoregression (TAR) and
momentum-TAR models that allow for asymmetry in responses and adjustments
to a disequilibrium in the long-run cointegrating relationship. Based on
the results, an asymmetric error-correction model (AECM) is proposed to
characterize the short-run and long-run dynamic relationship between
spring flow and water level. The results have implications for the
management of water resources, water demand, and ecosystems.
Journal: Journal of Applied Statistics
Pages: 1121-1129
Issue: 10
Volume: 33
Year: 2006
Keywords: Threshold cointegration, asymmetric adjustment, spring flow, water level, Edwards Aquifer,
X-DOI: 10.1080/02664760600746905
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600746905
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:10:p:1121-1129
Template-Type: ReDIF-Article 1.0
Author-Name: Kenneth Rice
Author-X-Name-First: Kenneth
Author-X-Name-Last: Rice
Author-Name: David Spiegelhalter
Author-X-Name-First: David
Author-X-Name-Last: Spiegelhalter
Title: A Simple Diagnostic Plot Connecting Robust Estimation, Outlier Detection, and False Discovery Rates
Abstract:
Robust estimation of parameters, and identification of specific data
points that are discordant with an assumed model, are often treated as
different statistical problems. The two aims are, however, closely
inter-related and in many cases the two analyses are required
simultaneously. We present a simple diagnostic plot that connects existing
robust estimators with simultaneous outlier detection, and uses the
concept of false discovery rates to allow for the multiple comparisons
induced by considering each point as a potential outlier. It is
straightforward to implement, and applicable in any situation for which
robust estimation procedures exist. Several examples are given.
Journal: Journal of Applied Statistics
Pages: 1131-1147
Issue: 10
Volume: 33
Year: 2006
Keywords: Robust estimation, Outlier detection, False discovery rate,
X-DOI: 10.1080/02664760600747002
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600747002
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:10:p:1131-1147
Template-Type: ReDIF-Article 1.0
Author-Name: M. C. Jones
Author-X-Name-First: M. C.
Author-X-Name-Last: Jones
Title: Book Review
Abstract:
Journal: Journal of Applied Statistics
Pages: 1149-1151
Issue: 10
Volume: 33
Year: 2006
X-DOI: 10.1080/02664760600747424
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600747424
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:33:y:2006:i:10:p:1149-1151
Template-Type: ReDIF-Article 1.0
Author-Name: Alice Whittemore
Author-X-Name-First: Alice
Author-X-Name-Last: Whittemore
Title: A Bayesian False Discovery Rate for Multiple Testing
Abstract:
Case-control studies of genetic polymorphisms and gene-environment
interactions are reporting large numbers of statistically significant
associations, many of which are likely to be spurious. This problem
reflects the low prior probability that any one null hypothesis is false,
and the large number of test results reported for a given study. In a
Bayesian approach to the low prior probabilities, Wacholder et al. (2004)
suggest supplementing the p-value for a hypothesis with its posterior
probability given the study data. In a frequentist approach to the test
multiplicity problem, Benjamini & Hochberg (1995) propose a
hypothesis-rejection rule that provides greater statistical power by
controlling the false discovery rate rather than the family-wise error
rate controlled by the Bonferroni correction. This paper defines a Bayes
false discovery rate and proposes a Bayes-based rejection rule for
controlling it. The method, which combines the Bayesian approach of
Wacholder et al. with the frequentist approach of Benjamini & Hochberg, is
used to evaluate the associations reported in a case-control study of
breast cancer risk and genetic polymorphisms of genes involved in the
repair of double-strand DNA breaks.
Journal: Journal of Applied Statistics
Pages: 1-9
Issue: 1
Volume: 34
Year: 2007
Keywords: Bayes, breast cancer, false discovery rate, false positive report probability, haplotypes, multiple comparisons, single nucleotide polymorphism,
X-DOI: 10.1080/02664760600994745
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600994745
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:1:p:1-9
Template-Type: ReDIF-Article 1.0
Author-Name: Amy Ming-Fang Yen
Author-X-Name-First: Amy Ming-Fang
Author-X-Name-Last: Yen
Author-Name: Tony Hsiu-Hsi Chen
Author-X-Name-First: Tony Hsiu-Hsi
Author-X-Name-Last: Chen
Title: Mixture Multi-state Markov Regression Model
Abstract:
Although heterogeneity across individuals may be reduced when a two-state
process is extended into a multi-state process, the discrepancy between
the observed and the predicted for some states may still exist owing to
two possibilities, unobserved mixture distribution in the initial state
and the effect of measured covariates on subsequent multi-state disease
progression. In the present study, we developed a mixture Markov
exponential regression model to take account of the above-mentioned
heterogeneity across individuals (subject-to-subject variability) with a
systematic model selection based on the likelihood ratio test. The model
was successfully demonstrated by an empirical example on surveillance of
patients with small hepatocellular carcinoma treated by non-surgical
methods. The estimated results suggested that the model with the
incorporation of unobserved mixture distribution behaves better than the
one without. Complete and partial effects regarding risk factors on
different subsequent multi-state transitions were identified using a
homogeneous Markov model. The combination of both initial mixture
distribution and homogeneous Markov exponential regression model makes a
significant contribution to reducing heterogeneity across individuals and
over time for disease progression.
Journal: Journal of Applied Statistics
Pages: 11-21
Issue: 1
Volume: 34
Year: 2007
Keywords: Markov mixture model, multi-state, model selection,
X-DOI: 10.1080/02664760600994711
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600994711
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:1:p:11-21
Template-Type: ReDIF-Article 1.0
Author-Name: K. D. Patterson
Author-X-Name-First: K. D.
Author-X-Name-Last: Patterson
Title: Bias Reduction through First-order Mean Correction, Bootstrapping and Recursive Mean Adjustment
Abstract:
Standard methods of estimation for autoregressive models are known to be
biased in finite samples, which has implications for estimation,
hypothesis testing, confidence interval construction and forecasting.
Three methods of bias reduction are considered here: first-order bias
correction, FOBC, where the total bias is approximated by the O(T-1) bias;
bootstrapping; and recursive mean adjustment, RMA. In addition, we show
how first-order bias correction is related to linear bias correction. The
practically important case where the AR model includes an unknown linear
trend is considered in detail. The fidelity of nominal to actual coverage
of confidence intervals is also assessed. A simulation study covers the
AR(1) model and a number of extensions based on the empirical AR(p) models
fitted by Nelson & Plosser (1982). Overall, which method dominates depends
on the criterion adopted: bootstrapping tends to be the best at reducing
bias, recursive mean adjustment is best at reducing mean squared error,
whilst FOBC does particularly well in maintaining the fidelity of
confidence intervals.
Journal: Journal of Applied Statistics
Pages: 23-45
Issue: 1
Volume: 34
Year: 2007
Keywords: Autoregressive model, bias, first-order correction, bootstrap bias correction, recursive mean adjustment,
X-DOI: 10.1080/02664760600994638
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600994638
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:1:p:23-45
Template-Type: ReDIF-Article 1.0
Author-Name: Noriah Al-Kandari
Author-X-Name-First: Noriah
Author-X-Name-Last: Al-Kandari
Author-Name: Sana Buhamra
Author-X-Name-First: Sana
Author-X-Name-Last: Buhamra
Author-Name: S. E. Ahmed
Author-X-Name-First: S. E.
Author-X-Name-Last: Ahmed
Title: Testing and Merging Information for Effect Size Estimation
Abstract:
A large-sample test for testing the equality of two effect sizes is
presented. The null and non-null distributions of the proposed test
statistic are derived. Further, the problem of estimating the effect size
is considered when it is a priori suspected that two effect sizes may be
close to each other. The combined data from all the samples leads to more
efficient estimator of the effect size. We propose a basis for optimally
combining estimation problems when there is uncertainty concerning the
appropriate statistical model-estimator to use in representing the
sampling process. The objective here is to produce natural adaptive
estimators with some good statistical properties. In the context of two
bivariate statistical models, the expressions for the asymptotic mean
squared error of the proposed estimators are derived and compared with the
parallel expressions for the benchmark estimators. We demonstrate that the
suggested preliminary test estimator has superior asymptotic mean squared
error performance relative to the benchmark and pooled estimators. A
simulation study and application of the methodology to real data are
presented.
Journal: Journal of Applied Statistics
Pages: 47-60
Issue: 1
Volume: 34
Year: 2007
Keywords: Effect size, pooling, preliminary test estimator, large-sample properties,
X-DOI: 10.1080/02664760600994604
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600994604
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:1:p:47-60
Template-Type: ReDIF-Article 1.0
Author-Name: Agnes Herzberg
Author-X-Name-First: Agnes
Author-X-Name-Last: Herzberg
Author-Name: Richard Jarrett
Author-X-Name-First: Richard
Author-X-Name-Last: Jarrett
Title: A-Optimal Block Designs with Additional Singly Replicated Treatments
Abstract:
Block designs to which have been added a number of singly-replicated
treatments, known as secondary treatments, are particularly useful for
experiments where only small amounts of material are available for some
treatments, for example new plant varieties. The designs are of particular
use in the microarray situation. Such designs are known as 'augmented
designs'. This paper obtains the properties of these designs and shows
that, with an equal number of secondary treatments in each block, the
A-optimal design is obtained by using the A-optimal design for the
original block design. It develops formulae for the variance of treatment
comparisons, for both the primary and the secondary treatments. A number
of examples are used to illustrate the results.
Journal: Journal of Applied Statistics
Pages: 61-70
Issue: 1
Volume: 34
Year: 2007
Keywords: Augmented designs, chain-block designs, coat-of-mail designs, block designs, efficiency, microarray designs, optimality, secondary treatments, singly-linked block designs,
X-DOI: 10.1080/02664760600744512
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600744512
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:1:p:61-70
Template-Type: ReDIF-Article 1.0
Author-Name: Housila Singh
Author-X-Name-First: Housila
Author-X-Name-Last: Singh
Author-Name: Mariano Ruiz Espejo
Author-X-Name-First: Mariano Ruiz
Author-X-Name-Last: Espejo
Title: Double Sampling Ratio-product Estimator of a Finite Population Mean in Sample Surveys
Abstract:
It is well known that two-phase (or double) sampling is of significant
use in practice when the population parameter(s) (say, population mean
X-super-¯) of the auxiliary variate x is not known. Keeping this in
view, we have suggested a class of ratio-product estimators in two-phase
sampling with its properties. The asymptotically optimum estimators (AOEs)
in the class are identified in two different cases with their variances.
Conditions for the proposed estimator to be more efficient than the
two-phase sampling ratio, product and mean per unit estimator are
investigated. Comparison with single phase sampling is also discussed. An
empirical study is carried out to demonstrate the efficiency of the
suggested estimator over conventional estimators.
Journal: Journal of Applied Statistics
Pages: 71-85
Issue: 1
Volume: 34
Year: 2007
Keywords: Auxiliary variate, double sampling ratio and product estimators, finite population mean, study variate,
X-DOI: 10.1080/02664760600994562
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600994562
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:1:p:71-85
Template-Type: ReDIF-Article 1.0
Author-Name: Thorsten Thadewald
Author-X-Name-First: Thorsten
Author-X-Name-Last: Thadewald
Author-Name: Herbert Buning
Author-X-Name-First: Herbert
Author-X-Name-Last: Buning
Title: Jarque-Bera Test and its Competitors for Testing Normality - A Power Comparison
Abstract:
For testing normality we investigate the power of several tests, first of
all, the well-known test of Jarque & Bera (1980) and furthermore the tests
of Kuiper (1960) and Shapiro & Wilk (1965) as well as tests of
Kolmogorov-Smirnov and Cramer-von Mises type. The tests on normality are
based, first, on independent random variables (model I) and, second, on
the residuals in the classical linear regression (model II). We
investigate the exact critical values of the Jarque-Bera test and the
Kolmogorov-Smirnov and Cramer-von Mises tests, in the latter case for the
original and standardized observations where the unknown parameters
μ and σ have to be estimated. The power comparison is carried
out via Monte Carlo simulation assuming the model of contaminated normal
distributions with varying parameters μ and σ and different
proportions of contamination. It turns out that for the Jarque-Bera test
the approximation of critical values by the chi-square distribution does
not work very well. The test is superior in power to its competitors for
symmetric distributions with medium up to long tails and for slightly
skewed distributions with long tails. The power of the Jarque-Bera test is
poor for distributions with short tails, especially if the shape is
bimodal - sometimes the test is even biased. In this case a modification
of the Cramer-von Mises test or the Shapiro-Wilk test may be recommended.
Journal: Journal of Applied Statistics
Pages: 87-105
Issue: 1
Volume: 34
Year: 2007
Keywords: Goodness-of-fit tests, tests of Kolmogorov-Smirnov and Cramer-von Mises type, Shapiro-Wilk test, Kuiper test, skewness, kurtosis, contaminated normal distribution, Monte Carlo simulation, critical values, power comparison,
X-DOI: 10.1080/02664760600994539
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600994539
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:1:p:87-105
Template-Type: ReDIF-Article 1.0
Author-Name: Zhiguo Wang
Author-X-Name-First: Zhiguo
Author-X-Name-Last: Wang
Author-Name: Jinde Wang
Author-X-Name-First: Jinde
Author-X-Name-Last: Wang
Author-Name: Xue Liang
Author-X-Name-First: Xue
Author-X-Name-Last: Liang
Title: Non-parametric Estimation for NHPP Software Reliability Models
Abstract:
The non-homogeneous Poisson process (NHPP) model is a very important
class of software reliability models and is widely used in software
reliability engineering. NHPPs are characterized by their intensity
functions. In the literature it is usually assumed that the functional
forms of the intensity functions are known and only some parameters in
intensity functions are unknown. The parametric statistical methods can
then be applied to estimate or to test the unknown reliability models.
However, in realistic situations it is often the case that the functional
form of the failure intensity is not very well known or is completely
unknown. In this case we have to use functional (non-parametric)
estimation methods. The non-parametric techniques do not require any
preliminary assumption on the software models and then can reduce the
parameter modeling bias. The existing non-parametric methods in the
statistical methods are usually not applicable to software reliability
data. In this paper we construct some non-parametric methods to estimate
the failure intensity function of the NHPP model, taking the
particularities of the software failure data into consideration.
Journal: Journal of Applied Statistics
Pages: 107-119
Issue: 1
Volume: 34
Year: 2007
Keywords: Software reliability, NHPP model, intensity function, non-parametric estimation,
X-DOI: 10.1080/02664760600994497
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600994497
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:1:p:107-119
Template-Type: ReDIF-Article 1.0
Author-Name: Howard D. Bondell
Author-X-Name-First: Howard D.
Author-X-Name-Last: Bondell
Author-Name: Aiyi Liu
Author-X-Name-First: Aiyi
Author-X-Name-Last: Liu
Author-Name: Enrique F. Schisterman
Author-X-Name-First: Enrique F.
Author-X-Name-Last: Schisterman
Title: Statistical Inference Based on Pooled Data: A Moment-Based Estimating Equation Approach
Abstract:
We consider statistical inference on parameters of a distribution when
only pooled data are observed. A moment-based estimating equation approach
is proposed to deal with situations where likelihood functions based on
pooled data are difficult to work with. We outline the method to obtain
estimates and test statistics of the parameters of interest in the general
setting. We demonstrate the approach on the family of distributions
generated by the Box-Cox transformation model, and, in the process,
construct tests for goodness of fit based on the pooled data.
Journal: Journal of Applied Statistics
Pages: 129-140
Issue: 2
Volume: 34
Year: 2007
Keywords: Pooling biospecimens, set-based observations, moments, Box-Cox transformation, goodness-of-fit, lognormal distribution,
X-DOI: 10.1080/02664760600994844
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600994844
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:2:p:129-140
Template-Type: ReDIF-Article 1.0
Author-Name: Birdal Senoğlu
Author-X-Name-First: Birdal
Author-X-Name-Last: Senoğlu
Title: Robust Estimation and Hypothesis Testing of Linear Contrasts in Analysis of Covariance with Stochastic Covariates
Abstract:
Estimators of parameters are derived by using the method of modified
maximum likelihood (MML) estimation when the distribution of covariate X
and the error e are both non-normal in a simple analysis of covariance
(ANCOVA) model. We show that our estimators are efficient. We also develop
a test statistic for testing a linear contrast and show that it is robust.
We give a real life example.
Journal: Journal of Applied Statistics
Pages: 141-151
Issue: 2
Volume: 34
Year: 2007
Keywords: Generalized logistic, linear contrasts, modified likelihood, non-normality, robustness, stochastic covariates,
X-DOI: 10.1080/02664760600994869
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600994869
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:2:p:141-151
Template-Type: ReDIF-Article 1.0
Author-Name: A. F. Militino
Author-X-Name-First: A. F.
Author-X-Name-Last: Militino
Author-Name: M. D. Ugarte
Author-X-Name-First: M. D.
Author-X-Name-Last: Ugarte
Author-Name: T. Goicoa
Author-X-Name-First: T.
Author-X-Name-Last: Goicoa
Title: A BLUP Synthetic Versus an EBLUP Estimator: An Empirical Study of a Small Area Estimation Problem
Abstract:
Model-based estimators are becoming very popular in statistical offices
because Governments require accurate estimates for small domains that were
not planned when the study was designed, as their inclusion would have
produced an increase in the cost of the study. The sample sizes in these
domains are very small or even zero; consequently, traditional direct
design-based estimators lead to unacceptably large standard errors. In
this regard, model-based estimators that 'borrow information' from related
areas by using auxiliary information are appropriate. This paper reviews,
under the model-based approach, a BLUP synthetic and an EBLUP estimator.
The goal is to obtain estimators of domain totals when there are several
domains with very small sample sizes or without sampled units. We also
provide detailed expressions of the mean squared error at different levels
of aggregation. The results are illustrated with real data from the Basque
Country Business Survey.
Journal: Journal of Applied Statistics
Pages: 153-165
Issue: 2
Volume: 34
Year: 2007
Keywords: Finite population, prediction theory, mixed models, mean squared error, business survey,
X-DOI: 10.1080/02664760600994893
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600994893
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:2:p:153-165
Template-Type: ReDIF-Article 1.0
Author-Name: E. Kolaiti
Author-X-Name-First: E.
Author-X-Name-Last: Kolaiti
Author-Name: C. Koukouvinos
Author-X-Name-First: C.
Author-X-Name-Last: Koukouvinos
Title: A Comparison of Three-level Orthogonal Arrays in the Presence of a Possible Correlation in Observations
Abstract:
When we want to compare two designs we usually assume the standard linear
model with uncorrelated observations. In this paper we use the comparison
method proposed by Ghosh & Shen (2006) to compare three level orthogonal
arrays with 18, 27 and 36 runs under a possible presence of correlation in
observations.
Journal: Journal of Applied Statistics
Pages: 167-175
Issue: 2
Volume: 34
Year: 2007
Keywords: Correlation in observations, linear model, orthogonal arrays, optimal design, change of variance functions,
X-DOI: 10.1080/02664760600995056
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600995056
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:2:p:167-175
Template-Type: ReDIF-Article 1.0
Author-Name: Yi-Ting Hwang
Author-X-Name-First: Yi-Ting
Author-X-Name-Last: Hwang
Author-Name: Peir-Feng Wei
Author-X-Name-First: Peir-Feng
Author-X-Name-Last: Wei
Title: A Remark on the Zhang Omnibus Test for Normality
Abstract:
Zhang (1999) proposed a novel test statistic Q for testing normality
based on the ratio of two unbiased standard deviation estimators, q1 and
q2, for the true population standard deviation σ. Mingoti & Neves
(2003) discussed some properties of q1 and q2 and showed that the variance
of q1 increases as the true population variance increases. In this paper,
we show that the distribution of q1 is not normal. As a result, normality
percentage points for Q are not appropriate. In this paper, percentage
points of Q are obtained using simulations. Monte Carlo simulations are
provided to evaluate the performance of the new method and Zhang's method.
Journal: Journal of Applied Statistics
Pages: 177-184
Issue: 2
Volume: 34
Year: 2007
Keywords: Empirical distribution, Monte Carlo simulation, Normality test, Q statistic,
X-DOI: 10.1080/02664760600995064
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600995064
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:2:p:177-184
Template-Type: ReDIF-Article 1.0
Author-Name: Saralees Nadarajah
Author-X-Name-First: Saralees
Author-X-Name-Last: Nadarajah
Author-Name: Samuel Kotz
Author-X-Name-First: Samuel
Author-X-Name-Last: Kotz
Title: On the Linear Combination of Laplace and Logistic Random Variables
Abstract:
The distribution of linear combinations of random variables arises
explicitly in many areas of engineering. This has increased the need to
have available the widest possible range of statistical results on linear
combinations of random variables. In this note, the exact distribution of
the linear combination α X+β Y is derived when X and Y are
Laplace and logistic random variables distributed independently of each
other. Extensive tabulations of the associated percentage points obtained
by inverting the derived distribution are also given.
Journal: Journal of Applied Statistics
Pages: 185-194
Issue: 2
Volume: 34
Year: 2007
X-DOI: 10.1080/02664760600995072
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600995072
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:2:p:185-194
Template-Type: ReDIF-Article 1.0
Author-Name: Zeinab H. Amin
Author-X-Name-First: Zeinab H.
Author-X-Name-Last: Amin
Title: Tests for the Validity of the Assumption that the Underlying Distribution of Life is Pareto
Abstract:
This article considers the problem of testing the validity of the
assumption that the underlying distribution of life is Pareto. For
complete and censored samples, the relationship between the Pareto and the
exponential distributions could be of vital importance to test for the
validity of this assumption. For grouped uncensored data the classical
Pearson χ2 test based on the multinomial model can be used.
Attention is confined in this article to handle grouped data with
withdrawals within intervals. Graphical as well as analytical procedures
will be presented. Maximum likelihood estimators for the parameters of the
Pareto distribution based on grouped data will be derived.
Journal: Journal of Applied Statistics
Pages: 195-201
Issue: 2
Volume: 34
Year: 2007
Keywords: Goodness of fit tests, Pareto distribution, grouped data, Types I and II censoring, hazard rate, maximum likelihood estimator, likelihood ratio statistic,
X-DOI: 10.1080/02664760600995098
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600995098
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:2:p:195-201
Template-Type: ReDIF-Article 1.0
Author-Name: L. Corain
Author-X-Name-First: L.
Author-X-Name-Last: Corain
Author-Name: L. Salmaso
Author-X-Name-First: L.
Author-X-Name-Last: Salmaso
Title: A Non-parametric Method for Defining a Global Preference Ranking of Industrial Products
Abstract:
Although experimentation is a crucial stage in the process of research
and development of industrial products, no satisfactory procedure is
available to deal with the common but rather important industrial problem
of defining a preference ranking among all the studied product prototypes
on the basis of performances. In this paper we propose a two-stage
non-parametric procedure in which we firstly perform a set of C-sample
testing procedures, followed by multiple comparisons, in this way
evaluating a set of partial preference rankings, and secondly synthesise
the partial rankings by combining them into a global ranking that provides
a general product preference rule. The proposed method is particularly
useful in the context of industrial experimentation and offers several
advantages such as effectiveness, high flexibility and practical adherence
to real problems where preference ranking is a natural goal.
Journal: Journal of Applied Statistics
Pages: 203-216
Issue: 2
Volume: 34
Year: 2007
Keywords: Dependent rankings, industrial products, non-parametric combination, permutation tests, research and development,
X-DOI: 10.1080/02664760600995122
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600995122
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:2:p:203-216
Template-Type: ReDIF-Article 1.0
Author-Name: C. A. Glasbey
Author-X-Name-First: C. A.
Author-X-Name-Last: Glasbey
Author-Name: G. W. A. M. Van Der Heijden
Author-X-Name-First: G. W. A. M.
Author-X-Name-Last: Van Der Heijden
Title: Alignment and Sub-pixel Interpolation of Images using Fourier Methods
Abstract:
A method is proposed for both estimating and correcting a translational
mis-alignment between digital images, taking account of aliasing of
high-frequency information. A parametric model is proposed for the power-
and cross-spectra of the multivariate stochastic process that is assumed
to have generated a continuous-space version of the images. Parameters,
including those that specify misalignment, are estimated by numerical
maximum likelihood. The effectiveness of the interpolant is confirmed by
simulation and illustrated using multi-band Landsat images.
Journal: Journal of Applied Statistics
Pages: 217-230
Issue: 2
Volume: 34
Year: 2007
Keywords: Aliasing, coherency, complex Gaussian distribution, cross-spectrum, landsat image, phase spectrum, power spectrum, sub-pixel,
X-DOI: 10.1080/02664760600995155
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600995155
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:2:p:217-230
Template-Type: ReDIF-Article 1.0
Author-Name: Arup Kumar Das
Author-X-Name-First: Arup Kumar
Author-X-Name-Last: Das
Title: Application of Multivariate Analysis to Increase the Yield of Dry Cell Batteries
Abstract:
In an organization, the manufacturing process of a dry cell battery was
suffering from the problem of frequent stoppages in the assembly line. The
complete battery manufacturing operation is highly automated and
mechanized. It was suspected that excessive variation in overall height of
bobbin was the major reason for such stoppages. The bobbin, the inside
part of a dry cell battery acting as cathode, is formed by the battery
extrusion process. A planned experiment was carried out on the extrusion
process that identified the setting of extrusion machines and the amount
of water content in the cathode mixture as the parameters causing
variation in the bobbin characteristics. The problem of frequent stoppages
was eliminated when appropriate action was taken on these two parameters.
Finally, multivariate and univariate control schemes were developed for
online control of the bobbin characteristics.
Journal: Journal of Applied Statistics
Pages: 239-248
Issue: 3
Volume: 34
Year: 2007
Keywords: Battery extrusion process, cathode mixture, Bartlett's test, Duncan's multiple range test, MANOVA, multivariate control chart,
X-DOI: 10.1080/02664760601004619
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760601004619
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:3:p:239-248
Template-Type: ReDIF-Article 1.0
Author-Name: Indranil Mukhopadhyay
Author-X-Name-First: Indranil
Author-X-Name-Last: Mukhopadhyay
Author-Name: Sudipta Chatterjee
Author-X-Name-First: Sudipta
Author-X-Name-Last: Chatterjee
Author-Name: Aditya Chatterjee
Author-X-Name-First: Aditya
Author-X-Name-Last: Chatterjee
Title: Towards Enhancement of the Economy of a Thermal Power Generating System through Prediction of Plant Efficiency
Abstract:
The plant 'Heat Rate' (HR) is a measure of overall efficiency of a
thermal power generating system. It depends on a large number of factors,
some of which are non-measurable, while data relating to others are seldom
available and recorded. However, coal quality (expressed in terms of
'effective heat value' (EHV) as kcal/kg) transpires to be one of the
important factors that influences HR values and data on EHV are available
in any thermal power generating system. In the present work, we propose a
prediction interval of the HR values on the basis of only EHV, keeping in
mind that coal quality is one of the important (but not the only) factors
that have a pronounced effect on the combustion process and hence on HR.
The underlying theory borrows the idea of providing simultaneous
confidence interval (SCI) to the coefficients of a p-th p(≥1) order
autoregressive model (AR(p)). The theory has been substantiated with the
help of real life data from a power utility (after suitable base and scale
transformation of the data to maintain the confidentiality of the
classified document). Scope for formulating strategies to enhance the
economy of a thermal power generating system has also been explored.
Journal: Journal of Applied Statistics
Pages: 249-259
Issue: 3
Volume: 34
Year: 2007
Keywords: Plant heat rate, effective heat value, dependence analysis, autoregressive process, prediction interval,
X-DOI: 10.1080/02664760601004767
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760601004767
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:3:p:249-259
Template-Type: ReDIF-Article 1.0
Author-Name: Mohamed Boutahar
Author-X-Name-First: Mohamed
Author-X-Name-Last: Boutahar
Author-Name: Velayoudom Marimoutou
Author-X-Name-First: Velayoudom
Author-X-Name-Last: Marimoutou
Author-Name: Leila Nouira
Author-X-Name-First: Leila
Author-X-Name-Last: Nouira
Title: Estimation Methods of the Long Memory Parameter: Monte Carlo Analysis and Application
Abstract:
Since the seminal paper of Granger & Joyeux (1980), the concept of a long
memory has focused the attention of many statisticians and econometricians
trying to model and measure the persistence of stationary processes. Many
methods for estimating d, the long-range dependence parameter, have been
suggested since the work of Hurst (1951). They can be summarized in three
classes: the heuristic methods, the semi-parametric methods and the
maximum likelihood methods. In this paper, we try by simulation, to verify
the two main properties of d-super-ˆ: the consistency and the
asymptotic normality. Hence, it is very important for practitioners to
compare the performance of the various classes of estimators. The results
indicate that only the semi-parametric and the maximum likelihood methods
can give good estimators. They also suggest that the AR component of the
ARFIMA (1, d, 0) process has an important impact on the properties of the
different estimators and that the Whittle method is the best one, since it
has the small mean squared error. We finally carry out an empirical
application using the monthly seasonally adjusted US Inflation series, in
order to illustrate the usefulness of the different estimation methods in
the context of using real data.
Journal: Journal of Applied Statistics
Pages: 261-301
Issue: 3
Volume: 34
Year: 2007
Keywords: Long memory, ARFIMA (p d q) process, fractional Gaussian noise, Monte Carlo study,
X-DOI: 10.1080/02664760601004874
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760601004874
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:3:p:261-301
Template-Type: ReDIF-Article 1.0
Author-Name: Lilian M. De Menezes
Author-X-Name-First: Lilian M.
Author-X-Name-Last: De Menezes
Author-Name: Ana Lasaosa
Author-X-Name-First: Ana
Author-X-Name-Last: Lasaosa
Title: Comparing Fits of Latent Trait and Latent Class Models Applied to Sparse Binary Data: An Illustration with Human Resource Management Data
Abstract:
This paper addresses the problem of comparing the fit of latent class and
latent trait models when the indicators are binary and the contingency
table is sparse. This problem is common in the analysis of data from large
surveys, where many items are associated with an unobservable variable. A
study of human resource data illustrates: (1) how the usual
goodness-of-fit tests, model selection and cross-validation criteria can
be inconclusive; (2) how model selection and evaluation procedures from
time series and economic forecasting can be applied to extend residual
analysis in this context.
Journal: Journal of Applied Statistics
Pages: 303-319
Issue: 3
Volume: 34
Year: 2007
Keywords: Multivariate statistics, latent variable models, forecast encompassing, human resource management,
X-DOI: 10.1080/02664760601004908
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760601004908
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:3:p:303-319
Template-Type: ReDIF-Article 1.0
Author-Name: Andreas Diekmann
Author-X-Name-First: Andreas
Author-X-Name-Last: Diekmann
Title: Not the First Digit! Using Benford's Law to Detect Fraudulent Scientif ic Data
Abstract:
Digits in statistical data produced by natural or social processes are
often distributed in a manner described by 'Benford's law'. Recently, a
test against this distribution was used to identify fraudulent accounting
data. This test is based on the supposition that first, second, third, and
other digits in real data follow the Benford distribution while the digits
in fabricated data do not. Is it possible to apply Benford tests to detect
fabricated or falsified scientific data as well as fraudulent financial
data? We approached this question in two ways. First, we examined the use
of the Benford distribution as a standard by checking the frequencies of
the nine possible first and ten possible second digits in published
statistical estimates. Second, we conducted experiments in which subjects
were asked to fabricate statistical estimates (regression coefficients).
The digits in these experimental data were scrutinized for possible
deviations from the Benford distribution. There were two main findings.
First, both digits of the published regression coefficients were
approximately Benford distributed or at least followed a pattern of
monotonic decline. Second, the experimental results yielded new insights
into the strengths and weaknesses of Benford tests. Surprisingly, first
digits of faked data also exhibited a pattern of monotonic decline, while
second, third, and fourth digits were distributed less in accordance with
Benford's law. At least in the case of regression coefficients, there were
indications that checks for digit-preference anomalies should focus less
on the first (i.e. leftmost) and more on later digits.
Journal: Journal of Applied Statistics
Pages: 321-329
Issue: 3
Volume: 34
Year: 2007
Keywords: Benford, first digit law, digital analysis, data fabrication, distribution of digits from regression coefficients,
X-DOI: 10.1080/02664760601004940
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760601004940
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:3:p:321-329
Template-Type: ReDIF-Article 1.0
Author-Name: Kang-Mo Jung
Author-X-Name-First: Kang-Mo
Author-X-Name-Last: Jung
Title: Least Trimmed Squares Estimator in the Errors-in-Variables Model
Abstract:
We propose a robust estimator in the errors-in-variables model using the
least trimmed squares estimator. We call this estimator the orthogonal
least trimmed squares (OLTS) estimator. We show that the OLTS estimator
has the high breakdown point and appropriate equivariance properties. We
develop an algorithm for the OLTS estimate. Simulations are performed to
compare the efficiencies of the OLTS estimates with the total least
squares (TLS) estimates and a numerical example is given to illustrate the
effectiveness of the estimate.
Journal: Journal of Applied Statistics
Pages: 331-338
Issue: 3
Volume: 34
Year: 2007
Keywords: Breakdown point, equivariance, errors-in-variables model, least trimmed squares estimator, orthogonal regression,
X-DOI: 10.1080/02664760601004973
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760601004973
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:3:p:331-338
Template-Type: ReDIF-Article 1.0
Author-Name: Jabu S. Sithole
Author-X-Name-First: Jabu S.
Author-X-Name-Last: Sithole
Author-Name: Peter W. Jones
Author-X-Name-First: Peter W.
Author-X-Name-Last: Jones
Title: Bivariate Longitudinal Model for Detecting Prescribing Change in Two Drugs Simultaneously with Correlated Errors
Abstract:
Bivariate responses of repeated measures data are usually analysed as two
separate responses in the literature by several authors. The two responses
usually tend to be related in some way and analysing this data jointly
presents an opportunity to account for the joint movement, which may
impact on the conclusions reached compared to analysing the responses
separately. In this paper, a bivariate regression model with random
effects (linear mixed model) is used to detect a change if any in the
prescribing habits in the UK at the general practice (family medicine)
level due to an educational intervention given repeated measures data
before and after the intervention and a control group. The message was to
increase the prescribing of one drug while simultaneously decreasing the
prescribing of another. The effects of modelling a bivariate
auto-regressive process are evaluated.
Journal: Journal of Applied Statistics
Pages: 339-352
Issue: 3
Volume: 34
Year: 2007
Keywords: Bivariate response, repeated measures data, linear mixed model, bivariate first order auto-regressive process, SAS proc mixed, educational intervention, prescribing analysis,
X-DOI: 10.1080/02664760601005020
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760601005020
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:3:p:339-352
Template-Type: ReDIF-Article 1.0
Author-Name: Rand R. Wilcox
Author-X-Name-First: Rand R.
Author-X-Name-Last: Wilcox
Title: Robust ANCOVA: Some Small-sample Results when there are Multiple Groups and Multiple Covariates
Abstract:
Numerous methods have been proposed for dealing with the serious
practical problems associated with the conventional analysis of covariance
method, with an emphasis on comparing two groups when there is a single
covariate. Recently, Wilcox (2005a: section 11.8.2) outlined a method for
handling multiple covariates that allows nonlinearity and
heteroscedasticity. The method is readily extended to multiple groups, but
nothing is known about its small-sample properties. This paper compares
three variations of the method, each method based on one of three measures
of location: means, medians and 20% trimmed means. The methods based on a
20% trimmed mean or median are found to avoid Type I error probabilities
well above the nominal level, but the method based on medians can be too
conservative in various situations; using a 20% trimmed mean gave the best
results in terms of Type I errors. The methods are based in part on a
running interval smoother approximation of the regression surface.
Included are comments on required sample sizes that are relevant to the
so-called curse of dimensionality.
Journal: Journal of Applied Statistics
Pages: 353-364
Issue: 3
Volume: 34
Year: 2007
Keywords: Robust methods, smoothers, heteroscedasticity, curse of dimensionality,
X-DOI: 10.1080/02664760601005053
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760601005053
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:3:p:353-364
Template-Type: ReDIF-Article 1.0
Author-Name: Jan G. De Gooijer
Author-X-Name-First: Jan G.
Author-X-Name-Last: De Gooijer
Title: Power of the Neyman Smooth Test for Evaluating Multivariate Forecast Densities
Abstract:
We compare and investigate Neyman's smooth test, its components, and the
Kolmogorov-Smirnov (KS) goodness-of-fit test for testing the uniformity of
multivariate forecast densities. Simulations indicate that the KS test
lacks power when the forecast distributions are misspecified, especially
for correlated sequences of random variables. Neyman's smooth test and its
components work well in samples of size typically available, although
there sometimes are size distortions. The components provide directed
diagnosis regarding the kind of departure from the null. For illustration,
the tests are applied to forecast densities obtained from a bivariate
threshold model fitted to high-frequency financial data.
Journal: Journal of Applied Statistics
Pages: 371-381
Issue: 4
Volume: 34
Year: 2007
Keywords: Goodness-of-fit, multivariate density forecasts, uniform distribution,
X-DOI: 10.1080/02664760701231526
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701231526
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:4:p:371-381
Template-Type: ReDIF-Article 1.0
Author-Name: Mark Evans
Author-X-Name-First: Mark
Author-X-Name-Last: Evans
Author-Name: Richard E. Johnston
Author-X-Name-First: Richard E.
Author-X-Name-Last: Johnston
Title: Stochastic Modelling of Times to Temperature for Furnaces Supplying Titanium Blooms to a Rolling Mill at TIMET
Abstract:
In conjunction with TIMET at Waunarlwydd (Swansea, UK) a model has been
developed that will optimise the scheduling of various blooms to their
eight furnaces so as to minimise the time taken to roll these blooms into
the finished mill products. This production scheduling model requires
reliable data on times taken for the various furnaces that heat the slabs
and blooms to reach the temperatures required for rolling. These times to
temperature are stochastic in nature and this paper identifies the
distributional form for these times using the generalised F distribution
as a modelling framework. The times to temperature were found to be
similarly distributed over all furnaces. The identified distributional
forms were incorporated into the scheduling model to optimise a particular
campaign that was run at TIMET Swansea. Amongst other conclusion it was
found that, compared to the actual campaign, the model produced a schedule
that reduced the makespan by some 35%.
Journal: Journal of Applied Statistics
Pages: 383-397
Issue: 4
Volume: 34
Year: 2007
Keywords: Titanium, scheduling, generalised F distribution,
X-DOI: 10.1080/02664760701231575
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701231575
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:4:p:383-397
Template-Type: ReDIF-Article 1.0
Author-Name: Haritini Tsangari
Author-X-Name-First: Haritini
Author-X-Name-Last: Tsangari
Title: An Alternative Methodology for Combining Different Forecasting Models
Abstract:
Many economic and financial time series exhibit heteroskedasticity, where
the variability changes are often based on recent past shocks, which cause
large or small fluctuations to cluster together. Classical ways of
modelling the changing variance include the use of Generalized
Autoregressive Conditional Heteroskedasticity (GARCH) models and Neural
Networks models. The paper starts with a comparative study of these two
models, both in terms of capturing the non-linear or heteroskedastic
structure and forecasting performance. Monthly and daily exchange rates
for three different countries are implemented. The paper continues with
different methods for combining forecasts of the volatility from the
competing models, in order to improve forecasting accuracy. Traditional
methods for combining the predicted values from different models, using
various weighting schemes are considered, such as the simple average or
methods that find the best weights in terms of minimizing the squared
forecast error. The main purpose of the paper is, however, to propose an
alternative methodology for combining forecasts effectively. The new,
hereby-proposed non-linear, non-parametric, kernel-based method, is shown
to have the basic advantage of not being affected by outliers, structural
breaks or shocks to the system and it does not require a specific
functional form for the combination.
Journal: Journal of Applied Statistics
Pages: 403-421
Issue: 4
Volume: 34
Year: 2007
Keywords: GARCH models, neural networks, heteroskedasticity, combination methods, non-parametric methods, kernel regression, forecasting criteria,
X-DOI: 10.1080/02664760701231633
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701231633
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:4:p:403-421
Template-Type: ReDIF-Article 1.0
Author-Name: Larry W. Taylor
Author-X-Name-First: Larry W.
Author-X-Name-Last: Taylor
Title: Nonparametric Estimation of Duration Dependence in Militarized Interstate Disputes
Abstract:
A militarized interstate dispute (MID) involves military conflict between
states with diplomatic ties and exists because two or more states have
failed to resolve their differences through diplomatic channels. Jones et
al. (1996) characterize an MID as the threat, display or use of military
force short of war. They analyze over 2000 disputes spanning two centuries
across the globe and conclude that disputes tend to be persistent once
established. In this paper, I find that the passage of time can be a
favorable factor in dispute resolution, and thus historical mechanisms for
dispute resolution favor ending, not extending, militarized disputes. I
emphasize the use of non-parametric procedures first to estimate the
hazard function and then to estimate the benefits of negotiated
settlements.
Journal: Journal of Applied Statistics
Pages: 423-441
Issue: 4
Volume: 34
Year: 2007
Keywords: Non-parametric estimation, militarized interstate dispute, duration dependence, continuous time, trimming, stochastic dominance, benefits of diplomacy,
X-DOI: 10.1080/02664760701231690
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701231690
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:4:p:423-441
Template-Type: ReDIF-Article 1.0
Author-Name: Shashibhushan B. Mahadik
Author-X-Name-First: Shashibhushan B.
Author-X-Name-Last: Mahadik
Author-Name: Digambar T. Shirke
Author-X-Name-First: Digambar T.
Author-X-Name-Last: Shirke
Title: On the Superiority of a Variable Sampling Interval Control Chart
Abstract:
The paper establishes the analytical grounds of the uniform superiority
of a variable sampling interval (VSI) Shewhart control chart over the
conventional fixed sampling interval (FSI) control chart, with respect to
the zero-time performance, for a wide class of process distributions. We
provide a sufficient condition on the distribution of a control chart
statistic, and propose a criterion to determine the control limits and the
regions in the in-control area of the VSI chart, corresponding to the
different sampling intervals used by it. The condition and the criterion
together ensure the uniform zero-time superiority of the VSI chart over
the matched FSI chart, in detecting a process shift of any magnitude. It
is shown that normal, Student's t and Laplace distributions satisfy the
sufficient condition. In addition, chi-square, F and beta distributions
satisfy it, provided that these are not extremely skewed. Further, it is
illustrated that the superiority of the VSI feature is not trivial and
cannot be assured if the sufficient condition is not satisfied or the
control limits and the regions are not determined according to the
proposed criterion. An application of the result to confirm the
superiority of the VSI feature is demonstrated for the control chart for
individual observations used to monitor a milk-pouch filling process.
Journal: Journal of Applied Statistics
Pages: 443-458
Issue: 4
Volume: 34
Year: 2007
Keywords: Adaptive control chart, average time to signal, average number of samples to signal, zero-time performance, statistical process control,
X-DOI: 10.1080/02664760701231765
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701231765
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:4:p:443-458
Template-Type: ReDIF-Article 1.0
Author-Name: Jeffrey E. Jarrett
Author-X-Name-First: Jeffrey E.
Author-X-Name-Last: Jarrett
Author-Name: Xia Pan
Author-X-Name-First: Xia
Author-X-Name-Last: Pan
Title: Monitoring Variability and Analyzing Multivariate Autocorrelated Processes
Abstract:
Traditional multivariate quality control charts are based on independent
observations. In this paper, we explain how to extend univariate residual
charts to multivariate cases and how to combine the traditional
statistical process control (SPC) approaches to monitor changes in process
variability in a dynamic environment. We propose using Alt's (1984) W
chart on vector autoregressive (VAR) residuals to monitor the variability
for multivariate processes in the presence of autocorrelation. We study
examples jointly using the Hotelling T2 chart on VAR residuals, the W
chart, and the Portmanteau test to diagnose the types of shift in process
parameters.
Journal: Journal of Applied Statistics
Pages: 459-469
Issue: 4
Volume: 34
Year: 2007
Keywords: SPC, variability shift, quality control for multivariate and serially correlated processes, vector autoregressive (VAR) residuals, diagnosing types of parameter shift,
X-DOI: 10.1080/02664760701231849
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701231849
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:4:p:459-469
Template-Type: ReDIF-Article 1.0
Author-Name: Perla Subbaiah
Author-X-Name-First: Perla
Author-X-Name-Last: Subbaiah
Author-Name: George Xia
Author-X-Name-First: George
Author-X-Name-Last: Xia
Title: Robustness of Inference for One-sample Problem with Correlated Observations
Abstract:
The inference about the population mean based on the standard t-test
involves the assumption of normal population as well as independence of
the observations. In this paper we examine the robustness of the inference
in the presence of correlations among the observations. We consider the
simplest correlation structure AR(1) and its impact on the t-test. A
modification of the t-test suitable for this structure is suggested, and
its effect on the inference is investigated using Monte Carlo simulation.
Journal: Journal of Applied Statistics
Pages: 471-486
Issue: 4
Volume: 34
Year: 2007
Keywords: Repeated measurements, AR(1) correlation structure,
X-DOI: 10.1080/02664760701231906
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701231906
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:4:p:471-486
Template-Type: ReDIF-Article 1.0
Author-Name: Andriy Andreev
Author-X-Name-First: Andriy
Author-X-Name-Last: Andreev
Author-Name: Antti Kanto
Author-X-Name-First: Antti
Author-X-Name-Last: Kanto
Author-Name: Pekka Malo
Author-X-Name-First: Pekka
Author-X-Name-Last: Malo
Title: Computational Examples of a New Method for Distribution Selection in the Pearson System
Abstract:
A considerable problem in statistics and risk management is finding
distributions that capture the complex behaviour exhibited by financial
data. The importance of higher order moments in decision making has been
well recognized and there is increasing interest in modelling with
distributions that are able to account for these effects. The Pearson
system can be used to model a wide scale of distributions with various
skewness and kurtosis. This paper provides computational examples of a new
easily implemented method for selecting probability density functions from
the Pearson family of distributions. We apply this method to daily,
monthly, and annual series using a range of data from commodity markets to
macroeconomic variables.
Journal: Journal of Applied Statistics
Pages: 487-506
Issue: 4
Volume: 34
Year: 2007
Keywords: Pearson system, block bootstrap, selection criteria,
X-DOI: 10.1080/02664760701231922
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701231922
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:4:p:487-506
Template-Type: ReDIF-Article 1.0
Author-Name: Prasun Das
Author-X-Name-First: Prasun
Author-X-Name-Last: Das
Author-Name: Sasadhar Bera
Author-X-Name-First: Sasadhar
Author-X-Name-Last: Bera
Title: Standardization of Process Norms in Baker's Yeast Fermentation through Statistical Models in Comparison with Neural Networks
Abstract:
Achieving consistency of growth pattern for commercial yeast fermentation
over batches through addition of water, molasses and other chemicals is
often very complex in nature due to its bio-chemical reactions in
operation. Regression models in statistical methods play a very important
role in modeling the underlying mechanism, provided it is known. On the
contrary, artificial neural networks provide a wide class of
general-purpose, flexible non-linear architectures to explain any complex
industrial processes. In this paper, an attempt has been made to find a
robust control system for a time varying yeast fermentation process
through statistical means, and in comparison to non-parametric neural
network techniques. The data used in this context are obtained from an
industry producing baker's yeast through a fed-batch fermentation process.
The model accuracy for predicting the growth pattern of commercial yeast,
when compared among the various techniques used, reveals the best
performance capability with the backpropagation neural network. The
statistical model used through projection pursuit regression also shows
higher prediction accuracy. The models, thus developed, would also help to
find an optimum combination of parameters for minimizing the variability
of yeast production.
Journal: Journal of Applied Statistics
Pages: 511-527
Issue: 5
Volume: 34
Year: 2007
Keywords: Generalized linear model (GLM), multisample bootstrapping, projection pursuit regression, artificial neural network (ANN), yeast, fed-batch fermentation,
X-DOI: 10.1080/02664760701234793
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701234793
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:5:p:511-527
Template-Type: ReDIF-Article 1.0
Author-Name: Daniel Gembris
Author-X-Name-First: Daniel
Author-X-Name-Last: Gembris
Author-Name: John G. Taylor
Author-X-Name-First: John G.
Author-X-Name-Last: Taylor
Author-Name: Dieter Suter
Author-X-Name-First: Dieter
Author-X-Name-Last: Suter
Title: Evolution of Athletic Records: Statistical Effects versus Real Improvements
Abstract:
Athletic records represent the best results in a given discipline, thus
improving monotonically with time. As has already been shown, this should
not be taken as an indication that the athletes' capabilities keep
improving. In other words, a new record is not noteworthy just because it
is a new record, instead it is necessary to assess by how much the record
has improved. In this paper we derive formulae that can be used to show
that athletic records continue to improve with time, even if athletic
performance remains constant. We are considering two specific examples,
the German championships and the world records in several athletic
disciplines. The analysis shows that, for the latter, true improvements
occur in 20-50% of the disciplines. The analysis is supplemented by an
application of our record estimation approach to the prediction of the
maximum body length of humans for a specified size of a population
respectively population group from a representative sample.
Journal: Journal of Applied Statistics
Pages: 529-545
Issue: 5
Volume: 34
Year: 2007
Keywords: Records, athletics, estimation of maxima and minima,
X-DOI: 10.1080/02664760701234850
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701234850
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:5:p:529-545
Template-Type: ReDIF-Article 1.0
Author-Name: S. Magnussen
Author-X-Name-First: S.
Author-X-Name-Last: Magnussen
Author-Name: R. Reeves
Author-X-Name-First: R.
Author-X-Name-Last: Reeves
Title: Sample-based Maximum Likelihood Estimation of the Autologistic Model
Abstract:
New recursive algorithms for fast computation of the normalizing constant
for the autologistic model on the lattice make feasible a sample-based
maximum likelihood estimation (MLE) of the autologistic parameters. We
demonstrate by sampling from 12 simulated 420×420 binary lattices
with square lattice plots of size 4×4, …, 7×7 and sample
sizes between 20 and 600. Sample-based results are compared with
'benchmark' MCMC estimates derived from all binary observations on a
lattice. Sample-based estimates are, on average, biased systematically by
3%-7%, a bias that can be reduced by more than half by a set of
calibrating equations. MLE estimates of sampling variances are large and
usually conservative. The variance of the parameter of spatial association
is about 2-10 times higher than the variance of the parameter of
abundance. Sample distributions of estimates were mostly non-normal. We
conclude that sample-based MLE estimation of the autologistic parameters
with an appropriate sample size and post-estimation calibration will
furnish fully acceptable estimates. Equations for predicting the expected
sampling variance are given.
Journal: Journal of Applied Statistics
Pages: 547-561
Issue: 5
Volume: 34
Year: 2007
Keywords: Markov Chain Monte Carlo, bias, sample size, cluster sampling, calibration, sampling variance,
X-DOI: 10.1080/02664760701234967
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701234967
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:5:p:547-561
Template-Type: ReDIF-Article 1.0
Author-Name: Li-Chu Chien
Author-X-Name-First: Li-Chu
Author-X-Name-Last: Chien
Author-Name: Tsung-Shan Tsou
Author-X-Name-First: Tsung-Shan
Author-X-Name-Last: Tsou
Title: Regression Diagnostic under Model Misspecification
Abstract:
We propose two novel diagnostic measures for the detection of influential
observations for regression parameters in linear regression. Traditional
diagnostic statistics focus on the effect of deletion of data points
either on parameter estimates, or on predicted values. A data point is
regarded as influential by the new methods if its inclusion determines a
significantly different likelihood function for the parameter of interest.
The concerned likelihood function is asymptotically valid for practically
all underlying distributions whose second moments exist.
Journal: Journal of Applied Statistics
Pages: 563-575
Issue: 5
Volume: 34
Year: 2007
Keywords: Influential diagnostic, robust likelihood, robust normal regression, DFBETAS, DFFITS, Cook's distance,
X-DOI: 10.1080/02664760701235014
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701235014
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:5:p:563-575
Template-Type: ReDIF-Article 1.0
Author-Name: Seema Jaggi
Author-X-Name-First: Seema
Author-X-Name-Last: Jaggi
Author-Name: Cini Varghese
Author-X-Name-First: Cini
Author-X-Name-Last: Varghese
Author-Name: V.K. Gupta
Author-X-Name-First: V.K.
Author-X-Name-Last: Gupta
Title: Optimal Circular Block Designs for Neighbouring Competition Effects
Abstract:
Competition or interference occurs when the responses to treatments in
experimental units are affected by the treatments in neighbouring units.
This may contribute to variability in experimental results and lead to
substantial losses in efficiency. The study of a competing situation needs
designs in which the competing units appear in a predetermined pattern.
This paper deals with optimality aspects of circular block designs for
studying the competition among treatments applied to neighbouring
experimental units. The model considered is a four-way classified model
consisting of direct effect of the treatment applied to a particular plot,
the effect of those treatments applied to the immediate left and right
neighbouring units and the block effect. Conditions have been obtained for
the block design to be universally optimal for estimating direct and
neighbour effects. Some classes of balanced and strongly balanced complete
block designs have been identified to be universally optimal for the
estimation of direct, left and right neighbour effects and a list of
universally optimal designs for v<20 and r<100 has been
prepared.
Journal: Journal of Applied Statistics
Pages: 577-584
Issue: 5
Volume: 34
Year: 2007
Keywords: Circular block design, universal optimality, direct effects, neighbour effects, balanced and strongly balanced design,
X-DOI: 10.1080/02664760701235089
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701235089
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:5:p:577-584
Template-Type: ReDIF-Article 1.0
Author-Name: Paul Gustafson
Author-X-Name-First: Paul
Author-X-Name-Last: Gustafson
Author-Name: S. Siddarth
Author-X-Name-First: S.
Author-X-Name-Last: Siddarth
Title: Describing the Dynamics of Attention to TV Commercials: A Hierarchical Bayes Analysis of the Time to Zap an Ad
Abstract:
This paper provides insights into the dynamics of attention to TV
commercials via an analysis of the length of time that commercials are
viewed before being 'zapped'. The model, which incorporates a flexible
baseline hazard rate and captures unobserved heterogeneity across both
consumers and commercials using a hierarchical Bayes approach, is
estimated on two datasets in which commercial viewing is captured by a
passive online device that continually monitors a household's TV viewing.
Consistent with previous findings in psychology about the nature of
attentional engagement in TV viewing, baseline hazard rates are found to
be non-monotonic. In addition, the data show considerable ad-to-ad and
household-to-household heterogeneity in zapping behavior. While one of the
datasets contains some information on characteristics of the ads, these
data do not reveal any firm links between the ad heterogeneity and the ad
characteristics. A number of methodological and computational issues arise
in the hierarchical Bayes analysis.
Journal: Journal of Applied Statistics
Pages: 585-609
Issue: 5
Volume: 34
Year: 2007
Keywords: Heterogeneity, hierarchical Bayes, marketing,
X-DOI: 10.1080/02664760701235279
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701235279
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:5:p:585-609
Template-Type: ReDIF-Article 1.0
Author-Name: A. Felipe
Author-X-Name-First: A.
Author-X-Name-Last: Felipe
Author-Name: M. L. Menendez
Author-X-Name-First: M. L.
Author-X-Name-Last: Menendez
Author-Name: L. Pardo
Author-X-Name-First: L.
Author-X-Name-Last: Pardo
Title: Order-restricted Dose-related Trend Phi-divergence Tests for Generalized Linear Models
Abstract:
In this paper a new family of test statistics is presented for testing
the independence between the binary response Y and an ordered categorical
explanatory variable X (doses) against the alternative hypothesis of an
increase dose-response relationship between a response variable Y and X
(doses). The properties of these test statistics are studied. This new
family of test statistics is based on the family of φ-divergence
measures and contains as a particular case the likelihood ratio test. We
pay special attention to the family of test statistics associated with the
power divergence family. A simulation study is included in order to
analyze the behavior of the power divergence family of test statistics.
Journal: Journal of Applied Statistics
Pages: 611-623
Issue: 5
Volume: 34
Year: 2007
Keywords: Phi-divergence test statistic, isotonic regression, response variable, explanatory variable,
X-DOI: 10.1080/02664760701235303
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701235303
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:5:p:611-623
Template-Type: ReDIF-Article 1.0
Author-Name: Shannon E. Allen
Author-X-Name-First: Shannon E.
Author-X-Name-Last: Allen
Author-Name: Burt Holland
Author-X-Name-First: Burt
Author-X-Name-Last: Holland
Title: Expected Mean Squares for Hierarchical Factorial Layouts with Population Imbalance
Abstract:
We introduce an analysis of variance usable for two-factor hierarchical
models where observations are incompletely sampled from unbalanced
populations of finite effects. Our new approach enables unbiased
estimation of the variance components for this type of model and allows
hypothesis testing to identify significant effects/sub-class effects. An
explanation of how these results can be generalized to factorial layouts
with more than two factors is given.
Journal: Journal of Applied Statistics
Pages: 625-637
Issue: 5
Volume: 34
Year: 2007
X-DOI: 10.1080/02664760701235352
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701235352
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:5:p:625-637
Template-Type: ReDIF-Article 1.0
Author-Name: Patrick Roger
Author-X-Name-First: Patrick
Author-X-Name-Last: Roger
Author-Name: Marie-Helene Broihanne
Author-X-Name-First: Marie-Helene
Author-X-Name-Last: Broihanne
Title: Efficiency of Betting Markets and Rationality of Players: Evidence from the French 6/49 Lotto
Abstract:
We analyse the existence of preferred numbers on the French Lotto market
and prove that this market is not strongly efficient in the sense of
Thaler & Ziemba (1988). The preference for low numbers is investigated by
means of stochastic dominance tests. The specific features of the French
Lotto game allow us to build a simple estimate of the probability
distribution of numbers actually played. The results are compared with the
(highly time-consuming) maximum likelihood estimator used by Farrell et
al. (2000). It is shown that the two methods give very close results. Our
conclusions stress the perspectives of this study in various domains.
Journal: Journal of Applied Statistics
Pages: 645-662
Issue: 6
Volume: 34
Year: 2007
Keywords: Lotto, pari-mutuel, information efficiency, maximum likelihood estimation,
X-DOI: 10.1080/02664760701236889
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701236889
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:6:p:645-662
Template-Type: ReDIF-Article 1.0
Author-Name: R.B. Arellano-Valle
Author-X-Name-First: R.B.
Author-X-Name-Last: Arellano-Valle
Author-Name: H. Bolfarine
Author-X-Name-First: H.
Author-X-Name-Last: Bolfarine
Author-Name: V.H. Lachos
Author-X-Name-First: V.H.
Author-X-Name-Last: Lachos
Title: Bayesian Inference for Skew-normal Linear Mixed Models
Abstract:
Linear mixed models (LMM) are frequently used to analyze repeated
measures data, because they are more flexible to modelling the correlation
within-subject, often present in this type of data. The most popular LMM
for continuous responses assumes that both the random effects and the
within-subjects errors are normally distributed, which can be an
unrealistic assumption, obscuring important features of the variations
present within and among the units (or groups). This work presents
skew-normal liner mixed models (SNLMM) that relax the normality assumption
by using a multivariate skew-normal distribution, which includes the
normal ones as a special case and provides robust estimation in mixed
models. The MCMC scheme is derived and the results of a simulation study
are provided demonstrating that standard information criteria may be used
to detect departures from normality. The procedures are illustrated using
a real data set from a cholesterol study.
Journal: Journal of Applied Statistics
Pages: 663-682
Issue: 6
Volume: 34
Year: 2007
Keywords: Bayesian inference, MCMC, Gibbs sampler, multivariate skew-normal distribution, skewness,
X-DOI: 10.1080/02664760701236905
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701236905
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:6:p:663-682
Template-Type: ReDIF-Article 1.0
Author-Name: Lourdes Pozueta
Author-X-Name-First: Lourdes
Author-X-Name-Last: Pozueta
Author-Name: Xavier Tort-Martorell
Author-X-Name-First: Xavier
Author-X-Name-Last: Tort-Martorell
Author-Name: Lluis Marco
Author-X-Name-First: Lluis
Author-X-Name-Last: Marco
Title: Identifying Dispersion Effects in Robust Design Experiments—Issues and Improvements
Abstract:
The two experimental methods most commonly used for reducing the effect
of noise factors on a response of interest Y aim either to estimate a
model of the variability (V(Y), or an associated function), that is
transmitted by the noise factors, or to estimate a model of the ratio
between the response (Y) and all the control and noise factors involved
therein. Both methods aim to determine which control factor conditions
minimise the noise factors' effect on the response of interest, and a
series of analytical guidelines are established to reach this end. Product
array designs allow robustness problems to be solved in both ways, but
require a large number of experiments. Thus, practitioners tend to choose
more economical designs that only allow them to model the surface response
for Y. The general assumption is that both methods would lead to similar
conclusions. In this article we present a case that utilises a design
based on a product design and for which the conclusions yielded by the two
analytical methods are quite different. This example casts doubt on the
guidelines that experimental practice follows when using either of the two
methods. Based on this example, we show the causes behind these
discrepancies and we propose a number of guidelines to help researchers in
the design and interpretation of robustness problems when using either of
the two methods.
Journal: Journal of Applied Statistics
Pages: 683-699
Issue: 6
Volume: 34
Year: 2007
Keywords: Robust conditions, noise factors, product array, full data array, dispersion effects, Taguchi methods, transmitted variation, quality improvement, interactions,
X-DOI: 10.1080/02664760701236947
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701236947
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:6:p:683-699
Template-Type: ReDIF-Article 1.0
Author-Name: Manoj Chacko
Author-X-Name-First: Manoj
Author-X-Name-Last: Chacko
Author-Name: P. Yageen Thomas
Author-X-Name-First: P. Yageen
Author-X-Name-Last: Thomas
Title: Estimation of a Parameter of Bivariate Pareto Distribution by Ranked Set Sampling
Abstract:
Ranked set sampling is applicable whenever ranking of a set of sampling
units can be done easily by a judgement method or based on the measurement
of an auxiliary variable on the units selected. In this work, we derive
different estimators of a parameter associated with the distribution of
the study variate Y, based on a ranked-set sample obtained by using an
auxiliary variable X correlated with Y for ranking the sample units, when
(X, Y) follows a bivariate Pareto distribution. Efficiency comparisons
among these estimators are also made. Real-life data have been used to
illustrate the application of the results obtained.
Journal: Journal of Applied Statistics
Pages: 703-714
Issue: 6
Volume: 34
Year: 2007
Keywords: Ranked set sampling, bivariate Pareto distribution, best linear unbiased estimator,
X-DOI: 10.1080/02664760701236954
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701236954
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:6:p:703-714
Template-Type: ReDIF-Article 1.0
Author-Name: Chien-Tai Lin
Author-X-Name-First: Chien-Tai
Author-X-Name-Last: Lin
Author-Name: Cheng-Chieh Chou
Author-X-Name-First: Cheng-Chieh
Author-X-Name-Last: Chou
Title: Empirical-distribution-function Tests for the Beta-Binomial Model
Abstract:
Empirical-distribution-function (EDF) goodness-of-fit tests are
considered for the beta-binomial model. The testing procedures based on
EDF statistics are given. A Monte Carlo study is conducted to investigate
the accuracy and power of the tests against various alternative
distributions. Our method is found to produce considerably greater power
than that of Garren et al. (2001) in most cases. The tests are applied to
data sets of the foraging behavior of herons and environmental toxicity
studies.
Journal: Journal of Applied Statistics
Pages: 715-724
Issue: 6
Volume: 34
Year: 2007
Keywords: Beta-binomial distribution, goodness-of-fit, parametric bootstrap, power, simulation,
X-DOI: 10.1080/02664760701236970
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701236970
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:6:p:715-724
Template-Type: ReDIF-Article 1.0
Author-Name: Hongmei Zhang
Author-X-Name-First: Hongmei
Author-X-Name-Last: Zhang
Title: Inferences on the Number of Unseen Species and the Number of Abundant/Rare Species
Abstract:
This paper focuses on estimating the number of species and the number of
abundant species in a specific geographic region and, consequently, draw
inferences on the number of rare species. The word 'species' is generic
referring to any objects in a population that can be categorized. In the
areas of biology, ecology, literature, etc, the species frequency
distributions are usually severely skewed, in which case the population
contains a few very abundant species and many rare ones. To model a such
situation, we develop an asymmetric multinomial-Dirichlet probability
model using species frequency data. Posterior distributions on the number
of species and the number of abundant species are obtained and posterior
inferences are induced using MCMC simulations. Simulations are used to
demonstrate and evaluate the developed methodology. We apply the method to
a DNA segment data set and a butterfly data set. Comparisons among
different approaches to inferring the number of species are also discussed
in this paper.
Journal: Journal of Applied Statistics
Pages: 725-740
Issue: 6
Volume: 34
Year: 2007
Keywords: Generalized multinomial model, Bayesian hierarchical model, Markov Chain Monte Carlo (MCMC), Dirichlet distribution, rare species,
X-DOI: 10.1080/02664760701237010
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701237010
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:6:p:725-740
Template-Type: ReDIF-Article 1.0
Author-Name: Juan Sun
Author-X-Name-First: Juan
Author-X-Name-Last: Sun
Author-Name: Lifu Bi
Author-X-Name-First: Lifu
Author-X-Name-Last: Bi
Author-Name: Yaojun Chi
Author-X-Name-First: Yaojun
Author-X-Name-Last: Chi
Author-Name: Guowei Huang
Author-X-Name-First: Guowei
Author-X-Name-Last: Huang
Author-Name: Chun Fan
Author-X-Name-First: Chun
Author-X-Name-Last: Fan
Author-Name: Kazuo Aoki
Author-X-Name-First: Kazuo
Author-X-Name-Last: Aoki
Author-Name: Akihiro Kono
Author-X-Name-First: Akihiro
Author-X-Name-Last: Kono
Author-Name: Tian Hui
Author-X-Name-First: Tian
Author-X-Name-Last: Hui
Author-Name: Junichi Misumi
Author-X-Name-First: Junichi
Author-X-Name-Last: Misumi
Title: The Impact of Ovarian Cancer on Life Expectancy in Japan
Abstract:
The purpose of this study was to determine how life expectancy is
modified by ovarian cancer from 1950-2000. The contributions of ovarian
cancer to life expectancy were estimated. The age characteristics of
ovarian cancer were detected using the Gompertz relational mortality
model. The patterns between years of potential life lost (YPLL) and
mortality were obtained by fitting a linear regression equation to the
natural logarithm of their ratios. YPLLs are substantially higher in
Ireland than in Japan. However, the rates of change were much higher in
Japan than in Ireland. YPLLs changed from 0.02 year in 1950 to 0.12 year
in 2000. In Japan, there was a sixfold increase in the proportion of YPLLs
for death from ovarian cancer relative to those for death from
gynaecological cancers during the last half century. The impact of ovarian
cancer on life expectancy clearly increased and the age-specific mortality
tend to ageing.
Journal: Journal of Applied Statistics
Pages: 741-747
Issue: 6
Volume: 34
Year: 2007
Keywords: Ovarian cancer, life expectancy, YPLLs, Gompertz,
X-DOI: 10.1080/02664760701237036
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701237036
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:6:p:741-747
Template-Type: ReDIF-Article 1.0
Author-Name: Juan Antonio Cuesta-Albertos
Author-X-Name-First: Juan Antonio
Author-X-Name-Last: Cuesta-Albertos
Author-Name: Ricardo Fraiman
Author-X-Name-First: Ricardo
Author-X-Name-Last: Fraiman
Author-Name: Antonio Galves
Author-X-Name-First: Antonio
Author-X-Name-Last: Galves
Author-Name: Jesus Garcia
Author-X-Name-First: Jesus
Author-X-Name-Last: Garcia
Author-Name: Marcela Svarc
Author-X-Name-First: Marcela
Author-X-Name-Last: Svarc
Title: Classifying Speech Sonority Functional Data using a Projected Kolmogorov-Smirnov Approach
Abstract:
This paper addresses a linguistically motivated question of
classification of functional data, namely the statistical classification
of languages according to their rhythmic features. This is an important
open problem in phonology. The analysis is based on the information
provided by the sonority, which is an index of local regularity of the
speech signal. Our main tool is the projected Kolmogorov-Smirnov test.
This is a new goodness of fit test for functional data. The result
obtained supports the linguistic conjecture of the existence of three
rhythmic classes.
Journal: Journal of Applied Statistics
Pages: 749-761
Issue: 6
Volume: 34
Year: 2007
Keywords: Classification of languages, rhythmic classes, functional data, projected Kolmogorov-Smirnov test,
X-DOI: 10.1080/02664760701237077
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701237077
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:6:p:749-761
Template-Type: ReDIF-Article 1.0
Author-Name: Trevor Park
Author-X-Name-First: Trevor
Author-X-Name-Last: Park
Title: Alternative Penalty Functions for Penalized Likelihood Principal Components
Abstract:
The penalized likelihood principal component method of Park (2005) offers
flexibility in the choice of the penalty function. This flexibility allows
the method to be tailored to enhance interpretation in special cases. Of
particular interest is a penalty function in the style of the Lasso that
can be used to produce exactly zero loadings. Also of interest is a
penalty function for cases in which interpretability is best represented
by alignment with orthogonal subspaces, rather than with axis directions.
In each case, a data example is presented.
Journal: Journal of Applied Statistics
Pages: 767-777
Issue: 7
Volume: 34
Year: 2007
Keywords: Interpretation, Lasso penalty, multivariate exploratory analysis, principal component rotation, varimax,
X-DOI: 10.1080/02664760701239859
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701239859
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:7:p:767-777
Template-Type: ReDIF-Article 1.0
Author-Name: Man Yu Wong
Author-X-Name-First: Man Yu
Author-X-Name-Last: Wong
Author-Name: D.R. Cox
Author-X-Name-First: D.R.
Author-X-Name-Last: Cox
Title: On the Screening of Large Numbers of Significance Tests
Abstract:
A brief review is given of procedures for the collective analysis of a
large number of significance tests. A simple procedure previously supplied
for isolating 'real' effects on the basis of a large number of
significance tests is generalized to deal with two-sided tests and is also
related more explicitly to the false discovery rate.
Journal: Journal of Applied Statistics
Pages: 779-783
Issue: 7
Volume: 34
Year: 2007
Keywords: False discovery rate, mixture of distributions, Bayes factor, multiple testing,
X-DOI: 10.1080/02664760701240014
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701240014
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:7:p:779-783
Template-Type: ReDIF-Article 1.0
Author-Name: Mohamed El Ghourabi
Author-X-Name-First: Mohamed El
Author-X-Name-Last: Ghourabi
Author-Name: Mohamed Limam
Author-X-Name-First: Mohamed
Author-X-Name-Last: Limam
Title: Residual Responses to Change Patterns of Autocorrelated Processes
Abstract:
This article studies the residual behaviour of various stationary
processes in the presence of change patterns. Three types of change
patterns are considered, Additive Outliers, Innovative Outliers and Level
Shift. The knowledge of the residual behaviour is important for monitoring
production processes. A new method of residual process control is
proposed, the patterns chart. In addition to the advantage of detecting
change patterns, it distinguishes their nature. The patterns chart's
performance is compared to the performance of the special causes control
(SCC) chart based on average run length. The results show that the
proposed method performs better than a SCC chart. A real case study
illustrates that the patterns chart has all the desirable properties of a
SCC chart and it overcomes the negative ones.
Journal: Journal of Applied Statistics
Pages: 785-798
Issue: 7
Volume: 34
Year: 2007
Keywords: Autocorrelation, outliers, residual responses, control chart, ARL,
X-DOI: 10.1080/02664760701240063
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701240063
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:7:p:785-798
Template-Type: ReDIF-Article 1.0
Author-Name: R. Vijayaraghavan
Author-X-Name-First: R.
Author-X-Name-Last: Vijayaraghavan
Title: Minimum Size Double Sampling Plans for Large Isolated Lots
Abstract:
A common approach to the design of an acceptance sampling plan is to
require that the operating characteristic (OC) curve should pass through
two designated points that would fix the curve in accordance with a
desired degree of discrimination. This paper presents a search procedure
for the selection of double sampling inspection plans of type DSP - (0, 1)
for specified two points on the OC curve, namely acceptance quality limit,
producer's risk, limiting quality and consumer's risk. Selection of the
plans is discussed for both the cases of fraction non-conforming and the
number of non-conformities per unit.
Journal: Journal of Applied Statistics
Pages: 799-806
Issue: 7
Volume: 34
Year: 2007
Keywords: Acceptance quality limit, limiting quality, double sampling plan, operating characteristic curve, single sampling plan,
X-DOI: 10.1080/02664760701240287
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701240287
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:7:p:799-806
Template-Type: ReDIF-Article 1.0
Author-Name: Carlos Diaz Avalos
Author-X-Name-First: Carlos Diaz
Author-X-Name-Last: Avalos
Title: Spatial Modeling of Habitat Preferences of Biological Species using Markov Random Fields
Abstract:
Spatial modeling has gained interest in ecology during the past two
decades, especially in the area of biodiversity, where reliable
distribution maps are required. Several methods have been proposed to
construct distribution maps, most of them acknowledging the presence of
spatial interactions. In many cases, a key problem is the lack of true
absence data. We present here a model suitable for use when true absence
data are missing. The quality of the estimates obtained from the model is
evaluated using ROC curve analysis as well as a quadratic cost function,
computed from the false positive and false negative error rates. The model
is also tested under random and clustered scattering of the presence
records. We also present an application of the model to the construction
of distribution maps of two endemic bird species in Mexico.
Journal: Journal of Applied Statistics
Pages: 807-821
Issue: 7
Volume: 34
Year: 2007
Keywords: Biodiversity maps, Markov random fields, spatial modeling, autologistic model, species distribution,
X-DOI: 10.1080/02664760701240782
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701240782
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:7:p:807-821
Template-Type: ReDIF-Article 1.0
Author-Name: Jurate saltyte Benth
Author-X-Name-First: Jurate saltyte
Author-X-Name-Last: Benth
Author-Name: Fred Espen Benth
Author-X-Name-First: Fred Espen
Author-X-Name-Last: Benth
Author-Name: Paulius Jalinskas
Author-X-Name-First: Paulius
Author-X-Name-Last: Jalinskas
Title: A Spatial-temporal Model for Temperature with Seasonal Variance
Abstract:
We propose a spatial-temporal stochastic model for daily average surface
temperature data. First, we build a model for a single spatial location,
independently on the spatial information. The model includes trend,
seasonality, and mean reversion, together with a seasonally dependent
variance of the residuals. The spatial dependency is modelled by a
Gaussian random field. Empirical fitting to data collected in 16
measurement stations in Lithuania over more than 40 years shows that our
model captures the seasonality in the autocorrelation of the squared
residuals, a property of temperature data already observed by other
authors. We demonstrate through examples that our spatial-temporal model
is applicable for prediction and classification.
Journal: Journal of Applied Statistics
Pages: 823-841
Issue: 7
Volume: 34
Year: 2007
Keywords: Spatial-temporal random field, temperature, seasonally dependent variance,
X-DOI: 10.1080/02664760701511398
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701511398
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:7:p:823-841
Template-Type: ReDIF-Article 1.0
Author-Name: Zhang Wu
Author-X-Name-First: Zhang
Author-X-Name-Last: Wu
Author-Name: Qinan Wang
Author-X-Name-First: Qinan
Author-X-Name-Last: Wang
Title: An NP Control Chart Using Double Inspections
Abstract:
The np control chart is used widely in Statistical Process Control (SPC)
for attributes. It is difficult to design an np chart that simultaneously
satisfies a requirement on false alarm rate and has high detection
effectiveness. This is mainly because one is often unable to make the
in-control Average Run Length ARL0 of an np chart close to a specified or
desired value. This article proposes a new np control chart which is able
to overcome the problems suffered by the conventional np chart. It is
called the Double Inspection (DI) np chart, because it uses a double
inspection scheme to decide the process status (in control or out of
control). The first inspection decides the process status according to the
number of non-conforming units found in a sample; and the second
inspection makes a decision based on the location of a particular
non-conforming unit in the sample. The double inspection scheme makes the
in-control ARL0 very close to a specified value and the out-of-control
Average Run Length ARL1 quite small. As a result, the requirement on a
false alarm rate is satisfied and the detection effectiveness also
achieves a high level. Moreover, the DI np chart retains the operational
simplicity of the np chart to a large degree and achieves the performance
improvement without requiring extra inspection (testing whether a unit is
conforming or not).
Journal: Journal of Applied Statistics
Pages: 843-855
Issue: 7
Volume: 34
Year: 2007
Keywords: Quality control, statistical process control, control chart, double inspection, average run length,
X-DOI: 10.1080/02664760701523492
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701523492
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:7:p:843-855
Template-Type: ReDIF-Article 1.0
Author-Name: Tee Chin Chang
Author-X-Name-First: Tee Chin
Author-X-Name-Last: Chang
Author-Name: Fah Fatt Gan
Author-X-Name-First: Fah Fatt
Author-X-Name-Last: Gan
Title: Modified Shewhart Charts for High Yield Processes
Abstract:
The conventional Shewhart p or np chart is not effective for monitoring a
high yield process, a process in which the defect level is close to zero.
An improved Shewhart np chart for monitoring high yield processes is
proposed. A review of control charts for monitoring high yield processes
is first given. The run length performance of the proposed Shewhart chart
is then compared with other high yield control charts. A simple procedure
for designing the chart for processes subjected to sampling or 100%
continuous inspection is provided and this allows the chart to be
implemented easily on the factory floor. The practical aspects of
implementation of the Shewhart chart are discussed. An application of the
Shewhart chart based on a real data set is demonstrated.
Journal: Journal of Applied Statistics
Pages: 857-877
Issue: 7
Volume: 34
Year: 2007
Keywords: Average run length, binomial counts, parts-per-million non-conforming items, supplementary runs rules, statistical process control,
X-DOI: 10.1080/02664760701546279
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701546279
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:7:p:857-877
Template-Type: ReDIF-Article 1.0
Author-Name: J. A. Roldan Nofuentes
Author-X-Name-First: J. A. Roldan
Author-X-Name-Last: Nofuentes
Author-Name: J. D. Luna Del Castillo
Author-X-Name-First: J. D. Luna Del
Author-X-Name-Last: Castillo
Title: Risk of Error and the Kappa Coefficient of a Binary Diagnostic Test in the Presence of Partial Verification
Abstract:
The accuracy of a binary diagnostic test is usually measured in terms of
its sensitivity and its specificity, or through positive and negative
predictive values. Another way to describe the validity of a binary
diagnostic test is the risk of error and the kappa coefficient of the risk
of error. The risk of error is the average loss that is caused when
incorrectly classifying a non-diseased or a diseased patient, and the
kappa coefficient of the risk of error is a measure of the agreement
between the diagnostic test and the gold standard. In the presence of
partial verification of the disease, the disease status of some patients
is unknown, and therefore the evaluation of a diagnostic test cannot be
carried out through the traditional method. In this paper, we have deduced
the maximum likelihood estimators and variances of the risk of error and
of the kappa coefficient of the risk of error in the presence of partial
verification of the disease. Simulation experiments have been carried out
to study the effect of the verification probabilities on the coverage of
the confidence interval of the kappa coefficient.
Journal: Journal of Applied Statistics
Pages: 887-898
Issue: 8
Volume: 34
Year: 2007
Keywords: Covariates, Kappa, partial verification, risk, sensitivity, specificity, verification bias,
X-DOI: 10.1080/02664760701590681
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701590681
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:8:p:887-898
Template-Type: ReDIF-Article 1.0
Author-Name: Neil Marks
Author-X-Name-First: Neil
Author-X-Name-Last: Marks
Title: Kolmogorov-Smirnov Test Statistic and Critical Values for the Erlang-3 and Erlang-4 Distributions
Abstract:
Following a procedure applied to the Erlang-2 distribution in a recent
paper, an adjusted Kolmogorov-Smirnov statistic and critical values are
developed for the Erlang-3 and -4 cases using data from Monte Carlo
simulations. The test statistic produced features of compactness and ease
of implementation. It is quite accurate for sample sizes as low as ten.
Journal: Journal of Applied Statistics
Pages: 899-906
Issue: 8
Volume: 34
Year: 2007
Keywords: Goodness-of-fit, Kolmogorov-Smirnov test, Erlang-k distribution,
X-DOI: 10.1080/02664760701590640
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701590640
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:8:p:899-906
Template-Type: ReDIF-Article 1.0
Author-Name: Feng-Chang Xie
Author-X-Name-First: Feng-Chang
Author-X-Name-Last: Xie
Author-Name: Bo-Cheng Wei
Author-X-Name-First: Bo-Cheng
Author-X-Name-Last: Wei
Author-Name: Jin-Guan Lin
Author-X-Name-First: Jin-Guan
Author-X-Name-Last: Lin
Title: Case-deletion Influence Measures for the Data from Multivariate t Distributions
Abstract:
For the data from multivariate t distributions, it is very hard to make
an influence analysis based on the probability density function since its
expression is intractable. In this paper, we present a technique for
influence analysis based on the mixture distribution and EM algorithm. In
fact, the multivariate t distribution can be considered as a particular
Gaussian mixture by introducing the weights from the Gamma distribution.
We treat the weights as the missing data and develop the influence
analysis for the data from multivariate t distributions based on the
conditional expectation of the complete-data log-likelihood function in
the EM algorithm. Several case-deletion measures are proposed for
detecting influential observations from multivariate t distributions. Two
numerical examples are given to illustrate our methodology.
Journal: Journal of Applied Statistics
Pages: 907-921
Issue: 8
Volume: 34
Year: 2007
Keywords: Multivariate t distribution, influence analysis, EM algorithm, case-deletion, generalized Cook distance,
X-DOI: 10.1080/02664760701590574
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701590574
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:8:p:907-921
Template-Type: ReDIF-Article 1.0
Author-Name: Harriet Namata
Author-X-Name-First: Harriet
Author-X-Name-Last: Namata
Author-Name: Ziv Shkedy
Author-X-Name-First: Ziv
Author-X-Name-Last: Shkedy
Author-Name: Christel Faes
Author-X-Name-First: Christel
Author-X-Name-Last: Faes
Author-Name: Marc Aerts
Author-X-Name-First: Marc
Author-X-Name-Last: Aerts
Author-Name: Geert Molenberghs
Author-X-Name-First: Geert
Author-X-Name-Last: Molenberghs
Author-Name: Heide Theeten
Author-X-Name-First: Heide
Author-X-Name-Last: Theeten
Author-Name: Pierre Van Damme
Author-X-Name-First: Pierre
Author-X-Name-Last: Van Damme
Author-Name: Philippe Beutels
Author-X-Name-First: Philippe
Author-X-Name-Last: Beutels
Title: Estimation of the Force of Infection from Current Status Data Using Generalized Linear Mixed Models
Abstract:
Based on sero-prevalence data of rubella, mumps in the UK and varicella
in Belgium, we show how the force of infection, the age-specific rate at
which susceptible individuals contract infection, can be estimated using
generalized linear mixed models (McCulloch & Searle, 2001). Modelling the
dependency of the force of infection on age by penalized splines, which
involve fixed and random effects, allows us to use generalized linear
mixed models techniques to estimate both the cumulative probability of
being infected before a given age and the force of infection. Moreover,
these models permit an automatic selection of the smoothing parameter. The
smoothness of the estimated force of infection can be influenced by the
number of knots and the degree of the penalized spline used. To determine
these, a different number of knots and different degrees are used and the
results are compared to establish this sensitivity. Simulations with a
different number of knots and polynomial spline bases of different degrees
suggest - for estimating the force of infection from serological data -
the use of a quadratic penalized spline based on about 10 knots.
Journal: Journal of Applied Statistics
Pages: 923-939
Issue: 8
Volume: 34
Year: 2007
Keywords: Prevalence data, penalized splines, generalized linear mixed models, smoothing parameter, force of infection,
X-DOI: 10.1080/02664760701590525
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701590525
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:8:p:923-939
Template-Type: ReDIF-Article 1.0
Author-Name: W. L. Pearn
Author-X-Name-First: W. L.
Author-X-Name-Last: Pearn
Author-Name: F. K. Wang
Author-X-Name-First: F. K.
Author-X-Name-Last: Wang
Author-Name: C. H. Yen
Author-X-Name-First: C. H.
Author-X-Name-Last: Yen
Title: Multivariate Capability Indices: Distributional and Inferential Properties
Abstract:
Process capability indices have been widely used in the manufacturing
industry for measuring process reproduction capability according to
manufacturing specifications. Properties of the univariate processes have
been investigated extensively, but are comparatively neglected for
multivariate processes where multiple dependent characteristics are
involved in quality measurement. In this paper, we consider two commonly
used multivariate capability indices MCp and MCpm, to evaluate
multivariate process capability. We investigate the statistical properties
of the estimated MCp and obtain the lower confidence bound for MCp. We
also consider testing MCp, and provide critical values for testing if a
multivariate process meets the preset capability requirement. In addition,
an approximate confidence interval for MCpm is derived. A simulation study
is conducted to ascertain the accuracy of the approximation. Three
examples are presented to illustrate the applicability of the obtained
results.
Journal: Journal of Applied Statistics
Pages: 941-962
Issue: 8
Volume: 34
Year: 2007
Keywords: Multivariate capability index, lower confidence bound, hypothesis testing, critical value,
X-DOI: 10.1080/02664760701590475
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701590475
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:8:p:941-962
Template-Type: ReDIF-Article 1.0
Author-Name: Terence Mills
Author-X-Name-First: Terence
Author-X-Name-Last: Mills
Title: A Note on Trend Decomposition: The 'Classical' Approach Revisited with an Application to Surface Temperature Trends
Abstract:
This note reconsiders the 'classical' approach to trend estimation and
presents a modern treatment of this technique that enables trend filters
which incorporate end-effects to be constructed easily and efficiently.
The approach is illustrated by estimating recent Northern Hemispheric
temperature trends. In so doing, it shows how classical trend models may
be selected in empirical applications and indicates how this choice
determines the properties of the latest trend estimates.
Journal: Journal of Applied Statistics
Pages: 963-972
Issue: 8
Volume: 34
Year: 2007
Keywords: Trend estimation, local polynomial trend, temperature trends,
X-DOI: 10.1080/02664760701590418
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701590418
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:8:p:963-972
Template-Type: ReDIF-Article 1.0
Author-Name: Edgardo Escalante-Vazquez
Author-X-Name-First: Edgardo
Author-X-Name-Last: Escalante-Vazquez
Title: SPC Study of a Brewing Process
Abstract:
The process of brewing is a complex one, in which several biological and
chemical reactions occur that involve many variables and their
interactions. This pilot study is an attempt to understand and to control
the chemical and biological nature of the process of 'beer cooking'.
Through data collection and analysis the measurement system was initially
evaluated and improved to allow the assessment of the stability of the
analysed response variable: wort's F (F is a fictitious name for this
variable due to confidentiality). Next, a deeper analysis was carried out
to characterize, improve and control the behaviour of this factor by means
of confidence intervals and several regression analyses. The way to
control F is by adding a certain amount of element X according to a
previously empirically developed table. After the analyses, this table was
questioned and a new one was developed. This study is the outcome of the
willingness of a group of people in this company to incorporate into its
traditional and, at some stages, artisan way of producing beer, the
utilization of statistical techniques for analysing and improving its
processes and products.
Journal: Journal of Applied Statistics
Pages: 973-984
Issue: 8
Volume: 34
Year: 2007
Keywords: SPC, brewing process, quality improvement,
X-DOI: 10.1080/02664760701590699
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701590699
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:8:p:973-984
Template-Type: ReDIF-Article 1.0
Author-Name: Olena Babak
Author-X-Name-First: Olena
Author-X-Name-Last: Babak
Author-Name: Birgir Hrafnkelsson
Author-X-Name-First: Birgir
Author-X-Name-Last: Hrafnkelsson
Author-Name: Olafur Palsson
Author-X-Name-First: Olafur
Author-X-Name-Last: Palsson
Title: Estimation of the Length Distribution of Marine Populations in the Gaussian-multinomial Setting using the Method of Moments
Abstract:
In this paper, the problem of estimation of the length distribution of
marine populations in the Gaussian-multinomial model is considered. For
the purpose of the mean and covariance parameter estimation, the method of
moments estimators are developed. That is, minimum variance linear
unbiased estimator for the mean frequency vector is derived and a
consistent estimator for the covariance matrix of the length observations
is presented. The usefulness of the proposed estimators is illustrated
with an analysis of real cod length measurement data.
Journal: Journal of Applied Statistics
Pages: 985-991
Issue: 8
Volume: 34
Year: 2007
Keywords: Gaussian-multinomial model, method of moments estimators, length distribution,
X-DOI: 10.1080/02664760701590376
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701590376
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:8:p:985-991
Template-Type: ReDIF-Article 1.0
Author-Name: Wai-Yin Poon
Author-X-Name-First: Wai-Yin
Author-X-Name-Last: Poon
Author-Name: Yat Sun Poon
Author-X-Name-First: Yat Sun
Author-X-Name-Last: Poon
Title: Local Conditional Influence
Abstract:
Through an investigation of normal curvature functions for influence
graphs of a family of perturbed models, we develop the concept of local
conditional influence. This concept can be used to study masking and
boosting effects in local influence. We identify the situation under which
the influence graph of the unperturbed model contains all the information
on these effects. The linear regression model is used for illustration and
it is shown that the concept developed is consistent with Lawrance's
(1995) approach of conditional influence in Cook's distance.
Journal: Journal of Applied Statistics
Pages: 997-1009
Issue: 8
Volume: 34
Year: 2007
Keywords: Normal curvature, curvature function, local conditional influence, masking, linear regression,
X-DOI: 10.1080/02664760600744371
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600744371
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:8:p:997-1009
Template-Type: ReDIF-Article 1.0
Author-Name: Dianxu Ren
Author-X-Name-First: Dianxu
Author-X-Name-Last: Ren
Author-Name: Roslyn Stone
Author-X-Name-First: Roslyn
Author-X-Name-Last: Stone
Title: A Bayesian Adjustment for Covariate Misclassification with Correlated Binary Outcome Data
Abstract:
Estimated associations between an outcome variable and misclassified
covariates tend to be biased when the methods of estimation that ignore
the classification error are applied. Available methods to account for
misclassification often require the use of a validation sample (i.e. a
gold standard). In practice, however, such a gold standard may be
unavailable or impractical. We propose a Bayesian approach to adjust for
misclassification in a binary covariate in the random effect logistic
model when a gold standard is not available. This Markov Chain Monte Carlo
(MCMC) approach uses two imperfect measures of a dichotomous exposure
under the assumptions of conditional independence and non-differential
misclassification. A simulated numerical example and a real clinical
example are given to illustrate the proposed approach. Our results suggest
that the estimated log odds of inpatient care and the corresponding
standard deviation are much larger in our proposed method compared with
the models ignoring misclassification. Ignoring misclassification produces
downwardly biased estimates and underestimate uncertainty.
Journal: Journal of Applied Statistics
Pages: 1019-1034
Issue: 9
Volume: 34
Year: 2007
Keywords: Bayesian approach, misclassification, logistic model, random effect logistic model, MCMC,
X-DOI: 10.1080/02664760701591895
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701591895
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:9:p:1019-1034
Template-Type: ReDIF-Article 1.0
Author-Name: Pasquale Sarnacchiaro
Author-X-Name-First: Pasquale
Author-X-Name-Last: Sarnacchiaro
Author-Name: Antonello D'ambra
Author-X-Name-First: Antonello
Author-X-Name-Last: D'ambra
Title: Explorative Data Analysis and CATANOVA for Ordinal Variables: An Integrated Approach
Abstract:
Categorical analysis of variance (CATANOVA) is a statistical method
designed to analyse variability between treatments of interest to the
researcher. There are well-established links between CATANOVA and the
τ statistic of Goodman and Kruskal which, for the purpose of the
graphical identification of this variation, is partitioned using singular
value decomposition for Non-Symmetrical Correspondence Analysis (NSCA)
(D'Ambra & Lauro, 1989). The aim of this paper is to show a decomposition
of the Between Sum of Squares (BSS), measured both in CATANOVA framework
and in the statistic τ, into location, dispersion and higher order
components. This decomposition has been developed using Emerson's
orthogonal polynomials. Starting from this decomposition, a statistical
test and a confidence circle have been calculated for each component and
for each modality in which the BSS was decomposed, respectively. A
Customer Satisfaction study has been considered to explain the
methodology.
Journal: Journal of Applied Statistics
Pages: 1035-1050
Issue: 9
Volume: 34
Year: 2007
Keywords: Categorical analysis of variance, Goodman & Kruskal τ, Emerson Orthogonal polynomials, customer satisfaction, non-symmetrical correspondence analysis, confidence circle, statistical test, Andrews curve,
X-DOI: 10.1080/02664760701591937
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701591937
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:9:p:1035-1050
Template-Type: ReDIF-Article 1.0
Author-Name: Jonggun Lim
Author-X-Name-First: Jonggun
Author-X-Name-Last: Lim
Author-Name: Sangun Park
Author-X-Name-First: Sangun
Author-X-Name-Last: Park
Title: Censored Kullback-Leibler Information and Goodness-of-Fit Test with Type II Censored Data
Abstract:
The Kulback-Leibler information has been considered for establishing
goodness-of-fit test statistics, which have been shown to perform very
well (Arizono & Ohta, 1989; Ebrahimi et al., 1992, etc). In this paper, we
propose censored Kullback-Leibler information to generalize the discussion
of the Kullback-Leibler information to the censored case. Then we
establish a goodness-of-fit test statistic based on the censored
Kullback-Leibler information with the type 2 censored data, and compare
the test statistics with some existing test statistics for the exponential
and normal distributions.
Journal: Journal of Applied Statistics
Pages: 1051-1064
Issue: 9
Volume: 34
Year: 2007
Keywords: Entropy difference, maximum entropy distribution, minimum discrimination information loss estimation, order statistics, sample entropy,
X-DOI: 10.1080/02664760701592000
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701592000
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:9:p:1051-1064
Template-Type: ReDIF-Article 1.0
Author-Name: Seunggeun Hyun
Author-X-Name-First: Seunggeun
Author-X-Name-Last: Hyun
Author-Name: Yanqing Sun
Author-X-Name-First: Yanqing
Author-X-Name-Last: Sun
Title: Hypotheses Tests of Strain-specific Vaccine Efficacy Adjusted for Covariate Effects
Abstract:
In the evaluation of efficacy of a vaccine to protect against disease
caused by finitely many diverse infectious pathogens, it is often
important to assess if vaccine protection depends on variations of the
exposing pathogen. This problem can be formulated under a competing risks
model where the endpoint event is the infection and the cause of failure
is the infecting strain type determined after the infection is diagnosed.
The strain-specific vaccine efficacy is defined as one minus the
cause-specific hazard ratio (vaccine/placebo). This paper develops some
simple procedures for testing if the vaccine affords protection against
various strains and if and how the strain-specific vaccine efficacy
depends on the type of exposing strain, adjusting for covariate effects.
The Cox proportional hazards model is used to relate the cause-specific
outcomes to explanatory variables. The finite sample properties of
proposed tests are studied through simulations and are shown to have good
performances. The tests developed are applied to the data collected from
an oral cholera vaccine trial.
Journal: Journal of Applied Statistics
Pages: 1065-1073
Issue: 9
Volume: 34
Year: 2007
Keywords: Competing risks model, cause-specific hazard function, Cox proportional hazards model,
X-DOI: 10.1080/02664760701592083
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701592083
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:9:p:1065-1073
Template-Type: ReDIF-Article 1.0
Author-Name: J. D. Bermudez
Author-X-Name-First: J. D.
Author-X-Name-Last: Bermudez
Author-Name: J. V. Segura
Author-X-Name-First: J. V.
Author-X-Name-Last: Segura
Author-Name: E. Vercher
Author-X-Name-First: E.
Author-X-Name-Last: Vercher
Title: Holt-Winters Forecasting: An Alternative Formulation Applied to UK Air Passenger Data
Abstract:
This paper provides a formulation for the additive Holt-Winters
forecasting procedure that simplifies both obtaining maximum likelihood
estimates of all unknowns, smoothing parameters and initial conditions,
and the computation of point forecasts and reliable predictive intervals.
The stochastic component of the model is introduced by means of additive,
uncorrelated, homoscedastic and Normal errors, and then the joint
distribution of the data vector, a multivariate Normal distribution, is
obtained. In the case where a data transformation was used to improve the
fit of the model, cumulative forecasts are obtained here using a
Monte-Carlo approximation. This paper describes the method by applying it
to the series of monthly total UK air passengers collected by the Civil
Aviation Authority, a long time series from 1949 to the present day, and
compares the resulting forecasts with those obtained in previous studies.
Journal: Journal of Applied Statistics
Pages: 1075-1090
Issue: 9
Volume: 34
Year: 2007
Keywords: Exponential smoothing, time series forecasting, prediction intervals, linear model, additive error, Monte-Carlo methods,
X-DOI: 10.1080/02664760701592125
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701592125
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:9:p:1075-1090
Template-Type: ReDIF-Article 1.0
Author-Name: Edoardo Otrano
Author-X-Name-First: Edoardo
Author-X-Name-Last: Otrano
Author-Name: Umberto Triacca
Author-X-Name-First: Umberto
Author-X-Name-Last: Triacca
Title: Testing for Equal Predictability of Stationary ARMA Processes
Abstract:
In this work we use a measure of predictability of a time series
following a stationary ARMA process to develop a test of equal
predictability of two or more time series. The test is derived by a set of
propositions which links the structure of the AR and MA coefficients to
the predictability measure. A particular case of this general approach is
constituted by time series having a Wold decomposition with weights having
the same sign; in this framework the equal predictability is equivalent to
parallelism among ARMA models and the null hypothesis of equal
predictability is simply a set of linear restrictions. The ARMA
representation of the GARCH models presents non-negative weights, so that
this test can be extended to verify the equal predictability of squared
time series following GARCH structures.
Journal: Journal of Applied Statistics
Pages: 1091-1108
Issue: 9
Volume: 34
Year: 2007
Keywords: Forecasts, parallelism, Wold decomposition, GARCH models,
X-DOI: 10.1080/02664760701592158
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701592158
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:9:p:1091-1108
Template-Type: ReDIF-Article 1.0
Author-Name: Chien-Wei Wu
Author-X-Name-First: Chien-Wei
Author-X-Name-Last: Wu
Author-Name: M. H. Shu
Author-X-Name-First: M. H.
Author-X-Name-Last: Shu
Title: A Bayesian Procedure for Assessing Process Performance Based on Expected Relative Loss with Asymmetric Tolerances
Abstract:
Taguchi has introduced the loss function approach to quality improvement
by focusing on the reduction of variation around the target value. This
concept pays attention to the product designer's original intent; that is,
values of a critical characteristic at a target lead to maximum product
performance. To address this concept, Johnson (1992) proposed the concept
of expected relative squared error loss Le for symmetric cases, by
approaching capability in terms of loss functions. Unfortunately, the
index Le inconsistently measures process capability for processes with
asymmetric tolerances, and thus reflects process potential and performance
inaccurately. To remedy this, Pearn et al. (2006) proposed a modification
of expected loss index, which is referred to as [image omitted] , to
handle processes with both symmetric and asymmetric tolerances. The
majority of the researches for assessing process performance based on the
process loss indices are investigated using the traditional frequentist
approach. However, the sampling distribution of the estimated [image
omitted] is intractable, this makes establishing the exact
confidence interval and testing process performance difficult. In the
paper, we consider an alternative Bayesian approach to assess process
performance based on the loss index for processes with asymmetric
tolerances. Based on the derived posterior probability, a simple but
practical procedure is proposed for practitioners to assess process
performance on their shop floor, whether the manufacturing tolerance is
symmetric or asymmetric.
Journal: Journal of Applied Statistics
Pages: 1109-1123
Issue: 9
Volume: 34
Year: 2007
Keywords: Asymmetric tolerances, Bayesian approach, credible interval, expected relative loss,
X-DOI: 10.1080/02664760701592190
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701592190
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:9:p:1109-1123
Template-Type: ReDIF-Article 1.0
Author-Name: C. Xu
Author-X-Name-First: C.
Author-X-Name-Last: Xu
Author-Name: P. A. Dowd
Author-X-Name-First: P. A.
Author-X-Name-Last: Dowd
Author-Name: K. V. Mardia
Author-X-Name-First: K. V.
Author-X-Name-Last: Mardia
Author-Name: R. J. Fowell
Author-X-Name-First: R. J.
Author-X-Name-Last: Fowell
Author-Name: C. C. Taylor
Author-X-Name-First: C. C.
Author-X-Name-Last: Taylor
Title: Simulating Correlated Marked-point Processes
Abstract:
The area of marked-point processes is well developed but simulation is
still a challenging problem when mark correlations are to be included. In
this paper we propose the use of simulated annealing to incorporate the
spatial mark correlation into the simulations of correlated marked-point
processes. Such a simulation has wide applications in areas such as
inference and goodness-of-fit investigations of proposed models. The
technique is applied to a forest dataset for which the results are
extremely encouraging.
Journal: Journal of Applied Statistics
Pages: 1125-1134
Issue: 9
Volume: 34
Year: 2007
Keywords: Marked-point process, spatial mark correlation, point process simulation, simulated annealing,
X-DOI: 10.1080/02664760701597231
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701597231
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:9:p:1125-1134
Template-Type: ReDIF-Article 1.0
Author-Name: Gopal Kanji
Author-X-Name-First: Gopal
Author-X-Name-Last: Kanji
Author-Name: Parvesh Chopra
Author-X-Name-First: Parvesh
Author-X-Name-Last: Chopra
Title: Poverty as a System: Human Contestability Approach to Poverty Measurement
Abstract:
Since Sen's (1976) paper on poverty measurement, a substantial
literature, both theoretical and empirical, has emerged. There have been
several recent efforts to drive poverty measures based on different
approaches and axioms. These poverty indices are based on head count
ratio, poverty gaps and distribution of income. These are very narrow in
approach and suffer from several drawbacks. However, the purpose of the
present paper is to introduce a new poverty measure based on a holistic
and system modelling approach. Based on Chopra's human contestability
(Chopra, 2003, 2007) approach to poverty, this new approach to measuring
poverty has been developed using a structure equation model based on
Kanji's business excellence model (Kanji, 2002) to create the proposed
poverty model. We construct a latent variable structural equation model to
measure the contestability excellence within certain boundaries of the
societal system. It will provide us with a measurement of poverty in a
society or community in terms of human contestability. A higher human
contestability index will indicate the lower poverty within the society.
Strengths and weakness as of various components will also indicate that a
characteristic of the individual requires extra society or government
support to remove poverty. However, there remains considerable
disagreement on the best way to achieve this.
Journal: Journal of Applied Statistics
Pages: 1135-1158
Issue: 9
Volume: 34
Year: 2007
Keywords: Human contestability, system approach, poverty model, poverty dimensions, Kanji-Chopra poverty model, poverty measurement,
X-DOI: 10.1080/02664760701619142
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701619142
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:9:p:1135-1158
Template-Type: ReDIF-Article 1.0
Author-Name: Ignacio Vidal
Author-X-Name-First: Ignacio
Author-X-Name-Last: Vidal
Author-Name: Pilar Iglesias
Author-X-Name-First: Pilar
Author-X-Name-Last: Iglesias
Author-Name: Manuel Galea
Author-X-Name-First: Manuel
Author-X-Name-Last: Galea
Title: Influential Observations in the Functional Measurement Error Model
Abstract:
In this work we propose Bayesian measures to quantify the influence of
observations on the structural parameters of the simple measurement error
model (MEM). Different influence measures, like those based on
q-divergence between posterior distributions and Bayes risk, are studied
to evaluate the influence. A strategy based on the perturbation function
and MCMC samples is used to compute these measures. The samples from the
posterior distributions are obtained by using the Metropolis-Hastings
algorithm and assuming specific proper prior distributions. The results
are illustrated with an application to a real example modeled with MEM in
the literature.
Journal: Journal of Applied Statistics
Pages: 1165-1183
Issue: 10
Volume: 34
Year: 2007
Keywords: MEM, Influence measures, Bayes risk, q-divergence, Perturbation function, Metropolis-Hastings, Gibbs sampling,
X-DOI: 10.1080/02664760701592703
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701592703
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:10:p:1165-1183
Template-Type: ReDIF-Article 1.0
Author-Name: Pilar Olave
Author-X-Name-First: Pilar
Author-X-Name-Last: Olave
Author-Name: Manuel Salvador
Author-X-Name-First: Manuel
Author-X-Name-Last: Salvador
Title: Semi-parametric Bayesian Analysis of the Proportional Hazard Rate Model An Application to the Effect of Training Programs on Graduate Unemployment
Abstract:
In this paper, we introduce a semi-parametric Bayesian methodology based
on the proportional hazard model that assumes that the baseline hazard
function is constant over segments but, by contrast to what is usually
assumed in the literature, with the periods at which the function changes
not being specified in advance. The methodology is applied to explore the
impact of Vocational Training courses offered by the University of
Zaragoza (Spain) on the duration of the initial periods of unemployment
experienced by graduate leavers. The framework is very flexible and allows
us, in particular, to capture the presence of seasonality in the job
insertion of graduates.
Journal: Journal of Applied Statistics
Pages: 1185-1205
Issue: 10
Volume: 34
Year: 2007
Keywords: Bayesian survival analysis, semi-parametric models, proportional hazard, training programs, labor market,
X-DOI: 10.1080/02664760701592752
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701592752
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:10:p:1185-1205
Template-Type: ReDIF-Article 1.0
Author-Name: Alberto Luceno
Author-X-Name-First: Alberto
Author-X-Name-Last: Luceno
Title: A Universal QQ-Plot for Continuous Non-homogeneous Populations
Abstract:
This article presents a universal quantile-quantile (QQ) plot that may be
used to assess the fit of a family of absolutely continuous distribution
functions in a possibly non-homogeneous population. This plot is more
general than probability plotting papers because it may be used for
distributions having more than two parameters. It is also more general
than standard quantile-quantile plots because it may be used for families
of not-necessarily identical distributions. In particular, the universal
QQ plot may be used in the context of non-homogeneous Poisson processes,
generalized linear models, and other general models.
Journal: Journal of Applied Statistics
Pages: 1207-1223
Issue: 10
Volume: 34
Year: 2007
Keywords: Generalized linear model, goodness of fit, Kolmogorov-Smirnov, non-homogeneous Poisson process, plot points, probability plotting papers,
X-DOI: 10.1080/02664760701592786
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701592786
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:10:p:1207-1223
Template-Type: ReDIF-Article 1.0
Author-Name: David Clark
Author-X-Name-First: David
Author-X-Name-Last: Clark
Author-Name: Louise Ryan
Author-X-Name-First: Louise
Author-X-Name-Last: Ryan
Author-Name: F. L. Lucas
Author-X-Name-First: F. L.
Author-X-Name-Last: Lucas
Title: A Multi-state Piecewise Exponential Model of Hospital Outcomes after Injury
Abstract:
To allow more accurate prediction of hospital length of stay (LOS) after
serious injury or illness, a multi-state model is proposed, in which
transitions from the hospitalized state to three possible outcome states
(home, long-term care, or death) are assumed to follow constant rates for
each of a limited number of time periods. This results in a piecewise
exponential (PWE) model for each outcome. Transition rates may be affected
by time-varying covariates, which can be estimated from a reference
database using standard statistical software and Poisson regression. A PWE
model combining the three outcomes allows prediction of LOS. Records of
259,941 injured patients from the US Nationwide Inpatient Sample were used
to create such a multi-state PWE model with four time periods. Hospital
mortality and LOS for patient subgroups were calculated from this model,
and time-varying covariate effects were estimated. Early mortality was
increased by anatomic injury severity or penetrating mechanism, but these
effects diminished with time; age and male sex remained strong predictors
of mortality in all time periods. Rates of discharge home decreased
steadily with time, while rates of transfer to long-term care peaked at
five days. Predicted and observed LOS and mortality were similar for
multiple subgroups. Conceptual background and methods of calculation are
discussed and demonstrated. Multi-state PWE models may be useful to
describe hospital outcomes, especially when many patients are not
discharged home.
Journal: Journal of Applied Statistics
Pages: 1225-1239
Issue: 10
Volume: 34
Year: 2007
Keywords: LOS, injury, model, multi-state, piecewise exponential, competing risks,
X-DOI: 10.1080/02664760701592836
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701592836
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:10:p:1225-1239
Template-Type: ReDIF-Article 1.0
Author-Name: Chrys Caroni
Author-X-Name-First: Chrys
Author-X-Name-Last: Caroni
Author-Name: Nedret Billor
Author-X-Name-First: Nedret
Author-X-Name-Last: Billor
Title: Robust Detection of Multiple Outliers in Grouped Multivariate Data
Abstract:
Many methods have been developed for detecting multiple outliers in a
single multivariate sample, but very few for the case where there may be
groups in the data. We propose a method of simultaneously determining
groups (as in cluster analysis) and detecting outliers, which are points
that are distant from every group. Our method is an adaptation of the
BACON algorithm proposed by Billor, Hadi and Velleman for the robust
detection of multiple outliers in a single group of multivariate data.
There are two versions of our method, depending on whether or not the
groups can be assumed to have equal covariance matrices. The effectiveness
of the method is illustrated by its application to two real data sets and
further shown by a simulation study for different sample sizes and
dimensions for 2 and 3 groups, with and without planted outliers in the
data. When the number of groups is not known in advance, the algorithm
could be used as a robust method of cluster analysis, by running it for
various numbers of groups and choosing the best solution.
Journal: Journal of Applied Statistics
Pages: 1241-1250
Issue: 10
Volume: 34
Year: 2007
Keywords: Multivariate data, outliers, robust methods, BACON, cluster analysis,
X-DOI: 10.1080/02664760701592877
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701592877
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:10:p:1241-1250
Template-Type: ReDIF-Article 1.0
Author-Name: Paramjit Gill
Author-X-Name-First: Paramjit
Author-X-Name-Last: Gill
Author-Name: Tim Swartz
Author-X-Name-First: Tim
Author-X-Name-Last: Swartz
Author-Name: Michael Treschow
Author-X-Name-First: Michael
Author-X-Name-Last: Treschow
Title: A Stylometric Analysis of King Alfred's Literary Works
Abstract:
For centuries, Alfred the Great was judged to have translated several
Latin texts into Old English. Many scholars, however, have expressed doubt
whether Alfred could have done all of this work. With the availability of
the Old English Corpus in electronic form, it is feasible to subject the
texts to statistical stylometric analysis. We approach the problem from a
Bayesian perspective where key words are identified and frequencies of the
key words are tabulated for seven relevant texts. The question of
authorship falls into the general statistical problem of classification
where several simple innovations to classical agglomerative procedures are
introduced. Our results suggest that one translation that has been
traditionally attributed to Alfred (The First Fifty Prose Psalms) tends to
distinguish itself from texts that are known to be Alfredian.
Journal: Journal of Applied Statistics
Pages: 1251-1258
Issue: 10
Volume: 34
Year: 2007
Keywords: Agglomerative techniques, Bayesian methods, classification, Dirichlet distribution, disputed authorship, entropy, hierarchical clustering, multinomial distribution, Old English,
X-DOI: 10.1080/02664760701592992
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701592992
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:10:p:1251-1258
Template-Type: ReDIF-Article 1.0
Author-Name: Daniel Yasumasa Takahashi
Author-X-Name-First: Daniel Yasumasa
Author-X-Name-Last: Takahashi
Author-Name: Luiz Antonio Baccal
Author-X-Name-First: Luiz Antonio
Author-X-Name-Last: Baccal
Author-Name: Koichi Sameshima
Author-X-Name-First: Koichi
Author-X-Name-Last: Sameshima
Title: Connectivity Inference between Neural Structures via Partial Directed Coherence
Abstract:
This paper describes the rigorous asymptotic distributions of the
recently introduced partial directed coherence (PDC) - a frequency domain
description of Granger causality between multivariate time series
represented by vector autoregressive models. We show that, when not zero,
PDC is asymptotically normally distributed and therefore provides means of
comparing different strengths of connection between observed time series.
Zero PDC indicates an absence of a direct connection between time series,
and its otherwise asymptotically normal behavior degenerates into that of
a mixture of [image omitted] variables allowing the computation of
rigorous thresholds for connectivity tests using either numerical
integration or approximate numerical methods. A Monte Carlo study
illustrates the power of the test under PDC nullity. An analysis of
electroencephalographic data, before and during an epileptic seizure
episode, is used to portray the usefulness of the test in a real
application.
Journal: Journal of Applied Statistics
Pages: 1259-1273
Issue: 10
Volume: 34
Year: 2007
Keywords: Partial directed coherence, epilepsy, Granger causality, connectivity,
X-DOI: 10.1080/02664760701593065
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701593065
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:10:p:1259-1273
Template-Type: ReDIF-Article 1.0
Author-Name: Øyvind Langsrud
Author-X-Name-First: Øyvind
Author-X-Name-Last: Langsrud
Author-Name: Kjetil Jørgensen
Author-X-Name-First: Kjetil
Author-X-Name-Last: Jørgensen
Author-Name: Ragni Ofstad
Author-X-Name-First: Ragni
Author-X-Name-Last: Ofstad
Author-Name: Tormod Næs
Author-X-Name-First: Tormod
Author-X-Name-Last: Næs
Title: Analyzing Designed Experiments with Multiple Responses
Abstract:
This paper is an overview of a unified framework for analyzing designed
experiments with univariate or multivariate responses. Both categorical
and continuous design variables are considered. To handle unbalanced data,
we introduce the so-called Type II* sums of squares. This means that the
results are independent of the scale chosen for continuous design
variables. Furthermore, it does not matter whether two-level variables are
coded as categorical or continuous. Overall testing of all responses is
done by 50-50 MANOVA, which handles several highly correlated responses.
Univariate p-values for each response are adjusted by using rotation
testing. To illustrate multivariate effects, mean values and mean
predictions are illustrated in a principal component score plot or
directly as curves. For the unbalanced cases, we introduce a new variant
of adjusted means, which are independent to the coding of two-level
variables. The methodology is exemplified by case studies from cheese and
fish pudding production.
Journal: Journal of Applied Statistics
Pages: 1275-1296
Issue: 10
Volume: 34
Year: 2007
Keywords: 50-50 MANOVA, general linear model, least-squares means, multiple testing, principal component, rotation test, unbalanced factorial design,
X-DOI: 10.1080/02664760701594246
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701594246
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:10:p:1275-1296
Template-Type: ReDIF-Article 1.0
Author-Name: Rand Wilcox
Author-X-Name-First: Rand
Author-X-Name-Last: Wilcox
Title: Post-hoc analyses in multiple regression based on prediction error
Abstract:
A well-known problem in multiple regression is that it is possible to
reject the hypothesis that all slope parameters are equal to zero, yet
when applying the usual Student's T-test to the individual parameters, no
significant differences are found. An alternative strategy is to estimate
prediction error via the 0.632 bootstrap method for all models of interest
and declare the parameters associated with the model that yields the
smallest prediction error to differ from zero. The main results in this
paper are that this latter strategy can have practical value versus
Student's T; replacing squared error with absolute error can be beneficial
in some situations and replacing least squares with an extension of the
Theil-Sen estimator can substantially increase the probability of
identifying the correct model under circumstances that are described.
Journal: Journal of Applied Statistics
Pages: 9-17
Issue: 1
Volume: 35
Year: 2008
Keywords: multiple comparisons, prediction error, bootstrap methods, robust regression,
X-DOI: 10.1080/02664760701683288
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701683288
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:1:p:9-17
Template-Type: ReDIF-Article 1.0
Author-Name: Erik Mønness
Author-X-Name-First: Erik
Author-X-Name-Last: Mønness
Author-Name: Kim Pearce
Author-X-Name-First: Kim
Author-X-Name-Last: Pearce
Author-Name: Shirley Coleman
Author-X-Name-First: Shirley
Author-X-Name-Last: Coleman
Title: Comparing a survey and a conjoint study: the future vision of water intermediaries
Abstract:
This paper compares and contrasts two methods of obtaining opinions using
questionnaires. As the name suggests, a conjoint study makes it possible
to consider several attributes jointly. Conjoint analysis is a statistical
method to analyse preferences. However, conjoint analysis requires a
certain amount of effort by the respondent. The alternative is ordinary
survey questions, answered one at a time. Survey questions are easier to
grasp mentally, but they do not challenge the respondent to prioritize.
This investigation has utilized both methods, survey and conjoint, making
it possible to compare them on real data. Attribute importance, attribute
correlations, case clustering and attribute grouping are evaluated by both
methods. Correspondence between how the two methods measure the attribute
in question is also given. Overall, both methods yield the same picture
concerning the relative importance of the attributes. Taken one attribute
at a time, the correspondence between the methods varies from good to no
correspondence. Considering all attributes together by cluster analysis of
the cases, the conjoint and survey data yield different cluster
structures. The attributes are grouped by factor analysis, and there is
reasonable correspondence. The data originate from the EU project 'New
Intermediary services and the transformation of urban water supply and
wastewater disposal systems in Europe'.
Journal: Journal of Applied Statistics
Pages: 19-30
Issue: 1
Volume: 35
Year: 2008
Keywords: questionnaire, conjoint analysis, survey methods,
X-DOI: 10.1080/02664760701683379
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701683379
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:1:p:19-30
Template-Type: ReDIF-Article 1.0
Author-Name: Hongying Dai
Author-X-Name-First: Hongying
Author-X-Name-Last: Dai
Author-Name: Richard Charnigo
Author-X-Name-First: Richard
Author-X-Name-Last: Charnigo
Title: Omnibus testing and gene filtration in microarray data analysis
Abstract:
When thousands of tests are performed simultaneously to detect
differentially expressed genes in microarray analysis, the number of Type
I errors can be immense if a multiplicity adjustment is not made. However,
due to the large scale, traditional adjustment methods require very
stringen significance levels for individual tests, which yield low power
for detecting alterations. In this work, we describe how two omnibus tests
can be used in conjunction with a gene filtration process to circumvent
difficulties due to the large scale of testing. These two omnibus tests,
the D-test and the modified likelihood ratio test (MLRT), can be used to
investigate whether a collection of P-values has arisen from the
Uniform(0,1) distribution or whether the Uniform(0,1) distribution
contaminated by another Beta distribution is more appropriate. In the
former case, attention can be directed to a smaller part of the genome; in
the latter event, parameter estimates for the contamination model provide
a frame of reference for multiple comparisons. Unlike the likelihood ratio
test (LRT), both the D-test and MLRT enjoy simple limiting distributions
under the null hypothesis of no contamination, so critical values can be
obtained from standard tables. Simulation studies demonstrate that the
D-test and MLRT are superior to the AIC, BIC, and Kolmogorov-Smirnov test.
A case study illustrates omnibus testing and filtration.
Journal: Journal of Applied Statistics
Pages: 31-47
Issue: 1
Volume: 35
Year: 2008
Keywords: multiple comparisons, P-values, Beta contamination model, MMLEs, D-test, MLRT,
X-DOI: 10.1080/02664760701683528
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701683528
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:1:p:31-47
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaomo Jiang
Author-X-Name-First: Xiaomo
Author-X-Name-Last: Jiang
Author-Name: Sankaran Mahadevan
Author-X-Name-First: Sankaran
Author-X-Name-Last: Mahadevan
Title: Bayesian validation assessment of multivariate computational models
Abstract:
Multivariate model validation is a complex decision-making problem
involving comparison of multiple correlated quantities, based upon the
available information and prior knowledge. This paper presents a Bayesian
risk-based decision method for validation assessment of multivariate
predictive models under uncertainty. A generalized likelihood ratio is
derived as a quantitative validation metric based on Bayes' theorem and
Gaussian distribution assumption of errors between validation data and
model prediction. The multivariate model is then assessed based on the
comparison of the likelihood ratio with a Bayesian decision threshold, a
function of the decision costs and prior of each hypothesis. The
probability density function of the likelihood ratio is constructed using
the statistics of multiple response quantities and Monte Carlo simulation.
The proposed methodology is implemented in the validation of a transient
heat conduction model, using a multivariate data set from experiments. The
Bayesian methodology provides a quantitative approach to facilitate
rational decisions in multivariate model assessment under uncertainty.
Journal: Journal of Applied Statistics
Pages: 49-65
Issue: 1
Volume: 35
Year: 2008
Keywords: Bayesian statistics, decision making, risk, reliability, model validation, multivariate statistics,
X-DOI: 10.1080/02664760701683577
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701683577
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:1:p:49-65
Template-Type: ReDIF-Article 1.0
Author-Name: Gulser Koksal
Author-X-Name-First: Gulser
Author-X-Name-Last: Koksal
Author-Name: Burcu Kantar
Author-X-Name-First: Burcu
Author-X-Name-Last: Kantar
Author-Name: Taylan Ali Ula
Author-X-Name-First: Taylan Ali
Author-X-Name-Last: Ula
Author-Name: Murat Caner Testik
Author-X-Name-First: Murat Caner
Author-X-Name-Last: Testik
Title: The effect of Phase I sample size on the run length performance of control charts for autocorrelated data
Abstract:
Traditional control charts assume independence of observations obtained
from the monitored process. However, if the observations are
autocorrelated, these charts often do not perform as intended by the
design requirements. Recently, several control charts have been proposed
to deal with autocorrelated observations. The residual chart, modified
Shewhart chart, EWMAST chart, and ARMA chart are such charts widely used
for monitoring the occurrence of assignable causes in a process when the
process exhibits inherent autocorrelation. Besides autocorrelation, one
other issue is the unknown values of true process parameters to be used in
the control chart design, which are often estimated from a reference
sample of in-control observations. Performances of the above-mentioned
control charts for autocorrelated processes are significantly affected by
the sample size used in a Phase I study to estimate the control chart
parameters. In this study, we investigate the effect of Phase I sample
size on the run length performance of these four charts for monitoring the
changes in the mean of an autocorrelated process, namely an AR(1) process.
A discussion of the practical implications of the results and suggestions
on the sample size requirements for effective process monitoring are
provided.
Journal: Journal of Applied Statistics
Pages: 67-87
Issue: 1
Volume: 35
Year: 2008
Keywords: autocorrelated data, sample size, residual chart, EWMAST chart, modified Shewhart chart, ARMA chart, run length,
X-DOI: 10.1080/02664760701683619
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701683619
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:1:p:67-87
Template-Type: ReDIF-Article 1.0
Author-Name: S. Chakraborti
Author-X-Name-First: S.
Author-X-Name-Last: Chakraborti
Author-Name: S. W. Human
Author-X-Name-First: S. W.
Author-X-Name-Last: Human
Title: Properties and performance of the c-chart for attributes data
Abstract:
The effects of parameter estimation are examined for the well-known
c-chart for attributes data. The exact run length distribution is obtained
for Phase II applications, when the true average number of
non-conformities, c, is unknown, by conditioning on the observed number of
non-conformities in a set of reference data (from Phase I). Expressions
for various chart performance characteristics, such as the average run
length (ARL), the standard deviation of the run length (SDRL) and the
median run length (MDRL) are also obtained. Examples show that the actual
performance of the chart, both in terms of the false alarm rate (FAR) and
the in-control ARL, can be substantially different from what might be
expected when c is known, in that an exceedingly large number of false
alarms are observed, unless the number of inspection units (the size of
the reference dataset) used to estimate c is very large, much larger than
is commonly used or recommended in practice. In addition, the actual FAR
and the in-control ARL values can be very different from the nominally
expected values such as 0.0027 (or ARL0=370), particularly when c is
small, even with large amounts of reference data. A summary and
conclusions are offered.
Journal: Journal of Applied Statistics
Pages: 89-100
Issue: 1
Volume: 35
Year: 2008
Keywords: non-conformities, defects, Shewhart, statistical process control, Phase I, Phase II, parameter estimation, Poisson distribution, run length, average run length, percentiles, in-control, out-of-control,
X-DOI: 10.1080/02664760701683643
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701683643
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:1:p:89-100
Template-Type: ReDIF-Article 1.0
Author-Name: Mahdi Alkhamisi
Author-X-Name-First: Mahdi
Author-X-Name-Last: Alkhamisi
Author-Name: Ghadban Khalaf
Author-X-Name-First: Ghadban
Author-X-Name-Last: Khalaf
Author-Name: Ghazi Shukur
Author-X-Name-First: Ghazi
Author-X-Name-Last: Shukur
Title: The effect of fat-tailed error terms on the properties of systemwise RESET test
Abstract:
The small sample properties of the systemwise RESET (Regression
Specification Error Test) test for functional misspecification are
investigated using normal and non-normal error terms. When using normally
distributed or less heavy tailed error terms, we find the Rao's
multivariate F-test to be best among all other alternative test methods
(i.e. Wald, Lagrange Multiplier and Likelihood Ratio). Using the bootstrap
critical values, however, all test methods perform satisfactorily in
almost all situations. However, the test methods perform extremely badly
(even the RAO test) when the error terms are very heavy tailed.
Journal: Journal of Applied Statistics
Pages: 101-113
Issue: 1
Volume: 35
Year: 2008
Keywords: systemwise test of functional misspecification, non-normal error terms, small sample properties,
X-DOI: 10.1080/02664760701683676
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701683676
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:1:p:101-113
Template-Type: ReDIF-Article 1.0
Author-Name: Daniel Wang
Author-X-Name-First: Daniel
Author-X-Name-Last: Wang
Author-Name: Michael Conerly
Author-X-Name-First: Michael
Author-X-Name-Last: Conerly
Title: Evaluating the power of Minitab's data subsetting lack of fit test in multiple linear regression
Abstract:
Minitab's data subsetting lack of fit test (denoted XLOF) is a
combination of Burn and Ryan's test and Utts' test for testing lack of fit
in linear regression models. As an alternative to the classical or pure
error lack of fit test, it does not require replicates of predictor
variables. However, due to the uncertainty about its performance, XLOF
still remains unfamiliar to regression users while the well-known
classical lack of fit test is not applicable to regression data without
replicates. So far this procedure has not been mentioned in any textbooks
and has not been included in any other software packages. This study
assesses the performance of XLOF in detecting lack of fit in linear
regressions without replicates by comparing the power with the classic
test. The power of XLOF is simulated using Minitab macros for variables
with several forms of curvature. These comparisons lead to pragmatic
suggestions on the use of XLOF. The performance of XLOF was shown to be
superior to the classical test based on the results. It should be noted
that the replicates required for the classical test made itself
unavailable for most of the regression data while XLOF can still be as
powerful as the classic test even without replicates.
Journal: Journal of Applied Statistics
Pages: 115-124
Issue: 1
Volume: 35
Year: 2008
Keywords: Minitab XLOF, lack of fit test, linear regression, diagnosis, power, simulation,
X-DOI: 10.1080/02664760701775381
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701775381
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:1:p:115-124
Template-Type: ReDIF-Article 1.0
Author-Name: R. L. J. Coetzer
Author-X-Name-First: R. L. J.
Author-X-Name-Last: Coetzer
Author-Name: D. H. Morgan
Author-X-Name-First: D. H.
Author-X-Name-Last: Morgan
Author-Name: H. Maumela
Author-X-Name-First: H.
Author-X-Name-Last: Maumela
Title: Optimization of a catalyst system through the sequential application of experimental design techniques
Abstract:
The selective oligomerisation of ethylene to higher alpha olefins is an
area of much recent interest. In this regard, Sasol Technology R{&}D has
developed a homogeneous catalyst system based on bis-sulfanylamine (SNS)
complexes of chromium for the selective trimerisation of ethylene to
1-hexene. It is activated by methylaluminoxane (MAO), which is an
extremely expensive activator. This paper discusses how, through the
sequential application of experimental design and response surface
techniques, the activator requirements of the catalyst system were reduced
12 times, whilst improving the catalyst activity on a g/g Cr/h basis ca.
three times and the activity on a g/g MAO basis ca. nine times. This
reduction in the amount of MAO required led to economically attractive
catalyst activities for the production of 1-hexene, and would not have
been possible without the use of experimental design techniques. This
paper will demonstrate the process of investigation through the use of
sequential experimental design in practice.
Journal: Journal of Applied Statistics
Pages: 131-147
Issue: 2
Volume: 35
Year: 2008
Keywords: experimental design, methylaluminoxane, oligomerisation, response surface modelling,
X-DOI: 10.1080/02664760701775613
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701775613
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:2:p:131-147
Template-Type: ReDIF-Article 1.0
Author-Name: R. Vijayaraghavan
Author-X-Name-First: R.
Author-X-Name-Last: Vijayaraghavan
Author-Name: K. Rajagopal
Author-X-Name-First: K.
Author-X-Name-Last: Rajagopal
Author-Name: A. Loganathan
Author-X-Name-First: A.
Author-X-Name-Last: Loganathan
Title: A procedure for selection of a gamma-Poisson single sampling plan by attributes
Abstract:
Design and evaluation of sampling plans by attributes and by variables
are important aspects in the area of acceptance sampling research. Various
procedures for the selection of conventional single sampling by attributes
have been developed and are available in the literature. This paper
presents a design methodology and tables for the selection of parameters
of single sampling plans for specified requirements (strengths) under the
conditions of a gamma prior and Poisson sampling distribution. The
relative efficiency of gamma-Poisson single sampling plans over
conventional plans is discussed through empirical illustrations.
Journal: Journal of Applied Statistics
Pages: 149-160
Issue: 2
Volume: 35
Year: 2008
Keywords: sampling inspection by attributes, Bayesian acceptance sampling plan, consumer's risk, gamma-Poisson single sampling plan, gamma prior, operating characteristic curve, Poisson distribution, producer's risk,
X-DOI: 10.1080/02664760701775654
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701775654
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:2:p:149-160
Template-Type: ReDIF-Article 1.0
Author-Name: Kim Pearce
Author-X-Name-First: Kim
Author-X-Name-Last: Pearce
Author-Name: Shirley Coleman
Author-X-Name-First: Shirley
Author-X-Name-Last: Coleman
Title: Modern-day perception of historic footwear and its links to preference
Abstract:
The importance of emotion in consumer preference is explored in the
subject of Kansei Engineering. The Kansei methodology has been
successfully adopted by many large companies in recent years. Currently, a
European Union Fifth framework project called 'Kensys' (Kansei Engineering
System) is being implemented to look at the application of Kansei
engineering in the field of footwear. The Kensys project is being
conducted in collaboration with several SMEs and this paper reports a
study that has been carried out with one of the SMEs who designs and makes
reproduction historic and specialist footwear. In addition, respondent
views on 'real' products from history and reproduction footwear are
compared. We report on the views of respondents in general and look at
gender differences, the comparison of non-experts' views versus experts'
views and we also look at differences due to age. The study was carried
out in the UK and in Spain. The views in both counties are compared.
Journal: Journal of Applied Statistics
Pages: 161-178
Issue: 2
Volume: 35
Year: 2008
Keywords: Kansei Engineering, emotional response, design,
X-DOI: 10.1080/02664760701775498
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701775498
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:2:p:161-178
Template-Type: ReDIF-Article 1.0
Author-Name: Manuel Galea
Author-X-Name-First: Manuel
Author-X-Name-Last: Galea
Author-Name: Jose Diaz-Garcia
Author-X-Name-First: Jose
Author-X-Name-Last: Diaz-Garcia
Author-Name: Filidor Vilca
Author-X-Name-First: Filidor
Author-X-Name-Last: Vilca
Title: Influence diagnostics in the capital asset pricing model under elliptical distributions
Abstract:
In this paper we consider the Capital Asset Pricing Model under
Elliptical (symmetric) Distributions. This class of distributions, which
contains the normal distribution, t, contaminated normal and power
exponential, among others, offers a more flexible framework for modelling
asset prices or returns. In order to analyze the sensibility to possible
outliers and/or atypical returns of the maximum likelihood estimators, the
local influence method was implemented. The results are illustrated by
using a set of shares from companies who trade in the Chilean Stock
Market. Our main conclusion is that symmetric distributions having heavier
tails than those of the normal distribution, especially the t distribution
with small degrees of freedom, show a better fit and allow the reduction
of the influence of atypical returns in the maximum likelihood estimators.
Journal: Journal of Applied Statistics
Pages: 179-192
Issue: 2
Volume: 35
Year: 2008
Keywords: robust estimation, diagnostics, local influence, elliptical distributions,
X-DOI: 10.1080/02664760701775712
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701775712
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:2:p:179-192
Template-Type: ReDIF-Article 1.0
Author-Name: Emil Berendt
Author-X-Name-First: Emil
Author-X-Name-Last: Berendt
Title: Contracts, livestock, and the Bernoulli process: an application of statistics to B. Traven's 'Cattle Drive'
Abstract:
One of the pivotal devices B. Traven employs in his short story 'The
Cattle Drive' is a contract between the cattle owner and the trail boss
who brings the livestock to market. By specifying a per-diem rate, the
contract appears to encourage a wage-maximizing trail boss to delay the
delivery of the cattle. However, a statistical model of the contract
demonstrates that a rational trail boss has an incentive to maintain a
rapid rate of travel. The article concludes that statistics can be applied
in non-traditional ways such as to the analysis of the plot of a fictional
story. The statistical model suggests plausible alternative endings to the
story based on various parameter assumptions. Finally, it demonstrates
that a well-crafted story can provide an excellent case study of how
contracts create incentives and influence decision-making.
Journal: Journal of Applied Statistics
Pages: 193-202
Issue: 2
Volume: 35
Year: 2008
Keywords: B. Traven, contract, principal-agent problem, binomial, Cattle Drive, wage, literature,
X-DOI: 10.1080/02664760701775571
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701775571
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:2:p:193-202
Template-Type: ReDIF-Article 1.0
Author-Name: Pankaj Sinha
Author-X-Name-First: Pankaj
Author-X-Name-Last: Sinha
Author-Name: Ashok Bansal
Author-X-Name-First: Ashok
Author-X-Name-Last: Bansal
Title: Bayesian optimization analysis with ML-II ε-contaminated prior
Abstract:
In this paper we derive the predictive density function of a future
observation when prior distribution for unknown mean of a normal
population is a Type-II maximum likelihood ε-contaminated prior. The
derived predictive distribution is applied to the problem of optimization
of a regression nature in the decisive prediction framework.
Journal: Journal of Applied Statistics
Pages: 203-211
Issue: 2
Volume: 35
Year: 2008
Keywords: ε-contaminated prior, type II maximum likelihood technique, optimization analysis; decisive prediction,
X-DOI: 10.1080/02664760701775415
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701775415
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:2:p:203-211
Template-Type: ReDIF-Article 1.0
Author-Name: David Bock
Author-X-Name-First: David
Author-X-Name-Last: Bock
Title: Aspects on the control of false alarms in statistical surveillance and the impact on the return of financial decision systems
Abstract:
In systems for online detection of regime shifts, a process is
continually observed. Based on the data available an alarm is given when
there is enough evidence of a change. There is a risk of a false alarm and
here two different ways of controlling the false alarms are compared: a
fixed average run length until the first false alarm and a fixed
probability of any false alarm (fixed size). The two approaches are
evaluated in terms of the timeliness of alarms. A system with a fixed size
is found to have a drawback: the ability to detect a change deteriorates
with the time of the change. Consequently, the probability of successful
detection will tend to zero and the expected delay of a motivated alarm
tends to infinity. This drawback is present even when the size is set to
be very large (close to one). Utility measures expressing the costs for a
false or a too late alarm are used in the comparison. How the choice of
the best approach can be guided by the parameters of the process and the
different costs of alarms is demonstrated. The technique is illustrated by
financial transactions of the Hang Seng Index.
Journal: Journal of Applied Statistics
Pages: 213-227
Issue: 2
Volume: 35
Year: 2008
Keywords: monitoring, surveillance, repeated decisions, moving average, Shewhart method,
X-DOI: 10.1080/02664760701775431
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701775431
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:2:p:213-227
Template-Type: ReDIF-Article 1.0
Author-Name: Rafael Pino-Mejias
Author-X-Name-First: Rafael
Author-X-Name-Last: Pino-Mejias
Author-Name: Mercedes Carrasco-Mairena
Author-X-Name-First: Mercedes
Author-X-Name-Last: Carrasco-Mairena
Author-Name: Antonio Pascual-Acosta
Author-X-Name-First: Antonio
Author-X-Name-Last: Pascual-Acosta
Author-Name: Maria-Dolores Cubiles-De-La-Vega
Author-X-Name-First: Maria-Dolores
Author-X-Name-Last: Cubiles-De-La-Vega
Author-Name: Joaquin Munoz-Garcia
Author-X-Name-First: Joaquin
Author-X-Name-Last: Munoz-Garcia
Title: A comparison of classification models to identify the Fragile X Syndrome
Abstract:
The main models of machine learning are briefly reviewed and considered
for building a classifier to identify the Fragile X Syndrome (FXS). We
have analyzed 172 patients potentially affected by FXS in Andalusia
(Spain) and, by means of a DNA test, each member of the data set is known
to belong to one of two classes: affected, not affected. The whole
predictor set, formed by 40 variables, and a reduced set with only nine
predictors significantly associated with the response are considered. Four
alternative base classification models have been investigated: logistic
regression, classification trees, multilayer perceptron and support vector
machines. For both predictor sets, the best accuracy, considering both the
mean and the standard deviation of the test error rate, is achieved by the
support vector machines, confirming the increasing importance of this
learning algorithm. Three ensemble methods - bagging, random forests and
boosting - were also considered, amongst which the bagged versions of
support vector machines stand out, especially when they are constructed
with the reduced set of predictor variables. The analysis of the
sensitivity, the specificity and the area under the ROC curve agrees with
the main conclusions extracted from the accuracy results. All of these
models can be fitted by free R programs.
Journal: Journal of Applied Statistics
Pages: 233-244
Issue: 3
Volume: 35
Year: 2008
Keywords: fragile X syndrome, support vector machines, multilayer perceptron, classification trees, logistic regression, ensemble methods, R system,
X-DOI: 10.1080/02664760701832976
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701832976
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:3:p:233-244
Template-Type: ReDIF-Article 1.0
Author-Name: Raghu Nandan Sengupta
Author-X-Name-First: Raghu Nandan
Author-X-Name-Last: Sengupta
Title: Use of asymmetric loss functions in sequential estimation problems for multiple linear regression
Abstract:
When estimating in a practical situation, asymmetric loss functions are
preferred over squared error loss functions, as the former is more
appropriate than the latter in many estimation problems. We consider here
the problem of fixed precision point estimation of a linear parametric
function in beta for the multiple linear regression model using asymmetric
loss functions. Due to the presence of nuissance parameters, the sample
size for the estimation problem is not known beforehand and hence we take
the recourse of adaptive multistage sampling methodologies. We discuss
here some multistage sampling techniques and compare the performances of
these methodologies using simulation runs. The implementation of the codes
for our proposed models is accomplished utilizing MATLAB 7.0.1 program run
on a Pentium IV machine. Finally, we highlight the significance of such
asymmetric loss functions with few practical examples.
Journal: Journal of Applied Statistics
Pages: 245-261
Issue: 3
Volume: 35
Year: 2008
Keywords: loss function, risk, bounded risk, asymmetric loss function, LINEX loss function, relative LINEX loss function, stopping rule, multistage sampling procedure, purely sequential sampling procedure, batch sequential sampling procedure,
X-DOI: 10.1080/02664760701833388
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701833388
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:3:p:245-261
Template-Type: ReDIF-Article 1.0
Author-Name: Francesco Pauli
Author-X-Name-First: Francesco
Author-X-Name-Last: Pauli
Author-Name: Laura Rizzi
Author-X-Name-First: Laura
Author-X-Name-Last: Rizzi
Title: Summer temperature effects on deaths and hospital admissions among the elderly population in two Italian cities
Abstract:
In developed countries the effects of climate on health status are mainly
due to temperature. Our analysis is aimed to deepen statistically the
relationship between summer climate conditions and daily frequency of
health episodes: deaths or hospital admissions. We expect to find a
U-shaped relationship between temperature and frequencies of events
occurring in summer regarding the elderly population resident in Milano
and Brescia. We use as covariates hourly records of temperature recorded
at observation sites located in Milano and Brescia. The analysis is
performed using Generalized Additive Models (GAM), where the response
variable is the daily number of events, which varies as a possibly
non-linear function of meteorological variables measured on the same or
previous day. We consider separate models for Milano and Brescia and then
we compare temperature effects among the two towns and among different age
classes. Moreover we consider separate models for all diagnosed events,
for those due to respiratory disease and those due to circulatory
pathologies. Model selection is a central problem, the basic methods used
are the UBRE and GCV criteria but, instead of conditioning all final
conclusions on the best model according to the chosen criterion, we
investigated the effect of model selection by implementing a bootstrap
procedure.
Journal: Journal of Applied Statistics
Pages: 263-276
Issue: 3
Volume: 35
Year: 2008
Keywords: temperature, deaths, hospital admissions, generalized additive models, model selection criteria, bootstrap,
X-DOI: 10.1080/02664760701833354
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701833354
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:3:p:263-276
Template-Type: ReDIF-Article 1.0
Author-Name: P. Angelopoulos
Author-X-Name-First: P.
Author-X-Name-Last: Angelopoulos
Author-Name: C. Koukouvinos
Author-X-Name-First: C.
Author-X-Name-Last: Koukouvinos
Title: Detecting active effects in unreplicated designs
Abstract:
Unreplicated factorial designs pose a difficult problem in analysis
because there are no degrees of freedom left to estimate the error. Daniel
[Technometrics 1 (1959), pp. 311-341] proposed an ingenious graphical
method that does not require σ to be estimated. Here we try to put
Daniel's method into a formal framework and lift the subjectiveness that
carries. A simulation study has been conducted that shows that the
proposed method behaves better than Lenth's [Technometrics 31 (1989), pp.
469-473] popular method.
Journal: Journal of Applied Statistics
Pages: 277-281
Issue: 3
Volume: 35
Year: 2008
Keywords: unreplicated design, factorial, effect, outliers,
X-DOI: 10.1080/02664760701833008
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701833008
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:3:p:277-281
Template-Type: ReDIF-Article 1.0
Author-Name: Sung-Soo Kim
Author-X-Name-First: Sung-Soo
Author-X-Name-Last: Kim
Author-Name: Sung Park
Author-X-Name-First: Sung
Author-X-Name-Last: Park
Author-Name: W. J. Krzanowski
Author-X-Name-First: W. J.
Author-X-Name-Last: Krzanowski
Title: Simultaneous variable selection and outlier identification in linear regression using the mean-shift outlier model
Abstract:
We provide a method for simultaneous variable selection and outlier
identification using the mean-shift outlier model. The procedure consists
of two steps: the first step is to identify potential outliers, and the
second step is to perform all possible subset regressions for the
mean-shift outlier model containing the potential outliers identified in
step 1. This procedure is helpful for model selection while simultaneously
considering outlier identification, and can be used to identify multiple
outliers. In addition, we can evaluate the impact on the regression model
of simultaneous omission of variables and interesting observations. In an
example, we provide detailed output from the R system, and compare the
results with those using posterior model probabilities as proposed by
Hoeting et al. [Comput. Stat. Data Anal. 22 (1996), pp. 252-270] for
simultaneous variable selection and outlier identification.
Journal: Journal of Applied Statistics
Pages: 283-291
Issue: 3
Volume: 35
Year: 2008
Keywords: multiple outliers, variable selection, mean-shift outlier model, all-subset regressions,
X-DOI: 10.1080/02664760701833040
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701833040
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:3:p:283-291
Template-Type: ReDIF-Article 1.0
Author-Name: Tonglin Zhang
Author-X-Name-First: Tonglin
Author-X-Name-Last: Zhang
Author-Name: Ge Lin
Author-X-Name-First: Ge
Author-X-Name-Last: Lin
Title: Identification of local clusters for count data: a model-based Moran's I test
Abstract:
We set out IDR as a loglinear-model-based Moran's I test for Poisson
count data that resembles the Moran's I residual test for Gaussian data.
We evaluate its type I and type II error probabilities via simulations,
and demonstrate its utility via a case study. When population sizes are
heterogeneous, IDR is effective in detecting local clusters by local
association terms with an acceptable type I error probability. When used
in conjunction with local spatial association terms in loglinear models,
IDR can also indicate the existence of first-order global cluster that can
hardly be removed by local spatial association terms. In this situation,
IDR should not be directly applied for local cluster detection. In the
case study of St. Louis homicides, we bridge loglinear model methods for
parameter estimation to exploratory data analysis, so that a uniform
association term can be defined with spatially varied contributions among
spatial neighbors. The method makes use of exploratory tools such as
Moran's I scatter plots and residual plots to evaluate the magnitude of
deviance residuals, and it is effective to model the shape, the elevation
and the magnitude of a local cluster in the model-based test.
Journal: Journal of Applied Statistics
Pages: 293-306
Issue: 3
Volume: 35
Year: 2008
Keywords: cluster and clustering, deviance residual, Moran's I, permutation test, spatial autocorrelation, type I error probability,
X-DOI: 10.1080/02664760701833248
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701833248
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:3:p:293-306
Template-Type: ReDIF-Article 1.0
Author-Name: Anuradha Roy
Author-X-Name-First: Anuradha
Author-X-Name-Last: Roy
Title: Computation aspects of the parameter estimates of linear mixed effects model in multivariate repeated measures set-up
Abstract:
The number of parameters mushrooms in a linear mixed effects (LME) model
in the case of multivariate repeated measures data. Computation of these
parameters is a real problem with the increase in the number of response
variables or with the increase in the number of time points. The problem
becomes more intricate and involved with the addition of additional random
effects. A multivariate analysis is not possible in a small sample
setting. We propose a method to estimate these many parameters in bits and
pieces from baby models, by taking a subset of response variables at a
time, and finally using these bits and pieces at the end to get the
parameter estimates for the mother model, with all variables taken
together. Applying this method one can calculate the fixed effects, the
best linear unbiased predictions (BLUPs) for the random effects in the
model, and also the BLUPs at each time of observation for each response
variable, to monitor the effectiveness of the treatment for each subject.
The proposed method is illustrated with an example of multiple response
variables measured over multiple time points arising from a clinical trial
in osteoporosis.
Journal: Journal of Applied Statistics
Pages: 307-320
Issue: 3
Volume: 35
Year: 2008
Keywords: best linear unbiased prediction, covariance structures, linear mixed effects model, multivariate repeated measures data,
X-DOI: 10.1080/02664760701833271
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701833271
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:3:p:307-320
Template-Type: ReDIF-Article 1.0
Author-Name: Murari Singh
Author-X-Name-First: Murari
Author-X-Name-Last: Singh
Author-Name: Michael Jones
Author-X-Name-First: Michael
Author-X-Name-Last: Jones
Title: Modelling spatial-temporal covariance structures in monocropping barley trials
Abstract:
In long-term trials, not only are individual plot errors correlated over
time but there is also a consistent underlying spatial variability in
field conditions. The current study sought the most appropriate covariance
structure of errors correlated in three dimensions for evaluating the
productivity and time-trends in the barley yield data from the
monocropping system established in northern Syria. The best
spatial-temporal model found reflected the contribution of
autocorrelations in spatial and temporal dimensions with estimates varying
with the yield variable and location. Compared with a control structure
based on independent errors, this covariance structure improved the
significance of the fertilizer effect and the interaction with year.
Time-trends were estimated in two ways: by accounting the seasonal
variable contribution in annual variability (Method 1), which is suitable
for detecting significant trends in short data series; and by using the
linear component of the orthogonal polynomial on time (year), which is
appropriate for long series (Method 2). Method 1 strengthened time-trend
detection compared with the method of Jones and Singh [J. Agri. Sci.,
Cambridge 135 (2000), pp. 251-259] which assumed independence of temporal
errors. Most estimates of yield trends over time from fertilizer
application were numerically greater than the corresponding linear trends
estimated from orthogonal polynomials in time (Method 2), reflecting the
effect of accounting for seasonal variables. Grain yield declined over
time at the drier site in the absence of nitrogen or phosphorus
application, but positive trends were observed fairly generally for straw
yield and for grain yield under higher levels of fertilizer inputs. It is
suggested that analyses of long-term trials on other crops and cropping
systems in other agro-ecological zones could be improved by taking spatial
and temporal variability into account in the data evaluation.
Journal: Journal of Applied Statistics
Pages: 321-333
Issue: 3
Volume: 35
Year: 2008
Keywords: barley monocropping, long-term trials, REML, spatial-temporal covariance, time-trend,
X-DOI: 10.1080/02664760701832992
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701832992
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:3:p:321-333
Template-Type: ReDIF-Article 1.0
Author-Name: Man-Lai Tang
Author-X-Name-First: Man-Lai
Author-X-Name-Last: Tang
Author-Name: Maozai Tian
Author-X-Name-First: Maozai
Author-X-Name-Last: Tian
Author-Name: Ping-Shing Chan
Author-X-Name-First: Ping-Shing
Author-X-Name-Last: Chan
Title: On the bootstrap quantile-treatment-effect test
Abstract:
Let {X1, …, Xn} and {Y1, …, Ym} be two samples of
independent and identically distributed observations with common
continuous cumulative distribution functions F(x)=P(X≤x) and
G(y)=P(Y≤y), respectively. In this article, we would like to test
the no quantile treatment effect hypothesis H0: F=G. We develop a
bootstrap quantile-treatment-effect test procedure for testing H0 under
the location-scale shift model. Our test procedure avoids the calculation
of the check function (which is non-differentiable at the origin and makes
solving the quantile effects difficult in typical quantile regression
analysis). The limiting null distribution of the test procedure is derived
and the procedure is shown to be consistent against a broad family of
alternatives. Simulation studies show that our proposed test procedure
attains its type I error rate close to the pre-chosen significance level
even for small sample sizes. Our test procedure is illustrated with two
real data sets on the lifetimes of guinea pigs from a treatment-control
experiment.
Journal: Journal of Applied Statistics
Pages: 335-350
Issue: 3
Volume: 35
Year: 2008
Keywords: Brownian bridge, bootstrap, Monte Carlo simulation, order statistics, two-sample case, quantile-treatment-effects,
X-DOI: 10.1080/02664760701834725
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701834725
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:3:p:335-350
Template-Type: ReDIF-Article 1.0
Author-Name: Enrique Gonzalez-Davila
Author-X-Name-First: Enrique
Author-X-Name-Last: Gonzalez-Davila
Author-Name: Josep Ginebra
Author-X-Name-First: Josep
Author-X-Name-Last: Ginebra
Author-Name: Roberto Dorta-Guerra
Author-X-Name-First: Roberto
Author-X-Name-Last: Dorta-Guerra
Title: Sample size determination for 2k-r experiments with a binomial response
Abstract:
This paper provides closed form expressions for the sample size for
two-level factorial experiments when the response is the number of
defectives. The sample sizes are obtained by approximating the two-sided
test for no effect through tests for the mean of a normal distribution,
and borrowing the classical sample size solution for that problem. The
proposals are appraised relative to the exact sample sizes computed
numerically, without appealing to any approximation to the binomial
distribution, and the use of the sample size tables provided is
illustrated through an example.
Journal: Journal of Applied Statistics
Pages: 357-367
Issue: 4
Volume: 35
Year: 2008
Keywords: factorial experiments, binary data, sample size, deviance,
X-DOI: 10.1080/02664760701833669
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701833669
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:4:p:357-367
Template-Type: ReDIF-Article 1.0
Author-Name: M. Ruiz
Author-X-Name-First: M.
Author-X-Name-Last: Ruiz
Author-Name: F. J. Giron
Author-X-Name-First: F. J.
Author-X-Name-Last: Giron
Author-Name: C. J. Perez
Author-X-Name-First: C. J.
Author-X-Name-Last: Perez
Author-Name: J. Martin
Author-X-Name-First: J.
Author-X-Name-Last: Martin
Author-Name: C. Rojano
Author-X-Name-First: C.
Author-X-Name-Last: Rojano
Title: A Bayesian model for multinomial sampling with misclassified data
Abstract:
In this paper the issue of making inferences with misclassified data from
a noisy multinomial process is addressed. A Bayesian model for making
inferences about the proportions and the noise parameters is developed.
The problem is reformulated in a more tractable form by introducing
auxiliary or latent random vectors. This allows for an easy-to-implement
Gibbs sampling-based algorithm to generate samples from the distributions
of interest. An illustrative example related to elections is also
presented.
Journal: Journal of Applied Statistics
Pages: 369-382
Issue: 4
Volume: 35
Year: 2008
Keywords: Bayesian inference, Gibbs sampling, misclassified data, noisy multinomial process,
X-DOI: 10.1080/02664760701834832
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701834832
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:4:p:369-382
Template-Type: ReDIF-Article 1.0
Author-Name: Jose Manuel Pavia-Miralles
Author-X-Name-First: Jose Manuel
Author-X-Name-Last: Pavia-Miralles
Author-Name: Beatriz Larraz-Iribas
Author-X-Name-First: Beatriz
Author-X-Name-Last: Larraz-Iribas
Title: Quick counts from non-selected polling stations
Abstract:
Countless examples of misleading forecasts on behalf of both campaign and
exit polls affecting, among others, British, French, and Spanish elections
could be found. This has seriously damaged their image. Therefore,
procedures should be used that minimize errors, especially on election
night when errors are more noticeable, in order to maintain people's trust
in surveys. This paper proposes a method to obtain quick and early outcome
forecasts on the election night. The idea is to partly sample some
(whatever) polling stations and use the consistency that polling stations
show between elections to predict the final results. Model accuracy is
analysed through simulation using seven different types of samples in four
elections. The efficacy of the technique is also tested predicting the
2005 Eusko Legebiltzarra elections from real data. Results confirm that
the procedure generates highly reliable and accurate forecasts.
Furthermore, compared with the classical quick count strategy, the method
is revealed as much more robust and precise.
Journal: Journal of Applied Statistics
Pages: 383-405
Issue: 4
Volume: 35
Year: 2008
Keywords: election forecasts, error observation, generalized linear regression, pseudodata augmentation, Spanish elections,
X-DOI: 10.1080/02664760701834881
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701834881
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:4:p:383-405
Template-Type: ReDIF-Article 1.0
Author-Name: Patricia Espinheira
Author-X-Name-First: Patricia
Author-X-Name-Last: Espinheira
Author-Name: Silvia Ferrari
Author-X-Name-First: Silvia
Author-X-Name-Last: Ferrari
Author-Name: Francisco Cribari-Neto
Author-X-Name-First: Francisco
Author-X-Name-Last: Cribari-Neto
Title: On beta regression residuals
Abstract:
We propose two new residuals for the class of beta regression models, and
numerically evaluate their behaviour relative to the residuals proposed by
Ferrari and Cribari-Neto. Monte Carlo simulation results and empirical
applications using real and simulated data are provided. The results
favour one of the residuals we propose.
Journal: Journal of Applied Statistics
Pages: 407-419
Issue: 4
Volume: 35
Year: 2008
Keywords: beta distribution, beta regression, maximum likelihood estimation, proportions, residuals,
X-DOI: 10.1080/02664760701834931
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701834931
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:4:p:407-419
Template-Type: ReDIF-Article 1.0
Author-Name: Arup Ranjan Mukhopadhyay
Author-X-Name-First: Arup Ranjan
Author-X-Name-Last: Mukhopadhyay
Title: Multivariate attribute control chart using Mahalanobis D2 statistic
Abstract:
Process control involves repeated hypothesis testing based on several
samples. However, process control is not exactly hypothesis testing as
such since it deals with detection of non-random patterns of variation as
well in a fleeting kind of population. Compare this with hypothesis
testing which is principally meant for a stagnant population. Dr Walter A.
Shewhart introduced a graphic method for doing this testing in a fleeting
population in 1924. This graphic method came to be known as control chart
and is widely used throughout the world today for process management
purposes. Subsequently there was much advancement in process control
techniques. In particular, when more than one variable was involved,
process control techniques were developed mainly by Hicks (1955), Jackson
(1956 and 1959) and Montgomery and Wadsworth (1972) based on the
pioneering work of Hotelling in 1931. Most of them have worked in the area
of multivariate variable control chart with the underlying distribution as
multivariate normal. When more than one attribute variables are involved
some works relating to test of hypothesis was done by Mahalanobis (1946).
These works were also based on the Hotelling T2 test. This paper expands
the concept of 'Mahalanobis Distance' in case of a multinomial
distribution and thereby proposes a multivariate attribute control chart.
Journal: Journal of Applied Statistics
Pages: 421-429
Issue: 4
Volume: 35
Year: 2008
Keywords: Euclidean distance, Mahalanobis distance, multinomial distribution, correlation matrix, variance covariance matrix,
X-DOI: 10.1080/02664760701834980
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701834980
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:4:p:421-429
Template-Type: ReDIF-Article 1.0
Author-Name: Abdullah Almasri
Author-X-Name-First: Abdullah
Author-X-Name-Last: Almasri
Author-Name: Håkan Locking
Author-X-Name-First: Håkan
Author-X-Name-Last: Locking
Author-Name: Ghazi Shukur
Author-X-Name-First: Ghazi
Author-X-Name-Last: Shukur
Title: Testing for climate warming in Sweden during 1850-1999, using wavelets analysis
Abstract:
This paper describes an alternative approach for testing for the
existence of trend among time series. The test method has been constructed
using wavelet analysis which has the ability of decomposing a time series
into low frequencies (trend) and high-frequency (noise) components. Under
the normality assumption, the test is distributed as F. However, using
generated empirical critical values, the properties of the test statistic
have been investigated under different conditions and different types of
wavelet. The Harr wavelet has shown to exhibit the highest power among the
other wavelet types. The methodology here has been applied to real
temperature data in Sweden for the period 1850-1999. The results indicate
a significant increasing trend which agrees with the 'global warming'
hypothesis during the last 100 years.
Journal: Journal of Applied Statistics
Pages: 431-443
Issue: 4
Volume: 35
Year: 2008
Keywords: wavelet analysis, trend, global warming,
X-DOI: 10.1080/02664760701835011
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701835011
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:4:p:431-443
Template-Type: ReDIF-Article 1.0
Author-Name: Viswanathan Shankar
Author-X-Name-First: Viswanathan
Author-X-Name-Last: Shankar
Author-Name: Shrikant Bangdiwala
Author-X-Name-First: Shrikant
Author-X-Name-Last: Bangdiwala
Title: Behavior of agreement measures in the presence of zero cells and biased marginal distributions
Abstract:
Kappa and B assess agreement between two observers independently
classifying N units into k categories. We study their behavior under zero
cells in the contingency table and unbalanced asymmetric marginal
distributions. Zero cells arise when a cross-classification is never
endorsed by both observers; biased marginal distributions occur when some
categories are preferred differently between the observers. Simulations
studied the distributions of the unweighted and weighted statistics for
k=4, under fixed proportions of diagonal agreement and different patterns
off-diagonal, with various sample sizes, and under various zero cell count
scenarios. Marginal distributions were first uniform and homogeneous, and
then unbalanced asymmetric distributions. Results for unweighted kappa and
B statistics were comparable to work of Munoz and Bangdiwala, even with
zero cells. A slight increased variation was observed as the sample size
decreased. Weighted statistics did show greater variation as the number of
zero cells increased, with weighted kappa increasing substantially more
than weighted B. Under biased marginal distributions, weighted kappa with
Cicchetti weights were higher than with squared weights. Both statistics
for observer agreement behaved well under zero cells. The weighted B was
less variable than the weighted kappa under similar circumstances and
different weights. In general, B's performance and graphical
interpretation make it preferable to kappa under the studied scenarios.
Journal: Journal of Applied Statistics
Pages: 445-464
Issue: 4
Volume: 35
Year: 2008
Keywords: Cohen's kappa, Bangdiwala's B, observer bias, zero cell,
X-DOI: 10.1080/02664760701835052
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701835052
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:4:p:445-464
Template-Type: ReDIF-Article 1.0
Author-Name: Hsiuying Wang
Author-X-Name-First: Hsiuying
Author-X-Name-Last: Wang
Title: Ranking responses in multiple-choice questions
Abstract:
In many studies, the questionnaire is a common tool for surveying. There
are two kinds of questions designed: single-choice questions and
multiple-choice questions. For single-choice questions, the methodology
for analyzing it has been provided in the literature. However, the
analyses of multiple-choice questions are not established as in depth as
those for single-choice questions. Recently, there has been a lot of
literature published about testing the marginal independence between two
questions involving at least one multiple-choice question. However,
another important problem regarding this topic is to rank the responses in
a multiple-choice question. The issue is whether there are significant
differences in the popularity of particular responses within the same
question. In this paper, methodologies for ranking responses are proposed.
Journal: Journal of Applied Statistics
Pages: 465-474
Issue: 4
Volume: 35
Year: 2008
Keywords: single-choice question, multiple-choice question, survey, likelihood ratio test, Wald test, ranking consistency,
X-DOI: 10.1080/02664760801924533
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760801924533
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:4:p:465-474
Template-Type: ReDIF-Article 1.0
Author-Name: D. J. Best
Author-X-Name-First: D. J.
Author-X-Name-Last: Best
Author-Name: J. C. W. Rayner
Author-X-Name-First: J. C. W.
Author-X-Name-Last: Rayner
Author-Name: O. Thas
Author-X-Name-First: O.
Author-X-Name-Last: Thas
Title: X2 and its components as tests of normality for grouped data
Abstract:
We consider testing for an unobservable normal distribution with
unspecified mean and variance. It is only possible to observe the counts
in groups with boundaries specified before sighting the data. On the basis
of a small power study, we recommend the usual X2 test be used as an
omnibus test, augmented by informal examination of the first two non-zero
components of X2. We also recommend use of maximum likelihood and method
of moments estimation.
Journal: Journal of Applied Statistics
Pages: 481-492
Issue: 5
Volume: 35
Year: 2008
Keywords: critical values, improved grouped normal models, maximum-likelihood estimation, method of moments estimation, power study,
X-DOI: 10.1080/02664760701835219
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701835219
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:5:p:481-492
Template-Type: ReDIF-Article 1.0
Author-Name: Dilip Nachane
Author-X-Name-First: Dilip
Author-X-Name-Last: Nachane
Author-Name: Jose Clavel
Author-X-Name-First: Jose
Author-X-Name-Last: Clavel
Title: Forecasting interest rates: a comparative assessment of some second-generation nonlinear models
Abstract:
Modeling and forecasting of interest rates has traditionally proceeded in
the framework of linear stationary methods such as ARMA and VAR, but only
with moderate success. We examine here three methods, which account for
several specific features of the real world asset prices such as
nonstationarity and nonlinearity. Our three candidate methods are based,
respectively, on a combined wavelet artificial neural network (WANN)
analysis, a mixed spectrum (MS) analysis and nonlinear ARMA models with
Fourier coefficients (FNLARMA). These models are applied to weekly data on
interest rates in India and their forecasting performance is evaluated
vis-a-vis three GARCH models [GARCH (1,1), GARCH-M (1,1) and EGARCH (1,1)]
as well as the random walk model. Both the WANN and MS methods show marked
improvement over other benchmark models, and may thus hold out several
potentials for real world modeling and forecasting of financial data.
Journal: Journal of Applied Statistics
Pages: 493-514
Issue: 5
Volume: 35
Year: 2008
Keywords: interest rates, wavelets, artificial neural networks, mixed spectra, nonlinear ARMA, GARCH, forecast comparisons,
X-DOI: 10.1080/02664760701835243
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701835243
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:5:p:493-514
Template-Type: ReDIF-Article 1.0
Author-Name: Takafumi Isogai
Author-X-Name-First: Takafumi
Author-X-Name-Last: Isogai
Author-Name: Hiroaki Uchida
Author-X-Name-First: Hiroaki
Author-X-Name-Last: Uchida
Author-Name: Susumu Miyama
Author-X-Name-First: Susumu
Author-X-Name-Last: Miyama
Author-Name: Sadao Nishiyama
Author-X-Name-First: Sadao
Author-X-Name-Last: Nishiyama
Title: Statistical modeling of enamel rater value data
Abstract:
Enamel rater value (shortly, ERV) of a quick stress test is usually used
to evaluate the integrity of an organic coating for the inside of an
aluminum (denoted by Al shortly) can. A large positive value of ERV is
supposed to indicate the degree of imperfect coating coverage, i.e. the
size of an exposed Al area. An Al can filled with some drink, if there is
an exposed Al area due to imperfect coating coverage, has Al dissolution
brought by corrosion. Thus a smaller value of ERV is desirable to prevent
Al dissolution. However, quantitative evaluations of ERV data as well as
an accumulated quantity of Al dissolution have never been published,
because ERV is involved in complicated anode dissolution of an exposed Al
area. Recently our experimental study has found out a relationship between
ERV and sizes of exposed Al areas. This relationship enables us to
construct a descriptive statistical model for ERV data as well as to
evaluate coating effects for Al cans. Furthermore, empirical implications
suggest that an instantaneous quantity of Al dissolution is proportional
to ERV. Using this fact, we can derive a predictive statistical model for
an accumulated quantity of Al dissolution in an Al can.
Journal: Journal of Applied Statistics
Pages: 515-535
Issue: 5
Volume: 35
Year: 2008
Keywords: enamel rater value, aluminum cans, corrosion, aluminum dissolution, generalized gamma distribution, new power normal family,
X-DOI: 10.1080/02664760701835342
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701835342
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:5:p:515-535
Template-Type: ReDIF-Article 1.0
Author-Name: Stephen Duffy
Author-X-Name-First: Stephen
Author-X-Name-Last: Duffy
Author-Name: Anne-Helene Olsen
Author-X-Name-First: Anne-Helene
Author-X-Name-Last: Olsen
Author-Name: Rhian Gabe
Author-X-Name-First: Rhian
Author-X-Name-Last: Gabe
Author-Name: Laszlo Tabar
Author-X-Name-First: Laszlo
Author-X-Name-Last: Tabar
Author-Name: Jane Warwick
Author-X-Name-First: Jane
Author-X-Name-Last: Warwick
Author-Name: Hilary Fielder
Author-X-Name-First: Hilary
Author-X-Name-Last: Fielder
Author-Name: Laufey Tryggvadottir
Author-X-Name-First: Laufey
Author-X-Name-Last: Tryggvadottir
Author-Name: Olorunsola Agbaje
Author-X-Name-First: Olorunsola
Author-X-Name-Last: Agbaje
Title: Screening opportunity bias in case-control studies of cancer screening
Abstract:
In case-control evaluations of cancer screening, subjects who have died
from the cancer in question (cases) are compared with those who have not
(controls) with respect to screening histories. This method is subject to
a rather subtle bias, among others, whereby the cases have greater
opportunity to have been screened than the controls. In this paper, we
propose a method of correction for this bias. We demonstrate its use on
two case-control studies of mammographic screening for breast cancer.
Journal: Journal of Applied Statistics
Pages: 537-546
Issue: 5
Volume: 35
Year: 2008
Keywords: cancer screening, case-control study, opportunity bias,
X-DOI: 10.1080/02664760701835755
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701835755
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:5:p:537-546
Template-Type: ReDIF-Article 1.0
Author-Name: Steven Cook
Author-X-Name-First: Steven
Author-X-Name-Last: Cook
Title: The sensitivity of robust unit root tests
Abstract:
The power properties of the rank-based Dickey-Fuller (DF) unit root test
of Granger and Hallman [C. Granger and J. Hallman, Nonlinear
transformations of integrated time series, J. Time Ser. Anal. 12 (1991),
pp. 207-218] and the range unit root tests of Aparicio et al. [F.
Aparicio, A. Escribano, and A. Siplos, Range unit root (RUR) tests: Robust
against non-linearities, error distributions, structural breaks and
outliers, J. Time Ser. Anal. 27 (2006), pp. 545-576] are considered when
applied to near-integrated time series processes with differing initial
conditions. The results obtained show the empirical powers of the tests to
be generally robust to smaller deviations of the initial condition of the
time series from its underlying deterministic component, particularly for
more highly stationary processes. However, dramatic decreases in power are
observed when either the mean or variance of the deviation of the initial
condition is increased. The robustness of the rank- and range-based unit
root tests and their higher power results relative to the seminal DF test
have both been noted previously in the econometrics literature. These
results are questioned by the findings of the present paper.
Journal: Journal of Applied Statistics
Pages: 547-557
Issue: 5
Volume: 35
Year: 2008
Keywords: unit roots, range-based tests, range unit root tests, initial conditions, Monte Carlo simulation,
X-DOI: 10.1080/02664760701835797
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701835797
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:5:p:547-557
Template-Type: ReDIF-Article 1.0
Author-Name: Sat Gupta
Author-X-Name-First: Sat
Author-X-Name-Last: Gupta
Author-Name: Javid Shabbir
Author-X-Name-First: Javid
Author-X-Name-Last: Shabbir
Title: On improvement in estimating the population mean in simple random sampling
Abstract:
Kadilar and Cingi [Ratio estimators in simple random sampling, Appl.
Math. Comput. 151 (3) (2004), pp. 893-902] introduced some ratio-type
estimators of finite population mean under simple random sampling.
Recently, Kadilar and Cingi [New ratio estimators using correlation
coefficient, Interstat 4 (2006), pp. 1-11] have suggested another form of
ratio-type estimators by modifying the estimator developed by Singh and
Tailor [Use of known correlation coefficient in estimating the finite
population mean, Stat. Transit. 6 (2003), pp. 655-560]. Kadilar and Cingi
[Improvement in estimating the population mean in simple random sampling,
Appl. Math. Lett. 19 (1) (2006), pp. 75-79] have suggested yet another
class of ratio-type estimators by taking a weighted average of the two
known classes of estimators referenced above. In this article, we propose
an alternative form of ratio-type estimators which are better than the
competing ratio, regression, and other ratio-type estimators considered
here. The results are also supported by the analysis of three real data
sets that were considered by Kadilar and Cingi.
Journal: Journal of Applied Statistics
Pages: 559-566
Issue: 5
Volume: 35
Year: 2008
Keywords: ratio-type estimators, mean square error (MSE), transformation, efficiency,
X-DOI: 10.1080/02664760701835839
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701835839
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:5:p:559-566
Template-Type: ReDIF-Article 1.0
Author-Name: Adam Branscum
Author-X-Name-First: Adam
Author-X-Name-Last: Branscum
Author-Name: Timothy Hanson
Author-X-Name-First: Timothy
Author-X-Name-Last: Hanson
Author-Name: Ian Gardner
Author-X-Name-First: Ian
Author-X-Name-Last: Gardner
Title: Bayesian non-parametric models for regional prevalence estimation
Abstract:
We developed a flexible non-parametric Bayesian model for regional
disease-prevalence estimation based on cross-sectional data that are
obtained from several subpopulations or clusters such as villages, cities,
or herds. The subpopulation prevalences are modeled with a mixture
distribution that allows for zero prevalence. The distribution of
prevalences among diseased subpopulations is modeled as a mixture of
finite Polya trees. Inferences can be obtained for (1) the proportion of
diseased subpopulations in a region, (2) the distribution of regional
prevalences, (3) the mean and median prevalence in the region, (4) the
prevalence of any sampled subpopulation, and (5) predictive distributions
of prevalences for regional subpopulations not included in the study,
including the predictive probability of zero prevalence. We focus on
prevalence estimation using data from a single diagnostic test, but we
also briefly discuss the scenario where two conditionally dependent (or
independent) diagnostic tests are used. Simulated data demonstrate the
utility of our non-parametric model over parametric analysis. An example
involving brucellosis in cattle is presented.
Journal: Journal of Applied Statistics
Pages: 567-582
Issue: 5
Volume: 35
Year: 2008
Keywords: disease-prevalence estimation, Polya trees, prediction,
X-DOI: 10.1080/02664760701835862
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760701835862
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:5:p:567-582
Template-Type: ReDIF-Article 1.0
Author-Name: Bahadur Singh
Author-X-Name-First: Bahadur
Author-X-Name-Last: Singh
Author-Name: Susan Halabi
Author-X-Name-First: Susan
Author-X-Name-Last: Halabi
Author-Name: Michael Schell
Author-X-Name-First: Michael
Author-X-Name-Last: Schell
Title: Sample size selection in clinical trials when population means are subject to a partial order: one-sided ordered alternatives
Abstract:
The statistical methodology under order restriction is very mathematical
and complex. Thus, we provide a brief methodological background of
order-restricted likelihood ratio tests for the normal theoretical case
for the basic understanding of its applications, and relegate more
technical details to the appendices. For data analysis, algorithms for
computing the order-restricted estimates and computation of p-values are
described. A two-step procedure is presented for obtaining the sample size
in clinical trials when the minimum power, say 0.80 or 0.90 is specified,
and the normal means satisfy an order restriction. Using this approach
will result in reduction of 14-24% in the sample size required when
one-sided ordered alternatives are used, as illustrated by several
examples.
Journal: Journal of Applied Statistics
Pages: 583-600
Issue: 5
Volume: 35
Year: 2008
Keywords: likelihood ratio tests, minimum power, simple order, simple tree ordering, simple loop ordering, two-step procedure,
X-DOI: 10.1080/02664760801924780
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760801924780
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:5:p:583-600
Template-Type: ReDIF-Article 1.0
Author-Name: Teresa Alpuim
Author-X-Name-First: Teresa
Author-X-Name-Last: Alpuim
Author-Name: Abdel El-Shaarawi
Author-X-Name-First: Abdel
Author-X-Name-Last: El-Shaarawi
Title: On the efficiency of regression analysis with AR(p) errors
Abstract:
In this paper we will consider a linear regression model with the
sequence of error terms following an autoregressive stationary process.
The statistical properties of the maximum likelihood and least squares
estimators of the regression parameters will be summarized. Then, it will
be proved that, for some typical cases of the design matrix, both methods
produce asymptotically equivalent estimators. These estimators are also
asymptotically efficient. Such cases include the most commonly used models
to describe trend and seasonality like polynomial trends, dummy variables
and trigonometric polynomials. Further, a very convenient asymptotic
formula for the covariance matrix will be derived. It will be illustrated
through a brief simulation study that, for the simple linear trend model,
the result applies even for sample sizes as small as 20.
Journal: Journal of Applied Statistics
Pages: 717-737
Issue: 7
Volume: 35
Year: 2008
Keywords: linear regression, autoregressive stationary process, maximum likelihood, least squares, trend, seasonality, linear difference equation,
X-DOI: 10.1080/02664760600679775
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760600679775
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:7:p:717-737
Template-Type: ReDIF-Article 1.0
Author-Name: Philip Prescott
Author-X-Name-First: Philip
Author-X-Name-Last: Prescott
Author-Name: Norman Draper
Author-X-Name-First: Norman
Author-X-Name-Last: Draper
Title: D-optimal mixture component-amount designs for quadratic and cubic models
Abstract:
When the total amount of a mixture of ingredients needs to be taken into
account (in addition to the composition of its ingredients), an
experimental design requires several levels of the amount. Designs for
such situations are discussed, and D-optimal choices are made for fitting
quadratic and cubic models, for various numbers of experimental units.
Journal: Journal of Applied Statistics
Pages: 739-749
Issue: 7
Volume: 35
Year: 2008
Keywords: component amounts, D-optimality, mixtures, Scheffe models, Scheffe designs,
X-DOI: 10.1080/02664760801997133
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760801997133
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:7:p:739-749
Template-Type: ReDIF-Article 1.0
Author-Name: Simos Meintanis
Author-X-Name-First: Simos
Author-X-Name-Last: Meintanis
Title: New inference procedures for generalized Poisson distributions
Abstract:
A common feature for compound Poisson and Katz distributions is that both
families may be viewed as generalizations of the Poisson law. In this
paper, we present a unified approach in testing the fit to any
distribution belonging to either of these families. The test involves the
probability generating function, and it is shown to be consistent under
general alternatives. The asymptotic null distribution of the test
statistic is obtained, and an effective bootstrap procedure is employed in
order to investigate the performance of the proposed test with real and
simulated data. Comparisons with classical methods based on the empirical
distribution function are also included.
Journal: Journal of Applied Statistics
Pages: 751-762
Issue: 7
Volume: 35
Year: 2008
Keywords: empirical probability generating function, compound Poisson distribution, goodness-of-fit test, Katz laws,
X-DOI: 10.1080/02664760801997174
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760801997174
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:7:p:751-762
Template-Type: ReDIF-Article 1.0
Author-Name: Sueli Mingoti
Author-X-Name-First: Sueli
Author-X-Name-Last: Mingoti
Author-Name: Julia De Carvalho
Author-X-Name-First: Julia
Author-X-Name-Last: De Carvalho
Author-Name: Joab De Oliveira Lima
Author-X-Name-First: Joab
Author-X-Name-Last: De Oliveira Lima
Title: On the estimation of serial correlation in Markov-dependent production processes
Abstract:
In this paper, we present a study about the estimation of the serial
correlation for Markov chain models which is used often in the quality
control of autocorrelated processes. Two estimators, non-parametric and
multinomial, for the correlation coefficient are discussed. They are
compared with the maximum likelihood estimator [U.N. Bhat and R. Lal,
Attribute control charts for Markov dependent production process, IIE
Trans. 22 (2) (1990), pp. 181-188.] by using some theoretical facts and
the Monte Carlo simulation under several scenarios that consider large and
small correlations as well a range of fractions (p) of non-conforming
items. The theoretical results show that for any value of p≠0.5 and
processes with autocorrelation higher than 0.5, the multinomial is more
precise than maximum likelihood. However, the maximum likelihood is better
when the autocorrelation is smaller than 0.5. The estimators are similar
for p=0.5. Considering the average of all simulated scenarios, the
multinomial estimator presented lower mean error values and higher
precision, being, therefore, an alternative to estimate the serial
correlation. The performance of the non-parametric estimator was
reasonable only for correlation higher than 0.5, with some improvement for
p=0.5.
Journal: Journal of Applied Statistics
Pages: 763-771
Issue: 7
Volume: 35
Year: 2008
Keywords: Markov chain, serial correlation estimation, autocorrelated processes,
X-DOI: 10.1080/02664760802005688
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802005688
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:7:p:763-771
Template-Type: ReDIF-Article 1.0
Author-Name: Sigyn Mark
Author-X-Name-First: Sigyn
Author-X-Name-Last: Mark
Author-Name: Sture Holm
Author-X-Name-First: Sture
Author-X-Name-Last: Holm
Title: Test and prediction in factorial models with independent variance estimates
Abstract:
The multiple inference character of several tests in the same application
is usually taken into consideration by requiring that the tests have a
multiple level of significance. Also, a prediction problem in an
application with several possible predictor variables requires that the
multiple inference character of the problem be considered. This is not
being done in the methods commonly used to choose predictor variables.
Here, we discuss both the test and prediction methods in two-level
factorial designs and suggest a principle for choosing variables which is
based on multiple inference thinking. By an example use demonstrated that
the principle proposed leads to the use of fewer prediction variables than
does the Akaike method.
Journal: Journal of Applied Statistics
Pages: 773-782
Issue: 7
Volume: 35
Year: 2008
Keywords: prediction, multiple inference, factorial design, Akaike's method,
X-DOI: 10.1080/02664760802005852
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802005852
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:7:p:773-782
Template-Type: ReDIF-Article 1.0
Author-Name: Mahmoud Mahmoud
Author-X-Name-First: Mahmoud
Author-X-Name-Last: Mahmoud
Author-Name: William Woodall
Author-X-Name-First: William
Author-X-Name-Last: Woodall
Author-Name: Robert Davis
Author-X-Name-First: Robert
Author-X-Name-Last: Davis
Title: Performance comparison of some likelihood ratio-based statistical surveillance methods
Abstract:
Using Markov chain representations, we evaluate and compare the
performance of cumulative sum (CUSUM) and Shiryayev-Roberts methods in
terms of the zero- and steady-state average run length and worst-case
signal resistance measures. We also calculate the signal resistance values
from the worst- to the best-case scenarios for both the methods. Our
results support the recommendation that Shewhart limits be used with CUSUM
and Shiryayev-Roberts methods, especially for low values of the size of
the shift in the process mean for which the methods are designed to detect
optimally.
Journal: Journal of Applied Statistics
Pages: 783-798
Issue: 7
Volume: 35
Year: 2008
Keywords: CUSUM chart, likelihood ratio, Shiryayev-Roberts chart, Shewhart chart, statistical process control, statistical surveillance,
X-DOI: 10.1080/02664760802005878
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802005878
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:7:p:783-798
Template-Type: ReDIF-Article 1.0
Author-Name: Edward Boone
Author-X-Name-First: Edward
Author-X-Name-Last: Boone
Author-Name: Susan Simmons
Author-X-Name-First: Susan
Author-X-Name-Last: Simmons
Author-Name: Haikun Bao
Author-X-Name-First: Haikun
Author-X-Name-Last: Bao
Author-Name: Ann Stapleton
Author-X-Name-First: Ann
Author-X-Name-Last: Stapleton
Title: Bayesian hierarchical regression models for detecting QTLs in plant experiments
Abstract:
Quantitative trait loci (QTL) mapping is a growing field in statistical
genetics. In plants, QTL detection experiments often feature replicates or
clones within a specific genetic line. In this work, a Bayesian
hierarchical regression model is applied to simulated QTL data and to a
dataset from the Arabidopsis thaliana plants for locating the QTL mapping
associated with cotyledon opening. A conditional model search strategy
based on Bayesian model averaging is utilized to reduce the computational
burden.
Journal: Journal of Applied Statistics
Pages: 799-808
Issue: 7
Volume: 35
Year: 2008
Keywords: hierarchical models, Bayesian statistics, quantitative trait loci, Bayesian model averaging, recombinant inbred Lines,
X-DOI: 10.1080/02664760802005910
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802005910
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:7:p:799-808
Template-Type: ReDIF-Article 1.0
Author-Name: A. F. B. Costa
Author-X-Name-First: A. F. B.
Author-X-Name-Last: Costa
Author-Name: M. A. G. Machado
Author-X-Name-First: M. A. G.
Author-X-Name-Last: Machado
Title: Bivariate control charts with double sampling
Abstract:
In this article, we consider the T2 chart with double sampling to control
bivariate processes (BDS chart). During the first stage of the sampling,
n1 items of the sample are inspected and two quality characteristics (x;
y) are measured. If the Hotelling statistic [image omitted] for the
mean vector of (x; y) is less than w, the sampling is interrupted. If the
Hotelling statistic [image omitted] is greater than CL1, where
CL1>w, the control chart signals an out-of-control condition. If
[image omitted] , the sampling goes on to the second stage, where the
remaining n2 items of the sample are inspected and [image omitted]
for the mean vector of the whole sample is computed. During the second
stage of the sampling, the control chart signals an out-of-control
condition when the statistic [image omitted] is larger than CL2. A
comparative study shows that the BDS chart detects process disturbances
faster than the standard bivariate T2 chart and the adaptive bivariate T2
charts with variable sample size and/or variable sampling interval.
Journal: Journal of Applied Statistics
Pages: 809-822
Issue: 7
Volume: 35
Year: 2008
Keywords: the Hotelling statistic T-super-2, bivariate processes, double sampling,
X-DOI: 10.1080/02664760802061939
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802061939
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:7:p:809-822
Template-Type: ReDIF-Article 1.0
Author-Name: Andrew Robinson
Author-X-Name-First: Andrew
Author-X-Name-Last: Robinson
Title: BOOK REVIEW
Abstract:
Journal: Journal of Applied Statistics
Pages: 823-824
Issue: 7
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760802066615
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802066615
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:7:p:823-824
Template-Type: ReDIF-Article 1.0
Author-Name: Herve Cardot
Author-X-Name-First: Herve
Author-X-Name-Last: Cardot
Author-Name: Philippe Maisongrande
Author-X-Name-First: Philippe
Author-X-Name-Last: Maisongrande
Author-Name: Robert Faivre
Author-X-Name-First: Robert
Author-X-Name-Last: Faivre
Title: Varying-time random effects models for longitudinal data: unmixing and temporal interpolation of remote-sensing data
Abstract:
Remote sensing is a helpful tool for crop monitoring or vegetation-growth
estimation at a country or regional scale. However, satellite images
generally have to cope with a compromise between the time frequency of
observations and their resolution (i.e. pixel size). When concerned with
high temporal resolution, we have to work with information on the basis of
kilometric pixels, named mixed pixels, that represent aggregated responses
of multiple land cover. Disaggreggation or unmixing is then necessary to
downscale from the square kilometer to the local dynamic of each theme
(crop, wood, meadows, etc.). Assuming the land use is known, that is to
say the proportion of each theme within each mixed pixel, we propose to
address the downscaling issue through the generalization of varying-time
regression models for longitudinal data and/or functional data by
introducing random individual effects. The estimators are built by
expanding the mixed pixels trajectories with B-splines functions and
maximizing the log-likelihood with a backfitting-ECME algorithm. A BLUP
formula allows then to get the 'best possible' estimations of the local
temporal responses of each crop when observing mixed pixels trajectories.
We show that this model has many potential applications in remote sensing,
and an interesting one consists of coupling high and low spatial
resolution images in order to perform temporal interpolation of high
spatial resolution images (20 m), increasing the knowledge on
particular crops in very precise locations. The unmixing and temporal
high-resolution interpolation approaches are illustrated on remote-sensing
data obtained on the South-Western France during the year 2002.
Journal: Journal of Applied Statistics
Pages: 827-846
Issue: 8
Volume: 35
Year: 2008
Keywords: backfitting, BLUP, covariance function, downscaling, ECME, functional data, mixed effects, mixed pixels, splines, SPOT/VGT, SPOT/HRVIR, remote sensing,
X-DOI: 10.1080/02664760802061970
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802061970
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:8:p:827-846
Template-Type: ReDIF-Article 1.0
Author-Name: Cody Hamilton
Author-X-Name-First: Cody
Author-X-Name-Last: Hamilton
Author-Name: Tom Bratcher
Author-X-Name-First: Tom
Author-X-Name-Last: Bratcher
Author-Name: James Stamey
Author-X-Name-First: James
Author-X-Name-Last: Stamey
Title: Bayesian subset selection approach to ranking normal means
Abstract:
In this, article we consider a Bayesian approach to the problem of
ranking the means of normal distributed populations, which is a common
problem in the biological sciences. We use a decision-theoretic approach
with a straightforward loss function to determine a set of candidate
rankings. This loss function allows the researcher to balance the risk of
not including the correct ranking with the risk of increasing the number
of rankings selected. We apply our new procedure to an example regarding
the effect of zinc on the diversity of diatom species.
Journal: Journal of Applied Statistics
Pages: 847-851
Issue: 8
Volume: 35
Year: 2008
Keywords: ranking, multiple comparisons, posterior approximation, Gibbs sampler,
X-DOI: 10.1080/02664760802124174
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802124174
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:8:p:847-851
Template-Type: ReDIF-Article 1.0
Author-Name: Herbert Buning
Author-X-Name-First: Herbert
Author-X-Name-Last: Buning
Author-Name: Michael Rietz
Author-X-Name-First: Michael
Author-X-Name-Last: Rietz
Title: Adaptive bootstrap tests and its competitors in the c-sample scale problem
Abstract:
This paper deals with a study of different types of tests for the
two-sided c-sample scale problem. We consider the classical parametric
test of Bartlett [M.S. Bartlett, Properties of sufficiency and statistical
tests, Proc. R. Stat. Soc. Ser. A. 160 (1937), pp. 268-282] several
nonparametric tests, especially the test of Fligner and Killeen [M.A.
Fligner and T.J. Killeen, Distribution-free two-sample tests for scale, J.
Amer. Statist. Assoc. 71 (1976), pp. 210-213], the test of Levene [H.
Levene, Robust tests for equality of variances, in Contribution to
Probability and Statistics, I. Olkin, ed., Stanford University Press, Palo
Alto, 1960, pp. 278-292] and a robust version of it introduced by Brown
and Forsythe [M.B. Brown and A.B. Forsythe, Robust tests for the equality
of variances, J. Amer. Statist. Assoc. 69 (1974), pp. 364-367] as well as
two adaptive tests proposed by Buning [H. Buning, Adaptive tests for the
c-sample location problem - the case of two-sided alternatives, Comm.
Statist.Theory Methods. 25 (1996), pp. 1569-1582] and Buning [H. Buning,
An adaptive test for the two sample scale problem, Nr. 2003/10,
Diskussionsbeitrage des Fachbereich Wirtschaftswissenschaft der Freien
Universitat Berlin, Volkswirtschaftliche Reihe, 2003]. which are based on
the principle of Hogg [R.V. Hogg, Adaptive robust procedures. A partial
review and some suggestions for future applications and theory, J. Amer.
Statist. Assoc. 69 (1974), pp. 909-927]. For all the tests we use
Bootstrap sampling strategies, too. We compare via Monte Carlo Methods all
the tests by investigating level α and power β of the tests
for distributions with different strength of tailweight and skewness and
for various sample sizes. It turns out that the test of Fligner and
Killeen in combination with the bootstrap is the best one among all tests
considered.
Journal: Journal of Applied Statistics
Pages: 853-866
Issue: 8
Volume: 35
Year: 2008
Keywords: bootstrap, sampling strategies, parametric, nonparametric, robustified and adaptive tests, tailweight skewness, nonnormality, α-robustness, power comparison,
X-DOI: 10.1080/02664760802124257
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802124257
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:8:p:853-866
Template-Type: ReDIF-Article 1.0
Author-Name: P. S. Chan
Author-X-Name-First: P. S.
Author-X-Name-Last: Chan
Author-Name: H. K. T. Ng
Author-X-Name-First: H. K. T.
Author-X-Name-Last: Ng
Author-Name: N. Balakrishnan
Author-X-Name-First: N.
Author-X-Name-Last: Balakrishnan
Title: Statistical inference for start-up demonstration tests with rejection of units upon observing d failures
Abstract:
In this paper, we consider the statistical inference for the success
probability in the case of start-up demonstration tests in which rejection
of units is possible when a pre-fixed number of failures is observed
before the required number of consecutive successes are achieved for
acceptance of the unit. Since the expected value of the stopping time is
not a monotone function of the unknown parameter, the method of moments is
not useful in this situation. Therefore, we discuss two estimation methods
for the success probability: (1) the maximum likelihood estimation (MLE)
via the expectation-maximization (EM) algorithm and (2) Bayesian
estimation with a beta prior. We examine the small-sample properties of
the MLE and Bayesian estimator. Finally, we present an example to
illustrate the method of inference discussed here.
Journal: Journal of Applied Statistics
Pages: 867-878
Issue: 8
Volume: 35
Year: 2008
Keywords: start-up demonstration test, maximum likelihood estimator, EM-algorithm, runs, Bayesian estimation, probability generating function,
X-DOI: 10.1080/02664760802124455
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802124455
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:8:p:867-878
Template-Type: ReDIF-Article 1.0
Author-Name: Wen-Den Chen
Author-X-Name-First: Wen-Den
Author-X-Name-Last: Chen
Title: Detecting and identifying interventions with the Whittle spectral approach in a long memory panel data model
Abstract:
This article provides a procedure for the detection and identification of
outliers in the spectral domain where the Whittle maximum likelihood
estimator of the panel data model proposed by Chen [W.D. Chen, Testing for
spurious regression in a panel data model with the individual number and
time length growing, J. Appl. Stat. 33(88) (2006b), pp. 759-772] is
implemented. We extend the approach of Chang and co-workers [I. Chang,
G.C. Tiao, and C. Chen, Estimation of time series parameters in the
presence of outliers, Technometrics 30 (2) (1988), pp. 193-204] to the
spectral domain and through the Whittle approach we can quickly detect and
identify the type of outliers. A fixed effects panel data model is used,
in which the remainder disturbance is assumed to be a fractional
autoregressive integrated moving-average (ARFIMA) process and the
likelihood ratio criterion is obtained directly through the modified
inverse Fourier transform. This saves much time, especially when the
estimated model implements a huge data-set. Through Monte Carlo
experiments, the consistency of the estimator is examined by growing the
individual number N and time length T, in which the long memory remainder
disturbances are contaminated with two types of outliers: additive outlier
and innovation outlier. From the power tests, we see that the estimators
are quite successful and powerful. In the empirical study, we apply the
model on Taiwan's computer motherboard industry. Weekly data from 1
January 2000 to 31 October 2006 of nine familiar companies are used. The
proposed model has a smaller mean square error and shows more distinctive
aggressive properties than the raw data model does.
Journal: Journal of Applied Statistics
Pages: 879-892
Issue: 8
Volume: 35
Year: 2008
Keywords: long memory, intervention, additive outlier, innovation outlier, Whittle approach, spectral density function, panel data model,
X-DOI: 10.1080/02664760802125213
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802125213
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:8:p:879-892
Template-Type: ReDIF-Article 1.0
Author-Name: Gemechis Djira
Author-X-Name-First: Gemechis
Author-X-Name-Last: Djira
Author-Name: Volker Guiard
Author-X-Name-First: Volker
Author-X-Name-Last: Guiard
Author-Name: Frank Bretz
Author-X-Name-First: Frank
Author-X-Name-Last: Bretz
Title: Efficient and easy-to-use sample size formulas in ratio-based non-inferiority tests
Abstract:
In many biomedical applications, tests for the classical hypotheses based
on the difference of treatment means in a one-way layout can be replaced
by tests for ratios (or tests for relative changes). This approach is well
noted for its simplicity in defining the margins, as for example in tests
for non-inferiority. Here, we derive approximate and efficient sample size
formulas in a multiple testing situation and then thoroughly investigate
the relative performance of hypothesis testing based on the ratios of
treatment means when compared with differences of means. The results will
be illustrated with an example on simultaneous tests for non-inferiority.
Journal: Journal of Applied Statistics
Pages: 893-900
Issue: 8
Volume: 35
Year: 2008
Keywords: relative margin, sample size, multivariate t, normal approximation,
X-DOI: 10.1080/02664760802125544
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802125544
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:8:p:893-900
Template-Type: ReDIF-Article 1.0
Author-Name: Sermin Elevli
Author-X-Name-First: Sermin
Author-X-Name-Last: Elevli
Author-Name: Nevin Uzgoren
Author-X-Name-First: Nevin
Author-X-Name-Last: Uzgoren
Author-Name: Birol Elevli
Author-X-Name-First: Birol
Author-X-Name-Last: Elevli
Title: Correspondence analysis of repair data: a case study for electric cable shovels
Abstract:
In mining operation, effective maintenance scheduling is very important
because of its effect on the performance of equipment and production
costs. Classifying equipment on the basis of repair durations is
considered one of the essential works to schedule maintenance activities
effectively. In this study, repair data of electric cable shovels used in
the Western Coal Company, Turkey, has been analyzed using correspondence
analysis to classify shovels in terms of repair durations. Correspondence
analysis, particularly helpful in analysing cross-tabular data in the form
of numerical frequencies, has provided a graphical display that permitted
more rapid interpretation and understanding of the repair data. The
results indicated that there are five groups of shovels according to their
repair duration. Especially, shovels numbered 2, 3, 7, 10 and 11 required
a repair duration of<1 h and maintained relatively good service
condition when compared with others. Thus, priority might be given to
repair them in maintenance job scheduling even if there is another failed
shovel waiting to be serviced. This type of information will help mine
managers to increase the number of available shovels in operation.
Journal: Journal of Applied Statistics
Pages: 901-908
Issue: 8
Volume: 35
Year: 2008
Keywords: shovel, repair data, correspondence analysis,
X-DOI: 10.1080/02664760802125627
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802125627
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:8:p:901-908
Template-Type: ReDIF-Article 1.0
Author-Name: George Halkos
Author-X-Name-First: George
Author-X-Name-Last: Halkos
Author-Name: Ilias Kevork
Author-X-Name-First: Ilias
Author-X-Name-Last: Kevork
Title: A sequential procedure for testing the existence of a random walk model in finite samples
Abstract:
Given the random walk model, we show, for the traditional unrestricted
regression used in testing stationarity, that no matter what the initial
value of the random walk is or its drift or its error standard deviation,
the sampling distributions of certain statistics remain unchanged. Using
Monte Carlo simulations, we estimate, for different finite samples, the
sampling distributions of these statistics. After smoothing the
percentiles of the empirical sampling distributions, we come up with a new
set of critical values for testing the existence of a random walk, if each
statistic is being used on an individual base. Combining the new sets of
critical values, we finally suggest a general methodology for testing for
a random walk model.
Journal: Journal of Applied Statistics
Pages: 909-925
Issue: 8
Volume: 35
Year: 2008
Keywords: random walk, critical values, uncertainty,
X-DOI: 10.1080/02664760802185290
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802185290
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:8:p:909-925
Template-Type: ReDIF-Article 1.0
Author-Name: Donghoh Kim
Author-X-Name-First: Donghoh
Author-X-Name-Last: Kim
Author-Name: Youngjo Lee
Author-X-Name-First: Youngjo
Author-X-Name-Last: Lee
Author-Name: Hee-Seok Oh
Author-X-Name-First: Hee-Seok
Author-X-Name-Last: Oh
Title: A fast wavelet approach for recovering damaged images
Abstract:
A wavelet method is proposed for recovering damaged images. The proposed
method combines wavelet shrinkage with preprocessing based on a binning
process and an imputation procedure that is designed to extend the scope
of wavelet shrinkage to data with missing values and perturbed locations.
The proposed algorithm, termed as the BTW algorithm is simple to implement
and efficient for recovering an image. Furthermore, this algorithm can be
easily applied to wavelet regression for one-dimensional (1-D) signal
estimation with irregularly spaced data. Simulation studies and real
examples show that the proposed method can produce substantially effective
results.
Journal: Journal of Applied Statistics
Pages: 927-938
Issue: 8
Volume: 35
Year: 2008
Keywords: binning process, imputation, missing pixel, perturbed location, scattered data, wavelet shrinkage,
X-DOI: 10.1080/02664760802187478
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802187478
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:8:p:927-938
Template-Type: ReDIF-Article 1.0
Author-Name: Kepher Makambi
Author-X-Name-First: Kepher
Author-X-Name-Last: Makambi
Title: BOOK REVIEW
Abstract:
Journal: Journal of Applied Statistics
Pages: 939-940
Issue: 8
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760802066672
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802066672
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:8:p:939-940
Template-Type: ReDIF-Article 1.0
Author-Name: Hassan Bakouch
Author-X-Name-First: Hassan
Author-X-Name-Last: Bakouch
Title: BOOK REVIEW
Abstract:
Journal: Journal of Applied Statistics
Pages: 941-942
Issue: 8
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760802066714
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802066714
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:8:p:941-942
Template-Type: ReDIF-Article 1.0
Author-Name: David Wooff
Author-X-Name-First: David
Author-X-Name-Last: Wooff
Title: BOOK REVIEW
Abstract:
Journal: Journal of Applied Statistics
Pages: 943-944
Issue: 8
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760802187494
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802187494
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:8:p:943-944
Template-Type: ReDIF-Article 1.0
Author-Name: Robert Aykroyd
Author-X-Name-First: Robert
Author-X-Name-Last: Aykroyd
Title: Editorial
Abstract:
Journal: Journal of Applied Statistics
Pages: 945-946
Issue: 9
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760802373342
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802373342
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:9:p:945-946
Template-Type: ReDIF-Article 1.0
Author-Name: Sugnet Gardner-Lubbe
Author-X-Name-First: Sugnet
Author-X-Name-Last: Gardner-Lubbe
Author-Name: Niël Le Roux
Author-X-Name-First: Niël
Author-X-Name-Last: Le Roux
Author-Name: John Gowers
Author-X-Name-First: John
Author-X-Name-Last: Gowers
Title: Measures of fit in principal component and canonical variate analyses
Abstract:
Treating principal component analysis (PCA) and canonical variate
analysis (CVA) as methods for approximating tables, we develop measures,
collectively termed predictivity, that assess the quality of fit
independently for each variable and for all dimensionalities. We
illustrate their use with data from aircraft development, the African
timber industry and copper froth measurements from the mining industry.
Similar measures are described for assessing the predictivity associated
with the individual samples (in the case of PCA and CVA) or group means
(in the case of CVA). For these measures to be meaningful, certain
essential orthogonality conditions must hold that are shown to be
satisfied by predictivity.
Journal: Journal of Applied Statistics
Pages: 947-965
Issue: 9
Volume: 35
Year: 2008
Keywords: biplots, canonical variate analysis, measures of fit, prediction, principal component analysis,
X-DOI: 10.1080/02664760802185399
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802185399
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:9:p:947-965
Template-Type: ReDIF-Article 1.0
Author-Name: Edward Boone
Author-X-Name-First: Edward
Author-X-Name-Last: Boone
Author-Name: Bronson Bullock
Author-X-Name-First: Bronson
Author-X-Name-Last: Bullock
Title: Spatial correlation matrix selection using Bayesian model averaging to characterize inter-tree competition in loblolly pine trees
Abstract:
Many applications of statistical methods for data that are spatially
correlated require the researcher to specify the correlation structure of
the data. This can be a difficult task as there are many candidate
structures. Some spatial correlation structures depend on the distance
between the observed data points while others rely on neighborhood
structures. In this paper, Bayesian methods that systematically determine
the 'best' correlation structure from a predefined class of structures are
proposed. Bayes factors, Highest Probability Models, and Bayesian Model
Averaging are employed to determine the 'best' correlation structure and
to average across these structures to create a non-parametric alternative
structure for a loblolly pine data-set with known tree coordinates. Tree
diameters and heights were measured and an investigation into the spatial
dependence between the trees was conducted. Results showed that the most
probable model for the spatial correlation structure agreed with
allometric trends for loblolly pine. A combined Matern, simultaneous
autoregressive model and conditional autoregressive model best described
the inter-tree competition among the loblolly pine tree data considered in
this research.
Journal: Journal of Applied Statistics
Pages: 967-977
Issue: 9
Volume: 35
Year: 2008
Keywords: autocorrelation, Bayes factors, BMA, geostatistical models, lattice models, Pinus taeda,
X-DOI: 10.1080/02664760802185845
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802185845
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:9:p:967-977
Template-Type: ReDIF-Article 1.0
Author-Name: David Almorza
Author-X-Name-First: David
Author-X-Name-Last: Almorza
Author-Name: M. Hortensia Garcia
Author-X-Name-First: M.
Author-X-Name-Last: Hortensia Garcia
Title: Results of exploratory data analysis in the broken stick model
Abstract:
The broken stick model is a model of the abundance of species in a
habitat, and it has been widely extended. In this paper, we present
results from exploratory data analysis of this model. To obtain some of
the statistics, we formulate the broken stick model as a probability
distribution function based on the same model, and we provide an
expression for the cumulative distribution function, which is needed to
obtain the results from exploratory data analysis. The inequalities we
present are useful in ecological studies that apply broken stick models.
These results are also useful for testing the goodness of fit of the
broken stick model as an alternative to the chi square test, which has
often been the main test used. Therefore, these results may be used in
several alternative and complementary ways for testing the goodness of fit
of the broken stick model.
Journal: Journal of Applied Statistics
Pages: 979-983
Issue: 9
Volume: 35
Year: 2008
Keywords: broken stick model, exploratory data analysis,
X-DOI: 10.1080/02664760802187536
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802187536
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:9:p:979-983
Template-Type: ReDIF-Article 1.0
Author-Name: Claudia Lautensack
Author-X-Name-First: Claudia
Author-X-Name-Last: Lautensack
Title: Fitting three-dimensional Laguerre tessellations to foam structures
Abstract:
Foam models, especially random tessellations, are powerful tools to study
the relations between the geometric structure of foams and their physical
properties. In this paper, we propose the use of random Laguerre
tessellations, weighted versions of the well-known Voronoi tessellations,
as models for the microstructure of foams. Based on geometric
characteristics estimated from a tomographic image of a closed-cell
polymer foam, we fit a Laguerre tessellation model to the material. It is
shown that this model allows for a better fit of the geometric structure
of the foam than some classical Voronoi tessellation models.
Journal: Journal of Applied Statistics
Pages: 985-995
Issue: 9
Volume: 35
Year: 2008
Keywords: cell characteristics, closed foam, foam model, Laguerre tessellation, random tessellation, 3D image, volume image,
X-DOI: 10.1080/02664760802188112
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802188112
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:9:p:985-995
Template-Type: ReDIF-Article 1.0
Author-Name: Carmen Armero
Author-X-Name-First: Carmen
Author-X-Name-Last: Armero
Author-Name: Antonio Lopez-Quilez
Author-X-Name-First: Antonio
Author-X-Name-Last: Lopez-Quilez
Author-Name: Rut Lopez-Sanchez
Author-X-Name-First: Rut
Author-X-Name-Last: Lopez-Sanchez
Title: Bayesian assessment of times to diagnosis in breast cancer screening
Abstract:
Breast cancer is one of the diseases with the most profound impact on
health in developed countries and mammography is the most popular method
for detecting breast cancer at a very early stage. This paper focuses on
the waiting period from a positive mammogram until a confirmatory
diagnosis is carried out in hospital. Generalized linear mixed models are
used to perform the statistical analysis, always within the Bayesian
reasoning. Markov chain Monte Carlo algorithms are applied for estimation
by simulating the posterior distribution of the parameters and
hyperparameters of the model through the free software WinBUGS.
Journal: Journal of Applied Statistics
Pages: 997-1009
Issue: 9
Volume: 35
Year: 2008
Keywords: Bayesian statistics, breast cancer screening program, generalized linear mixed models, Markov Chain Monte Carlo,
X-DOI: 10.1080/02664760802191397
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802191397
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:9:p:997-1009
Template-Type: ReDIF-Article 1.0
Author-Name: Nicolas Bousquet
Author-X-Name-First: Nicolas
Author-X-Name-Last: Bousquet
Title: Diagnostics of prior-data agreement in applied Bayesian analysis
Abstract:
This article focused on the definition and the study of a binary Bayesian
criterion which measures a statistical agreement between a subjective
prior and data information. The setting of this work is concrete Bayesian
studies. It is an alternative and a complementary tool to the method
recently proposed by Evans and Moshonov, [M. Evans and H. Moshonov,
Checking for Prior-data conflict, Bayesian Anal. 1 (2006), pp. 893-914].
Both methods try to help the work of the Bayesian analyst, from
preliminary to the posterior computation. Our criterion is defined as a
ratio of Kullback-Leibler divergences; two of its main features are to
make easy the check of a hierarchical prior and be used as a default
calibration tool to obtain flat but proper priors in applications.
Discrete and continuous distributions exemplify the approach and an
industrial case study in reliability, involving the Weibull distribution,
is highlighted.
Journal: Journal of Applied Statistics
Pages: 1011-1029
Issue: 9
Volume: 35
Year: 2008
Keywords: prior-data conflict, expert opinion, subjective prior, objective prior, Kullback-Leibler diver-gence, discrete distributions, lifetime distributions,
X-DOI: 10.1080/02664760802192981
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802192981
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:9:p:1011-1029
Template-Type: ReDIF-Article 1.0
Author-Name: A. Georgievska
Author-X-Name-First: A.
Author-X-Name-Last: Georgievska
Author-Name: L. Georgievska
Author-X-Name-First: L.
Author-X-Name-Last: Georgievska
Author-Name: A. Stojanovic
Author-X-Name-First: A.
Author-X-Name-Last: Stojanovic
Author-Name: N. Todorovic
Author-X-Name-First: N.
Author-X-Name-Last: Todorovic
Title: Sovereign rescheduling probabilities in emerging markets: a comparison with credit rating agencies' ratings
Abstract:
This study estimates default probabilities of 124 emerging countries from
1981 to 2002 as a function of a set of macroeconomic and political
variables. The estimated probabilities are then compared with the default
rates implied by sovereign credit ratings of three major international
credit rating agencies (CRAs) - Moody's Investor's Service, Standard &
Poor's and Fitch Ratings. Sovereign debt default probabilities are used by
investors in pricing sovereign bonds and loans as well as in determining
country risk exposure. The study finds that CRAs usually underestimate the
risk of sovereign debt as the sovereign credit ratings from rating
agencies are usually too optimistic.
Journal: Journal of Applied Statistics
Pages: 1031-1051
Issue: 9
Volume: 35
Year: 2008
Keywords: sovereign debt, default probabilities, credit rating agencies, credit ratings,
X-DOI: 10.1080/02664760802193112
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802193112
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:9:p:1031-1051
Template-Type: ReDIF-Article 1.0
Author-Name: Cristina Rueda-Sabater
Author-X-Name-First: Cristina
Author-X-Name-Last: Rueda-Sabater
Author-Name: Pedro Alvarez-Esteban
Author-X-Name-First: Pedro
Author-X-Name-Last: Alvarez-Esteban
Title: The analysis of age-specific fertility patterns via logistic models
Abstract:
In this paper, we introduce logistic models to analyse fertility curves.
The models are formulated as linear models of the log odds of fertility
and are defined in terms of parameters that are interpreted as measures of
level, location and shape of the fertility schedule. This parameterization
is useful for the evaluation, and interpretation of fertility trends and
projections of future period fertility. For a series of years, the
proposed models admit a state-space formulation that allows a coherent
joint estimation of parameters and forecasting. The main features of the
models compared with other alternatives are the functional simplicity, the
flexibility, and the interpretability of the parameters. These and other
features are analysed in this paper using examples and theoretical
results. Data from different countries are analysed, and to validate the
logistic approach, we compare the goodness of fit of the new model against
well-known alternatives; the analysis gives superior results in most
developed countries.
Journal: Journal of Applied Statistics
Pages: 1053-1070
Issue: 9
Volume: 35
Year: 2008
Keywords: logistic model, fertility schedule, state-space model, maximum-likelihood estimation, Tempo, quantum,
X-DOI: 10.1080/02664760802192999
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802192999
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:9:p:1053-1070
Template-Type: ReDIF-Article 1.0
Author-Name: Klara Goethals
Author-X-Name-First: Klara
Author-X-Name-Last: Goethals
Author-Name: Paul Janssen
Author-X-Name-First: Paul
Author-X-Name-Last: Janssen
Author-Name: Luc Duchateau
Author-X-Name-First: Luc
Author-X-Name-Last: Duchateau
Title: Frailty models and copulas: similarities and differences
Abstract:
Copulas and frailty models are important tools to model bivariate
survival data. Equivalence between Archimedean copula models and shared
frailty models, e.g. between the Clayton-Oakes copula model and the shared
gamma frailty model, has often been claimed in the literature. In this
note we show that, in both the models, there is indeed a well-known
equivalence between the copula functions; the modeling of the marginal
survival functions, however, is quite different. The latter fact leads to
different joint survival functions.
Journal: Journal of Applied Statistics
Pages: 1071-1079
Issue: 9
Volume: 35
Year: 2008
Keywords: bivariate survival data, Clayton-Oakes copula, positive stable frailty, shared gamma frailty model,
X-DOI: 10.1080/02664760802271389
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802271389
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:9:p:1071-1079
Template-Type: ReDIF-Article 1.0
Author-Name: Honghu Liu
Author-X-Name-First: Honghu
Author-X-Name-Last: Liu
Author-Name: Yan Zheng
Author-X-Name-First: Yan
Author-X-Name-Last: Zheng
Author-Name: Jie Shen
Author-X-Name-First: Jie
Author-X-Name-Last: Shen
Title: Goodness-of-fit measures of R2 for repeated measures mixed effect models
Abstract:
Linear mixed effects model (LMEM) is efficient in modeling repeated
measures longitudinal data. However, little research has been done in
developing goodness-of-fit measures that can evaluate the models,
particularly those that can be interpreted in an absolute sense without
referencing a null model. This paper proposes three coefficient of
determination (R2) as goodness-of-fit measures for LMEM with repeated
measures longitudinal data. Theorems are presented describing the
properties of R2 and relationships between the R2 statistics. A simulation
study was conducted to evaluate and compare the R2 along with other
criteria from literature. Finally, we applied the proposed R2 to a real
virologic response data of an HIV-patient cohort. We conclude that our
proposed R2 statistics have more advantages than other goodness-of-fit
measures in the literature, in terms of robustness to sample size,
intuitive interpretation, well-defined range, and unnecessary to determine
a null model.
Journal: Journal of Applied Statistics
Pages: 1081-1092
Issue: 10
Volume: 35
Year: 2008
Keywords: repeated measures, R-square, linear mixed effects model, fixed effects, random effects, simulation,
X-DOI: 10.1080/02664760802124422
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802124422
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:10:p:1081-1092
Template-Type: ReDIF-Article 1.0
Author-Name: Lawrence Raffalovich
Author-X-Name-First: Lawrence
Author-X-Name-Last: Raffalovich
Author-Name: Glenn Deane
Author-X-Name-First: Glenn
Author-X-Name-Last: Deane
Author-Name: David Armstrong
Author-X-Name-First: David
Author-X-Name-Last: Armstrong
Author-Name: Hui-Shien Tsao
Author-X-Name-First: Hui-Shien
Author-X-Name-Last: Tsao
Title: Model selection procedures in social research: Monte-Carlo simulation results
Abstract:
Model selection strategies play an important, if not explicit, role in
quantitative research. The inferential properties of these strategies are
largely unknown, therefore, there is little basis for recommending (or
avoiding) any particular set of strategies. In this paper, we evaluate
several commonly used model selection procedures [Bayesian information
criterion (BIC), adjusted R2, Mallows' Cp, Akaike information criteria
(AIC), AICc, and stepwise regression] using Monte-Carlo simulation of
model selection when the true data generating processes (DGP) are known.
We find that the ability of these selection procedures to include
important variables and exclude irrelevant variables increases with the
size of the sample and decreases with the amount of noise in the model.
None of the model selection procedures do well in small samples, even when
the true DGP is largely deterministic; thus, data mining in small samples
should be avoided entirely. Instead, the implicit uncertainty in model
specification should be explicitly discussed. In large samples, BIC is
better than the other procedures at correctly identifying most of the
generating processes we simulated, and stepwise does almost as well. In
the absence of strong theory, both BIC and stepwise appear to be
reasonable model selection strategies in large samples. Under the
conditions simulated, adjusted R2, Mallows' Cp AIC, and AICc are clearly
inferior and should be avoided.
Journal: Journal of Applied Statistics
Pages: 1093-1114
Issue: 10
Volume: 35
Year: 2008
Keywords: model selection, BIC, AIC, stepwise regression,
X-DOI: 10.1080/03081070802203959
File-URL: http://www.tandfonline.com/doi/abs/10.1080/03081070802203959
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:10:p:1093-1114
Template-Type: ReDIF-Article 1.0
Author-Name: Donatella Vicari
Author-X-Name-First: Donatella
Author-X-Name-Last: Vicari
Author-Name: Johan Rene Van Dorp
Author-X-Name-First: Johan Rene
Author-X-Name-Last: Van Dorp
Author-Name: Samuel Kotz
Author-X-Name-First: Samuel
Author-X-Name-Last: Kotz
Title: Two-sided generalized Topp and Leone (TS-GTL) distributions
Abstract:
Over 50 years ago, in a 1955 issue of JASA, a paper on a bounded
continuous distribution by Topp and Leone [C.W. Topp and F.C. Leone, A
family of J-shaped frequency functions, J. Am. Stat. Assoc. 50(269)
(1955), pp. 209-219] appeared (the subject was dormant for over 40 years
but recently the family was resurrected). Here, we shall investigate the
so-called Two-Sided Generalized Topp and Leone (TS-GTL) distributions.
This family of distributions is constructed by extending the Generalized
Two-Sided Power (GTSP) family to a new two-sided framework of
distributions, where the first (second) branch arises from the
distribution of the largest (smallest) order statistic. The TS-GTL
distribution is generated from this framework by sampling from a slope
(reflected slope) distribution for the first (second) branch. The
resulting five-parameter TS-GTL family of distributions turns out to be
flexible, encompassing the uniform, triangular, GTSP and two-sided slope
distributions into a single family. In addition, the probability density
functions may have bimodal shapes or admitting shapes with a jump
discontinuity at the 'threshold' parameter. We will discuss some
properties of the TS-GTL family and describe a maximum likelihood
estimation (MLE) procedure. A numerical example of the MLE procedure is
provided by means of a bimodal Galaxy M87 data set concerning V-I color
indices of 80 globular clusters. A comparison with a Gaussian mixture fit
is presented.
Journal: Journal of Applied Statistics
Pages: 1115-1129
Issue: 10
Volume: 35
Year: 2008
Keywords: bimodal distribution, maximum likelihood estimation, order statistics,
X-DOI: 10.1080/02664760802230583
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802230583
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:10:p:1115-1129
Template-Type: ReDIF-Article 1.0
Author-Name: Terence Mills
Author-X-Name-First: Terence
Author-X-Name-Last: Mills
Title: Predicting body fat using weight-height indices
Abstract:
While body fat is the most accurate measure of obesity, its measurement
requires special equipment that can be costly and time consuming to
operate. Attention has thus typically focused on the easier to calculate
body mass index (BMI). However, the ability of BMI to accurately identify
obesity has been increasingly questioned. This paper focuses attention on
whether more general body mass indices are appropriate measures of body
fat. Using a data set of body fat, height, and weight measurements,
general models are estimated which nest a wide variety of weight-height
indices as special cases. In the absence of a race and gender
categorisation, the conventional BMI was found to be the appropriate index
with which to predict body fat. When such a categorisation was made,
however, the BMI was never selected as the appropriate index. In general,
predicted female body fat was some 10 kg higher than that of a male
of identical build and predicted % body fat was over 11 percentage points
higher, but age effects were smaller for females. Considerable racial
differences in predicted body fat were found for males, but such
differences were less marked for females. The implications of this finding
for interpreting recent research on the effect of obesity on health,
society, and economic factors are considered.
Journal: Journal of Applied Statistics
Pages: 1131-1138
Issue: 10
Volume: 35
Year: 2008
Keywords: body fat, BMI, height-weight indices, obesity,
X-DOI: 10.1080/02664760802264707
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802264707
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:10:p:1131-1138
Template-Type: ReDIF-Article 1.0
Author-Name: Shuo-Jye Wu
Author-X-Name-First: Shuo-Jye
Author-X-Name-Last: Wu
Title: Estimation of the two-parameter bathtub-shaped lifetime distribution with progressive censoring
Abstract:
In this paper, we investigate the estimation problem concerning a
progressively type-II censored sample from the two-parameter
bathtub-shaped lifetime distribution. We use the maximum likelihood method
to obtain the point estimators of the parameters. We also provide a method
for constructing an exact confidence interval and an exact joint
confidence region for the parameters. Two numerical examples are presented
to illustrate the method of inference developed here. Finally, Monte Carlo
simulation studies are used to assess the performance of our proposed
method.
Journal: Journal of Applied Statistics
Pages: 1139-1150
Issue: 10
Volume: 35
Year: 2008
Keywords: confidence interval, hazard function, joint confidence region, maximum likelihood estimator, pivot, progressive type-II censoring,
X-DOI: 10.1080/02664760802264996
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802264996
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:10:p:1139-1150
Template-Type: ReDIF-Article 1.0
Author-Name: Subburaj Ramasamy
Author-X-Name-First: Subburaj
Author-X-Name-Last: Ramasamy
Author-Name: Gopal Govindasamy
Author-X-Name-First: Gopal
Author-X-Name-Last: Govindasamy
Title: A software reliability growth model addressing learning
Abstract:
Goel proposed generalization of the Goel-Okumoto (G-O) software
reliability growth model (SRGM), in order to model the failure intensity
function, i.e. the rate of occurrence of failures (ROCOF) that initially
increases and then decreases (I/D), which occurs in many projects due to
the learning phenomenon of the testing team and a few other causes. The
ROCOF of the generalized non-homogenous poisson process (NHPP) model can
be expressed in the same mathematical form as that of a two-parameter
Weibull function. However, this SRGM is susceptible to wide fluctuations
in time between failures and sometimes it seems unable to recognize the
I/D pattern of ROCOF present in the datasets and hence does not adequately
describe such data. The authors therefore propose a shifted Weibull
function ROCOF instead for the generalized NHPP model. This modification
to the Goel-generalized NHPP model results in an SRGM that seems to
perform better consistently, as confirmed by the goodness of fit statistic
and predictive validity metrics, when applied to failure datasets of 11
software projects with widely varying characteristics. A case study on
software release time determination using the proposed SRGM is also given.
Journal: Journal of Applied Statistics
Pages: 1151-1168
Issue: 10
Volume: 35
Year: 2008
Keywords: failure intensity function, goodness of fit statistic, mean value function, NHPP model, predictive validity, SRGM,
X-DOI: 10.1080/02664760802270621
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802270621
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:10:p:1151-1168
Template-Type: ReDIF-Article 1.0
Author-Name: Stavros Degiannakis
Author-X-Name-First: Stavros
Author-X-Name-Last: Degiannakis
Title: ARFIMAX and ARFIMAX-TARCH realized volatility modeling
Abstract:
ARFIMAX models are applied in estimating the intra-day realized
volatility of the CAC40 and DAX30 indices. Volatility clustering and
asymmetry characterize the logarithmic realized volatility of both the
indices. The ARFIMAX model with time-varying conditional
heteroskedasticity is the best performing specification and, at least in
the case of DAX30, provides statistically superior next trading day's
realized volatility forecasts.
Journal: Journal of Applied Statistics
Pages: 1169-1180
Issue: 10
Volume: 35
Year: 2008
Keywords: ARFIMAX, realized volatility, TARCH, volatility forecasting,
X-DOI: 10.1080/02664760802271017
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802271017
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:10:p:1169-1180
Template-Type: ReDIF-Article 1.0
Author-Name: Lin-An Chen
Author-X-Name-First: Lin-An
Author-X-Name-Last: Chen
Author-Name: Hsien-Chueh Peter Yang
Author-X-Name-First: Hsien-Chueh
Author-X-Name-Last: Peter Yang
Author-Name: Chau-Shyun Tang
Author-X-Name-First: Chau-Shyun
Author-X-Name-Last: Tang
Title: Mode type quasi-range and its applications
Abstract:
Building from the consideration of closeness, we propose the mode
quasi-range as an alternative scale parameter. Application of this scale
parameter to formulate the population standard deviation is investigated
leading to an efficient sample estimator of standard deviation from the
point of asymptotic variance. Monte Carlo studies, in terms of finite
sample efficiency and robustness of breakdown point, have been performed
for the sample mode quasi-range. This study reveals that this closeness
consideration-based mode, quasi-range, is satisfactory because these
statistical procedures based on it are efficient and are less misleading
for drawing conclusion from the sample results.
Journal: Journal of Applied Statistics
Pages: 1181-1192
Issue: 10
Volume: 35
Year: 2008
Keywords: breakdown point, range, robustness, quasi-range, scale parameter,
X-DOI: 10.1080/02664760802271082
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802271082
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:10:p:1181-1192
Template-Type: ReDIF-Article 1.0
Author-Name: Abbas Moghimbeigi
Author-X-Name-First: Abbas
Author-X-Name-Last: Moghimbeigi
Author-Name: Mohammed Reza Eshraghian
Author-X-Name-First: Mohammed Reza
Author-X-Name-Last: Eshraghian
Author-Name: Kazem Mohammad
Author-X-Name-First: Kazem
Author-X-Name-Last: Mohammad
Author-Name: Brian Mcardle
Author-X-Name-First: Brian
Author-X-Name-Last: Mcardle
Title: Multilevel zero-inflated negative binomial regression modeling for over-dispersed count data with extra zeros
Abstract:
Count data with excess zeros often occurs in areas such as public health,
epidemiology, psychology, sociology, engineering, and agriculture.
Zero-inflated Poisson (ZIP) regression and zero-inflated negative binomial
(ZINB) regression are useful for modeling such data, but because of
hierarchical study design or the data collection procedure, zero-inflation
and correlation may occur simultaneously. To overcome these challenges ZIP
or ZINB may still be used. In this paper, multilevel ZINB regression is
used to overcome these problems. The method of parameter estimation is an
expectation-maximization algorithm in conjunction with the penalized
likelihood and restricted maximum likelihood estimates for variance
components. Alternative modeling strategies, namely the ZIP distribution
are also considered. An application of the proposed model is shown on
decayed, missing, and filled teeth of children aged 12 years old.
Journal: Journal of Applied Statistics
Pages: 1193-1202
Issue: 10
Volume: 35
Year: 2008
Keywords: count data, EM algorithm, multilevel, negative binomial regression, Poisson regression, zero-inflation,
X-DOI: 10.1080/02664760802273203
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802273203
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:10:p:1193-1202
Template-Type: ReDIF-Article 1.0
Author-Name: Zeinab Amin
Author-X-Name-First: Zeinab
Author-X-Name-Last: Amin
Title: Bayesian inference for the Pareto lifetime model under progressive censoring with binomial removals
Abstract:
This paper considers the estimation and prediction problems when
lifetimes are Pareto-distributed and are collected under Type II
progressive censoring with random removals, where the number of units
removed at each failure time follows a Binomial distribution. The analysis
is carried out within the Bayesian context.
Journal: Journal of Applied Statistics
Pages: 1203-1217
Issue: 11
Volume: 35
Year: 2008
Keywords: Bayesian estimation, Bayesian prediction, Gibbs sampling, missing data, natural conjugate prior, non-informative prior, Pareto distribution, progressive censoring, Type II censoring, total test time remaining,
X-DOI: 10.1080/09537280802187634
File-URL: http://www.tandfonline.com/doi/abs/10.1080/09537280802187634
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:11:p:1203-1217
Template-Type: ReDIF-Article 1.0
Author-Name: Eun Sug Park
Author-X-Name-First: Eun Sug
Author-X-Name-Last: Park
Author-Name: Roger Smith
Author-X-Name-First: Roger
Author-X-Name-Last: Smith
Author-Name: Thomas Freeman
Author-X-Name-First: Thomas
Author-X-Name-Last: Freeman
Author-Name: Clifford Spiegelman
Author-X-Name-First: Clifford
Author-X-Name-Last: Spiegelman
Title: A Bayesian approach for improved pavement performance prediction
Abstract:
We present a method for predicting future pavement distresses such as
longitudinal cracking. These predicted distress values are used to plan
road repairs. Large inherent variability in measured cracking and an
extremely small number of observations are the nature of the pavement
cracking data, which calls for a parametric Bayesian approach. We model
theoretical pavement distress with a sigmoidal equation with coefficients
based on prior engineering knowledge. We show that a Bayesian formulation
akin to Kalman filtering gives sensible predictions and provides
defendable uncertainty statements for predictions. The method is
demonstrated on data collected by the Texas Transportation Institute at
several sites in Texas. The predictions behave in a reasonable and
statistically valid manner.
Journal: Journal of Applied Statistics
Pages: 1219-1238
Issue: 11
Volume: 35
Year: 2008
Keywords: pavement management information system, Bayesian adjustment, state-space models, Kalman filtering, Markov chain Monte Carlo,
X-DOI: 10.1080/02664760802318651
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802318651
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:11:p:1219-1238
Template-Type: ReDIF-Article 1.0
Author-Name: V. G. Cancho
Author-X-Name-First: V. G.
Author-X-Name-Last: Cancho
Author-Name: Reiko Aoki
Author-X-Name-First: Reiko
Author-X-Name-Last: Aoki
Author-Name: V. H. Lachos
Author-X-Name-First: V. H.
Author-X-Name-Last: Lachos
Title: Bayesian analysis for a skew extension of the multivariate null intercept measurement error model
Abstract:
Skew-normal distribution is a class of distributions that includes the
normal distributions as a special case. In this paper, we explore the use
of Markov Chain Monte Carlo (MCMC) methods to develop a Bayesian analysis
in a multivariate, null intercept, measurement error model [R. Aoki, H.
Bolfarine, J.A. Achcar, and D. Leao Pinto Jr, Bayesian analysis of a
multivariate null intercept error-in-variables regression model, J.
Biopharm. Stat. 13(4) (2003b), pp. 763-771] where the unobserved value of
the covariate (latent variable) follows a skew-normal distribution. The
results and methods are applied to a real dental clinical trial presented
in [A. Hadgu and G. Koch, Application of generalized estimating equations
to a dental randomized clinical trial, J. Biopharm. Stat. 9 (1999), pp.
161-178].
Journal: Journal of Applied Statistics
Pages: 1239-1251
Issue: 11
Volume: 35
Year: 2008
Keywords: Skew-normal distribution, Gibbs algorithm, skewness, multivariate null intercepts model, measurement error,
X-DOI: 10.1080/02664760802319667
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802319667
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:11:p:1239-1251
Template-Type: ReDIF-Article 1.0
Author-Name: M. D. Ugarte
Author-X-Name-First: M. D.
Author-X-Name-Last: Ugarte
Author-Name: A. F. Militino
Author-X-Name-First: A. F.
Author-X-Name-Last: Militino
Author-Name: T. Goicoa
Author-X-Name-First: T.
Author-X-Name-Last: Goicoa
Title: Adjusting economic estimates in business surveys
Abstract:
Statistics for small areas within larger regions are recently required
for many economic variables. However, when adding the estimates of the
small areas within the larger regions, the results do not match up to
those obtained with the appropriate estimator originally derived for the
larger region. To avoid discrepancies between estimates benchmarking
methods are commonly used in practice. In this paper, we discuss the
suitability of using a restricted predictor versus a traditional direct
calibrated estimator. The results are illustrated with the 2000 Business
Survey of the Basque Country, Spain.
Journal: Journal of Applied Statistics
Pages: 1253-1265
Issue: 11
Volume: 35
Year: 2008
Keywords: benchmarking, restricted predictor, prorating estimator, linear mixed model, EBLUP, synthetic,
X-DOI: 10.1080/02664760802319709
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802319709
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:11:p:1253-1265
Template-Type: ReDIF-Article 1.0
Author-Name: Zhang Wu
Author-X-Name-First: Zhang
Author-X-Name-Last: Wu
Author-Name: Jianxin Jiao
Author-X-Name-First: Jianxin
Author-X-Name-Last: Jiao
Author-Name: Ying Liu
Author-X-Name-First: Ying
Author-X-Name-Last: Liu
Title: A binomial CUSUM chart for detecting large shifts in fraction nonconforming
Abstract:
This article studies a unique feature of the binomial CUSUM chart in
which the difference (dt-d0) is replaced by (dt-d0)2 in the formulation of
the cumulative sum Ct (where dt and d0 are the actual and in-control
numbers of nonconforming units, respectively, in a sample). Performance
studies are reported and the results reveal that this new feature is able
to increase the detection effectiveness when fraction nonconforming p
becomes three to four times as large as the in-control value p0. The
design of the new binomial CUSUM chart is presented along with the
calculation of the in-control and out-of-control Average Run Lengths (ARL0
and ARL1).
Journal: Journal of Applied Statistics
Pages: 1267-1276
Issue: 11
Volume: 35
Year: 2008
Keywords: quality control, statistical process control, attribute control chart, binomial CUSUM control chart, fraction nonconforming,
X-DOI: 10.1080/02664760802320533
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802320533
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:11:p:1267-1276
Template-Type: ReDIF-Article 1.0
Author-Name: Md. Mostafizur Rahman
Author-X-Name-First: Md. Mostafizur
Author-X-Name-Last: Rahman
Author-Name: Jian-Ping Zhu
Author-X-Name-First: Jian-Ping
Author-X-Name-Last: Zhu
Author-Name: M. Sayedur Rahman
Author-X-Name-First: M. Sayedur
Author-X-Name-Last: Rahman
Title: Impact study of volatility modelling of Bangladesh stock index using non-normal density
Abstract:
This article examines a wide variety of popular volatility models for
stock index return, including the random walk (RW), autoregressive,
generalized autoregressive conditional heteroscedasticity (GARCH), and
asymmetric GARCH models with normal and non-normal (Student's t and
generalized error) distributional assumption. Fitting these models to the
Chittagong stock index return data from the period 2 January 1999 to 29
December 2005, we found that the asymmetric GARCH/GARCH model fits better
under the assumption of non-normal distribution than under normal
distribution. Non-parametric specification tests show that the RW-GARCH,
RW-TGARCH, RW-EGARCH, and RW-APARCH models under the Student's
t-distributional assumption are significant at the 5% level. Finally, the
study suggests that these four models are suitable for the Chittagong
Stock Exchange of Bangladesh. We believe that this study would be of great
benefit to investors and policy makers at home and abroad.
Journal: Journal of Applied Statistics
Pages: 1277-1292
Issue: 11
Volume: 35
Year: 2008
Keywords: random walk, GARCH, asymmetric GARCH, non-parametric specification test, Student's t-distribution, generalized error distribution,
X-DOI: 10.1080/02664760802320574
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802320574
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:11:p:1277-1292
Template-Type: ReDIF-Article 1.0
Author-Name: Shuo-Jye Wu
Author-X-Name-First: Shuo-Jye
Author-X-Name-Last: Wu
Author-Name: Chun-Tao Chang
Author-X-Name-First: Chun-Tao
Author-X-Name-Last: Chang
Author-Name: Kang-Jun Liao
Author-X-Name-First: Kang-Jun
Author-X-Name-Last: Liao
Author-Name: Syuan-Rong Huang
Author-X-Name-First: Syuan-Rong
Author-X-Name-Last: Huang
Title: Planning of progressive group-censoring life tests with cost considerations
Abstract:
This paper considers a life test under progressive type I group censoring
with a Weibull failure time distribution. The maximum likelihood method is
used to derive the estimators of the parameters of the failure time
distribution. In practice, several variables, such as the number of test
units, the number of inspections, and the length of inspection interval
are related to the precision of estimation and the cost of experiment. An
inappropriate setting of these decision variables not only wastes the
resources of the experiment but also reduces the precision of estimation.
One problem arising from designing a life test is the restricted budget of
experiment. Therefore, under the constraint that the total cost of
experiment does not exceed a pre-determined budget, this paper provides an
algorithm to solve the optimal decision variables by considering three
different criteria. An example is discussed to illustrate the proposed
method. The sensitivity analysis is also studied.
Journal: Journal of Applied Statistics
Pages: 1293-1304
Issue: 11
Volume: 35
Year: 2008
Keywords: A-optimality, D-optimality, E-optimality, grouped data, maximum likelihood method, progressive censoring, Weibull distribution,
X-DOI: 10.1080/02664760802382392
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382392
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:11:p:1293-1304
Template-Type: ReDIF-Article 1.0
Author-Name: Dong Wan Shin
Author-X-Name-First: Dong Wan
Author-X-Name-Last: Shin
Author-Name: Yoon Young Jung
Author-X-Name-First: Yoon Young
Author-X-Name-Last: Jung
Author-Name: Man-Suk Oh
Author-X-Name-First: Man-Suk
Author-X-Name-Last: Oh
Title: Double unit root tests for cross-sectionally dependent panel data
Abstract:
This paper proposes various double unit root tests for cross-sectionally
dependent panel data. The cross-sectional correlation is handled by the
projection method [P.C.B. Phillips and D. Sul, Dynamic panel estimation
and homogeneity testing under cross section dependence, Econom. J. 6
(2003), pp. 217-259; H.R. Moon and B. Perron, Testing for a unit root in
panels with dynamic factors, J. Econom. 122 (2004), pp. 81-126] or the
subtraction method [J. Bai and S. Ng, A PANIC attack on unit roots and
cointegration, Econometrica 72 (2004), pp. 1127-1177]. Pooling or
averaging is applied to combine results from different panel units. Also,
to estimate autoregressive parameters the ordinary least squares
estimation [D.P. Hasza and W.A. Fuller, Estimation for autoregressive
processes with unit roots, Ann. Stat. 7 (1979), pp. 1106-1120] or the
symmetric estimation [D.L. Sen and D.A. Dickey, Symmetric test for second
differencing in univariate time series, J. Bus. Econ. Stat. 5 (1987), pp.
463-473] are used, and to adjust mean functions the ordinary mean
adjustment or the recursive mean adjustment are used. Combinations of
different methods in defactoring to eliminate the cross-sectional
dependency, integrating results from panel units, estimating the
parameters, and adjusting mean functions yields various available tests
for double unit roots in panel data. Simple asymptotic distributions of
the proposed test statistics are derived, which can be used to find
critical values of the test statistics. We perform a Monte Carlo
experiment to compare the performance of these tests and to suggest
optimal tests for a given panel data. Application of the proposed tests to
a real data, the yearly export panel data sets of several Latin-American
countries for the past 50 years, illustrates the usefulness of the
proposed tests for panel data, in that they reveal stronger evidence of
double unit roots than the componentwise double unit root tests of Hasza
and Fuller [Estimation for autoregressive processes with unit roots, Ann.
Stat. 7 (1979), pp. 1106-1120] or Sen and Dickey [Symmetric test for
second differencing in univariate time series, J. Bus. Econ. Stat. 5
(1987), pp. 463-473].
Journal: Journal of Applied Statistics
Pages: 1305-1321
Issue: 11
Volume: 35
Year: 2008
Keywords: panel double unit roots, defactoring, recursive adjustment, symmetric estimation; nonstationarity,
X-DOI: 10.1080/02664760802382400
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382400
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:11:p:1305-1321
Template-Type: ReDIF-Article 1.0
Author-Name: Ronny Vallejos
Author-X-Name-First: Ronny
Author-X-Name-Last: Vallejos
Title: Assessing the association between two spatial or temporal sequences
Abstract:
This paper deals with the codispersion coefficient for spatial and
temporal series. We present some results and simulations concerning the
codispersion coefficient in the context of spatial models. The results
obtained are immediate consequences of the asymptotic normality of the
sample codispersion coefficient and show certain limitations of the
coefficient. New simulation studies provide information about the
performance of the coefficient with respect to other coefficients of
spatial association. The behavior of the codispersion coefficient under
additively contaminated processes is also studied via Monte Carlo
simulations. In the context of time series, explicit expressions for the
asymptotic variance of the sample version of the coefficient are given for
autoregressive and moving average processes. Resampling methods are used
to compute the variance of the coefficient. A real data example is
presented to explore how well the codispersion coefficient captures the
comovement between two time series in practice.
Journal: Journal of Applied Statistics
Pages: 1323-1343
Issue: 12
Volume: 35
Year: 2008
Keywords: spatial association, autoregressive models, correlation coefficient, codispersion coefficient, time series,
X-DOI: 10.1080/02664760802382418
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382418
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:12:p:1323-1343
Template-Type: ReDIF-Article 1.0
Author-Name: M. Jahanshahi
Author-X-Name-First: M.
Author-X-Name-Last: Jahanshahi
Author-Name: M. H. Sanati
Author-X-Name-First: M. H.
Author-X-Name-Last: Sanati
Author-Name: Z. Babaei
Author-X-Name-First: Z.
Author-X-Name-Last: Babaei
Title: Optimization of parameters for the fabrication of gelatin nanoparticles by the Taguchi robust design method
Abstract:
The Taguchi method is a statistical approach to overcome the limitation
of the factorial and fractional factorial experiments by simplifying and
standardizing the fractional factorial design. The objective of this study
was to optimize the fabrication of gelatin nanoparticles by applying the
Taguchi design method. Gelatin nanoparticles have been extensively studied
in our previous works as an appropriate carrier for drug delivery, since
they are biodegradable, non-toxic, are not usually contaminated with
pyrogens and possess relatively low antigenicity. Taguchi method with L16
orthogonal array robust design was implemented to optimize experimental
conditions of the purpose. Four key process parameters - temperature,
gelatin concentration, agitation speed and the amount of acetone - were
considered for the optimization of gelatin nanoparticles. As a result of
Taguchi analysis in this study, temperature and amount of acetone were the
most influencing parameters of the particle size. For characterizing the
nanoparticle sample, atomic force microscope and scanning electron
microscope were employed. In this study, a minimum size of gelatin
nanoparticles was obtained at 50 °C temperature, 45 mg/ml gelatin
concentration, 80 ml acetone and 700 rpm agitation speed. The nanoparticle
size at the determined condition was less than 174 nm.
Journal: Journal of Applied Statistics
Pages: 1345-1353
Issue: 12
Volume: 35
Year: 2008
Keywords: gelatin, drug carrier, nanoparticles, optimization, Taguchi method, statistical experimental design,
X-DOI: 10.1080/02664760802382426
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382426
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:12:p:1345-1353
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Austin
Author-X-Name-First: Peter
Author-X-Name-Last: Austin
Title: The large-sample performance of backwards variable elimination
Abstract:
Prior studies have shown that automated variable selection results in
models with substantially inflated estimates of the model R2, and that a
large proportion of selected variables are truly noise variables. These
earlier studies used simulated data sets whose sample sizes were at most
100. We used Monte Carlo simulations to examine the large-sample
performance of backwards variable elimination. We found that in large
samples, backwards variable elimination resulted in estimates of R2 that
were at most marginally biased. However, even in large samples, backwards
elimination tended to identify the correct regression model in a minority
of the simulated data sets.
Journal: Journal of Applied Statistics
Pages: 1355-1370
Issue: 12
Volume: 35
Year: 2008
Keywords: variable selection methods, model selection methods, regression models, Monte Carlo simulations, backwards variable elimination,
X-DOI: 10.1080/02664760802382434
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382434
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:12:p:1355-1370
Template-Type: ReDIF-Article 1.0
Author-Name: Shaul Bar-Lev
Author-X-Name-First: Shaul
Author-X-Name-Last: Bar-Lev
Title: Point and confidence interval estimates for a global maximum via extreme value theory
Abstract:
The aim of this paper is to provide some practical aspects of point and
interval estimates of the global maximum of a function using extreme value
theory. Consider a real-valued function f:D→ defined on a
bounded interval D such that f is either not known analytically or is
known analytically but has rather a complicated analytic form. We assume
that f possesses a global maximum attained, say, at u*∈D with
maximal value x*=max u f(u)≐f(u*). The problem of seeking the
optimum of a function which is more or less unknown to the observer has
resulted in the development of a large variety of search techniques. In
this paper we use the extreme-value approach as appears in Dekkers et al.
[A moment estimator for the index of an extreme-value distribution, Ann.
Statist. 17 (1989), pp. 1833-1855] and de Haan [Estimation of the minimum
of a function using order statistics, J. Amer. Statist. Assoc. 76 (1981),
pp. 467-469]. We impose some Lipschitz conditions on the functions being
investigated and through repeated simulation-based samplings, we provide
various practical interpretations of the parameters involved as well as
point and interval estimates for x*.
Journal: Journal of Applied Statistics
Pages: 1371-1381
Issue: 12
Volume: 35
Year: 2008
Keywords: extreme value theory, global maximum, search techniques,
X-DOI: 10.1080/02664760802382442
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382442
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:12:p:1371-1381
Template-Type: ReDIF-Article 1.0
Author-Name: Krishna Saha
Author-X-Name-First: Krishna
Author-X-Name-Last: Saha
Title: Semiparametric estimation for the dispersion parameter in the analysis of over- or underdispersed count data
Abstract:
This paper investigates several semiparametric estimators of the
dispersion parameter in the analysis of over- or underdispersed count data
when there is no likelihood available. In the context of estimating the
dispersion parameter, we consider the double-extended quasi-likelihood
(DEQL), the pseudo-likelihood and the optimal quadratic estimating (OQE)
equations method and compare them with the maximum likelihood method, the
method of moments and the extended quasi-likelihood through simulation
study. The simulation study shows that the estimator based on the DEQL has
superior bias and efficiency property for moderate and large sample size,
and for small sample size the estimator based on the OQE equations
outperforms the other estimators. Three real-life data sets arising in
biostatistical practices are analyzed, and the findings from these
analyses are quite similar to what are found from the simulation study.
Journal: Journal of Applied Statistics
Pages: 1383-1397
Issue: 12
Volume: 35
Year: 2008
Keywords: dispersion parameter, maximum likelihood, negative binomial model, semiparametric procedures, toxicological data,
X-DOI: 10.1080/02664760802382459
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382459
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:12:p:1383-1397
Template-Type: ReDIF-Article 1.0
Author-Name: P. Angelopoulos
Author-X-Name-First: P.
Author-X-Name-Last: Angelopoulos
Author-Name: C. Koukouvinos
Author-X-Name-First: C.
Author-X-Name-Last: Koukouvinos
Title: Some robust parameter designs from orthogonal arrays
Abstract:
Robust parameter design, originally proposed by Taguchi [System of
Experimental Design, Vols. I and II, UNIPUB, New York, 1987], is an
offline production technique for reducing variation and improving a
product's quality by using product arrays. However, the use of the product
arrays results in an exorbitant number of runs. To overcome this drawback,
several scientists proposed the use of combined arrays, where the control
and noise factors are combined in a single array. In this paper, we use
non-isomorphic orthogonal arrays as combined arrays, in order to identify
a model that contains all the main effects (control and noise), their
control-by-noise interactions and their control-by-control interactions
with high efficiency. Some cases where the control-by-control-noise are of
interest are also considered.
Journal: Journal of Applied Statistics
Pages: 1399-1408
Issue: 12
Volume: 35
Year: 2008
Keywords: robust parameter design, combined array, control and noise factors, orthogonal arrays, identifiable models, validation,
X-DOI: 10.1080/02664760802382467
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382467
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:12:p:1399-1408
Template-Type: ReDIF-Article 1.0
Author-Name: Arnab Maity
Author-X-Name-First: Arnab
Author-X-Name-Last: Maity
Author-Name: Michael Sherman
Author-X-Name-First: Michael
Author-X-Name-Last: Sherman
Title: On adaptive linear regression
Abstract:
Ordinary least squares (OLS) is omnipresent in regression modeling.
Occasionally, least absolute deviations (LAD) or other methods are used as
an alternative when there are outliers. Although some data adaptive
estimators have been proposed, they are typically difficult to implement.
In this paper, we propose an easy to compute adaptive estimator which is
simply a linear combination of OLS and LAD. We demonstrate large sample
normality of our estimator and show that its performance is close to best
for both light-tailed (e.g. normal and uniform) and heavy-tailed (e.g.
double exponential and t3) error distributions. We demonstrate this
through three simulation studies and illustrate our method on state public
expenditures and lutenizing hormone data sets. We conclude that our method
is general and easy to use, which gives good efficiency across a wide
range of error distributions.
Journal: Journal of Applied Statistics
Pages: 1409-1422
Issue: 12
Volume: 35
Year: 2008
Keywords: adaptive regression, heavy-tailed error, least absolute deviation regression, mean squared error, ordinary least-squares regression,
X-DOI: 10.1080/02664760802382475
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382475
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:12:p:1409-1422
Template-Type: ReDIF-Article 1.0
Author-Name: Jacques Pienaar
Author-X-Name-First: Jacques
Author-X-Name-Last: Pienaar
Title: BOOK REVIEW
Abstract:
Journal: Journal of Applied Statistics
Pages: 1423-1424
Issue: 12
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760802193328
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802193328
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:12:p:1423-1424
Template-Type: ReDIF-Article 1.0
Author-Name: Jacques Pienaar
Author-X-Name-First: Jacques
Author-X-Name-Last: Pienaar
Title: BOOK REVIEW
Abstract:
Journal: Journal of Applied Statistics
Pages: 1425-1426
Issue: 12
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760802193336
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802193336
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:12:p:1425-1426
Template-Type: ReDIF-Article 1.0
Author-Name: Stuart Barber
Author-X-Name-First: Stuart
Author-X-Name-Last: Barber
Title: BOOK REVIEW
Abstract:
Journal: Journal of Applied Statistics
Pages: 1427-1428
Issue: 12
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760802366742
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802366742
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:12:p:1427-1428
Template-Type: ReDIF-Article 1.0
Author-Name: A. Mukhopadhyay
Author-X-Name-First: A.
Author-X-Name-Last: Mukhopadhyay
Author-Name: A. Iqbal
Author-X-Name-First: A.
Author-X-Name-Last: Iqbal
Title: Prediction of mechanical property of steel strips using multivariate adaptive regression splines
Abstract:
In recent times, the problem of prediction of properties of a steel strip
has attracted enormous attention from different communities such as
statistics, data mining, soft computing, and engineering. This is due to
the prospective benefits of reduction in testing and inventory cost,
increase in yield, and improvement in delivery compliance. The complexity
of the problem arises due to its dependency on the chemical composition of
the steel, and a number of processing parameters. To predict the
mechanical properties of the strip (yield strength, ultimate tensile
strength, and Elongation), a model based on multivariate adaptive
regression spline has been developed. It is found that the prediction
agrees well with the actual measured data.
Journal: Journal of Applied Statistics
Pages: 1-9
Issue: 1
Volume: 36
Year: 2009
Keywords: data mining, MARS, property prediction, soft computing, statistics, steel,
X-DOI: 10.1080/02664760802193252
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802193252
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:1:p:1-9
Template-Type: ReDIF-Article 1.0
Author-Name: Min Kim
Author-X-Name-First: Min
Author-X-Name-Last: Kim
Author-Name: Bong-Jin Yum
Author-X-Name-First: Bong-Jin
Author-X-Name-Last: Yum
Title: Reliability acceptance sampling plans for the Weibull distribution under accelerated Type-I censoring
Abstract:
Type-I censored reliability acceptance sampling plans (RASPs) are
developed for the Weibull lifetime distribution with unknown shape and
scale parameters such that the producer and consumer risks are satisfied.
It is assumed that the life test is conducted at an accelerated condition
for which the acceleration factor (AF) is known, and each item is
continuously monitored for failure. Sensitivity analyses are also
conducted to assess the effect of the uncertainty in the assumed AF on the
actual producer and consumer risks, and a method is developed for
constructing RASPs that can accommodate the uncertainty in AF.
Journal: Journal of Applied Statistics
Pages: 11-20
Issue: 1
Volume: 36
Year: 2009
Keywords: reliability acceptance sampling plan, Type-I censoring, producer risk, consumer risk, acceleration factor,
X-DOI: 10.1080/02664760802382483
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382483
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:1:p:11-20
Template-Type: ReDIF-Article 1.0
Author-Name: Jose Dias Curto
Author-X-Name-First: Jose Dias
Author-X-Name-Last: Curto
Author-Name: Jose Castro Pinto
Author-X-Name-First: Jose Castro
Author-X-Name-Last: Pinto
Title: The coefficient of variation asymptotic distribution in the case of non-iid random variables
Abstract:
Due to the widespread use of the coefficient of variation in empirical
finance, we derive its asymptotic sampling distribution in the case of
non-iid random variables to deal with autocorrelation and/or conditional
heteroskedasticity stylized facts of financial returns. We also propose
statistical tests for the comparison of two coefficients of variation
based on asymptotic normality and studentized time-series bootstrap. In an
illustrative example, we analyze the monthly return volatility of six
stock market indexes during the years 1990-2007.
Journal: Journal of Applied Statistics
Pages: 21-32
Issue: 1
Volume: 36
Year: 2009
Keywords: coefficient of variation, autocorrelation, conditional heteroskedasticity, non-iid random variables,
X-DOI: 10.1080/02664760802382491
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382491
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:1:p:21-32
Template-Type: ReDIF-Article 1.0
Author-Name: P. Angelopoulos
Author-X-Name-First: P.
Author-X-Name-Last: Angelopoulos
Author-Name: H. Evangelaras
Author-X-Name-First: H.
Author-X-Name-Last: Evangelaras
Author-Name: C. Koukouvinos
Author-X-Name-First: C.
Author-X-Name-Last: Koukouvinos
Title: Model identification using 27 runs three level orthogonal arrays
Abstract:
In this paper we examine all the combinatorial non-isomorphic OA(27, q,
3, t), with 3≤q≤13 three level quantitative factors, with
respect to model identification, estimation capacity and efficiency. We
use the popular D-efficiency criterion to evaluate the ability of each
design considered in estimating the parameters of a second-order model
with adequate efficiency. The prior selection of the 'middle' level of
factors plays an important role in the results.
Journal: Journal of Applied Statistics
Pages: 33-38
Issue: 1
Volume: 36
Year: 2009
Keywords: orthogonal arrays, quantitative factors, geometric isomorphism, hidden projection properties,
X-DOI: 10.1080/02664760802382509
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382509
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:1:p:33-38
Template-Type: ReDIF-Article 1.0
Author-Name: Claire Weston
Author-X-Name-First: Claire
Author-X-Name-Last: Weston
Author-Name: John Thompson
Author-X-Name-First: John
Author-X-Name-Last: Thompson
Title: The definition of start time in cancer treatment studies analysed by non-mixture cure models
Abstract:
Non-mixture cure models are derived from a simplified representation of
the biological process that takes place after treatment for cancer. These
models are intended to represent the time from the end of treatment to the
time of first recurrence of the cancer in studies when a proportion of
those treated are completely cured. However, for many studies, other start
times are more relevant. In a clinical trial, it may be more natural to
model the time from randomisation rather than the time from the end of
treatment and in an epidemiological study, the time from diagnosis might
be more meaningful. Some simulations and two real studies of childhood
cancer are presented to show that starting from time of diagnosis or
randomisation can affect the estimates of the cure fraction. The
susceptibility of different parametric kernels to errors caused by using
start times other than the end of treatment is also assessed. Analysing
failures on treatment and relapse after completing the treatment as two
processes offers a simple way of overcoming many of these problems.
Journal: Journal of Applied Statistics
Pages: 39-52
Issue: 1
Volume: 36
Year: 2009
Keywords: non-mixture cure model, parametric survival, paediatric cancer,
X-DOI: 10.1080/02664760802382517
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382517
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:1:p:39-52
Template-Type: ReDIF-Article 1.0
Author-Name: P. Economou
Author-X-Name-First: P.
Author-X-Name-Last: Economou
Author-Name: C. Caroni
Author-X-Name-First: C.
Author-X-Name-Last: Caroni
Title: Fitting parametric frailty and mixture models under biased sampling
Abstract:
Biased sampling from an underlying distribution with p.d.f. f(t),
t>0, implies that observations follow the weighted distribution with
p.d.f. fw(t)=w(t)f(t)/E[w(T)] for a known weight function w. In
particular, the function w(t)=tα has important applications,
including length-biased sampling (α=1) and area-biased sampling
(α=2). We first consider here the maximum likelihood estimation of
the parameters of a distribution f(t) under biased sampling from a
censored population in a proportional hazards frailty model where a
baseline distribution (e.g. Weibull) is mixed with a continuous frailty
distribution (e.g. Gamma). A right-censored observation contributes a term
proportional to w(t)S(t) to the likelihood; this is not the same as Sw(t),
so the problem of fitting the model does not simply reduce to fitting the
weighted distribution. We present results on the distribution of frailty
in the weighted distribution and develop an EM algorithm for estimating
the parameters of the model in the important Weibull-Gamma case. We also
give results for the case where f(t) is a finite mixture distribution.
Results are presented for uncensored data and for Type I right censoring.
Simulation results are presented, and the methods are illustrated on a set
of lifetime data.
Journal: Journal of Applied Statistics
Pages: 53-66
Issue: 1
Volume: 36
Year: 2009
Keywords: weighted distribution, biased sampling, frailty, finite mixture, Weibull distribution, Burr distribution, Type I right censoring, EM algorithm,
X-DOI: 10.1080/02664760802382525
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382525
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:1:p:53-66
Template-Type: ReDIF-Article 1.0
Author-Name: Myung-Hoe Huh
Author-X-Name-First: Myung-Hoe
Author-X-Name-Last: Huh
Author-Name: Yong Lim
Author-X-Name-First: Yong
Author-X-Name-Last: Lim
Title: Weighting variables in K-means clustering
Abstract:
The aim of this study is to assign weights w1, …, wm to m
clustering variables Z1, …, Zm, so that k groups were uncovered to
reveal more meaningful within-group coherence. We propose a new criterion
to be minimized, which is the sum of the weighted within-cluster sums of
squares and the penalty for the heterogeneity in variable weights w1,
…, wm. We will present the computing algorithm for such k-means
clustering, a working procedure to determine a suitable value of penalty
constant and numerical examples, among which one is simulated and the
other two are real.
Journal: Journal of Applied Statistics
Pages: 67-78
Issue: 1
Volume: 36
Year: 2009
Keywords: K-means clustering, variable weighting, penalty constant,
X-DOI: 10.1080/02664760802382533
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802382533
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:1:p:67-78
Template-Type: ReDIF-Article 1.0
Author-Name: Yih Su
Author-X-Name-First: Yih
Author-X-Name-Last: Su
Author-Name: Jing-Shiang Hwang
Author-X-Name-First: Jing-Shiang
Author-X-Name-Last: Hwang
Title: A two-phase approach to estimating time-varying parameters in the capital asset pricing model
Abstract:
Following the development of the economy and the diversification of
investment, mutual funds are a popular investment tool nowadays. Choosing
excellent targets from hundreds of mutual funds has become more and more
crucial to investors. The capital asset pricing model (CAPM) has been
widely used in the capital cost estimation and performance evaluation of
mutual funds. In this study, we propose a new two-phase approach to
estimating the time-varying parameters of CAPM. We implemented a
simulation study to evaluate the efficiency of the proposed method and
compared it with the commonly used state space and rolling regression
methods. The results showed that the new method is more efficient in most
scenarios. Meanwhile, the proposed approach is very practical and it is
unnecessary to judge and adjust the estimating process for different
situations. Finally, we applied the proposed method to equity mutual funds
in the Taiwan stock market and reported the performances of two funds for
demonstration.
Journal: Journal of Applied Statistics
Pages: 79-89
Issue: 1
Volume: 36
Year: 2009
Keywords: CAPM, two-phase estimation, time-varying parameter,
X-DOI: 10.1080/02664760802443871
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443871
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:1:p:79-89
Template-Type: ReDIF-Article 1.0
Author-Name: Jamel Jouini
Author-X-Name-First: Jamel
Author-X-Name-Last: Jouini
Title: Analysis of structural break models based on the evolutionary spectrum: Monte Carlo study and application
Abstract:
We investigate the instability problem of the covariance structure of
time series by combining the non-parametric approach based on the
evolutionary spectral density theory of Priestley [Evolutionary spectra
and non-stationary processes, J. R. Statist. Soc., 27 (1965), pp. 204-237;
Wavelets and time-dependent spectral analysis, J. Time Ser. Anal., 17
(1996), pp. 85-103] and the parametric approach based on linear regression
models of Bai and Perron [Estimating and testing linear models with
multiple structural changes, Econometrica 66 (1998), pp. 47-78]. A Monte
Carlo study is presented to evaluate the performance of some parametric
testing and estimation procedures for models characterized by breaks in
variance. We attempt to see whether these procedures perform in the same
way as models characterized by mean-shifts as investigated by Bai and
Perron [Multiple structural change models: a simulation analysis, in:
Econometric Theory and Practice: Frontiers of Analysis and Applied
Research, D. Corbea, S. Durlauf, and B.E. Hansen, eds., Cambridge
University Press, 2006, pp. 212-237]. We also provide an analysis of
financial data series, of which the stability of the covariance function
is doubtful.
Journal: Journal of Applied Statistics
Pages: 91-110
Issue: 1
Volume: 36
Year: 2009
Keywords: evolutionary spectrum, break dates, size and power, coverage rates, selection procedures,
X-DOI: 10.1080/02664760802443889
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443889
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:1:p:91-110
Template-Type: ReDIF-Article 1.0
Author-Name: Ulysses Brown
Author-X-Name-First: Ulysses
Author-X-Name-Last: Brown
Author-Name: Stephen Knouse
Author-X-Name-First: Stephen
Author-X-Name-Last: Knouse
Author-Name: James Stewart
Author-X-Name-First: James
Author-X-Name-Last: Stewart
Author-Name: Ruby Beale
Author-X-Name-First: Ruby
Author-X-Name-Last: Beale
Title: The relationship between unit diversity and perceptions of organizational performance in the military
Abstract:
Structural equation modeling techniques are used to examine the
relationship between demographic diversity and perceptions of
organizational performance in military units. Analyzing data from the
Military Equal Opportunity Climate Survey reveals higher female and
minority representation reduces females' and minorities' perceptions of
organizational effectiveness, respectively. Identical factors appear to
influence the perceptions of organizational performance across these two
subgroups of employees. The results demonstrate the importance of
conducting separate analyses for subgroups in examining the effects of
demographic diversity on organizational performance.
Journal: Journal of Applied Statistics
Pages: 111-120
Issue: 1
Volume: 36
Year: 2009
Keywords: demographic diversity, military units, organizational performance, structural equation modeling,
X-DOI: 10.1080/02664760802443905
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443905
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:1:p:111-120
Template-Type: ReDIF-Article 1.0
Author-Name: M. Bebbington
Author-X-Name-First: M.
Author-X-Name-Last: Bebbington
Author-Name: C. D. Lai
Author-X-Name-First: C. D.
Author-X-Name-Last: Lai
Author-Name: R. Zitikis
Author-X-Name-First: R.
Author-X-Name-Last: Zitikis
Title: Modeling lactation curves: classical parametric models re-examined and modified
Abstract:
A large number of methods for modeling lactation curves have been
proposed - parametric and nonparametric, mathematically or biologically
oriented. The most popular of these are methods that express the milk
yield in terms of time via a parametric nonlinear functional equation.
This is intuitive and allows for relatively easy mathematical and
biological interpretations of the parameters involved. Interestingly, as
far as we are aware, all such models generate nonzero milk yields on the
whole positive time half-line, even though real lactation curves always
have finite range, with spans of approximately 300 days for dairy cows.
For this reason, we re-examine a number of existing parametric models, and
modify them to produce finite-range lactation curves that fit remarkably
well to data of milk yields from New Zealand cows. The use of daily or
weekly yields rather than the monthly yields normally considered reveals
considerable variation that is usually suppressed. Both individual and
herd lactation curves are examined in the present paper, and median-based
procedures explored as alternatives to the usual average-based methods.
These suggestions offer further insights into the existing literature on
modeling lactation curves.
Journal: Journal of Applied Statistics
Pages: 121-133
Issue: 2
Volume: 36
Year: 2009
Keywords: lactation curve, parametric function, wood curve, finite-range modification,
X-DOI: 10.1080/02664760802443897
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443897
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:2:p:121-133
Template-Type: ReDIF-Article 1.0
Author-Name: Shey-Huei Sheu
Author-X-Name-First: Shey-Huei
Author-X-Name-Last: Sheu
Author-Name: Yu-Tai Hsieh
Author-X-Name-First: Yu-Tai
Author-X-Name-Last: Hsieh
Title: The extended GWMA control chart
Abstract:
This study extends the generally weighted moving average (GWMA) control
chart by imitating the double exponentially weighted moving average
(DEWMA) technique. The proposed chart is called the double generally
weighted moving average (DGWMA) control chart. Simulation is employed to
evaluate the average run length characteristics of the GWMA, DEWMA and
DGWMA control charts. An extensive comparison of these control charts
reveals that the DGWMA control chart with time-varying control limits is
more sensitive than the GWMA and the DEWMA control charts for detecting
medium shifts in the mean of a process when the shifts are between 0.5 and
1.5 standard deviations. Additionally, the GWMA control chart performs
better when the mean shifts are below the 0.5 standard deviation, and the
DEWMA control performs better when the mean shifts are above the 1.5
standard deviation. The design of the DGWMA control chart is also
discussed.
Journal: Journal of Applied Statistics
Pages: 135-147
Issue: 2
Volume: 36
Year: 2009
Keywords: GWMA control chart, DGWMA control chart, DEWMA control chart, average run length, time-varying control limits,
X-DOI: 10.1080/02664760802443913
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443913
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:2:p:135-147
Template-Type: ReDIF-Article 1.0
Author-Name: J. M. Vilar
Author-X-Name-First: J. M.
Author-X-Name-Last: Vilar
Author-Name: R. Cao
Author-X-Name-First: R.
Author-X-Name-Last: Cao
Author-Name: M. C. Ausin
Author-X-Name-First: M. C.
Author-X-Name-Last: Ausin
Author-Name: C. Gonzalez-Fragueiro
Author-X-Name-First: C.
Author-X-Name-Last: Gonzalez-Fragueiro
Title: Nonparametric analysis of aggregate loss models
Abstract:
This paper describes a nonparametric approach to make inferences for
aggregate loss models in the insurance framework. We assume that an
insurance company provides a historical sample of claims given by claim
occurrence times and claim sizes. Furthermore, information may be
incomplete as claims may be censored and/or truncated. In this context,
the main goal of this work consists of fitting a probability model for the
total amount that will be paid on all claims during a fixed future time
period. In order to solve this prediction problem, we propose a new
methodology based on nonparametric estimators for the density functions
with censored and truncated data, the use of Monte Carlo simulation
methods and bootstrap resampling. The developed methodology is useful to
compare alternative pricing strategies in different insurance decision
problems. The proposed procedure is illustrated with a real dataset
provided by the insurance department of an international commercial
company.
Journal: Journal of Applied Statistics
Pages: 149-166
Issue: 2
Volume: 36
Year: 2009
Keywords: aggregate loss models, kernel estimator, Monte Carlo method, bootstrap, censored and truncated claims,
X-DOI: 10.1080/02664760802443921
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443921
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:2:p:149-166
Template-Type: ReDIF-Article 1.0
Author-Name: S. S. K. Haputhantri
Author-X-Name-First: S. S. K.
Author-X-Name-Last: Haputhantri
Author-Name: J. Moreau
Author-X-Name-First: J.
Author-X-Name-Last: Moreau
Author-Name: S. Lek
Author-X-Name-First: S.
Author-X-Name-Last: Lek
Title: Exploring gillnet catch efficiency of sardines in the coastal waters of Sri Lanka by means of three statistical techniques: a comparison of linear and nonlinear modelling techniques
Abstract:
The present investigation was undertaken to study the gillnet catch
efficiency of sardines in the coastal waters of Sri Lanka using commercial
catch and effort data. Commercial catch and effort data of small mesh
gillnet fishery were collected in five fisheries districts during the
period May 1999-August 2002. Gillnet catch efficiency of sardines was
investigated by developing catch rates predictive models using data on
commercial fisheries and environmental variables. Three statistical
techniques [multiple linear regression, generalized additive model and
regression tree model (RTM)] were employed to predict the catch rates of
trenched sardine Amblygaster sirm (key target species of small mesh
gillnet fishery) and other sardines (Sardinella longiceps, S. gibbosa, S.
albella and S. sindensis). The data collection programme was conducted for
another six months and the models were tested on new data. RTMs were found
to be the strongest in terms of reliability and accuracy of the
predictions. The two operational characteristics used here for model
formulation (i.e. depth of fishing and number of gillnet pieces used per
fishing operation) were more useful as predictor variables than the
environmental variables. The study revealed a rapid tendency of increasing
the catch rates of A. sirm with increased sea depth up to around
32 m.
Journal: Journal of Applied Statistics
Pages: 167-179
Issue: 2
Volume: 36
Year: 2009
Keywords: fisheries, modelling, multiple linear regression, generalized additive models, regression tree models,
X-DOI: 10.1080/02664760802443939
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443939
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:2:p:167-179
Template-Type: ReDIF-Article 1.0
Author-Name: Giovanni Celano
Author-X-Name-First: Giovanni
Author-X-Name-Last: Celano
Title: Robust design of adaptive control charts for manual manufacturing/inspection workstations
Abstract:
Often the manufacturing and the inspection workstations in a
manufacturing process can coincide: thus, in these workstations the
statistical process control (SPC) procedure of collecting sample
statistics related to a critical-to-quality parameter is a task required
to be done by the same worker who has to complete the working operations
on a part. The aim of this study is to design a local SPC inspection
procedure implementing an adaptive Shewhart control chart locally managed
by the worker within the manufacturing workstation: the economic design of
the inspection procedure is constrained by the expected number of false
alarms issued and is restricted to those designs feasible with respect to
the available shared labour resource. Furthermore, a robust approach that
models the shift of the controlled parameter mean as a random variable is
taken into account. The numerical analysis allows the most influencing
environmental process factors to be captured and commented upon. The
obtained results show that a few process operating parameters drive the
choice of performing a robust optimization and the selection of the
optimal SPC adaptive procedure.
Journal: Journal of Applied Statistics
Pages: 181-203
Issue: 2
Volume: 36
Year: 2009
Keywords: statistical process control, control chart, economic design, robust optimization, labour resource,
X-DOI: 10.1080/02664760802443947
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443947
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:2:p:181-203
Template-Type: ReDIF-Article 1.0
Author-Name: Malin Albing
Author-X-Name-First: Malin
Author-X-Name-Last: Albing
Author-Name: Kerstin Vannman
Author-X-Name-First: Kerstin
Author-X-Name-Last: Vannman
Title: Skewed zero-bound distributions and process capability indices for upper specifications
Abstract:
A common practical situation in process capability analysis, which is not
well developed theoretically, is when the quality characteristic of
interest has a skewed distribution with a long tail towards relatively
large values and an upper specification limit only exists. In such
situations, it is not uncommon that the smallest possible value of the
characteristic is 0 and this is also the best value to obtain. Hence a
target value 0 is assumed to exist. We investigate a new class of process
capability indices for this situation. Two estimators of the proposed
index are studied and the asymptotic distributions of these estimators are
derived. Furthermore, we suggest a decision procedure useful when drawing
conclusions about the capability at a given significance level, based on
the estimated indices and their asymptotic distributions. A simulation
study is also performed, assuming that the quality characteristic is
Weibull-distributed, to investigate the true significance level when the
sample size is finite.
Journal: Journal of Applied Statistics
Pages: 205-221
Issue: 2
Volume: 36
Year: 2009
Keywords: capability index, skewed distributions, one-sided specification interval, upper specification limit, zero-bound process data, target value 0, hypothesis testing, Weibull distribution,
X-DOI: 10.1080/02664760802443954
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443954
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:2:p:205-221
Template-Type: ReDIF-Article 1.0
Author-Name: S. Cabras
Author-X-Name-First: S.
Author-X-Name-Last: Cabras
Author-Name: M. E. Castellanos
Author-X-Name-First: M. E.
Author-X-Name-Last: Castellanos
Title: Default Bayesian goodness-of-fit tests for the skew-normal model
Abstract:
In this paper we propose a series of goodness-of-fit tests for the family
of skew-normal models when all parameters are unknown. As the null
distributions of the considered test statistics depend only on asymmetry
parameter, we used a default and proper prior on skewness parameter
leading to the prior predictive p-value advocated by G. Box.
Goodness-of-fit tests, here proposed, depend only on sample size and
exhibit full agreement between nominal and actual size. They also have
good power against local alternative models which also account for
asymmetry in the data.
Journal: Journal of Applied Statistics
Pages: 223-232
Issue: 2
Volume: 36
Year: 2009
Keywords: EDF test, model checking, prior predictive distribution, power, p-values, size of test,
X-DOI: 10.1080/02664760802443988
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443988
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:2:p:223-232
Template-Type: ReDIF-Article 1.0
Author-Name: Weiqi Luo
Author-X-Name-First: Weiqi
Author-X-Name-Last: Luo
Title: Analysing ecological data
Abstract:
Journal: Journal of Applied Statistics
Pages: 233-234
Issue: 2
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802340267
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802340267
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:2:p:233-234
Template-Type: ReDIF-Article 1.0
Author-Name: Stuart Barber
Author-X-Name-First: Stuart
Author-X-Name-Last: Barber
Title: Bioinformatics
Abstract:
Journal: Journal of Applied Statistics
Pages: 235-236
Issue: 2
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802340275
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802340275
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:2:p:235-236
Template-Type: ReDIF-Article 1.0
Author-Name: John Kornak
Author-X-Name-First: John
Author-X-Name-Last: Kornak
Author-Name: Bruce Dunham
Author-X-Name-First: Bruce
Author-X-Name-Last: Dunham
Author-Name: Deborah Hall
Author-X-Name-First: Deborah
Author-X-Name-Last: Hall
Author-Name: Mark Haggard
Author-X-Name-First: Mark
Author-X-Name-Last: Haggard
Title: Nonlinear voxel-based modelling of the haemodynamic response in fMRI
Abstract:
A common assumption for data analysis in functional magnetic resonance
imaging is that the response signal can be modelled as the convolution of
a haemodynamic response (HDR) kernel with a stimulus reference function.
Early approaches modelled spatially constant HDR kernels, but more
recently spatially varying models have been proposed. However, convolution
limits the flexibility of these models and their ability to capture
spatial variation. Here, a range of (nonlinear) parametric curves are
fitted by least squares minimisation directly to individual voxel HDRs
(i.e., without using convolution). A 'constrained gamma curve' is proposed
as an efficient form for fitting the HDR at each individual voxel. This
curve allows for spatial variation in the delay of the HDR, but places a
global constraint on the temporal spread. The approach of directly fitting
individual parameters of HDR shape is demonstrated to lead to an improved
fit of response estimates.
Journal: Journal of Applied Statistics
Pages: 237-253
Issue: 3
Volume: 36
Year: 2009
Keywords: constrained gamma curve, haemodynamic response function, functional magnetic resonance imaging, least squares estimation, nonlinear curve fitting, polynomial curve fitting,
X-DOI: 10.1080/02664760802443962
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443962
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:3:p:237-253
Template-Type: ReDIF-Article 1.0
Author-Name: Wen-Chih Chiu
Author-X-Name-First: Wen-Chih
Author-X-Name-Last: Chiu
Title: Generally weighted moving average control charts with fast initial response features
Abstract:
The generally weighted moving average (GWMA) control chart is an
extension model of exponentially weighted moving average (EWMA) control
chart. Recently, some approaches have been proposed to modify EWMA charts
with fast initial response (FIR) features. We introduce these approaches
in GWMA-type charts. Via simulation, various control schemes are designed
and then their average run lengths are computed and compared. Based on the
overall performance, it is showed that the DGWMA chart is the best choice
especially when the shift is moderate, and the GWMA charts provided with
additional FIR feature have a good performance only in detecting large
shifts during the initial stage.
Journal: Journal of Applied Statistics
Pages: 255-275
Issue: 3
Volume: 36
Year: 2009
Keywords: EWMA, GWMA, control charts, average run length, fast initial response,
X-DOI: 10.1080/02664760802443970
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443970
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:3:p:255-275
Template-Type: ReDIF-Article 1.0
Author-Name: Saralees Nadarajah
Author-X-Name-First: Saralees
Author-X-Name-Last: Nadarajah
Title: A bivariate distribution with gamma and beta marginals with application to drought data
Abstract:
The first known bivariate distribution with gamma and beta marginals is
introduced. Various representations are derived for its joint probability
density function (pdf), joint cumulative distribution function (cdf),
product moments, conditional pdfs, conditional cdfs, conditional moments,
joint moment generating function, joint characteristic function and
entropies. The method of maximum likelihood and the method of moments are
used to derive the associated estimation procedures as well as the Fisher
information matrix, variance-covariance matrix and the profile likelihood
confidence intervals. An application to drought data from Nebraska is
provided. Some other applications are also discussed. Finally, an
extension of the bivariate distribution to the multivariate case is
proposed.
Journal: Journal of Applied Statistics
Pages: 277-301
Issue: 3
Volume: 36
Year: 2009
Keywords: beta distribution, drought modeling, gamma distribution,
X-DOI: 10.1080/02664760802443996
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802443996
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:3:p:277-301
Template-Type: ReDIF-Article 1.0
Author-Name: Li Wang
Author-X-Name-First: Li
Author-X-Name-Last: Wang
Author-Name: Scott Kowalski
Author-X-Name-First: Scott
Author-X-Name-Last: Kowalski
Author-Name: G. Geoffrey Vining
Author-X-Name-First: G. Geoffrey
Author-X-Name-Last: Vining
Title: Orthogonal blocking of response surface split-plot designs
Abstract:
When all experimental runs cannot be performed under homogeneous
conditions, blocking can be used to increase the power for testing the
treatment effects. Orthogonal blocking provides the same estimator of the
polynomial effects as the one that would be obtained by ignoring the
blocks. In many real-life design scenarios, there is at least one factor
that is hard to change, leading to a split-plot structure. This paper
shows that for a balanced ordinary least square-generalized least square
equivalent split-plot design, orthogonal blocking can be achieved.
Orthogonally blocked split-plot central composite designs are constructed
and a catalog is provided.
Journal: Journal of Applied Statistics
Pages: 303-321
Issue: 3
Volume: 36
Year: 2009
Keywords: central composite designs, orthogonal blocking, design of experiments, split-plot experiments,
X-DOI: 10.1080/02664760802444002
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802444002
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:3:p:303-321
Template-Type: ReDIF-Article 1.0
Author-Name: Joao Santos
Author-X-Name-First: Joao
Author-X-Name-Last: Santos
Author-Name: Solange Leite
Author-X-Name-First: Solange
Author-X-Name-Last: Leite
Title: Long-term variability of the temperature time series recorded in Lisbon
Abstract:
As a case study for application in climate change studies, daily air
temperature records in Lisbon are analysed by applying advanced
statistical methodologies that take into account the dynamic nature of
climate. A trend analysis based on two non-parametric tests (Spearman and
Mann-Kendall) revealed the presence of statistically significant upward
trends in the maximum temperatures, mainly during March. The minimum
temperatures do not present significant trends, with the exception of
March where a relatively weak positive trend is detected. A singular
spectral analysis combined with a maximum entropy spectral analysis
enables the detection of regularities in the annual mean time series of
the maximum and minimum temperatures. A quasi-periodic oscillation with a
peak period of about 50 years is superimposed in the linear trends. At the
maximum temperature, a secondary oscillation with a peak period of nearly
20 years is also identified. No other regularities are isolated in these
time series. The study is enhanced by applying an extreme value analysis
to the extreme winter and summer temperatures. The generalized extreme
value distribution family is shown to provide high-quality adjustments to
the distributions, and a description of the temperatures related to
different return periods and risks is given.
Journal: Journal of Applied Statistics
Pages: 323-337
Issue: 3
Volume: 36
Year: 2009
Keywords: climate change, trends, extremes, oscillations, air temperature, Portugal,
X-DOI: 10.1080/02664760802449159
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802449159
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:3:p:323-337
Template-Type: ReDIF-Article 1.0
Author-Name: Philip Hans Franses
Author-X-Name-First: Philip Hans
Author-X-Name-Last: Franses
Author-Name: Bert de Groot
Author-X-Name-First: Bert
Author-X-Name-Last: de Groot
Author-Name: Rianne Legerstee
Author-X-Name-First: Rianne
Author-X-Name-Last: Legerstee
Title: Testing for harmonic regressors
Abstract:
This paper reports on the Wald test for α1=α2=0 in the
regression model [image omitted] where κ is estimated using
nonlinear least squares. As this situation is not standard we provide
critical values for further use. An illustration to quarterly GDP in the
Netherlands is given. A power study shows that choosing inappropriate
starting values for κ leads to a quick loss of power.
Journal: Journal of Applied Statistics
Pages: 339-346
Issue: 3
Volume: 36
Year: 2009
Keywords: harmonic regressors, critical values,
X-DOI: 10.1080/02664760802454837
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802454837
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:3:p:339-346
Template-Type: ReDIF-Article 1.0
Author-Name: A.H.M. Rahmatullah Imon
Author-X-Name-First: A.H.M.
Author-X-Name-Last: Rahmatullah Imon
Title: Deletion residuals in the detection of heterogeneity of variances in linear regression
Abstract:
The heterogeneity of error variance often causes a huge interpretive
problem in linear regression analysis. Before taking any remedial measures
we first need to detect this problem. A large number of diagnostic plots
are now available in the literature for detecting heteroscedasticity of
error variances. Among them the 'residuals' and 'fits' (R-F) plot is very
popular and commonly used. In the R-F plot residuals are plotted against
the fitted responses, where both these components are obtained using the
ordinary least squares (OLS) method. It is now evident that the OLS fits
and residuals suffer a huge setback in the presence of unusual
observations and hence the R-F plot may not exhibit the real scenario. The
deletion residuals based on a data set free from all unusual cases should
estimate the true errors in a better way than the OLS residuals. In this
paper we propose 'deletion residuals' and the 'deletion fits' (DR-DF) plot
for the detection of the heterogeneity of error variances in a linear
regression model to get a more convincing and reliable graphical display.
Examples show that this plot locates unusual observations more clearly
than the R-F plot. The advantage of using deletion residuals in the
detection of heteroscedasticity of error variance is investigated through
Monte Carlo simulations under a variety of situations.
Journal: Journal of Applied Statistics
Pages: 347-358
Issue: 3
Volume: 36
Year: 2009
Keywords: R-F plot, unusual observations, deletion residuals, robust regression, DR-DF plot,
X-DOI: 10.1080/02664760802466237
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802466237
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:3:p:347-358
Template-Type: ReDIF-Article 1.0
Author-Name: Pablo Martinez-Camblor
Author-X-Name-First: Pablo
Author-X-Name-Last: Martinez-Camblor
Author-Name: Aina Yanez
Author-X-Name-First: Aina
Author-X-Name-Last: Yanez
Title: Testing the equality of diagnostic effectiveness of one measure with respect to k different features
Abstract:
In several cases the same measurement is used as a marker for two or more
population features, and it is useful to test whether this measurement has
the same diagnostic effectiveness with respect to different features. In
this paper we use the area under receiver operating characteristic curve
as index for the discriminatory power among continuous variables and
population features (eventuality, two or more diseases), and we propose a
test to contrast the equality of the diagnostic effectiveness of this
measurement.
Journal: Journal of Applied Statistics
Pages: 359-367
Issue: 4
Volume: 36
Year: 2009
Keywords: ROC curve, sensitivity, specificity, area under ROC curve (AUC),
X-DOI: 10.1080/02664760802464471
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802464471
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:4:p:359-367
Template-Type: ReDIF-Article 1.0
Author-Name: E. Cankaya
Author-X-Name-First: E.
Author-X-Name-Last: Cankaya
Author-Name: N. R. J. Fieller
Author-X-Name-First: N. R. J.
Author-X-Name-Last: Fieller
Title: Quantal models: a review with additional methodological development
Abstract:
Analysis of quantal models is a particular aspect of the general problem
of investigating multimodality. The distinction is that the spacings
between modes are integral multiples of some unspecified fundamental unit
and that the number of modes is not defined. Such semi-structured models
arise in a wide variety of contexts such as biology, cosmology,
archaeology and molecular physics. This paper presents a brief review of
their historical development in such areas as an aid to their recognition
in other contexts as well as giving guidance to their analysis from the
statistical viewpoint. The available methodology for their analysis is
collated into a coherent and self-contained account, establishing various
optimality properties under particular parametric distributional
assumptions. An illustrative power study shows how dependence on sample
size and failure of assumptions such as underlying distribution, origin of
measurements and independence affect the power of various analyses. These
aspects are illustrated by an example from developmental biology.
Journal: Journal of Applied Statistics
Pages: 369-384
Issue: 4
Volume: 36
Year: 2009
Keywords: cosine quantogram, megalithic yard, quantal model, multimodality, power,
X-DOI: 10.1080/02664760802466195
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802466195
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:4:p:369-384
Template-Type: ReDIF-Article 1.0
Author-Name: L. Bettendorf
Author-X-Name-First: L.
Author-X-Name-Last: Bettendorf
Author-Name: S. A. van der Geest
Author-X-Name-First: S. A.
Author-X-Name-Last: van der Geest
Author-Name: G. H. Kuper
Author-X-Name-First: G. H.
Author-X-Name-Last: Kuper
Title: Do daily retail gasoline prices adjust asymmetrically?
Abstract:
This paper analyses adjustments in the Dutch retail gasoline prices. We
estimate an error correction model on changes in the daily retail price
for gasoline (taxes excluded) for the period 1996-2004, taking care of
volatility clustering by estimating an EGARCH model. It turns out that the
volatility process is asymmetrical: a positive shock to the retail price
has a greater effect on the variance of the retail price than a negative
shock. We conclude that the retail price and the spot price do not drift
apart in the long run. However, there is a faster reaction to upward
changes in spot prices than to downward changes in spot prices in the
short run. This asymmetry starts 3 days after the change in the spot price
and lasts for 4 days.
Journal: Journal of Applied Statistics
Pages: 385-397
Issue: 4
Volume: 36
Year: 2009
Keywords: asymmetry, retail gasoline prices, volatility,
X-DOI: 10.1080/02664760802466468
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802466468
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:4:p:385-397
Template-Type: ReDIF-Article 1.0
Author-Name: Christian Weiss
Author-X-Name-First: Christian
Author-X-Name-Last: Weiss
Title: Monitoring correlated processes with binomial marginals
Abstract:
Few approaches for monitoring autocorrelated attribute data have been
proposed in the literature. If the marginal process distribution is
binomial, then the binomial AR(1) model as a realistic and
well-interpretable process model may be adequate. Based on known and newly
derived statistical properties of this model, we shall develop approaches
to monitor a binomial AR(1) process, and investigate their performance in
a simulation study. A case study demonstrates the applicability of the
binomial AR(1) model and of the proposed control charts to problems from
statistical process control.
Journal: Journal of Applied Statistics
Pages: 399-414
Issue: 4
Volume: 36
Year: 2009
Keywords: binomial AR(1) models, statistical process control, control charts, case study,
X-DOI: 10.1080/02664760802468803
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802468803
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:4:p:399-414
Template-Type: ReDIF-Article 1.0
Author-Name: S. H. Lin
Author-X-Name-First: S. H.
Author-X-Name-Last: Lin
Author-Name: R. S. Wang
Author-X-Name-First: R. S.
Author-X-Name-Last: Wang
Title: Inferences on a linear combination of K multivariate normal mean vectors
Abstract:
In this paper, the hypothesis testing and confidence region construction
for a linear combination of mean vectors for K independent multivariate
normal populations are considered. A new generalized pivotal quantity and
a new generalized test variable are derived based on the concepts of
generalized p-values and generalized confidence regions. When only two
populations are considered, our results are equivalent to those proposed
by Gamage et al. [Generalized p-values and confidence regions for the
multivariate Behrens-Fisher problem and MANOVA, J. Multivariate Aanal. 88
(2004), pp. 117-189] in the bivariate case, which is also known as the
bivariate Behrens-Fisher problem. However, in some higher dimension cases,
these two results are quite different. The generalized confidence region
is illustrated with two numerical examples and the merits of the proposed
method are numerically compared with those of the existing methods with
respect to their expected areas, coverage probabilities under different
scenarios.
Journal: Journal of Applied Statistics
Pages: 415-428
Issue: 4
Volume: 36
Year: 2009
Keywords: coverage probability, generalized confidence region, generalized pivotal quantity, generalized test variable, heteroscedasticity, type I error,
X-DOI: 10.1080/02664760802474231
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802474231
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:4:p:415-428
Template-Type: ReDIF-Article 1.0
Author-Name: Rahim Mahmoudvand
Author-X-Name-First: Rahim
Author-X-Name-Last: Mahmoudvand
Author-Name: Hossein Hassani
Author-X-Name-First: Hossein
Author-X-Name-Last: Hassani
Title: Two new confidence intervals for the coefficient of variation in a normal distribution
Abstract:
In this article we introduce an approximately unbiased estimator for the
population coefficient of variation, τ, in a normal distribution.
The accuracy of this estimator is examined by several criteria. Using this
estimator and its variance, two approximate confidence intervals for
τ are introduced. The performance of the new confidence intervals is
compared to those obtained by current methods.
Journal: Journal of Applied Statistics
Pages: 429-442
Issue: 4
Volume: 36
Year: 2009
Keywords: coefficient of variation, confidence interval, normal distribution,
X-DOI: 10.1080/02664760802474249
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802474249
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:4:p:429-442
Template-Type: ReDIF-Article 1.0
Author-Name: Vitor Ozaki
Author-X-Name-First: Vitor
Author-X-Name-Last: Ozaki
Author-Name: Ralph Silva
Author-X-Name-First: Ralph
Author-X-Name-Last: Silva
Title: Bayesian ratemaking procedure of crop insurance contracts with skewed distribution
Abstract:
Over the years, crop insurance programs became the focus of agricultural
policy in the USA, Spain, Mexico, and more recently in Brazil. Given the
increasing interest in insurance, accurate calculation of the premium rate
is of great importance. We address the crop-yield distribution issue and
its implications in pricing an insurance contract considering the dynamic
structure of the data and incorporating the spatial correlation in the
Hierarchical Bayesian framework. Results show that empirical (insurers)
rates are higher in low risk areas and lower in high risk areas. Such
methodological improvement is primarily important in situations of limited
data.
Journal: Journal of Applied Statistics
Pages: 443-452
Issue: 4
Volume: 36
Year: 2009
Keywords: crop insurance, Bayesian hierarchical model, premium rate, skew-normal distribution, spatial correlation,
X-DOI: 10.1080/02664760802474256
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802474256
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:4:p:443-452
Template-Type: ReDIF-Article 1.0
Author-Name: Qiwei Liang
Author-X-Name-First: Qiwei
Author-X-Name-Last: Liang
Author-Name: Huajiang Li
Author-X-Name-First: Huajiang
Author-X-Name-Last: Li
Author-Name: Paul Mendes
Author-X-Name-First: Paul
Author-X-Name-Last: Mendes
Author-Name: Hans Roethig
Author-X-Name-First: Hans
Author-X-Name-Last: Roethig
Author-Name: Kim Frost-Pineda
Author-X-Name-First: Kim
Author-X-Name-Last: Frost-Pineda
Title: Using bootstrap method to evaluate the estimates of nicotine equivalents from linear mixed model and generalized estimating equation
Abstract:
Twenty-four-hour urinary excretion of nicotine equivalents, a biomarker
for exposure to cigarette smoke, has been widely used in biomedical
studies in recent years. Its accurate estimate is important for examining
human exposure to tobacco smoke. The objective of this article is to
compare the bootstrap confidence intervals of nicotine equivalents with
the standard confidence intervals derived from linear mixed model (LMM)
and generalized estimation equation. We use percentile bootstrap method
because it has practical value for real-life application and it works well
with nicotine data. To preserve the within-subject correlation of nicotine
equivalents between repeated measures, we bootstrap the repeated measures
of each subject as a vector. The results indicate that the bootstrapped
estimates in most cases give better estimates than the LMM and generalized
estimation equation without bootstrap.
Journal: Journal of Applied Statistics
Pages: 453-463
Issue: 4
Volume: 36
Year: 2009
Keywords: bootstrap estimates, linear mixed models, generalized estimation equations, nicotine equivalents,
X-DOI: 10.1080/02664760802638074
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802638074
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:4:p:453-463
Template-Type: ReDIF-Article 1.0
Author-Name: Z.Q. John Lu
Author-X-Name-First: Z.Q.
Author-X-Name-Last: John Lu
Title: Bayesian biostatistics and diagnostic medicine
Abstract:
Journal: Journal of Applied Statistics
Pages: 465-466
Issue: 4
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802340283
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802340283
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:4:p:465-466
Template-Type: ReDIF-Article 1.0
Author-Name: Mukesh Srivastava
Author-X-Name-First: Mukesh
Author-X-Name-Last: Srivastava
Author-Name: M. Abbas
Author-X-Name-First: M.
Author-X-Name-Last: Abbas
Title: Topics in biostatistics
Abstract:
Journal: Journal of Applied Statistics
Pages: 467-468
Issue: 4
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802340325
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802340325
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:4:p:467-468
Template-Type: ReDIF-Article 1.0
Author-Name: Steff Lewis
Author-X-Name-First: Steff
Author-X-Name-Last: Lewis
Title: Sample size calculations in clinical research
Abstract:
Journal: Journal of Applied Statistics
Pages: 469-469
Issue: 4
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802366775
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802366775
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:4:p:469-469
Template-Type: ReDIF-Article 1.0
Author-Name: Elisabeth Deviere
Author-X-Name-First: Elisabeth
Author-X-Name-Last: Deviere
Title: Analyzing linguistic data: a practical introduction to statistics using R
Abstract:
Journal: Journal of Applied Statistics
Pages: 471-472
Issue: 4
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802366783
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802366783
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:4:p:471-472
Template-Type: ReDIF-Article 1.0
Author-Name: Seung-Hwan Lee
Author-X-Name-First: Seung-Hwan
Author-X-Name-Last: Lee
Title: Some estimators and tests for accelerated hazards model using weighted cumulative hazard difference
Abstract:
For a censored two-sample problem, Chen and Wang [Y.Q. Chen and M.-C.
Wang, Analysis of accelerated hazards models, J. Am. Statist. Assoc. 95
(2000), pp. 608-618] introduced the accelerated hazards model. The
scale-change parameter in this model characterizes the association of two
groups. However, its estimator involves the unknown density in the
asymptotic variance. Thus, to make an inference on the parameter,
numerically intensive methods are needed. The goal of this article is to
propose a simple estimation method in which estimators are asymptotically
normal with a density-free asymptotic variance. Some lack-of-fit tests are
also obtained from this. These tests are related to Gill-Schumacher type
tests [R.D. Gill and M. Schumacher, A simple test of the proportional
hazards assumption, Biometrika 74 (1987), pp. 289-300] in which the
estimating functions are evaluated at two different weight functions
yielding two estimators that are close to each other. Numerical studies
show that for some weight functions, the estimators and tests perform
well. The proposed procedures are illustrated in two applications.
Journal: Journal of Applied Statistics
Pages: 473-482
Issue: 5
Volume: 36
Year: 2009
Keywords: accelerated hazards model, two-sample censored data, estimation, lack-of-fit tests,
X-DOI: 10.1080/02664760802474264
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802474264
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:5:p:473-482
Template-Type: ReDIF-Article 1.0
Author-Name: Lei Shi
Author-X-Name-First: Lei
Author-X-Name-Last: Shi
Author-Name: Hongyuan Sun
Author-X-Name-First: Hongyuan
Author-X-Name-Last: Sun
Author-Name: Peng Bai
Author-X-Name-First: Peng
Author-X-Name-Last: Bai
Title: Bayesian confidence interval for difference of the proportions in a 2×2 table with structural zero
Abstract:
This article studies the construction of a Bayesian confidence interval
for risk difference in a 2×2 table with structural zero. The exact
posterior distribution of the risk difference is derived under the
Dirichlet prior distribution, and a tail-based interval is used to
construct the Bayesian confidence interval. The frequentist performance of
the tail-based interval is investigated and compared with the score-based
interval by simulation. Our results show that the tail-based interval at
Jeffreys prior performs as well as or better than the score-based
confidence interval.
Journal: Journal of Applied Statistics
Pages: 483-494
Issue: 5
Volume: 36
Year: 2009
Keywords: 2×2 table with structural zero, risk difference, Bayesian analysis, Dirichlet prior, confidence interval,
X-DOI: 10.1080/02664760802474272
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802474272
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:5:p:483-494
Template-Type: ReDIF-Article 1.0
Author-Name: J. B. Shah
Author-X-Name-First: J. B.
Author-X-Name-Last: Shah
Author-Name: M. N. Patel
Author-X-Name-First: M. N.
Author-X-Name-Last: Patel
Title: Bayesian estimation of parameters of mixed geometric failure models from Type I group censored sample
Abstract:
The estimation problem of the parameters of a mixed geometric lifetime
model, using Bayesian approach and Type I group censored sample, will be
investigated in the case of two subpopulations. The Bayes estimates are
derived for squared error, minimum expected, general entropy and linex
loss functions under informative and diffuse priors. A practical example
given by Nelson (W.B. Nelson, Hazard plotting methods for analysis of the
life data with different failure models, J. Qual. Technol. 2 (1970), pp.
126-149) is considered. A simulation study is carried out along with risk.
Journal: Journal of Applied Statistics
Pages: 495-506
Issue: 5
Volume: 36
Year: 2009
Keywords: Bayes estimator, geometric model, loss functions, mixture distribution, risk function, simulation, Type I group censoring,
X-DOI: 10.1080/02664760802553422
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802553422
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:5:p:495-506
Template-Type: ReDIF-Article 1.0
Author-Name: M. Habshah
Author-X-Name-First: M.
Author-X-Name-Last: Habshah
Author-Name: M. R. Norazan
Author-X-Name-First: M. R.
Author-X-Name-Last: Norazan
Author-Name: A.H.M. Rahmatullah Imon
Author-X-Name-First: A.H.M.
Author-X-Name-Last: Rahmatullah Imon
Title: The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression
Abstract:
Leverage values are being used in regression diagnostics as measures of
influential observations in the $X$-space. Detection of high leverage
values is crucial because of their responsibility for misleading
conclusion about the fitting of a regression model, causing
multicollinearity problems, masking and/or swamping of outliers, etc. Much
work has been done on the identification of single high leverage points
and it is generally believed that the problem of detection of a single
high leverage point has been largely resolved. But there is no general
agreement among the statisticians about the detection of multiple high
leverage points. When a group of high leverage points is present in a data
set, mainly because of the masking and/or swamping effects the commonly
used diagnostic methods fail to identify them correctly. On the other
hand, the robust alternative methods can identify the high leverage points
correctly but they have a tendency to identify too many low leverage
points to be points of high leverages which is not also desired. An
attempt has been made to make a compromise between these two approaches.
We propose an adaptive method where the suspected high leverage points are
identified by robust methods and then the low leverage points (if any) are
put back into the estimation data set after diagnostic checking. The
usefulness of our newly proposed method for the detection of multiple high
leverage points is studied by some well-known data sets and Monte Carlo
simulations.
Journal: Journal of Applied Statistics
Pages: 507-520
Issue: 5
Volume: 36
Year: 2009
Keywords: diagnostic-robust generalized potentials, group deletion, high leverage points, masking, robust Mahalanobis distance, minimum volume ellipsoid, Monte Carlo simulation,
X-DOI: 10.1080/02664760802553463
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802553463
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:5:p:507-520
Template-Type: ReDIF-Article 1.0
Author-Name: Wanbo Lu
Author-X-Name-First: Wanbo
Author-X-Name-Last: Lu
Author-Name: Tzong-Ru Tsai
Author-X-Name-First: Tzong-Ru
Author-X-Name-Last: Tsai
Title: Interval censored sampling plans for the log-logistic lifetime distribution
Abstract:
The log-logistic distribution is one of the popular distributions in
life-testing applications. This article develops an acceptance sampling
procedure for the log-logistic lifetime distribution based on grouped data
when the shape parameter is given. Both producer and consumer risks are
considered to develop the ordinary, approximate and simulated sampling
plans. Some of the proposed sampling plans are tabulated; moreover, those
three types of sampling plans are compared with each other under the same
censoring rates. The use of these tables is illustrated by an example.
Journal: Journal of Applied Statistics
Pages: 521-536
Issue: 5
Volume: 36
Year: 2009
Keywords: consumer risk, log-logistic distribution, grouped data, maximum likelihood estimator, producer risk,
X-DOI: 10.1080/02664760802554180
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802554180
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:5:p:521-536
Template-Type: ReDIF-Article 1.0
Author-Name: A. C. Antonakis
Author-X-Name-First: A. C.
Author-X-Name-Last: Antonakis
Author-Name: M. E. Sfakianakis
Author-X-Name-First: M. E.
Author-X-Name-Last: Sfakianakis
Title: Assessing naive Bayes as a method for screening credit applicants
Abstract:
The naive Bayes rule (NBR) is a popular and often highly effective
technique for constructing classification rules. This study examines the
effectiveness of NBR as a method for constructing classification rules
(credit scorecards) in the context of screening credit applicants (credit
scoring). For this purpose, the study uses two real-world credit scoring
data sets to benchmark NBR against linear discriminant analysis, logistic
regression analysis, k-nearest neighbours, classification trees and neural
networks. Of the two aforementioned data sets, the first one is taken from
a major Greek bank whereas the second one is the Australian Credit
Approval data set taken from the UCI Machine Learning Repository
(available at http://www.ics.uci.edu/~mlearn/MLRepository.html). The
predictive ability of scorecards is measured by the total percentage of
correctly classified cases, the Gini coefficient and the bad rate amongst
accepts. In each of the data sets, NBR is found to have a lower predictive
ability than some of the other five methods under all measures used.
Reasons that may negatively affect the predictive ability of NBR relative
to that of alternative methods in the context of credit scoring are
examined.
Journal: Journal of Applied Statistics
Pages: 537-545
Issue: 5
Volume: 36
Year: 2009
Keywords: credit scorecard, credit scoring, credit risk, naive Bayes, retail banking,
X-DOI: 10.1080/02664760802554263
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802554263
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:5:p:537-545
Template-Type: ReDIF-Article 1.0
Author-Name: Luis Martins
Author-X-Name-First: Luis
Author-X-Name-Last: Martins
Title: Unit root tests and dramatic shifts with infinite variance processes
Abstract:
A model which explains data that is subject to sudden structural changes
of unspecified nature is presented. The structural shifts are generated by
a random walk component whose innovations belong to the normal domain of
attraction of a symmetric stable law. To test the model against the
stationarity case, several non-parametric, and regression-based statistics
are studied. The non-parametric tests are a generalization of the variance
ratio test to innovations with heavy-tailed distributions. The tests are
consistent and shown to have good finite sample size and power properties
and are applied to a set of economic variables.
Journal: Journal of Applied Statistics
Pages: 547-571
Issue: 5
Volume: 36
Year: 2009
Keywords: unit root, stable processes, partial sums, limit distributions, empirical size and power,
X-DOI: 10.1080/02664760802554321
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802554321
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:5:p:547-571
Template-Type: ReDIF-Article 1.0
Author-Name: Jose Manuel Herrerias-Velasco
Author-X-Name-First: Jose Manuel
Author-X-Name-Last: Herrerias-Velasco
Author-Name: Rafael Herrerias-Pleguezuelo
Author-X-Name-First: Rafael
Author-X-Name-Last: Herrerias-Pleguezuelo
Author-Name: Johan Rene van Dorp
Author-X-Name-First: Johan Rene
Author-X-Name-Last: van Dorp
Title: The generalized two-sided power distribution
Abstract:
The generalized standard two-sided power (GTSP) distribution was
mentioned only in passing by Kotz and van Dorp Beyond Beta, Other
Continuous Families of Distributions with Bounded Support and
Applications, World Scientific Press, Singapore, 2004. In this paper, we
shall further investigate this three-parameter distribution by presenting
some novel properties and use its more general form to contrast the
chronology of developments of various authors on the two-parameter TSP
distribution since its initial introduction. GTSP distributions allow for
J-shaped forms of its pdf, whereas TSP distributions are limited to
U-shaped and unimodal forms. Hence, GTSP distributions possess the same
three distributional shapes as the classical beta distributions. A novel
method and algorithm for the indirect elicitation of the two-power
parameters of the GTSP distribution is developed. We present a Project
Evaluation Review Technique example that utilizes this algorithm and
demonstrates the benefit of separate powers for the two branches of
activity GTSP distributions for project completion time uncertainty
estimation.
Journal: Journal of Applied Statistics
Pages: 573-587
Issue: 5
Volume: 36
Year: 2009
Keywords: Parameter elicitation, moment ratio diagram, PERT technique,
X-DOI: 10.1080/02664760802582850
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802582850
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:5:p:573-587
Template-Type: ReDIF-Article 1.0
Author-Name: Alan Hutson
Author-X-Name-First: Alan
Author-X-Name-Last: Hutson
Author-Name: Maurie Markman
Author-X-Name-First: Maurie
Author-X-Name-Last: Markman
Author-Name: Mark Brady
Author-X-Name-First: Mark
Author-X-Name-Last: Brady
Title: A permutation test approach to phase II historical control trials
Abstract:
In this note we outline 15 years of Gynecologic Oncology Group (GOG)
experience conducting a series of phase II second-line intraperitoneal
trials in the treatment of ovarian cancer. Using this information, the
goal is to define a new permutation approach to historical control phase
II trials in ovarian cancer. We utilize seven previous phase II GOG trials
in our database to illustrate our methodology.
Journal: Journal of Applied Statistics
Pages: 589-599
Issue: 6
Volume: 36
Year: 2009
Keywords: clinical trial, non-randomized,
X-DOI: 10.1080/02664760802474280
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802474280
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:589-599
Template-Type: ReDIF-Article 1.0
Author-Name: Ayman Baklizi
Author-X-Name-First: Ayman
Author-X-Name-Last: Baklizi
Author-Name: B.M. Golam Kibria
Author-X-Name-First: B.M.
Author-X-Name-Last: Golam Kibria
Title: One and two sample confidence intervals for estimating the mean of skewed populations: an empirical comparative study
Abstract:
In this paper we consider and propose some confidence intervals for
estimating the mean or difference of means of skewed populations. We
extend the median t interval to the two sample problem. Further, we
suggest using the bootstrap to find the critical points for use in the
calculation of median t intervals. A simulation study has been made to
compare the performance of the intervals and a real life example has been
considered to illustrate the application of the methods.
Journal: Journal of Applied Statistics
Pages: 601-609
Issue: 6
Volume: 36
Year: 2009
Keywords: bootstrapping, coverage probability, interval estimation, Johnson's t, mean, Student's t, skewness,
X-DOI: 10.1080/02664760802474298
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802474298
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:601-609
Template-Type: ReDIF-Article 1.0
Author-Name: Veena Shetty
Author-X-Name-First: Veena
Author-X-Name-Last: Shetty
Author-Name: Christopher Morrell
Author-X-Name-First: Christopher
Author-X-Name-Last: Morrell
Author-Name: Samer Najjar
Author-X-Name-First: Samer
Author-X-Name-Last: Najjar
Title: Modeling a cross-sectional response variable with longitudinal predictors: an example of pulse pressure and pulse wave velocity
Abstract:
We wish to model pulse wave velocity (PWV) as a function of longitudinal
measurements of pulse pressure (PP) at the same and prior visits at which
the PWV is measured. A number of approaches are compared. First, we use
the PP at the same visit as the PWV in a linear regression model. In
addition, we use the average of all available PPs as the explanatory
variable in a linear regression model. Next, a two-stage process is
applied. The longitudinal PP is modeled using a linear mixed-effects
model. This modeled PP is used in the regression model to describe PWV. An
approach for using the longitudinal PP data is to obtain a measure of the
cumulative burden, the area under the PP curve. This area under the curve
is used as an explanatory variable to model PWV. Finally, a joint Bayesian
model is constructed similar to the two-stage model.
Journal: Journal of Applied Statistics
Pages: 611-619
Issue: 6
Volume: 36
Year: 2009
Keywords: mixed effects, linear regression, area under the curve,
X-DOI: 10.1080/02664760802478208
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802478208
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:611-619
Template-Type: ReDIF-Article 1.0
Author-Name: S. Rao Jammalamadaka
Author-X-Name-First: S. Rao
Author-X-Name-Last: Jammalamadaka
Author-Name: Md. Aleemuddin Siddiqi
Author-X-Name-First: Md. Aleemuddin
Author-X-Name-Last: Siddiqi
Author-Name: Kaushik Ghosh
Author-X-Name-First: Kaushik
Author-X-Name-Last: Ghosh
Title: Analysis of microtubule dynamics using growth curve models
Abstract:
Microtubules are part of the structural network within a cell's
cytoplasm, providing structural support as well as taking part in many of
the cellular processes. A large body of data provide evidence that
dynamics of microtubules in a cell is reponsible for the performance of
many critical cellular functions such as cell division. In this article,
we study the effect of four different isoforms of a protein tau on
microtubule dynamics using growth curve models. The results show that a
linear growth curve model is sufficient to explain the data. Moreover, we
find that a mutated version of a 3-repeat tau protein has a similar effect
as a 4-repeat tau protein on microtubule dynamics. The latter findings
conform with the biological understanding of the effect of the protein tau
on microtubule dynamics.
Journal: Journal of Applied Statistics
Pages: 621-631
Issue: 6
Volume: 36
Year: 2009
Keywords: growth curves, microtubule dynamics, polynomial regression, splines, MANOVA, Wilks' Lamda,
X-DOI: 10.1080/02664760802479131
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802479131
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:621-631
Template-Type: ReDIF-Article 1.0
Author-Name: Ammarin Thakkinstian
Author-X-Name-First: Ammarin
Author-X-Name-Last: Thakkinstian
Author-Name: John Thompson
Author-X-Name-First: John
Author-X-Name-Last: Thompson
Author-Name: Cosetta Minelli
Author-X-Name-First: Cosetta
Author-X-Name-Last: Minelli
Author-Name: John Attia
Author-X-Name-First: John
Author-X-Name-Last: Attia
Title: Choosing between per-genotype, per-allele, and trend approaches for initial detection of gene-disease association
Abstract:
There are a number of approaches to detect candidate gene-disease
associations including: (i) 'per-genotype', which looks for any difference
across the genotype groups without making any assumptions about the
direction of the effect or the genetic model; (ii) 'per-allele', which
assumes an additive genetic model, i.e. an effect for each allele copy;
and (iii) linear trend, which looks for an incremental effect across the
genotype groups. We simulated a number of gene-disease associations,
varying odds ratios, allele frequency, genetic model, and deviation from
Hardy-Weinberg equilibrium (HWE) and tested the performance of each of the
three methods to detect the associations, where performance was judged by
looking at critical values, power, coverage, bias, and root mean square
error. Results indicate that the per-allele method is very susceptible to
false positives and false negatives when deviations from HWE occur. The
linear trend test appears to have the best power under most simulated
scenarios, but can sometimes be biased and have poor coverage. These
results indicate that of these strategies a linear trend test may be best
for initially testing an association, and the per-genotype approach may be
best for estimating the magnitude of the association.
Journal: Journal of Applied Statistics
Pages: 633-646
Issue: 6
Volume: 36
Year: 2009
Keywords: per-genotype, per-allele, power, bias, gene-disease association,
X-DOI: 10.1080/02664760802484990
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802484990
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:633-646
Template-Type: ReDIF-Article 1.0
Author-Name: Gabriel Escarela
Author-X-Name-First: Gabriel
Author-X-Name-Last: Escarela
Author-Name: Luis Carlos Perez-Ruiz
Author-X-Name-First: Luis Carlos
Author-X-Name-Last: Perez-Ruiz
Author-Name: Russell Bowater
Author-X-Name-First: Russell
Author-X-Name-Last: Bowater
Title: A copula-based Markov chain model for the analysis of binary longitudinal data
Abstract:
A fully parametric first-order autoregressive (AR(1)) model is proposed
to analyse binary longitudinal data. By using a discretized version of a
copula, the modelling approach allows one to construct separate models for
the marginal response and for the dependence between adjacent responses.
In particular, the transition model that is focused on discretizes the
Gaussian copula in such a way that the marginal is a Bernoulli
distribution. A probit link is used to take into account concomitant
information in the behaviour of the underlying marginal distribution.
Fixed and time-varying covariates can be included in the model. The method
is simple and is a natural extension of the AR(1) model for Gaussian
series. Since the approach put forward is likelihood-based, it allows
interpretations and inferences to be made that are not possible with
semi-parametric approaches such as those based on generalized estimating
equations. Data from a study designed to reduce the exposure of children
to the sun are used to illustrate the methods.
Journal: Journal of Applied Statistics
Pages: 647-657
Issue: 6
Volume: 36
Year: 2009
Keywords: copula, discrete time series, Markov regression models, maximum likelihood, probit regression model, serial correlation,
X-DOI: 10.1080/02664760802499287
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802499287
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:647-657
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaomo Jiang
Author-X-Name-First: Xiaomo
Author-X-Name-Last: Jiang
Author-Name: Sankaran Mahadevan
Author-X-Name-First: Sankaran
Author-X-Name-Last: Mahadevan
Title: Bayesian inference method for model validation and confidence extrapolation
Abstract:
This paper presents a Bayesian-hypothesis-testing-based methodology for
model validation and confidence extrapolation under uncertainty, using
limited test data. An explicit expression of the Bayes factor is derived
for the interval hypothesis testing. The interval method is compared with
the Bayesian point null hypothesis testing approach. The Bayesian network
with Markov Chain Monte Carlo simulation and Gibbs sampling is explored
for extrapolating the inference from the validated domain at the component
level to the untested domain at the system level. The effect of the number
of experiments on the confidence in the model validation decision is
investigated. The probabilities of Type I and Type II errors in
decision-making during the model validation and confidence extrapolation
are quantified. The proposed methodologies are applied to a structural
mechanics problem. Numerical results demonstrate that the Bayesian
methodology provides a quantitative approach to facilitate rational
decisions in model validation and confidence extrapolation under
uncertainty.
Journal: Journal of Applied Statistics
Pages: 659-677
Issue: 6
Volume: 36
Year: 2009
Keywords: Bayesian statistics, Bayes factor, hypothesis testing, model validation, extrapolation,
X-DOI: 10.1080/02664760802499295
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802499295
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:659-677
Template-Type: ReDIF-Article 1.0
Author-Name: F. Javier Trivez
Author-X-Name-First: F. Javier
Author-X-Name-Last: Trivez
Author-Name: Beatriz Catalan
Author-X-Name-First: Beatriz
Author-X-Name-Last: Catalan
Title: Detecting level shifts in ARMA-GARCH (1,1) Models
Abstract:
The purpose of this article is to present a new method to detect level
shifts in the context of conditional heteroscedastic models. First, we
define precisely what type of outlier we are referring to, a concept that
has been scarcely touched in the field of GARCH (1,1) models, and then we
go on to present our methodology based on the nature of the Lagrange
multiplier tests. The validity and efficiency of the proposed procedure
are demonstrated through different simulation experiments. To conclude, we
present a practical application of the method to the time series of
returns of US short-term interest rates.
Journal: Journal of Applied Statistics
Pages: 679-697
Issue: 6
Volume: 36
Year: 2009
Keywords: level outliers, volatility outliers, level shifts, GARCH models, LM tests,
X-DOI: 10.1080/02664760802499303
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802499303
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:679-697
Template-Type: ReDIF-Article 1.0
Author-Name: Ana Militino
Author-X-Name-First: Ana
Author-X-Name-Last: Militino
Title: Time Series Analysis
Abstract:
Journal: Journal of Applied Statistics
Pages: 699-700
Issue: 6
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802366809
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802366809
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:699-700
Template-Type: ReDIF-Article 1.0
Author-Name: Andrew Robinson
Author-X-Name-First: Andrew
Author-X-Name-Last: Robinson
Title: Adaptive design theory and implementation using SAS and R
Abstract:
Journal: Journal of Applied Statistics
Pages: 701-702
Issue: 6
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802366817
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802366817
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:701-702
Template-Type: ReDIF-Article 1.0
Author-Name: Jacques Pienaar
Author-X-Name-First: Jacques
Author-X-Name-Last: Pienaar
Title: Probability and statistics with R
Abstract:
Journal: Journal of Applied Statistics
Pages: 703-704
Issue: 6
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802416539
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802416539
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:703-704
Template-Type: ReDIF-Article 1.0
Author-Name: Jacques Pienaar
Author-X-Name-First: Jacques
Author-X-Name-Last: Pienaar
Title: Control Techniques for Complex Networks
Abstract:
Journal: Journal of Applied Statistics
Pages: 705-705
Issue: 6
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802416554
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802416554
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:705-705
Template-Type: ReDIF-Article 1.0
Author-Name: Jacques Pienaar
Author-X-Name-First: Jacques
Author-X-Name-Last: Pienaar
Title: Random networks for communication: from statistical physics to information systems
Abstract:
Journal: Journal of Applied Statistics
Pages: 707-708
Issue: 6
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802416562
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802416562
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:6:p:707-708
Template-Type: ReDIF-Article 1.0
Author-Name: Antonello Maruotti
Author-X-Name-First: Antonello
Author-X-Name-Last: Maruotti
Title: Fairness of the national health service in Italy: a bivariate correlated random effects model
Abstract:
The primary purpose of this paper is to comprehensively assess
households' burden due to health payments. Starting from the fairness
approach developed by the World Health Organization, we analyse the burden
of healthcare payments on Italian households by modeling catastrophic
payments and impoverishment due to healthcare expenditures. For this
purpose, we propose to extend the analysis of fairness in financing
contribution through a generalized linear mixed models by introducing a
bivariate correlated random effects model, where association between the
outcomes is modeled through individual- and outcome-specific latent
effects which are assumed to be correlated. We discuss model parameter
estimation in a finite mixture context. By using such model specification,
the fairness of the Italian national health service is investigated.
Journal: Journal of Applied Statistics
Pages: 709-722
Issue: 7
Volume: 36
Year: 2009
Keywords: fairness, healthcare, random effects models, binary data, non-parametric maximum likelihood,
X-DOI: 10.1080/02664760802499311
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802499311
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:7:p:709-722
Template-Type: ReDIF-Article 1.0
Author-Name: Montezuma Dumangane
Author-X-Name-First: Montezuma
Author-X-Name-Last: Dumangane
Author-Name: Nicoletta Rosati
Author-X-Name-First: Nicoletta
Author-X-Name-Last: Rosati
Author-Name: Anna Volossovitch
Author-X-Name-First: Anna
Author-X-Name-Last: Volossovitch
Title: Departure from independence and stationarity in a handball match
Abstract:
This paper analyses direct and indirect forms of dependence in the
probability of scoring in a handball match, taking into account the mutual
influence of both playing teams. Non-identical distribution (i.d.) and
non-stationarity, which are commonly observed in sport games, are studied
through the specification of time-varying parameters. The model accounts
for the binary character of the dependent variable, and for unobserved
heterogeneity. The parameter dynamics is specified by a first-order
auto-regressive process. Data from the Handball World Championships
2001-2005 show that the dynamics of handball violate both independence and
i.d., in some cases having a non-stationary behaviour.
Journal: Journal of Applied Statistics
Pages: 723-741
Issue: 7
Volume: 36
Year: 2009
Keywords: binary choice, dynamic panel data, time-varying parameters, unobserved heterogeneity, dependence, non-stationarity,
X-DOI: 10.1080/02664760802499329
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802499329
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:7:p:723-741
Template-Type: ReDIF-Article 1.0
Author-Name: Alexander Correa
Author-X-Name-First: Alexander
Author-X-Name-Last: Correa
Author-Name: Pere Grima
Author-X-Name-First: Pere
Author-X-Name-Last: Grima
Author-Name: Xavier Tort-Martorell
Author-X-Name-First: Xavier
Author-X-Name-Last: Tort-Martorell
Title: Experimentation order with good properties for 2k factorial designs
Abstract:
Randomizing the order of experimentation in a factorial design does not
always achieve the desired effect of neutralizing the influence of unknown
factors. In fact, with some very reasonable assumptions, an important
proportion of random orders achieve the same degree of protection as that
obtained by experimenting in the design matrix standard order. In
addition, randomization can induce a large number of changes in factor
levels and thus make experimentation expensive and difficult. De Leon et
al. [Experimentation order in factorial designs with 8 or 16 runs, J.
Appl. Stat. 32 (2005), pp. 297-313] proposed experimentation orders for
designs with eight or 16 runs that combine an excellent level of
protection against the influence of unknown factors, with the minimum
number of changes in factor levels. This article presents a new
methodology to obtain experimentation orders with the desired properties
for designs with any number of runs.
Journal: Journal of Applied Statistics
Pages: 743-754
Issue: 7
Volume: 36
Year: 2009
Keywords: randomization, experimentation order, factorial design, bias protection, minimum number of level changes,
X-DOI: 10.1080/02664760802499337
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802499337
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:7:p:743-754
Template-Type: ReDIF-Article 1.0
Author-Name: Rabindra Nath Das
Author-X-Name-First: Rabindra Nath
Author-X-Name-Last: Das
Author-Name: Sung Park
Author-X-Name-First: Sung
Author-X-Name-Last: Park
Title: A measure of robust slope-rotatability for second-order response surface experimental designs
Abstract:
In response surface methodology, rotatability and slope-rotatability are
natural and highly desirable properties for second-order regression
models. In this paper a measure of robust slope-rotatability for
second-order response surface designs with a general correlated error
structure is developed and illustrated with different examples for
autocorrelated error structure.
Journal: Journal of Applied Statistics
Pages: 755-767
Issue: 7
Volume: 36
Year: 2009
Keywords: response surface design, rotatability, robust rotatability, robust slope-rotatability, weak slope-rotatability, weak slope-rotatability region,
X-DOI: 10.1080/02664760802499345
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802499345
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:7:p:755-767
Template-Type: ReDIF-Article 1.0
Author-Name: Binbing Yu
Author-X-Name-First: Binbing
Author-X-Name-Last: Yu
Title: Approximating the risk score for disease diagnosis using MARS
Abstract:
In disease screening and diagnosis, often multiple markers are measured
and combined to improve the accuracy of diagnosis. McIntosh and Pepe
[Combining several screening tests: optimality of the risk score,
Biometrics 58 (2002), pp. 657-664] showed that the risk score, defined as
the probability of disease conditional on multiple markers, is the optimal
function for classification based on the Neyman-Pearson lemma. They
proposed a two-step procedure to approximate the risk score. However, the
resulting receiver operating characteristic (ROC) curve is only defined in
a subrange (L, h) of false-positive rates in (0,1) and the determination
of the lower limit L needs extra prior information. In practice, most
diagnostic tests are not perfect, and it is usually rare that a single
marker is uniformly better than the other tests. Using simulation, I show
that multivariate adaptive regression spline is a useful tool to
approximate the risk score when combining multiple markers, especially
when ROC curves from multiple tests cross. The resulting ROC is defined in
the whole range of (0,1) and is easy to implement and has intuitive
interpretation. The sample code of the application is shown in the
appendix.
Journal: Journal of Applied Statistics
Pages: 769-778
Issue: 7
Volume: 36
Year: 2009
Keywords: multivariate adaptive regression spline (MARS), Neyman-Pearson lemma, risk score, ROC,
X-DOI: 10.1080/02664760802499352
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802499352
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:7:p:769-778
Template-Type: ReDIF-Article 1.0
Author-Name: Loredana Ureche-Rangau
Author-X-Name-First: Loredana
Author-X-Name-Last: Ureche-Rangau
Author-Name: Quiterie de Rorthays
Author-X-Name-First: Quiterie
Author-X-Name-Last: de Rorthays
Title: More on the volatility-trading volume relationship in emerging markets: The Chinese stock market
Abstract:
This paper empirically investigates the characteristics in terms of
volatility and trading volume relationships of the Chinese stock markets,
and specifically of the stocks comprising the SSE180 index. Our results
show that, contrary to previous evidence, both volatility and trading
volume appear to be multi-fractal and highly intermittent, suggesting a
common long-run behaviour in addition to the common short-term behaviour
underlined by former studies. Moreover, the trading volume seems to have
no explanatory power for volatility persistence when introduced in the
conditional variance equation. Finally, the sign of the trading volume
coefficients is mainly negative, hence showing a negative correlation
between the two variables.
Journal: Journal of Applied Statistics
Pages: 779-799
Issue: 7
Volume: 36
Year: 2009
Keywords: volatility persistence, long-memory, trading volume,
X-DOI: 10.1080/02664760802509101
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802509101
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:7:p:779-799
Template-Type: ReDIF-Article 1.0
Author-Name: E. Ciavolino
Author-X-Name-First: E.
Author-X-Name-Last: Ciavolino
Author-Name: J. J. Dahlgaard
Author-X-Name-First: J. J.
Author-X-Name-Last: Dahlgaard
Title: Simultaneous Equation Model based on the generalized maximum entropy for studying the effect of management factors on enterprise performance
Abstract:
The aim of this paper is to study the effect of management factors on
enterprise performance, considering a survey that the University
Consortium in Engineering for Quality and Innovation has led. The
relationships between management factors and enterprise performance are
formalized by a Simultaneous Equation Model based on the generalized
maximum entropy (GME) estimation method. The format of this paper is as
follows. In Section 2, the data collected, the questionnaire evaluation,
and the management model analytical formulation are introduced. In Section
3, the GME formulation is specified, showing the main characteristics of
the estimation method. In Section 4, the results and a comparison among
GME, partial least squares (PLS), and maximum likelihood estimation (MLE)
is shown. In Section 5, concluding remarks are discussed.
Journal: Journal of Applied Statistics
Pages: 801-815
Issue: 7
Volume: 36
Year: 2009
Keywords: generalized maximum entropy, human resources, leadership, maximum likelihood estimation, partial least squares, performance, Simultaneous Equation Model, strategic planning,
X-DOI: 10.1080/02664760802510026
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802510026
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:7:p:801-815
Template-Type: ReDIF-Article 1.0
Author-Name: Christoph Hanck
Author-X-Name-First: Christoph
Author-X-Name-Last: Hanck
Title: Cross-sectional correlation robust tests for panel cointegration
Abstract:
We use meta-analytic procedures to develop new tests for panel
cointegration, combining p-values from time-series cointegration tests on
the units of the panel. The tests are robust to heterogeneity and
cross-sectional dependence between the panel units. To achieve the latter,
we employ a sieve bootstrap procedure with joint resampling of the units'
residuals. A simulation study shows that the tests can have substantially
smaller size distortion than tests ignoring the presence of
cross-sectional dependence while preserving high power. We apply the tests
to a panel of post-Bretton Woods data to test for weak purchasing power
parity.
Journal: Journal of Applied Statistics
Pages: 817-833
Issue: 7
Volume: 36
Year: 2009
Keywords: panel cointegration tests, cross-sectional dependence, sieve bootstrap,
X-DOI: 10.1080/02664760802510042
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802510042
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:7:p:817-833
Template-Type: ReDIF-Article 1.0
Author-Name: Bo He
Author-X-Name-First: Bo
Author-X-Name-Last: He
Author-Name: Clyde Martin
Author-X-Name-First: Clyde
Author-X-Name-Last: Martin
Title: Statistical analysis of error for fourth-order ordinary differential equation solvers
Abstract:
We develop an autoregressive integrated moving average (ARIMA) model to
study the statistical behavior of the numerical error generated from three
fourth-order ordinary differential equation solvers: Milne's method,
Adams-Bashforth method and a new method that randomly switches between the
Milne and Adams-Bashforth methods. With the actual error data based on
three differential equations, we desire to identify an ARIMA model for
each data series. Results show that some of the data series can be
described by ARIMA models but others cannot. Based on the mathematical
form of the numerical error, other statistical models should be
investigated in the future. Finally, we assess the multivariate normality
of the sample mean error generated by the switching method.
Journal: Journal of Applied Statistics
Pages: 835-852
Issue: 8
Volume: 36
Year: 2009
Keywords: differential equations, numerical error, switching,
X-DOI: 10.1080/02664760802510034
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802510034
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:8:p:835-852
Template-Type: ReDIF-Article 1.0
Author-Name: Agustin Hernandez Bastida
Author-X-Name-First: Agustin Hernandez
Author-X-Name-Last: Bastida
Author-Name: Emilio Gomez Deniz
Author-X-Name-First: Emilio Gomez
Author-X-Name-Last: Deniz
Author-Name: Jose Maria Perez Sanchez
Author-X-Name-First: Jose Maria Perez
Author-X-Name-Last: Sanchez
Title: Bayesian robustness of the compound Poisson distribution under bidimensional prior: an application to the collective risk model
Abstract:
The distribution of the aggregate claims in one year plays an important
role in Actuarial Statistics for computing, for example, insurance
premiums when both the number and size of the claims must be implemented
into the model. When the number of claims follows a Poisson distribution
the aggregated distribution is called the compound Poisson distribution.
In this article we assume that the claim size follows an exponential
distribution and later we make an extensive study of this model by
assuming a bidimensional prior distribution for the parameters of the
Poisson and exponential distribution with marginal gamma. This study
carries us to obtain expressions for net premiums, marginal and posterior
distributions in terms of some well-known special functions used in
statistics. Later, a Bayesian robustness study of this model is made.
Bayesian robustness on bidimensional models was deeply treated in the
1990s, producing numerous results, but few applications dealing with this
problem can be found in the literature.
Journal: Journal of Applied Statistics
Pages: 853-869
Issue: 8
Volume: 36
Year: 2009
Keywords: Bayesian robustness, Bessel function, class of distributions, compound, generalized hypergeometric function, net premium,
X-DOI: 10.1080/02664760802510059
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802510059
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:8:p:853-869
Template-Type: ReDIF-Article 1.0
Author-Name: Carla Machado
Author-X-Name-First: Carla
Author-X-Name-Last: Machado
Author-Name: Carlos Daniel Paulino
Author-X-Name-First: Carlos Daniel
Author-X-Name-Last: Paulino
Author-Name: Francisco Nunes
Author-X-Name-First: Francisco
Author-X-Name-Last: Nunes
Title: Deprivation analysis based on Bayesian latent class models
Abstract:
This article seeks to measure deprivation among Portuguese households,
taking into account four well-being dimensions - housing, durable goods,
economic strain and social relationships - with survey data from the
European Community Household Panel. We propose a multi-stage approach to a
cross-sectional analysis, side-stepping the sparse nature of the
contingency tables caused by the large number of variables considered and
bringing together partial and overall analyses of deprivation that are
based on Bayesian latent class models via Markov Chain Monte Carlo
methods. The outcomes demonstrate that there was a substantial improvement
on household overall well-being between 1995 and 2001. The dimensions that
most contributed to the risk of household deprivation were found to be
economic strain and social relationships.
Journal: Journal of Applied Statistics
Pages: 871-891
Issue: 8
Volume: 36
Year: 2009
Keywords: poverty, deprivation, Bayesian latent class model, label-switching, MCMC method, Portugal,
X-DOI: 10.1080/02664760802520769
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802520769
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:8:p:871-891
Template-Type: ReDIF-Article 1.0
Author-Name: Ganesh Dutta
Author-X-Name-First: Ganesh
Author-X-Name-Last: Dutta
Author-Name: Premadhis Das
Author-X-Name-First: Premadhis
Author-X-Name-Last: Das
Author-Name: Nripes Mandal
Author-X-Name-First: Nripes
Author-X-Name-Last: Mandal
Title: Optimum covariate designs in split-plot and strip-plot design set-ups
Abstract:
The problem considered is that of finding optimum covariate designs for
estimation of covariate parameters in standard split-plot and strip-plot
design set-ups with the levels of the whole-plot factor in r randomised
blocks. Also an extended version of a mixed orthogonal array has been
introduced, which is used to construct such optimum covariate designs.
Hadamard matrices, as usual, play the key role for such construction.
Journal: Journal of Applied Statistics
Pages: 893-906
Issue: 8
Volume: 36
Year: 2009
Keywords: split-plot designs, strip-plot designs, covariates, optimal designs, extended mixed orthogonal array, Hadamard matrices,
X-DOI: 10.1080/02664760802520777
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802520777
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:8:p:893-906
Template-Type: ReDIF-Article 1.0
Author-Name: Giovanni De Luca
Author-X-Name-First: Giovanni
Author-X-Name-Last: De Luca
Author-Name: Giorgia Rivieccio
Author-X-Name-First: Giorgia
Author-X-Name-Last: Rivieccio
Title: Archimedean copulae for risk measurement
Abstract:
In this paper some Archimedean copula functions for bivariate financial
returns are studied. The choice of this family is due to their ability to
capture the tail dependence, which is an association measure we can detect
in many bivariate financial time-series. A time-varying version of these
copulae is also investigated. Finally, the Value-at-Risk is computed and
its performance is compared across different copula specifications.
Journal: Journal of Applied Statistics
Pages: 907-924
Issue: 8
Volume: 36
Year: 2009
Keywords: copula, time-varying parameters, daily equity returns, risk management, value-at-risk,
X-DOI: 10.1080/02664760802520785
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802520785
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:8:p:907-924
Template-Type: ReDIF-Article 1.0
Author-Name: Miroslav Ristic
Author-X-Name-First: Miroslav
Author-X-Name-Last: Ristic
Title: R programming for bioinformatics
Abstract:
Journal: Journal of Applied Statistics
Pages: 925-925
Issue: 8
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802695884
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802695884
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:8:p:925-925
Template-Type: ReDIF-Article 1.0
Author-Name: Filia Vonta
Author-X-Name-First: Filia
Author-X-Name-Last: Vonta
Title: The frailty model
Abstract:
Journal: Journal of Applied Statistics
Pages: 927-928
Issue: 8
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802695892
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802695892
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:8:p:927-928
Template-Type: ReDIF-Article 1.0
Author-Name: Paul Baxter
Author-X-Name-First: Paul
Author-X-Name-Last: Baxter
Title: An introduction to generalised linear models, third edition
Abstract:
Journal: Journal of Applied Statistics
Pages: 929-930
Issue: 8
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802695900
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802695900
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:8:p:929-930
Template-Type: ReDIF-Article 1.0
Author-Name: Pieter Bastiaan Ober
Author-X-Name-First: Pieter Bastiaan
Author-X-Name-Last: Ober
Title: Risk analysis
Abstract:
Journal: Journal of Applied Statistics
Pages: 931-931
Issue: 8
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760802695918
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802695918
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:8:p:931-931
Template-Type: ReDIF-Article 1.0
Author-Name: B. J. Gajewski
Author-X-Name-First: B. J.
Author-X-Name-Last: Gajewski
Author-Name: R. Lee
Author-X-Name-First: R.
Author-X-Name-Last: Lee
Author-Name: M. Bott
Author-X-Name-First: M.
Author-X-Name-Last: Bott
Author-Name: U. Piamjariyakul
Author-X-Name-First: U.
Author-X-Name-Last: Piamjariyakul
Author-Name: R. L. Taunton
Author-X-Name-First: R. L.
Author-X-Name-Last: Taunton
Title: On estimating the distribution of data envelopment analysis efficiency scores: an application to nursing homes' care planning process
Abstract:
Data envelopment analysis (DEA) is a deterministic econometric model for
calculating efficiency by using data from an observed set of
decision-making units (DMUs). We propose a method for calculating the
distribution of efficiency scores. Our framework relies on estimating data
from an unobserved set of DMUs. The model provides posterior predictive
data for the unobserved DMUs to augment the frontier in the DEA that
provides a posterior predictive distribution for the efficiency scores. We
explore the method on a multiple-input and multiple-output DEA model. The
data for the example are from a comprehensive examination of how nursing
homes complete a standardized mandatory assessment of residents.
Journal: Journal of Applied Statistics
Pages: 933-944
Issue: 9
Volume: 36
Year: 2009
Keywords: bounded DEA, MCMC, binomial distribution, posterior predictive distribution, sampling, imputation,
X-DOI: 10.1080/02664760802552986
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802552986
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:9:p:933-944
Template-Type: ReDIF-Article 1.0
Author-Name: Mohamed Boutahar
Author-X-Name-First: Mohamed
Author-X-Name-Last: Boutahar
Title: Comparison of non-parametric and semi-parametric tests in detecting long memory
Abstract:
The first two stages in modelling times series are hypothesis testing and
estimation. For long memory time series, the second stage was studied in
the paper published in [M. Boutahar et al., Estimation methods of the long
memory parameter: monte Carlo analysis and application, J. Appl. Statist.
34(3), pp. 261-301.] in which we have presented some estimation methods of
the long memory parameter. The present paper is intended for the first
stage, and hence completes the former, by exploring some tests for
detecting long memory in time series. We consider two kinds of tests: the
non-parametric class and the semi-parametric one. We precise the limiting
distribution of the non-parametric tests under the null of short memory
and we show that they are consistent against the alternative of long
memory. We perform also some Monte Carlo simulations to analyse the size
distortion and the power of all proposed tests. We conclude that for large
sample size, the two classes are equivalent but for small sample size the
non-parametric class is better than the semi-parametric one.
Journal: Journal of Applied Statistics
Pages: 945-972
Issue: 9
Volume: 36
Year: 2009
Keywords: hypothesis testing, long memory, power, short memory, size,
X-DOI: 10.1080/02664760802562464
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802562464
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:9:p:945-972
Template-Type: ReDIF-Article 1.0
Author-Name: Vito Muggeo
Author-X-Name-First: Vito
Author-X-Name-Last: Muggeo
Author-Name: Massimo Attanasio
Author-X-Name-First: Massimo
Author-X-Name-Last: Attanasio
Author-Name: Mariano Porcu
Author-X-Name-First: Mariano
Author-X-Name-Last: Porcu
Title: A segmented regression model for event history data: an application to the fertility patterns in Italy
Abstract:
We propose a segmented discrete-time model for the analysis of event
history data in demographic research. Through a unified regression
framework, the model provides estimates of the effects of explanatory
variables and jointly accommodates flexibly non-proportional differences
via segmented relationships. The main appeal relies on ready availability
of parameters, changepoints, and slopes, which may provide meaningful and
intuitive information on the topic. Furthermore, specific linear
constraints on the slopes may also be set to investigate particular
patterns. We investigate the intervals between cohabitation and first
childbirth and from first to second childbirth using individual data for
Italian women from the Second National Survey on Fertility. The model
provides insights into dramatic decrease of fertility experienced in
Italy, in that it detects a 'common' tendency in delaying the onset of
childbearing for the more recent cohorts and a 'specific' postponement
strictly depending on the educational level and age at cohabitation.
Journal: Journal of Applied Statistics
Pages: 973-988
Issue: 9
Volume: 36
Year: 2009
Keywords: segmented regression, discrete-time hazard models, changepoints, parity progression, event occurrence data,
X-DOI: 10.1080/02664760802552994
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802552994
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:9:p:973-988
Template-Type: ReDIF-Article 1.0
Author-Name: P. Singh
Author-X-Name-First: P.
Author-X-Name-Last: Singh
Title: Multiple comparisons of several populations with more than one control with respectto scale parameter
Abstract:
In this paper, one-sided and two-sided test procedures for comparing
several treatments with more than one control with respect to scale
parameter are proposed. The proposed test procedures are inverted to
obtain the associated simultaneous confidence intervals. The multiple
comparisons of test treatments with the best control are also developed.
The computation of the critical points, required to implement the proposed
procedures, is discussed by taking the normal probability model.
Applications of the proposed test procedures to two-parameter exponential
probability model are also demonstrated.
Journal: Journal of Applied Statistics
Pages: 989-998
Issue: 9
Volume: 36
Year: 2009
Keywords: best treatment, critical points, experiment-wise error, exponential distribution, multiple comparisons,
X-DOI: 10.1080/02664760802663098
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802663098
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:9:p:989-998
Template-Type: ReDIF-Article 1.0
Author-Name: Manisha Pal
Author-X-Name-First: Manisha
Author-X-Name-Last: Pal
Author-Name: Nripes Mandal
Author-X-Name-First: Nripes
Author-X-Name-Last: Mandal
Title: Optimum designs for estimation of optimum point under cost constraint
Abstract:
In this paper, we consider the estimation of the optimum factor
combination in a response surface model. Assuming that the response
function is quadratic concave and there is a linear cost constraint on the
factor combination, we attempt to find the optimum design using the trace
optimality criterion. As the criterion function involves the unknown
parameters, we adopt a pseudo-Bayesian approach to resolve the problem.
Journal: Journal of Applied Statistics
Pages: 999-1008
Issue: 9
Volume: 36
Year: 2009
Keywords: response surface model, second-order models, mixture experiments, weighted centroid designs, trace criterion,
X-DOI: 10.1080/02664760802582264
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802582264
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:9:p:999-1008
Template-Type: ReDIF-Article 1.0
Author-Name: Prasun Das
Author-X-Name-First: Prasun
Author-X-Name-Last: Das
Author-Name: Arup Kumar Das
Author-X-Name-First: Arup Kumar
Author-X-Name-Last: Das
Author-Name: Saddam Hossain
Author-X-Name-First: Saddam
Author-X-Name-Last: Hossain
Title: Forecasting models for developing control scheme to improve furnace run length
Abstract:
In petrochemical industries, the gaseous feedstock like ethane and
propane are cracked in furnaces to produce ethylene and propylene as main
products and the inputs for the other plant in the downstream. A problem
of low furnace run length (FRL) increases furnace decoking and reduces
productivity along with the problem of reducing life of the coil. Coil
pressure ratio (CPR) and tube metal temperature (TMT) are the two most
important performance measures for the FRL to decide upon the need for
furnace decoking. This article, therefore, makes an attempt to develop the
prediction models for CPR and TMT based on the critical process
parameters, which would lead to take the necessary control measures along
with a prior indication for decoking. Regression-based time series and
double exponential smoothing techniques are used to build up the models.
The effective operating ranges of the critical process parameters are
found using a simulation-based approach. The models are expected to be the
guiding principles eventually to increase the average run length of
furnace.
Journal: Journal of Applied Statistics
Pages: 1009-1019
Issue: 9
Volume: 36
Year: 2009
Keywords: furnace run length, decoking, coil pressure ratio, tube metal temperature, operating ranges, simulation,
X-DOI: 10.1080/02664760902803255
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902803255
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:9:p:1009-1019
Template-Type: ReDIF-Article 1.0
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Author-Name: Chi-Hyuck Jun
Author-X-Name-First: Chi-Hyuck
Author-X-Name-Last: Jun
Title: A group acceptance sampling plan for truncated life test having Weibull distribution
Abstract:
In this paper, a group acceptance sampling plan for a truncated life test
is proposed when a multiple number of items as a group can be tested
simultaneously in a tester, assuming that the lifetime of a product
follows the Weibull distribution with a known shape parameter. The design
parameters such as the number of groups and the acceptance number will be
determined by satisfying the producer's and the consumer's risks at the
specified quality levels, while the termination time and the number of
testers are specified. The results are explained with tables and examples.
Journal: Journal of Applied Statistics
Pages: 1021-1027
Issue: 9
Volume: 36
Year: 2009
Keywords: acceptance sampling, consumer's risk, operating characteristics, producer's risk, truncated life test,
X-DOI: 10.1080/02664760802566788
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802566788
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:9:p:1021-1027
Template-Type: ReDIF-Article 1.0
Author-Name: J. Mazucheli
Author-X-Name-First: J.
Author-X-Name-Last: Mazucheli
Author-Name: J. A. Achcar
Author-X-Name-First: J. A.
Author-X-Name-Last: Achcar
Author-Name: E. A. Coelho-Barros
Author-X-Name-First: E. A.
Author-X-Name-Last: Coelho-Barros
Author-Name: F. Louzada-Neto
Author-X-Name-First: F.
Author-X-Name-Last: Louzada-Neto
Title: Infant mortality model for lifetime data
Abstract:
In this paper we introduce a parametric model for handling lifetime data
where an early lifetime can be related to the infant-mortality failure or
to the wear processes but we do not know which risk is responsible for the
failure. The maximum likelihood approach and the sampling-based approach
are used to get the inferences of interest. Some special cases of the
proposed model are studied via Monte Carlo methods for size and power of
hypothesis tests. To illustrate the proposed methodology, we introduce an
example consisting of a real data set.
Journal: Journal of Applied Statistics
Pages: 1029-1036
Issue: 9
Volume: 36
Year: 2009
Keywords: hazard models, infant-mortality failure, mixture models, Monte Carlo study, Weibull model, bootstrap,
X-DOI: 10.1080/02664760802526907
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802526907
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:9:p:1029-1036
Template-Type: ReDIF-Article 1.0
Author-Name: N. R. Parsons
Author-X-Name-First: N. R.
Author-X-Name-Last: Parsons
Author-Name: S. G. Gilmour
Author-X-Name-First: S. G.
Author-X-Name-Last: Gilmour
Author-Name: R. N. Edmondson
Author-X-Name-First: R. N.
Author-X-Name-Last: Edmondson
Title: Analysis of robust design experiments with time-dependent ordinal response characteristics: a quality improvement study from the horticulture industry
Abstract:
An approach to the analysis of time-dependent ordinal quality score data
from robust design experiments is developed and applied to an experiment
from commercial horticultural research, using concepts of product
robustness and longevity that are familiar to analysts in engineering
research. A two-stage analysis is used to develop models describing the
effects of a number of experimental treatments on the rate of post-sales
product quality decline. The first stage uses a polynomial function on a
transformed scale to approximate the quality decline for an individual
experimental unit using derived coefficients and the second stage uses a
joint mean and dispersion model to investigate the effects of the
experimental treatments on these derived coefficients. The approach,
developed specifically for an application in horticulture, is exemplified
with data from a trial testing ornamental plants that are subjected to a
range of treatments during production and home-life. The results of the
analysis show how a number of control and noise factors affect the rate of
post-production quality decline. Although the model is used to analyse
quality data from a trial on ornamental plants, the approach developed is
expected to be more generally applicable to a wide range of other complex
production systems.
Journal: Journal of Applied Statistics
Pages: 1037-1054
Issue: 9
Volume: 36
Year: 2009
Keywords: joint mean-dispersion model, ordinal scores, proportional odds model, robust product design, two-stage analysis,
X-DOI: 10.1080/02664760802566796
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802566796
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:9:p:1037-1054
Template-Type: ReDIF-Article 1.0
Author-Name: Holger Hurtgen
Author-X-Name-First: Holger
Author-X-Name-Last: Hurtgen
Author-Name: Daniel Gervini
Author-X-Name-First: Daniel
Author-X-Name-Last: Gervini
Title: Semiparametric shape-invariant models for periodic data
Abstract:
This article presents a novel shape-invariant modeling approach to
quasi-periodic data. We propose a dynamic semiparametric method that
estimates the common cycle shape in a nonparametric way and the individual
phase and amplitude variability in a parametric way. An efficient
algorithm to compute the estimators is proposed. The behavior of the
estimators is studied by simulation and by a real-data example.
Journal: Journal of Applied Statistics
Pages: 1055-1065
Issue: 10
Volume: 36
Year: 2009
Keywords: circadian rhythms, nonparametric regression, spline smoothing,
X-DOI: 10.1080/02664760802562472
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802562472
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:10:p:1055-1065
Template-Type: ReDIF-Article 1.0
Author-Name: E. M. Conlon
Author-X-Name-First: E. M.
Author-X-Name-Last: Conlon
Author-Name: B. L. Postier
Author-X-Name-First: B. L.
Author-X-Name-Last: Postier
Author-Name: B. A. Methe
Author-X-Name-First: B. A.
Author-X-Name-Last: Methe
Author-Name: K. P. Nevin
Author-X-Name-First: K. P.
Author-X-Name-Last: Nevin
Author-Name: D. R. Lovley
Author-X-Name-First: D. R.
Author-X-Name-Last: Lovley
Title: Hierarchical Bayesian meta-analysis models for cross-platform microarray studies
Abstract:
The development of new technologies to measure gene expression has been
calling for statistical methods to integrate findings across
multiple-platform studies. A common goal of microarray analysis is to
identify genes with differential expression between two conditions, such
as treatment versus control. Here, we introduce a hierarchical Bayesian
meta-analysis model to pool gene expression studies from different
microarray platforms: spotted DNA arrays and short oligonucleotide arrays.
The studies have different array design layouts, each with multiple
sources of data replication, including repeated experiments, slides and
probes. Our model produces the gene-specific posterior probability of
differential expression, which is the basis for inference. In simulations
combining two and five independent studies, our meta-analysis model
outperformed separate analyses for three commonly used comparison
measures; it also showed improved receiver operating characteristic
curves. When combining spotted DNA and CombiMatrix short oligonucleotide
array studies of Geobacter sulfurreducens, our meta-analysis model
discovered more genes for fixed thresholds of posterior probability of
differential expression and Bayesian false discovery than individual study
analyses. We also examine an alternative model and compare models using
the deviance information criterion.
Journal: Journal of Applied Statistics
Pages: 1067-1085
Issue: 10
Volume: 36
Year: 2009
Keywords: Bayesian statistics, meta-analysis, microarray data, multiple platform, Markov chain Monte Carlo, deviance information criterion,
X-DOI: 10.1080/02664760802562480
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802562480
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:10:p:1067-1085
Template-Type: ReDIF-Article 1.0
Author-Name: G. K. Skinner
Author-X-Name-First: G. K.
Author-X-Name-Last: Skinner
Author-Name: G. H. Freeman
Author-X-Name-First: G. H.
Author-X-Name-Last: Freeman
Title: Soccer matches as experiments: how often does the 'best' team win?
Abstract:
Models in which the number of goals scored by a team in a soccer match
follow a Poisson distribution, or a closely related one, have been widely
discussed. We here consider a soccer match as an experiment to assess
which of two teams is superior and examine the probability that the
outcome of the experiment (match) truly represents the relative abilities
of the two teams. Given a final score, it is possible by using a Bayesian
approach to quantify the probability that it was or was not the case that
'the best team won'. For typical scores, the probability of a misleading
result is significant. Modifying the rules of the game to increase the
typical number of goals scored would improve the situation, but a level of
confidence that would normally be regarded as satisfactory could not be
obtained unless the character of the game was radically changed.
Journal: Journal of Applied Statistics
Pages: 1087-1095
Issue: 10
Volume: 36
Year: 2009
Keywords: football, soccer, experiment design, Poisson statistics, Bayesian,
X-DOI: 10.1080/02664760802715922
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802715922
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:10:p:1087-1095
Template-Type: ReDIF-Article 1.0
Author-Name: Moawia Alghalith
Author-X-Name-First: Moawia
Author-X-Name-Last: Alghalith
Title: Empirical comparative statics under price and output uncertainty
Abstract:
This article provides empirical comparative statics under simultaneous
price and output uncertainty. In so doing, it presents a simple (one-step)
and general statistical methodology under price and output uncertainty.
Journal: Journal of Applied Statistics
Pages: 1097-1100
Issue: 10
Volume: 36
Year: 2009
Keywords: estimating equations, output uncertainty, price uncertainty, utility,
X-DOI: 10.1080/02664760802562506
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802562506
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:10:p:1097-1100
Template-Type: ReDIF-Article 1.0
Author-Name: Corrado Crocetta
Author-X-Name-First: Corrado
Author-X-Name-Last: Crocetta
Author-Name: Nicola Loperfido
Author-X-Name-First: Nicola
Author-X-Name-Last: Loperfido
Title: Maximum likelihood estimation of correlation between maximal oxygen consumption and the 6-min walk test in patients with chronic heart failure
Abstract:
Maximal oxygen consumption (VO2max) is the standard measurement used to
quantify cardiovascular functional capacity and aerobic fitness.
Unfortunately, it is a costly, impractical and labour-intensive measure to
obtain. The 6-min walk test (6MWT) also assesses cardiopulmonary function,
but in contrast to the VO2max test, it is inexpensive and can be performed
almost anywhere. Various medical studies have addressed the correlation
between VO2max and 6MWT in patients with chronic heart failure. Of
particular interest, from a medical point of view, is the conditional
correlation between the two measures given the individual's height,
weight, age and gender. In this paper, we have calculated the maximum
likelihood estimate of the conditional correlation in patients with
chronic heart failure under the assumption of skew normality. Data were
recorded from 98 patients in the Operative Unit of Thoracic Surgery in
Bari, Italy. The estimated conditional correlation was found to be much
smaller than estimated marginal correlations reported in the medical
literature.
Journal: Journal of Applied Statistics
Pages: 1101-1108
Issue: 10
Volume: 36
Year: 2009
Keywords: cardiopulmonary exercise testing, correlation, maximal oxygen consumption, 6-min walk test, skew-normal distribution,
X-DOI: 10.1080/02664760802653545
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802653545
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:10:p:1101-1108
Template-Type: ReDIF-Article 1.0
Author-Name: Ofer Harel
Author-X-Name-First: Ofer
Author-X-Name-Last: Harel
Title: The estimation of R2 and adjusted R2 in incomplete data sets using multiple imputation
Abstract:
The coefficient of determination, known also as the R2, is a common
measure in regression analysis. Many scientists use the R2 and the
adjusted R2 on a regular basis. In most cases, the researchers treat the
coefficient of determination as an index of 'usefulness' or 'goodness of
fit,' and in some cases, they even treat it as a model selection tool. In
cases in which the data is incomplete, most researchers and common
statistical software will use complete case analysis in order to estimate
the R2, a procedure that might lead to biased results. In this paper, I
introduce the use of multiple imputation for the estimation of R2 and
adjusted R2 in incomplete data sets. I illustrate my methodology using a
biomedical example.
Journal: Journal of Applied Statistics
Pages: 1109-1118
Issue: 10
Volume: 36
Year: 2009
Keywords: coefficient of determination, incomplete data, multiple imputation, linear regression,
X-DOI: 10.1080/02664760802553000
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802553000
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:10:p:1109-1118
Template-Type: ReDIF-Article 1.0
Author-Name: Nirian Martin
Author-X-Name-First: Nirian
Author-X-Name-Last: Martin
Author-Name: Leandro Pardo
Author-X-Name-First: Leandro
Author-X-Name-Last: Pardo
Title: On the asymptotic distribution of Cook's distance in logistic regression models
Abstract:
It sometimes occurs that one or more components of the data exert a
disproportionate influence on the model estimation. We need a reliable
tool for identifying such troublesome cases in order to decide either
eliminate from the sample, when the data collect was badly realized, or
otherwise take care on the use of the model because the results could be
affected by such components. Since a measure for detecting influential
cases in linear regression setting was proposed by Cook [Detection of
influential observations in linear regression, Technometrics 19 (1977),
pp. 15-18.], apart from the same measure for other models, several new
measures have been suggested as single-case diagnostics. For most of them
some cutoff values have been recommended (see [D.A. Belsley, E. Kuh, and
R.E. Welsch, Regression Diagnostics: Identifying Influential Data and
Sources of Collinearity, 2nd ed., John Wiley & Sons, New York, Chichester,
Brisban, (2004).], for instance), however the lack of a quantile type
cutoff for Cook's statistics has induced the analyst to deal only with
index plots as worthy diagnostic tools. Focussed on logistic regression,
the aim of this paper is to provide the asymptotic distribution of Cook's
distance in order to look for a meaningful cutoff point for detecting
influential and leverage observations.
Journal: Journal of Applied Statistics
Pages: 1119-1146
Issue: 10
Volume: 36
Year: 2009
Keywords: Cook's distance, logistic regression, maximum likelihood estimation, outlier, leverage,
X-DOI: 10.1080/02664760802562498
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802562498
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:10:p:1119-1146
Template-Type: ReDIF-Article 1.0
Author-Name: R. H. Rieger
Author-X-Name-First: R. H.
Author-X-Name-Last: Rieger
Author-Name: C. R. Weinberg
Author-X-Name-First: C. R.
Author-X-Name-Last: Weinberg
Title: Testing for violations of the homogeneity needed for conditional logistic regression
Abstract:
In epidemiologic studies where the outcome is binary, the data often
arise as clusters, as when siblings, friends or neighbors are used as
matched controls in a case-control study. Conditional logistic regression
(CLR) is typically used for such studies to estimate the odds ratio for an
exposure of interest. However, CLR assumes the exposure coefficient is the
same in every cluster, and CLR-based inference can be badly biased when
homogeneity is violated. Existing methods for testing goodness-of-fit for
CLR are not designed to detect such violations. Good alternative methods
of analysis exist if one suspects there is heterogeneity across clusters.
However, routine use of alternative robust approaches when there is no
appreciable heterogeneity could cause loss of precision and be
computationally difficult, particularly if the clusters are small. We
propose a simple non-parametric test, the test of heterogeneous
susceptibility (THS), to assess the assumption of homogeneity of a
coefficient across clusters. The test is easy to apply and provides
guidance as to the appropriate method of analysis. Simulations demonstrate
that the THS has reasonable power to reveal violations of homogeneity. We
illustrate by applying the THS to a study of periodontal disease.
Journal: Journal of Applied Statistics
Pages: 1147-1157
Issue: 10
Volume: 36
Year: 2009
Keywords: clustered binary outcomes, conditional logistic regression, heterogeneity of response,
X-DOI: 10.1080/02664760802638124
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802638124
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:10:p:1147-1157
Template-Type: ReDIF-Article 1.0
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Author-Name: Chi-Hyuck Jun
Author-X-Name-First: Chi-Hyuck
Author-X-Name-Last: Jun
Title: Designing of a variables two-plan system by minimizing the average sample number
Abstract:
This article proposes a variables two-plan sampling system called
tightened-normal-tightened (TNT) sampling inspection scheme where the
quality characteristic follows a normal distribution or a lognormal
distribution and has an upper or a lower specification limit. The TNT
variables sampling inspection scheme will be useful when testing is costly
and destructive. The advantages of the variables TNT scheme over variables
single and double sampling plans and attributes TNT scheme are discussed.
Tables are also constructed for the selection of parameters of known and
unknown standard deviation variables TNT schemes for a given acceptable
quality level (AQL) and limiting quality level (LQL). The problem is
formulated as a nonlinear programming where the objective function to be
minimized is the average sample number and the constraints are related to
lot acceptance probabilities at AQL and LQL under the operating
characteristic curve.
Journal: Journal of Applied Statistics
Pages: 1159-1172
Issue: 10
Volume: 36
Year: 2009
Keywords: average sample number, OC curve, sampling system, two-plan system,
X-DOI: 10.1080/02664760802562514
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802562514
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:10:p:1159-1172
Template-Type: ReDIF-Article 1.0
Author-Name: Miroslav Ristic
Author-X-Name-First: Miroslav
Author-X-Name-Last: Ristic
Title: Probability with R
Abstract:
Journal: Journal of Applied Statistics
Pages: 1173-1173
Issue: 10
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760902811555
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902811555
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:10:p:1173-1173
Template-Type: ReDIF-Article 1.0
Author-Name: M. Dolores Ugarte
Author-X-Name-First: M. Dolores
Author-X-Name-Last: Ugarte
Title: Longitudinal data analysis
Abstract:
Journal: Journal of Applied Statistics
Pages: 1175-1176
Issue: 10
Volume: 36
Year: 2009
X-DOI: 10.1080/02664760902811563
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902811563
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:10:p:1175-1176
Template-Type: ReDIF-Article 1.0
Author-Name: Silvia Salini
Author-X-Name-First: Silvia
Author-X-Name-Last: Salini
Author-Name: Ron Kenett
Author-X-Name-First: Ron
Author-X-Name-Last: Kenett
Title: Bayesian networks of customer satisfaction survey data
Abstract:
A Bayesian network (BN) is a probabilistic graphical model that
represents a set of variables and their probabilistic dependencies.
Formally, BNs are directed acyclic graphs whose nodes represent variables,
and whose arcs encode the conditional dependencies among the variables.
Nodes can represent any kind of variable, be it a measured parameter, a
latent variable, or a hypothesis. They are not restricted to represent
random variables, which form the “Bayesian” aspect of a BN.
Efficient algorithms exist that perform inference and learning in BNs. BNs
that model sequences of variables are called dynamic BNs. In this context,
[A. Harel, R. Kenett, and F. Ruggeri, Modeling web usability diagnostics
on the basis of usage statistics, in Statistical Methods in eCommerce
Research, W. Jank and G. Shmueli, eds., Wiley, 2008] provide a comparison
between Markov Chains and BNs in the analysis of web usability from
e-commerce data. A comparison of regression models, structural equation
models, and BNs is presented in Anderson et al. [R.D. Anderson, R.D.
Mackoy, V.B. Thompson, and G. Harrell, A bayesian network estimation of
the service-profit Chain for transport service satisfaction, Decision
Sciences 35(4), (2004), pp. 665-689]. In this article we apply BNs to the
analysis of customer satisfaction surveys and demonstrate the potential of
the approach. In particular, BNs offer advantages in implementing models
of cause and effect over other statistical techniques designed primarily
for testing hypotheses. Other advantages include the ability to conduct
probabilistic inference for prediction and diagnostic purposes with an
output that can be intuitively understood by managers.
Journal: Journal of Applied Statistics
Pages: 1177-1189
Issue: 11
Volume: 36
Year: 2009
Keywords: Bayesian networks, customer satisfaction, Eurobarometer, service quality,
X-DOI: 10.1080/02664760802587982
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802587982
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:11:p:1177-1189
Template-Type: ReDIF-Article 1.0
Author-Name: Angela He
Author-X-Name-First: Angela
Author-X-Name-Last: He
Author-Name: Alan Wan
Author-X-Name-First: Alan
Author-X-Name-Last: Wan
Title: Predicting daily highs and lows of exchange rates: a cointegration analysis
Abstract:
This article presents empirical evidence that links the daily highs and
lows of exchange rates of the US dollar against two other major currencies
over a 15 year period. We find that the log high and log low of an
exchange rate are cointegrated, and the error correction term is
well-approximated by the range, which is defined as the difference between
the log high and log low. We further assess the empirical relevance of
jointly analyzing the highs, lows and the ranges by comparing the range
forecasts generated from the cointegration framework with those from
random walk and autoregressive integrated moving average (ARIMA)
specifications. The ability of range forecasts as predictors of implied
volatility for a European style currency option is also evaluated. Our
results show that aside from a limited set of exceptions, the
cointegration framework generally outperforms the random walk and ARIMA
models in an out-of-sample forecast contest.
Journal: Journal of Applied Statistics
Pages: 1191-1204
Issue: 11
Volume: 36
Year: 2009
Keywords: daily high, daily low, direction of change, implied volatility, VECM,
X-DOI: 10.1080/02664760802578304
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802578304
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:11:p:1191-1204
Template-Type: ReDIF-Article 1.0
Author-Name: Chien-Wei Wu
Author-X-Name-First: Chien-Wei
Author-X-Name-Last: Wu
Author-Name: Tsai-Yu Lin
Author-X-Name-First: Tsai-Yu
Author-X-Name-Last: Lin
Title: A Bayesian procedure for assessing process performance based on the third-generation capability index
Abstract:
Capability indices that qualify process potential and process performance
are practical tools for successful quality improvement activities and
quality program implementation. Most existing methods to assess process
capability were derived on the basis of the traditional frequentist point
of view. This paper considers the problem of estimating and testing
process capability based on the third-generation capability index Cpmk
from the Bayesian point of view. We first derive the posterior probability
p for the process under investigation is capable. The one-sided credible
interval, a Bayesian analog of the classical lower confidence interval,
can be obtained to assess process performance. To investigate the
effectiveness of the derived results, a series of simulation was
undertaken. The results indicate that the performance of the proposed
Bayesian approach depends strongly on the value of
ξ=(μ-T)/σ. It performs very well with the accurate
coverage rate when μ is sufficiently far from T. In those cases,
they have the same acceptable performance even though the sample size n is
as small as 25.
Journal: Journal of Applied Statistics
Pages: 1205-1223
Issue: 11
Volume: 36
Year: 2009
Keywords: Bayesian approach, lower confidence bound, process capability, posterior probability, sampling distribution,
X-DOI: 10.1080/02664760802582298
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802582298
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:11:p:1205-1223
Template-Type: ReDIF-Article 1.0
Author-Name: Seok-Oh Jeong
Author-X-Name-First: Seok-Oh
Author-X-Name-Last: Jeong
Author-Name: Kee-Hoon Kang
Author-X-Name-First: Kee-Hoon
Author-X-Name-Last: Kang
Title: Nonparametric estimation of value-at-risk
Abstract:
This paper develops a fully nonparametric method for estimating
value-at-risk based on the adaptive volatility estimation and the
nonparametric quantile estimation. The proposed method is simple, fast and
easy to implement. We evaluated its numerical performance on the basis of
Monte Carlo study for numerous models. We also provided an empirical
application to KOrean Stock Price Index data, which turned out to be
successful by backtesting.
Journal: Journal of Applied Statistics
Pages: 1225-1238
Issue: 11
Volume: 36
Year: 2009
Keywords: value-at-risk, volatility, local homogeneity, quantile estimation, risk management, KOSPI,
X-DOI: 10.1080/02664760802607517
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802607517
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:11:p:1225-1238
Template-Type: ReDIF-Article 1.0
Author-Name: Kenneth Berry
Author-X-Name-First: Kenneth
Author-X-Name-Last: Berry
Author-Name: Janis Johnston
Author-X-Name-First: Janis
Author-X-Name-Last: Johnston
Author-Name: Paul Mielke
Author-X-Name-First: Paul
Author-X-Name-Last: Mielke
Title: Exact and resampling probability values for the Piccarreta nominal-ordinal index of association
Abstract:
Exact, resampling, and Pearson type III permutation methods are provided
to compute probability values for Piccarreta's nominal-ordinal index of
association. The resampling permutation method provides good approximate
probability values based on the proportion of resampled test statistic
values equal to or greater than the observed test statistic value.
Journal: Journal of Applied Statistics
Pages: 1239-1249
Issue: 11
Volume: 36
Year: 2009
Keywords: contingency tables, nominal-ordinal association, resampling, permutation, probability,
X-DOI: 10.1080/02664760802578312
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802578312
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:11:p:1239-1249
Template-Type: ReDIF-Article 1.0
Author-Name: Juan Sun
Author-X-Name-First: Juan
Author-X-Name-Last: Sun
Author-Name: Xiaohui Ouyang
Author-X-Name-First: Xiaohui
Author-X-Name-Last: Ouyang
Author-Name: Hidekatsu Yoshioka
Author-X-Name-First: Hidekatsu
Author-X-Name-Last: Yoshioka
Author-Name: Wenli Wang
Author-X-Name-First: Wenli
Author-X-Name-Last: Wang
Author-Name: Chun Fan
Author-X-Name-First: Chun
Author-X-Name-Last: Fan
Author-Name: Hongwei Li
Author-X-Name-First: Hongwei
Author-X-Name-Last: Li
Author-Name: Jianru Wang
Author-X-Name-First: Jianru
Author-X-Name-Last: Wang
Author-Name: Yalin Liu
Author-X-Name-First: Yalin
Author-X-Name-Last: Liu
Author-Name: Li Su
Author-X-Name-First: Li
Author-X-Name-Last: Su
Author-Name: Heping Ma
Author-X-Name-First: Heping
Author-X-Name-Last: Ma
Author-Name: Ying liu
Author-X-Name-First: Ying
Author-X-Name-Last: liu
Author-Name: Yuwen Zhang
Author-X-Name-First: Yuwen
Author-X-Name-Last: Zhang
Author-Name: Xingguang Zhang
Author-X-Name-First: Xingguang
Author-X-Name-Last: Zhang
Author-Name: Xuemei Wang
Author-X-Name-First: Xuemei
Author-X-Name-Last: Wang
Author-Name: Yueling Hu
Author-X-Name-First: Yueling
Author-X-Name-Last: Hu
Title: A progressive rise in stomach cancer-related mortality rate during 1970-1995 in Japanese individuals over 85 years of age
Abstract:
A large number of studies have shown a gradual fall in stomach
cancer-related mortality rate during the last decade. Here we analyzed the
pattern of stomach cancer-related mortality rates in Japanese aged>85
years from 1970 to 1995. We used data for the entire population of Japan.
The magnitude of change was measured by relative risk and
cause-elimination life tables to distinguish time trends in mortality
rates of stomach cancer for individuals over 85 years of age compared with
other age groups (55-84 years). In the over-85 age group, stomach cancer
mortality increased from 374 in 1970 to 662 in 1995 per 100,000 (77%) for
males and from 232 to 296 per 100,000 (27%) for females. Using the 55-59
years group as the reference category, the relative risk increased from
2.3 to 9.9 and from 2.8 to 11.1 in men and women, respectively. The
effects of mortality on life expectancy also increased 1.5 times and 1.1
times, respectively. Our results showed a rise of stomach cancer mortality
in Japanese aged over 85 years, which paralleled the increase in relative
risk and negative contribution to life expectancy. While the mortality of
younger age groups is decreasing, the change over from increase to
decrease in the over-85 age group is only just beginning.
Journal: Journal of Applied Statistics
Pages: 1251-1258
Issue: 11
Volume: 36
Year: 2009
Keywords: stomach cancer, mortality, age, Japan, old,
X-DOI: 10.1080/02664760802582272
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802582272
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:11:p:1251-1258
Template-Type: ReDIF-Article 1.0
Author-Name: Nicholas Longford
Author-X-Name-First: Nicholas
Author-X-Name-Last: Longford
Title: Analysis of all-zero binomial outcomes with borderline and equilibrium priors
Abstract:
This article is concerned with the analysis of a random sample from a
binomial distribution when all the outcomes are zero (or unity). We
discuss how elicitation of the prior can be reduced to asking the expert
whether (and which of) the so-called borderline or equilibrium priors are
plausible.
Journal: Journal of Applied Statistics
Pages: 1259-1265
Issue: 11
Volume: 36
Year: 2009
Keywords: beta distribution, borderline prior, equilibrium prior, expected loss, plausible prior,
X-DOI: 10.1080/02664760802603813
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802603813
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:11:p:1259-1265
Template-Type: ReDIF-Article 1.0
Author-Name: Getachew Dagne
Author-X-Name-First: Getachew
Author-X-Name-Last: Dagne
Author-Name: James Snyder
Author-X-Name-First: James
Author-X-Name-Last: Snyder
Title: Bayesian hierarchical duration model for repeated events: an application to behavioral observations
Abstract:
This article presents a continuous-time Bayesian model for analyzing
durations of behavior displays in social interactions. Duration data of
social interactions are often complex because of repeated behaviors
(events) at individual or group (e.g. dyad) level, multiple behaviors
(multistates), and several choices of exit from a current event (competing
risks). A multilevel, multistate model is proposed to adequately
characterize the behavioral processes. The model incorporates
dyad-specific and transition-specific random effects to account for
heterogeneity among dyads and interdependence among competing risks. The
proposed method is applied to child-parent observational data derived from
the School Transitions Project to assess the relation of emotional
expression in child-parent interaction to risk for early and persisting
child conduct problems.
Journal: Journal of Applied Statistics
Pages: 1267-1279
Issue: 11
Volume: 36
Year: 2009
Keywords: competing risks, event history, survival, multilevel models, multistates, Bayesian inference, semi-Markov models,
X-DOI: 10.1080/02664760802587032
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802587032
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:11:p:1267-1279
Template-Type: ReDIF-Article 1.0
Author-Name: S. Sampath
Author-X-Name-First: S.
Author-X-Name-Last: Sampath
Author-Name: V. Varalakshmi
Author-X-Name-First: V.
Author-X-Name-Last: Varalakshmi
Author-Name: R. Geetha
Author-X-Name-First: R.
Author-X-Name-Last: Geetha
Title: Estimation under systematic sampling schemes for parabolic populations
Abstract:
In this paper, three sampling-estimating strategies involving linear,
balanced and modified systematic sampling are considered for the
estimation of a finite population total in the presence of parabolic
trend. Using appropriate super-population models, their performances are
evaluated. For super-population models with constant variance, Yates
corrected estimator under linear systematic sampling is shown to perform
well. Choices of variance functions under which modified and balanced
systematic sampling perform well are also identified based on extensive
numerical studies.
Journal: Journal of Applied Statistics
Pages: 1281-1292
Issue: 11
Volume: 36
Year: 2009
Keywords: finite population, systematic sampling, super-population models, average variances,
X-DOI: 10.1080/02664760802578338
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802578338
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:11:p:1281-1292
Template-Type: ReDIF-Article 1.0
Author-Name: S. Mukhopadhyay
Author-X-Name-First: S.
Author-X-Name-Last: Mukhopadhyay
Author-Name: S. W. Looney
Author-X-Name-First: S. W.
Author-X-Name-Last: Looney
Title: Quantile dispersion graphs to compare the efficiencies of cluster randomized designs
Abstract:
The purpose of this article is to compare efficiencies of several cluster
randomized designs using the method of quantile dispersion graphs (QDGs).
A cluster randomized design is considered whenever subjects are randomized
at a group level but analyzed at the individual level. A prior knowledge
of the correlation existing between subjects within the same cluster is
necessary to design these cluster randomized trials. Using the QDG
approach, we are able to compare several cluster randomized designs
without requiring any information on the intracluster correlation. For a
given design, several quantiles of the power function, which are directly
related to the effect size, are obtained for several effect sizes. The
quantiles depend on the intracluster correlation present in the model. The
dispersion of these quantiles over the space of the unknown intracluster
correlation is determined, and then depicted by the QDGs. Two applications
of the proposed methodology are presented.
Journal: Journal of Applied Statistics
Pages: 1293-1305
Issue: 11
Volume: 36
Year: 2009
Keywords: quantile dispersion graphs, power function, intracluster correlation, effect size, noncentrality parameter,
X-DOI: 10.1080/02664760902914508
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914508
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:11:p:1293-1305
Template-Type: ReDIF-Article 1.0
Author-Name: Jonathan Bradley
Author-X-Name-First: Jonathan
Author-X-Name-Last: Bradley
Author-Name: David Farnsworth
Author-X-Name-First: David
Author-X-Name-Last: Farnsworth
Title: Testing for Mutual Exclusivity
Abstract:
A test for two events being mutually exclusive is presented for the case
in which there are known rates of misclassification of the events. The
test can be utilized in other situations, such as to test whether a set is
a subset of another set. In the test, the null value of the probability of
the intersection is replaced by the expected value of the number
determined to be in the intersection by the imperfect diagnostic tools.
The test statistic is the number in a sample that is judged to be in the
intersection. Medical testing applications are emphasized.
Journal: Journal of Applied Statistics
Pages: 1307-1314
Issue: 11
Volume: 36
Year: 2009
Keywords: intersection test, misclassification rate, misdiagnosis rate, mutual exclusivity test, power, subset test,
X-DOI: 10.1080/02664760802582306
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802582306
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:11:p:1307-1314
Template-Type: ReDIF-Article 1.0
Author-Name: Luigi D'Ambra
Author-X-Name-First: Luigi
Author-X-Name-Last: D'Ambra
Author-Name: Onur Koksoy
Author-X-Name-First: Onur
Author-X-Name-Last: Koksoy
Author-Name: Biagio Simonetti
Author-X-Name-First: Biagio
Author-X-Name-Last: Simonetti
Title: Cumulative correspondence analysis of ordered categorical data from industrial experiments
Abstract:
Most studies of quality improvement deal with ordered categorical data
from industrial experiments. Accounting for the ordering of such data
plays an important role in effectively determining the optimal factor
level of combination. This paper utilizes the correspondence analysis to
develop a procedure to improve the ordered categorical response in a
multifactor state system based on Taguchi's statistic. Users may find the
proposed procedure in this paper to be attractive because we suggest a
simple and also popular statistical tool for graphically identifying the
really important factors and determining the levels to improve process
quality. A case study for optimizing the polysilicon deposition process in
a very large-scale integrated circuit is provided to demonstrate the
effectiveness of the proposed procedure.
Journal: Journal of Applied Statistics
Pages: 1315-1328
Issue: 12
Volume: 36
Year: 2009
Keywords: ordered categories, correspondence analysis, quality engineering, experimental design, Taguchi's statistic,
X-DOI: 10.1080/02664760802638090
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802638090
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:12:p:1315-1328
Template-Type: ReDIF-Article 1.0
Author-Name: Rosa Arboretti Giancristofaro
Author-X-Name-First: Rosa
Author-X-Name-Last: Arboretti Giancristofaro
Author-Name: Stefano Bonnini
Author-X-Name-First: Stefano
Author-X-Name-Last: Bonnini
Author-Name: Luigi Salmaso
Author-X-Name-First: Luigi
Author-X-Name-Last: Salmaso
Title: Employment status and education/employment relationship of PhD graduates from the University of Ferrara
Abstract:
Two sample surveys of Post-Docs were planned and carried out at the
University of Ferrara in 2004 and 2007 aimed at determining the
professional status of Post-Docs, the relationship between their PhD
education and employment, and their satisfaction with certain aspects of
the education and research program. As part of these surveys, two
methodological contributions were developed. The first concerns an
extension of the non-parametric combination of dependent rankings to
construct a synthesis of composite indicators measuring satisfaction with
particular aspects of PhD programs [R. Arboretti Giancristofaro and L.
Salmaso, Global ranking indicators with application to the evaluation of
PhD programs, Atti del Convegno “Valutazione e Customer
Satisfaction per la Qualita dei Servizi”, Roma, 8-9 Settembre 2005,
pp. 19-22; R. Arboretti Giancristofaro, S. Bonnini, and L. Salmaso, A
performance indicator for multivariate data, Quad. Stat. 9 (2007), pp.
1-29; R. Arboretti Giancristofaro, F. Pesarin, and L. Salmaso,
Nonparametric approaches for multivariate testing with mixed variables and
for ranking on ordered categorical variables with an application to the
evaluation of PhD programs, in Real Data Analysis, S. Sawilowsky, ed., a
volume in Quantitative Methods in Education and the Behavioral Sciences:
Issues, Research and Teaching, Ronald C. Serlin, series ed., Information
Age Publishing, Charlotte, North Carolina, 2007, pp. 355-385]. The
procedure was applied to highlight differences in the interviewed
Post-Docs' multivariate satisfaction profiles in relation to two aspects:
education/employment relationship; employment expectations; and
opportunities. The second consists of an inferential procedure providing a
solution to the problem of hypothesis testing, where the objective is to
compare the heterogeneity of two populations on the basis of sampling data
[G.R. Arboretti, S. Bonnini, and F. Pesarin, A permutation approach for
testing heterogeneity in two-sample categorical variables, Stat. Comput.
(2009) doi: 10.1007/S11222-008-9085-8.]. The procedure was applied to
compare the degrees of heterogeneity of Post-Doc judgments in the two
surveys with regard to the adequacy of the PhD education for the work
carried out.
Journal: Journal of Applied Statistics
Pages: 1329-1344
Issue: 12
Volume: 36
Year: 2009
Keywords: employment survey, performance indicators, heterogeneity tests,
X-DOI: 10.1080/02664760802638108
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802638108
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:12:p:1329-1344
Template-Type: ReDIF-Article 1.0
Author-Name: S. B. Caudill
Author-X-Name-First: S. B.
Author-X-Name-Last: Caudill
Author-Name: F. G. Mixon
Author-X-Name-First: F. G.
Author-X-Name-Last: Mixon
Title: More on testing the normality assumptionin the Tobit Model
Abstract:
In a recent volume of this journal, Holden [Testing the normality
assumption in the Tobit Model, J. Appl. Stat. 31 (2004) pp. 521-532]
presents Monte Carlo evidence comparing several tests for departures from
normality in the Tobit Model. This study adds to the work of Holden by
considering another test, and several information criteria, for detecting
departures from normality in the Tobit Model. The test given here is a
modified likelihood ratio statistic based on a partially adaptive
estimator of the Censored Regression Model using the approach of Caudill
[A partially adaptive estimator for the Censored Regression Model based on
a mixture of normal distributions, Working Paper, Department of Economics,
Auburn University, 2007]. The information criteria examined include the
Akaike's Information Criterion (AIC), the Consistent AIC (CAIC), the
Bayesian information criterion (BIC), and the Akaike's BIC (ABIC). In
terms of fewest 'rejections' of a true null, the best performance is
exhibited by the CAIC and the BIC, although, like some of the statistics
examined by Holden, there are computational difficulties with each.
Journal: Journal of Applied Statistics
Pages: 1345-1352
Issue: 12
Volume: 36
Year: 2009
Keywords: Censored Regression Model, departures from normality,
X-DOI: 10.1080/02664760802653578
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802653578
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:12:p:1345-1352
Template-Type: ReDIF-Article 1.0
Author-Name: Y. Zhao
Author-X-Name-First: Y.
Author-X-Name-Last: Zhao
Author-Name: A. H. Lee
Author-X-Name-First: A. H.
Author-X-Name-Last: Lee
Author-Name: V. Burke
Author-X-Name-First: V.
Author-X-Name-Last: Burke
Author-Name: K. K. W. Yau
Author-X-Name-First: K. K. W.
Author-X-Name-Last: Yau
Title: Testing for zero-inflation in count series: application to occupational health
Abstract:
Count data series with extra zeros relative to a Poisson distribution are
common in many biomedical applications. A score test is presented to
assess whether the zero-inflation problem is significant to warrant the
analysis by the more complex zero-inflated Poisson autoregression model.
The score test is implemented as a computer program in the Splus platform.
For illustration, the test procedure is applied to a workplace injury
series where many zero counts are observed due to the heterogeneity in
injury risk and the dynamic population involved.
Journal: Journal of Applied Statistics
Pages: 1353-1359
Issue: 12
Volume: 36
Year: 2009
Keywords: occupational health, random effects, score test, workplace injuries, zero-inflation,
X-DOI: 10.1080/02664760802653586
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802653586
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:12:p:1353-1359
Template-Type: ReDIF-Article 1.0
Author-Name: Christopher Gjestvang
Author-X-Name-First: Christopher
Author-X-Name-Last: Gjestvang
Author-Name: Sarjinder Singh
Author-X-Name-First: Sarjinder
Author-X-Name-Last: Singh
Title: An improved randomized response model: estimation of mean
Abstract:
In this paper, we suggest a new randomized response model useful for
collecting information on quantitative sensitive variables such as drug
use and income. The resultant estimator has been found to be better than
the usual additive randomized response model. An interesting feature of
the proposed model is that it is free from the known parameters of the
scrambling variable unlike the additive model due to Himmelfarb and Edgell
[S. Himmelfarb and S.E. Edgell, Additive constant model: a randomized
response technique for eliminating evasiveness to quantitative response
questions, Psychol. Bull. 87(1980), 525-530]. Relative efficiency of the
proposed model has also been studied with the corresponding competitors.
At the end, an application of the proposed model has been discussed.
Journal: Journal of Applied Statistics
Pages: 1361-1367
Issue: 12
Volume: 36
Year: 2009
Keywords: sensitive variable, estimation of mean, randomized response model, scrambling variables,
X-DOI: 10.1080/02664760802684151
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802684151
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:12:p:1361-1367
Template-Type: ReDIF-Article 1.0
Author-Name: J. Fredrik Lindstrom
Author-X-Name-First: J. Fredrik
Author-X-Name-Last: Lindstrom
Author-Name: H. E. T. Holgersson
Author-X-Name-First: H. E. T.
Author-X-Name-Last: Holgersson
Title: Forecast mean squared error reductionin the VAR(1) process
Abstract:
When VAR models are used to predict future outcomes, the forecast error
can be substantial. Through imposition of restrictions on the off-diagonal
elements of the parameter matrix, however, the information in the process
may be condensed to the marginal processes. In particular, if the
cross-autocorrelations in the system are small and only a small sample is
available, then such a restriction may reduce the forecast mean squared
error considerably. In this paper, we propose three different techniques
to decide whether to use the restricted or unrestricted model, i.e. the
full VAR(1) model or only marginal AR(1) models. In a Monte Carlo
simulation study, all three proposed tests have been found to behave quite
differently depending on the parameter setting. One of the proposed tests
stands out, however, as the preferred one and is shown to outperform other
estimators for a wide range of parameter settings.
Journal: Journal of Applied Statistics
Pages: 1369-1384
Issue: 12
Volume: 36
Year: 2009
Keywords: VAR models, prediction error, pre-test, linear hypothesis, selection criteria,
X-DOI: 10.1080/02664760802715898
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802715898
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:12:p:1369-1384
Template-Type: ReDIF-Article 1.0
Author-Name: J. L. Alfaro
Author-X-Name-First: J. L.
Author-X-Name-Last: Alfaro
Author-Name: J. Fco. Ortega
Author-X-Name-First: J. Fco.
Author-X-Name-Last: Ortega
Title: A comparison of robust alternatives to Hotelling's T2 control chart
Abstract:
Control charts are one of the widest used techniques in statistical
process control. In Phase I, historical observations are analysed in order
to construct a control chart. Because of the existence of multiple
outliers that are undetected by control charts such as Hotelling's T2 due
to the masking effect, robust alternatives to Hotelling's T2 have been
developed based on minimum volume ellipsoid (MVE) estimators, minimum
covariance determinant (MCD) estimators, reweighted MCD estimators or
trimmed estimators. In this paper, we use a simulation study to analyse
the performance of each alternative in various situations and offer
guidance for the correct use of each estimator.
Journal: Journal of Applied Statistics
Pages: 1385-1396
Issue: 12
Volume: 36
Year: 2009
Keywords: statistical quality control, multivariate control chart, outliers, masking effect, robust estimators,
X-DOI: 10.1080/02664760902810813
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902810813
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:36:y:2009:i:12:p:1385-1396
Template-Type: ReDIF-Article 1.0
Author-Name: Kahadawala Cooray
Author-X-Name-First: Kahadawala
Author-X-Name-Last: Cooray
Author-Name: Malwane Ananda
Author-X-Name-First: Malwane
Author-X-Name-Last: Ananda
Title: Analyzing survival data with highly negatively skewed distribution: The Gompertz-sinh family
Abstract:
In this article, we explore a new two-parameter family of distribution,
which is derived by suitably replacing the exponential term in the
Gompertz distribution with a hyperbolic sine term. The resulting new
family of distribution is referred to as the Gompertz-sinh distribution,
and it possesses a thicker and longer lower tail than the Gompertz family,
which is often used to model highly negatively skewed data. Moreover, we
introduce a useful generalization of this model by adding a second shape
parameter to accommodate a variety of density shapes as well as
nondecreasing hazard shapes. The flexibility and better fitness of the new
family, as well as its generalization, is demonstrated by providing
well-known examples that involve complete, group, and censored data.
Journal: Journal of Applied Statistics
Pages: 1-11
Issue: 1
Volume: 37
Year: 2010
Keywords: goodness-of-fit, gompertz distribution, maximum likelihood,
X-DOI: 10.1080/02664760802663072
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802663072
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:1-11
Template-Type: ReDIF-Article 1.0
Author-Name: A. Karagrigoriou
Author-X-Name-First: A.
Author-X-Name-Last: Karagrigoriou
Author-Name: C. Koukouvinos
Author-X-Name-First: C.
Author-X-Name-Last: Koukouvinos
Author-Name: K. Mylona
Author-X-Name-First: K.
Author-X-Name-Last: Mylona
Title: On the advantages of the non-concave penalized likelihood model selection method with minimum prediction errors in large-scale medical studies
Abstract:
Variable and model selection problems are fundamental to high-dimensional
statistical modeling in diverse fields of sciences. Especially in health
studies, many potential factors are usually introduced to determine an
outcome variable. This paper deals with the problem of high-dimensional
statistical modeling through the analysis of the trauma annual data in
Greece for 2005. The data set is divided into the experiment and control
sets and consists of 6334 observations and 112 factors that include
demographic, transport and intrahospital data used to detect possible risk
factors of death. In our study, different model selection techniques are
applied to the experiment set and the notion of deviance is used on the
control set to assess the fit of the overall selected model. The
statistical methods employed in this work were the non-concave penalized
likelihood methods, smoothly clipped absolute deviation, least absolute
shrinkage and selection operator, and Hard, the generalized linear
logistic regression, and the best subset variable selection.The way of
identifying the significant variables in large medical data sets along
with the performance and the pros and cons of the various statistical
techniques used are discussed. The performed analysis reveals the distinct
advantages of the non-concave penalized likelihood methods over the
traditional model selection techniques.
Journal: Journal of Applied Statistics
Pages: 13-24
Issue: 1
Volume: 37
Year: 2010
Keywords: model selection, generalized linear model, non-concave penalized likelihood, high-dimensional data set, deviance, trauma,
X-DOI: 10.1080/02664760802638116
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802638116
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:13-24
Template-Type: ReDIF-Article 1.0
Author-Name: M. Saleem
Author-X-Name-First: M.
Author-X-Name-Last: Saleem
Author-Name: M. Aslam
Author-X-Name-First: M.
Author-X-Name-Last: Aslam
Author-Name: P. Economou
Author-X-Name-First: P.
Author-X-Name-Last: Economou
Title: On the Bayesian analysis of the mixture of power function distribution using the complete and the censored sample
Abstract:
The power function distribution is often used to study the electrical
component reliability. In this paper, we model a heterogeneous population
using the two-component mixture of the power function distribution. A
comprehensive simulation scheme including a large number of parameter
points is followed to highlight the properties and behavior of the
estimates in terms of sample size, censoring rate, parameters size and the
proportion of the components of the mixture. The parameters of the power
function mixture are estimated and compared using the Bayes estimates. A
simulated mixture data with censored observations is generated by
probabilistic mixing for the computational purposes. Elegant closed form
expressions for the Bayes estimators and their variances are derived for
the censored sample as well as for the complete sample. Some interesting
comparison and properties of the estimates are observed and presented. The
system of three non-linear equations, required to be solved iteratively
for the computations of maximum likelihood (ML) estimates, is derived. The
complete sample expressions for the ML estimates and for their variances
are also given. The components of the information matrix are constructed
as well. Uninformative as well as informative priors are assumed for the
derivation of the Bayes estimators. A real-life mixture data example has
also been discussed. The posterior predictive distribution with the
informative Gamma prior is derived, and the equations required to find the
lower and upper limits of the predictive intervals are constructed. The
Bayes estimates are evaluated under the squared error loss function.
Journal: Journal of Applied Statistics
Pages: 25-40
Issue: 1
Volume: 37
Year: 2010
Keywords: information matrix, censored sampling, inverse transform method, squared error loss function, predictive distribution,
X-DOI: 10.1080/02664760902914557
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914557
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:25-40
Template-Type: ReDIF-Article 1.0
Author-Name: Athanasios Micheas
Author-X-Name-First: Athanasios
Author-X-Name-Last: Micheas
Author-Name: Yuqiang Peng
Author-X-Name-First: Yuqiang
Author-X-Name-Last: Peng
Title: Bayesian Procrustes analysis with applications to hydrology
Abstract:
In this paper, we introduce Procrustes analysis in a Bayesian framework,
by treating the classic Procrustes regression equation from a Bayesian
perspective, while modeling shapes in two dimensions. The Bayesian
approach allows us to compute point estimates and credible sets for the
full Procrustes fit parameters. The methods are illustrated through an
application to radar data from short-term weather forecasts (nowcasts), a
very important problem in hydrology and meteorology.
Journal: Journal of Applied Statistics
Pages: 41-55
Issue: 1
Volume: 37
Year: 2010
Keywords: Bayesian computation and estimation, complex normal distribution, full Procrustes fit, Procrustes analysis, shape analysis,
X-DOI: 10.1080/02664760802653560
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802653560
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:41-55
Template-Type: ReDIF-Article 1.0
Author-Name: Andrea Beccarini
Author-X-Name-First: Andrea
Author-X-Name-Last: Beccarini
Title: Eliminating the omitted variable bias by a regime-switching approach
Abstract:
This work shows a procedure that aims to eliminate or reduce the bias
caused by omitted variables by means of the so-called regime-switching
regressions. There is a bias estimation whenever the statistical (linear)
model is under-specified, that is, when there are some omitted variables
and they are correlated with the regressors. This work shows how an
appropriate specification of a regime-switching model (independent or
Markov-switching) can eliminate or reduce this correlation, hence the
estimation bias. A demonstration is given, together with some Monte Carlo
simulations. An empirical verification, based on Fisher's equation, is
also provided.
Journal: Journal of Applied Statistics
Pages: 57-75
Issue: 1
Volume: 37
Year: 2010
Keywords: omitted variable bias, regime-switching model, EM algorithm, Monte Carlo simulations, Fisher's equation,
X-DOI: 10.1080/02664760902914474
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914474
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:57-75
Template-Type: ReDIF-Article 1.0
Author-Name: A. Parchami
Author-X-Name-First: A.
Author-X-Name-Last: Parchami
Author-Name: M. Mashinchi
Author-X-Name-First: M.
Author-X-Name-Last: Mashinchi
Title: A new generation of process capability indices
Abstract:
In quality control, we may confront imprecise concepts. One case is a
situation in which upper and lower specification limits (SLs) are
imprecise. If we introduce vagueness into SLs, we face quite new,
reasonable and interesting processes, and the ordinary capability indices
are not appropriate for measuring the capability of these processes. In
this paper, similar to the traditional process capability indices (PCIs),
we develop a fuzzy analogue by a distance defined on a fuzzy limit space
and introduce PCIs, where instead of precise SLs we have two membership
functions for upper and lower SLs. These indices are necessary when SLs
are fuzzy, and they are helpful for comparing manufacturing process with
fuzzy SLs. Some interesting relations among these introduced indices are
proved. Numerical examples are given to clarify the method.
Journal: Journal of Applied Statistics
Pages: 77-89
Issue: 1
Volume: 37
Year: 2010
Keywords: fuzzy quality, specification limits, process capability index, fuzzy statistics,
X-DOI: 10.1080/02664760802695785
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802695785
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:77-89
Template-Type: ReDIF-Article 1.0
Author-Name: Biagio Simonetti
Author-X-Name-First: Biagio
Author-X-Name-Last: Simonetti
Author-Name: Eric Beh
Author-X-Name-First: Eric
Author-X-Name-Last: Beh
Author-Name: Luigi D'Ambra
Author-X-Name-First: Luigi
Author-X-Name-Last: D'Ambra
Title: The analysis of dependence for three ways contingency tables with ordinal variables: A case study of patient satisfaction data
Abstract:
For many questionnaires and surveys in the marketing, business, and
health disciplines, items often involve ordinal scales (such as the Likert
scale and rating scale) that are associated in sometimes complex ways.
Techniques such as classical correspondence analysis provide a simple
graphical means of describing the nature of the association. However, the
procedure does not allow the researcher to specify how one item may be
associated with another, nor does the analysis allow for the ordinal
structure of the scales to be reflected. This article presents a graphical
approach that can help the researcher to study in depth the complex
association of the items and reflect the structure of the items. We will
demonstrate the applicability of this approach using data collected from a
study that involves identifying major factors that influence the level of
patient satisfaction in a Neapolitan hospital.
Journal: Journal of Applied Statistics
Pages: 91-103
Issue: 1
Volume: 37
Year: 2010
Keywords: correspondence analysis, orthogonal polynomials, patient satisfaction evaluation,
X-DOI: 10.1080/02664760802653552
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802653552
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:91-103
Template-Type: ReDIF-Article 1.0
Author-Name: L.H.A. Dal Bello
Author-X-Name-First: L.H.A.
Author-X-Name-Last: Dal Bello
Author-Name: A. F. C. Vieira
Author-X-Name-First: A. F. C.
Author-X-Name-Last: Vieira
Title: Optimization of a product performance using mixture experiments
Abstract:
This article presents a case study of a chemical compound acting as a
subsystem of a delay mechanism for starting a rocket engine. The objective
of this study was to investigate the proportions of mix components that
enable a previously specified burning time. Thus, a linear regression
model with normal responses was fitted, but later considered inadequate,
as there was evidence that the response variance was not constant. Models
fitted by the quasi-likelihood method were tried then. Through the
developed model, it was possible to determine the proportion of each
component to accomplish the process optimization. For the process
optimization, besides considering a specific burning time, it was possible
to consider the variance minimization for this time prediction
as well.
Journal: Journal of Applied Statistics
Pages: 105-117
Issue: 1
Volume: 37
Year: 2010
Keywords: mixture experiments, optimization, quality, quasi-likelihood, regression,
X-DOI: 10.1080/02664760802647976
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802647976
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:105-117
Template-Type: ReDIF-Article 1.0
Author-Name: Marcus Perry
Author-X-Name-First: Marcus
Author-X-Name-Last: Perry
Author-Name: Joseph Pignatiello
Author-X-Name-First: Joseph
Author-X-Name-Last: Pignatiello
Title: Identifying the time of step change in the mean of autocorrelated processes
Abstract:
Control charts are used to detect changes in a process. Once a change is
detected, knowledge of the change point would simplify the search for and
identification of the special cause. Consequently, having an estimate of
the process change point following a control chart signal would be useful
to process analysts. Change-point methods for the uncorrelated process
have been studied extensively in the literature; however, less attention
has been given to change-point methods for autocorrelated processes.
Autocorrelation is common in practice and is often modeled via the class
of autoregressive moving average (ARMA) models. In this article, a maximum
likelihood estimator for the time of step change in the mean of
covariance-stationary processes that fall within the general ARMA
framework is developed. The estimator is intended to be used as an
“add-on” following a signal from a phase II control chart.
Considering first-order pure and mixed ARMA processes, Monte Carlo
simulation is used to evaluate the performance of the proposed
change-point estimator across a range of step change magnitudes following
a genuine signal from a control chart. Results indicate that the estimator
provides process analysts with an accurate and useful estimate of the last
sample obtained from the unchanged process. Additionally, results indicate
that if a change-point estimator designed for the uncorrelated process is
applied to an autocorrelated process, the performance of the estimator can
suffer dramatically.
Journal: Journal of Applied Statistics
Pages: 119-136
Issue: 1
Volume: 37
Year: 2010
Keywords: ARMA(p, q) models, autocorrelated processes, change-point estimation, stationary processes, statistical process control (SPC),
X-DOI: 10.1080/02664760802663080
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802663080
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:119-136
Template-Type: ReDIF-Article 1.0
Author-Name: R. Fuentes-Garcia
Author-X-Name-First: R.
Author-X-Name-Last: Fuentes-Garcia
Author-Name: S. G. Walker
Author-X-Name-First: S. G.
Author-X-Name-Last: Walker
Title: A new approach to classification
Abstract:
Clustering is a common and important issue, and finite mixture models
based on the normal distribution are frequently used to address the
problem. In this article, we consider a classification model and build a
mixture model around it. A good assessment of the allocation of
observations and number of clusters is easily obtained from this approach.
Journal: Journal of Applied Statistics
Pages: 137-146
Issue: 1
Volume: 37
Year: 2010
Keywords: clustering, classification, latent variable, normal mixture-model, random histogram,
X-DOI: 10.1080/02664760802698987
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802698987
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:137-146
Template-Type: ReDIF-Article 1.0
Author-Name: Michele Scagliarini
Author-X-Name-First: Michele
Author-X-Name-Last: Scagliarini
Title: Inference on Cpk for autocorrelated data in the presence of random measurement errors
Abstract:
The present paper examines the properties of the Cpk estimator when
observations are autocorrelated and affected by measurement errors. The
underlying reason for this choice of subject matter is that in industrial
applications, process data are often autocorrelated, especially when
sampling frequency is not particularly low, and even with the most
advanced measuring instruments, gauge imprecision needs to be taken into
consideration. In the case of a first-order stationary autoregressive
process, we compare the statistical properties of the estimator in the
error case with those of the estimator in the error-free case. Results
indicate that the presence of gauge measurement errors leads the estimator
to behave differently depending on the entity of error variability.
Journal: Journal of Applied Statistics
Pages: 147-158
Issue: 1
Volume: 37
Year: 2010
Keywords: process capability indices, gauge measurement errors, autocorrelation, estimator, process specifications,
X-DOI: 10.1080/02664760902914482
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914482
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:147-158
Template-Type: ReDIF-Article 1.0
Author-Name: K. Ben-Ahmed
Author-X-Name-First: K.
Author-X-Name-Last: Ben-Ahmed
Author-Name: A. Bouratbine
Author-X-Name-First: A.
Author-X-Name-Last: Bouratbine
Author-Name: M. -A. El-Aroui
Author-X-Name-First: M. -A.
Author-X-Name-Last: El-Aroui
Title: Generalized linear spatial models in epidemiology: A case study of zoonotic cutaneous leishmaniasis in Tunisia
Abstract:
Generalized linear spatial models (GLSM) are used here to study spatial
characters of zoonotic cutaneous leishmaniasis (ZCL) in Tunisia. The
response variable stands for the number of affected by district during the
period 2001-2002. The model covariates are: climates (temperature and
rainfall), humidity and surrounding vegetation status. As the
environmental and weather data are not available for all the studied
districts, Kriging based on linear interpolation was used to estimate the
missing data. To account for unexplained spatial variation in the model,
we include a stationary Gaussian process S with a powered exponential
spatial correlation function. Moran coefficient, DIC criterion and
residuals variograms are used to show the high goodness-of-fit of the
GLSM. When compared with the statistical tools used in the previous ZCL
studies, the optimal GLSM found here yields a better assessment of the
impact of the risk factors, a better prediction of ZCL evolution and a
better comprehension of the disease transmission. The statistical results
show the progressive increase in the number of affected in zones with high
temperature, low rainfall and high surrounding vegetation index. Relative
humidity does not seem to affect the distribution of the disease in
Tunisia. The results of the statistical analyses stress the important risk
of misleading epidemiological conclusions when non-spatial models are used
to analyse spatially structured data.
Journal: Journal of Applied Statistics
Pages: 159-170
Issue: 1
Volume: 37
Year: 2010
Keywords: generalized linear spatial model, Leishmania major, spatial variation, Tunisia, zoonotic cutaneous leishmaniasis,
X-DOI: 10.1080/02664760802684169
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802684169
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:159-170
Template-Type: ReDIF-Article 1.0
Author-Name: Kahadawala Cooray
Author-X-Name-First: Kahadawala
Author-X-Name-Last: Cooray
Title: Generalized Gumbel distribution
Abstract:
A generalization of the Gumbel distribution is presented to deal with
general situations in modeling univariate data with broad range of
skewness in the density function. This generalization is derived by
considering a logarithmic transformation of an odd Weibull random
variable. As a result, the generalized Gumbel distribution is not only
useful for testing goodness-of-fit of Gumbel and reverse-Gumbel
distributions as submodels, but it is also convenient for modeling and
fitting a wide variety of data sets that are not possible to be modeled by
well-known distributions. Skewness and kurtosis shapes of the generalized
Gumbel distribution are illustrated by constructing the Galton's skewness
and Moor's kurtosis plane. Parameters are estimated by using maximum
likelihood method in two different ways due to the fact that the reverse
transformation of the proposed distribution does not change its density
function. In order to illustrate the flexibility of this generalization,
wave and surge height data set is analyzed, and the fitness is compared
with Gumbel and generalized extreme value distributions.
Journal: Journal of Applied Statistics
Pages: 171-179
Issue: 1
Volume: 37
Year: 2010
Keywords: coverage probabilities, generalized extreme value distribution, Gumbel distribution, odd Weibull distribution, skewness and kurtosis,
X-DOI: 10.1080/02664760802698995
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802698995
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:1:p:171-179
Template-Type: ReDIF-Article 1.0
Author-Name: Yangxin Huang
Author-X-Name-First: Yangxin
Author-X-Name-Last: Huang
Title: A Bayesian approach in differential equation dynamic models incorporating clinical factors and covariates
Abstract:
A virologic marker, the number of HIV RNA copies or viral load, is
currently used to evaluate antiretroviral (ARV) therapies in AIDS clinical
trials. This marker can be used to assess the antiviral potency of
therapies, but may be easily affected by clinical factors such as drug
exposures and drug resistance as well as baseline characteristics during
the long-term treatment evaluation process. HIV dynamic studies have
significantly contributed to the understanding of HIV pathogenesis and ARV
treatment strategies. Viral dynamic models can be formulated through
differential equations, but there has been only limited development of
statistical methodologies for estimating such models or assessing their
agreement with observed data. This paper develops mechanism-based
nonlinear differential equation models for characterizing long-term viral
dynamics with ARV therapy. In this model we not only incorporate clinical
factors (drug exposures, and susceptibility), but also baseline covariate
(baseline viral load, CD4 count, weight, or age) into a function of
treatment efficacy. A Bayesian nonlinear mixed-effects modeling approach
is investigated with application to an AIDS clinical trial study. The
effects of confounding interaction of clinical factors with
covariate-based models are compared using the deviance information
criteria (DIC), a Bayesian version of the classical deviance for model
assessment, designed from complex hierarchical model settings.
Relationships between baseline covariate combined with confounding
clinical factors and drug efficacy are explored. In addition, we compared
models incorporating each of four baseline covariates through DIC and some
interesting findings are presented. Our results suggest that modeling HIV
dynamics and virologic responses with consideration of time-varying
clinical factors as well as baseline characteristics may play an important
role in understanding HIV pathogenesis, designing new treatment strategies
for long-term care of AIDS patients.
Journal: Journal of Applied Statistics
Pages: 181-199
Issue: 2
Volume: 37
Year: 2010
Keywords: AIDS, baseline characteristics, Bayesian nonlinear mixed-effects models, long-term HIV dynamics, longitudinal data, time-varying drug efficacy,
X-DOI: 10.1080/02664760802578320
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802578320
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:2:p:181-199
Template-Type: ReDIF-Article 1.0
Author-Name: Chin Wen Cheong
Author-X-Name-First: Chin Wen
Author-X-Name-Last: Cheong
Title: Estimating the Hurst parameter in financial time series via heuristic approaches
Abstract:
This research investigates long memory financial equity markets using
three heuristic methodologies namely a proposed modified variance
time-aggregated plot, modified rescaled-range plot and periodogram
approaches. The intensity of the long memory process is quantified in
terms of Hurst parameter (H). Five Malaysian equity market indices are
selected in the empirical studies with the inclusion of pre- and
post-drastic economic events. Our empirical results evidenced dissimilar
long memory behaviours in the different regimes of significant economic
events. It is also found that after the short-memory adjustment, all the
equity markets exhibited substantial reductions in long memory
estimations.
Journal: Journal of Applied Statistics
Pages: 201-214
Issue: 2
Volume: 37
Year: 2010
Keywords: heuristic methodology, long-range dependence, Hurst parameter, quantile regression,
X-DOI: 10.1080/02664760802582280
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802582280
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:2:p:201-214
Template-Type: ReDIF-Article 1.0
Author-Name: Chee Kian Leong
Author-X-Name-First: Chee Kian
Author-X-Name-Last: Leong
Author-Name: Weihong Huang
Author-X-Name-First: Weihong
Author-X-Name-Last: Huang
Title: Testing for spurious and cointegrated regressions: A wavelet approach
Abstract:
This paper proposes a wavelet-based approach to analyze spurious and
cointegrated regressions in time series. The approach is based on the
properties of the wavelet covariance and correlation in Monte Carlo
studies of spurious and cointegrated regression. In the case of the
spurious regression, the null hypotheses of zero wavelet covariance and
correlation for these series across the scales fail to be rejected.
Conversely, these null hypotheses across the scales are rejected for the
cointegrated bivariate time series. These nonresidual-based tests are then
applied to analyze if any relationship exists between the extraterrestrial
phenomenon of sunspots and the earthly economic time series of oil prices.
Conventional residual-based tests appear sensitive to the specification in
both the cointegrating regression and the lag order in the augmented
Dickey-Fuller tests on the residuals. In contrast, the wavelet tests, with
their bootstrap t-statistics and confidence intervals, detect the
spuriousness of this relationship.
Journal: Journal of Applied Statistics
Pages: 215-233
Issue: 2
Volume: 37
Year: 2010
Keywords: spurious regression, cointegration, wavelet covariance and correlation, Monte Carlo simulations, bootstrap,
X-DOI: 10.1080/02664760802638082
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802638082
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:2:p:215-233
Template-Type: ReDIF-Article 1.0
Author-Name: Nizar Bouguila
Author-X-Name-First: Nizar
Author-X-Name-Last: Bouguila
Author-Name: Jian Han Wang
Author-X-Name-First: Jian Han
Author-X-Name-Last: Wang
Author-Name: A. Ben Hamza
Author-X-Name-First: A. Ben
Author-X-Name-Last: Hamza
Title: Software modules categorization through likelihood and bayesian analysis of finite dirichlet mixtures
Abstract:
In this paper, we examine deterministic and Bayesian methods for
analyzing finite Dirichlet mixtures. The deterministic method is based on
the likelihood approach, and the Bayesian approach is implemented using
the Gibbs sampler. The selection of the number of clusters for both
approaches is based on the Bayesian information criterion, which is
equivalent to the minimum description length. Experimental results are
presented using simulated data, and a real application for software
modules classification is also included.
Journal: Journal of Applied Statistics
Pages: 235-252
Issue: 2
Volume: 37
Year: 2010
Keywords: dirichlet distribution, mixture modeling, maximum likelihood, EM, MDL, BIC, Bayesian analysis, Gibbs sampling, Metropolis-Hastings, software modules,
X-DOI: 10.1080/02664760802684185
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802684185
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:2:p:235-252
Template-Type: ReDIF-Article 1.0
Author-Name: Gianluca Baio
Author-X-Name-First: Gianluca
Author-X-Name-Last: Baio
Author-Name: Marta Blangiardo
Author-X-Name-First: Marta
Author-X-Name-Last: Blangiardo
Title: Bayesian hierarchical model for the prediction of football results
Abstract:
The problem of modelling football data has become increasingly popular in
the last few years and many different models have been proposed with the
aim of estimating the characteristics that bring a team to lose or win a
game, or to predict the score of a particular match. We propose a Bayesian
hierarchical model to fulfil both these aims and test its predictive
strength based on data about the Italian Serie A 1991-1992 championship.
To overcome the issue of overshrinkage produced by the Bayesian
hierarchical model, we specify a more complex mixture model that results
in a better fit to the observed data. We test its performance using an
example of the Italian Serie A 2007-2008 championship.
Journal: Journal of Applied Statistics
Pages: 253-264
Issue: 2
Volume: 37
Year: 2010
Keywords: Bayesian hierarchical models, overshrinkage, football data, bivariate Poisson distribution, Poisson-log normal model,
X-DOI: 10.1080/02664760802684177
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802684177
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:2:p:253-264
Template-Type: ReDIF-Article 1.0
Author-Name: Zongrun Wang
Author-X-Name-First: Zongrun
Author-X-Name-Last: Wang
Author-Name: Weitao Wu
Author-X-Name-First: Weitao
Author-X-Name-Last: Wu
Author-Name: Chao Chen
Author-X-Name-First: Chao
Author-X-Name-Last: Chen
Author-Name: Yanju Zhou
Author-X-Name-First: Yanju
Author-X-Name-Last: Zhou
Title: The exchange rate risk of Chinese yuan: Using VaR and ES based on extreme value theory
Abstract:
This paper applies extreme value theory (EVT) to estimate the tails of
return series of Chinese yuan (CNY) exchange rates. We find that the
degree of fitting Pareto distribution to the data of the tail of return
series is extremely high. The empirical results indicate that expected
shortfall cannot improve the tail risk problem of value-at-risk (VaR). The
evidence of back testing indicates that EVT-based VaR values underestimate
the risks of exchange rates such as USD/CNY and HKD/CNY, which may be
caused by the continuous appreciation of CNY against USD and HKD. However,
compared with VaR values calculated by historical simulation and
variance-covariance method, VaR values calculated by EVT can measure the
risk more accurately for the exchange rates of JPY/CNY and EUR/CNY.
Journal: Journal of Applied Statistics
Pages: 265-282
Issue: 2
Volume: 37
Year: 2010
Keywords: expected shortfall, extreme value theory, historical simulation, value-at-risk, variance-covariance,
X-DOI: 10.1080/02664760902846114
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902846114
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:2:p:265-282
Template-Type: ReDIF-Article 1.0
Author-Name: Hamdi Akcakoca
Author-X-Name-First: Hamdi
Author-X-Name-Last: Akcakoca
Author-Name: Nurhak Sutcu
Author-X-Name-First: Nurhak
Author-X-Name-Last: Sutcu
Title: Investigation of the diesel consumption for trucks at an overburden stripping area by SPC study
Abstract:
This study examined whether diesel consumption used by trucks at a
stripping area is controlled or not. The factors affecting diesel
consumption were also investigated and some necessary solutions were
presented. Diesel consumption was observed with the aid of control graphs.
Abnormal situations in the diesel consumption were explored by means of
Shewhart control graphs. The factors which are out of control were also
presented in a cause-effect diagram, and suggestions for improvement were
proposed. It has been determined that the main effect of the diesel
consumption is the daily run number of the trucks. The main factors
affecting the daily run number were also investigated.
Journal: Journal of Applied Statistics
Pages: 283-298
Issue: 2
Volume: 37
Year: 2010
Keywords: statistical process control (SPC), quality control, control graphs, strip mining, diesel consumption,
X-DOI: 10.1080/02664760902914540
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914540
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:2:p:283-298
Template-Type: ReDIF-Article 1.0
Author-Name: A. Erhan Mergen
Author-X-Name-First: A. Erhan
Author-X-Name-Last: Mergen
Author-Name: Z. Seyda Deligonul
Author-X-Name-First: Z. Seyda
Author-X-Name-Last: Deligonul
Title: Assessment of acceptance sampling plans using posterior distribution for a dependent process
Abstract:
In this study, performance of single acceptance sampling plans by
attribute is investigated by using the distribution of fraction
nonconformance (i.e. lot quality distribution) for a dependent production
process. It is the aim of this study to demonstrate that, in order to
emphasize consumer risk (i.e. the risk of accepting a bad lot), it is
better to evaluate a sampling plan based upon its performance as assessed
by the posterior distribution of fractions nonconforming in accepted lots.
Similarly, it is the desired posterior distribution that sets the basis
for designing a sampling plan. The prior distribution used in this study
is derived from a Markovian model of dependence.
Journal: Journal of Applied Statistics
Pages: 299-307
Issue: 2
Volume: 37
Year: 2010
Keywords: acceptance sampling, dependent production processes, lot quality distribution, posterior distribution, mean squared nonconformance,
X-DOI: 10.1080/02664760902998451
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902998451
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:2:p:299-307
Template-Type: ReDIF-Article 1.0
Author-Name: Walid Gani
Author-X-Name-First: Walid
Author-X-Name-Last: Gani
Author-Name: Hassen Taleb
Author-X-Name-First: Hassen
Author-X-Name-Last: Taleb
Author-Name: Mohamed Limam
Author-X-Name-First: Mohamed
Author-X-Name-Last: Limam
Title: Support vector regression based residual control charts
Abstract:
Control charts for residuals, based on the regression model, require a
robust fitting technique for minimizing the error resulting from the
fitted model. However, in the multivariate case, when the number of
variables is high and data become complex, traditional fitting techniques,
such as ordinary least squares (OLS), lose efficiency. In this paper,
support vector regression (SVR) is used to construct robust control charts
for residuals, called SVR-chart. This choice is based on the fact that the
SVR is designed to minimize the structural error whereas other techniques
minimize the empirical error. An application shows that SVR methods gives
competitive results in comparison with the OLS and the partial least
squares method, in terms of standard deviation of the error prediction and
the standard error of performance. A sensitivity study is conducted to
evaluate the SVR-chart performance based on the average run length (ARL)
and showed that the SVR-chart has the best ARL behaviour in comparison
with the other residuals control charts.
Journal: Journal of Applied Statistics
Pages: 309-324
Issue: 2
Volume: 37
Year: 2010
Keywords: SVR-chart, multivariate regression, SDEP, SEP, ARL,
X-DOI: 10.1080/02664760903002667
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903002667
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:2:p:309-324
Template-Type: ReDIF-Article 1.0
Author-Name: Jing Wang
Author-X-Name-First: Jing
Author-X-Name-Last: Wang
Title: Gibbs sampling in DP-based nonlinear mixed effects models
Abstract:
This article uses several approaches to deal with the difficulty involved
in evaluating the intractable integral when using Gibbs sampling to
estimate the nonlinear mixed effects model (NLMM) based on the Dirichlet
process (DP). For illustration, we applied these approaches to real data
and simulations. Comparisons are then made between these methods with
respect to estimation accuracy and computing efficiency.
Journal: Journal of Applied Statistics
Pages: 325-340
Issue: 2
Volume: 37
Year: 2010
Keywords: nonlinear mixed effects model, Dirichlet process, Laplace's approximation, adaptive Gaussian quadrature approximation, No-gaps algorithm, EM algorithm, Monte Carlo approximations, Markov chain,
X-DOI: 10.1080/02664760903117721
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903117721
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:2:p:325-340
Template-Type: ReDIF-Article 1.0
Author-Name: Jan Serroyen
Author-X-Name-First: Jan
Author-X-Name-Last: Serroyen
Author-Name: Liesbeth Bruckers
Author-X-Name-First: Liesbeth
Author-X-Name-Last: Bruckers
Author-Name: Geert Rogiers
Author-X-Name-First: Geert
Author-X-Name-Last: Rogiers
Author-Name: Geert Molenberghs
Author-X-Name-First: Geert
Author-X-Name-Last: Molenberghs
Title: Characterizing persistent disturbing behavior using longitudinal and multivariate techniques
Abstract:
Persistent disturbing behavior (PDB) refers to a chronic condition in
therapy-resistant psychiatric patients. Since these patients are highly
unstable and difficult to maintain in their natural living environment and
even in hospital wards, it is important to properly characterize this
group. Previous studies in the Belgian province of Limburg indicated that
the size of this group was larger than anticipated. Here, using a score
calculated from longitudinal psychiatric registration data in 611
patients, we characterize the difference between PDB patients and a set of
control patients. These differences are studied both at a given point in
time, using discriminant analysis, as well as in terms of the evolution of
the score over time, using longitudinal data analysis methods. Further,
using clustering techniques, the group of PDB patients is split into two
subgroups, characterized in terms of a number of ordinal scores. Such
findings are useful from a scientific as well as from an organizational
point of view.
Journal: Journal of Applied Statistics
Pages: 341-355
Issue: 2
Volume: 37
Year: 2010
Keywords: cluster analysis, discriminant analysis, longitudinal data, multivariate methods, psychiatry,
X-DOI: 10.1080/02664760802688673
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802688673
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:2:p:341-355
Template-Type: ReDIF-Article 1.0
Author-Name: Ana Militino
Author-X-Name-First: Ana
Author-X-Name-Last: Militino
Title: Statistics and data with R
Abstract:
Journal: Journal of Applied Statistics
Pages: 357-358
Issue: 2
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902811589
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902811589
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:2:p:357-358
Template-Type: ReDIF-Article 1.0
Author-Name: Jung Hsien Chang
Author-X-Name-First: Jung Hsien
Author-X-Name-Last: Chang
Author-Name: Mao Wei Hung
Author-X-Name-First: Mao Wei
Author-X-Name-Last: Hung
Title: Liquidity spreads in the corporate bondmarket: Estimation using a semi-parametric model
Abstract:
This study utilizes the liquidity risk associated with Treasury bonds to
directly determine the degree to which liquidity spreads account for
corporate bond spreads. This enhances understanding of their relative
contributions to the yield spreads of corporate bonds. To capture time
variation on instantaneous spreads and volatility and to reduce modeling
bias, semi-parametric techniques are applied to estimate the time-varying
intensity process. Empirical results indicate that our semi-parametric
model is good at capturing the time variation in default and liquidity
intensity processes. The credit spreads are due to default risk and
reflect the relative liquidity of the corporate bond market, indicating
that liquidity risk plays an important role in corporate bond valuation.
Journal: Journal of Applied Statistics
Pages: 359-374
Issue: 3
Volume: 37
Year: 2010
Keywords: liquidity risk, on-the-run, off-the-run, semi-parameter model, reduced-form model,
X-DOI: 10.1080/02664760802688681
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802688681
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:359-374
Template-Type: ReDIF-Article 1.0
Author-Name: D. Senthilkumar
Author-X-Name-First: D.
Author-X-Name-Last: Senthilkumar
Author-Name: D. Muthuraj
Author-X-Name-First: D.
Author-X-Name-Last: Muthuraj
Title: Construction and selection of tightened-normal-tightened variables sampling scheme of type TNTVSS (n1, n2; k)
Abstract:
This paper provides tables for the construction and selection of
tightened-normal-tightened variables sampling scheme of type TNTVSS (n1,
n2; k). The method of designing the scheme indexed by (AQL, α) and
(LQL, β) is indicated. The TNTVSS (nT, nN; k) is compared with
conventional single sampling plans for variables and with TNT (n1, n2; c)
scheme for attributes, and it is shown that the TNTVSS is more efficient.
Journal: Journal of Applied Statistics
Pages: 375-390
Issue: 3
Volume: 37
Year: 2010
Keywords: variables sampling, tightened-normal-tightened scheme, AQL, LQL, switching rules, producer's risk, consumer's risk and ASN,
X-DOI: 10.1080/02664760802695777
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802695777
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:375-390
Template-Type: ReDIF-Article 1.0
Author-Name: Fawziah Alshunnar
Author-X-Name-First: Fawziah
Author-X-Name-Last: Alshunnar
Author-Name: Mohammad Raqab
Author-X-Name-First: Mohammad
Author-X-Name-Last: Raqab
Author-Name: Debasis Kundu
Author-X-Name-First: Debasis
Author-X-Name-Last: Kundu
Title: On the comparison of the Fisher information of the log-normal and generalized Rayleigh distributions
Abstract:
Surles and Padgett recently considered two-parameter Burr Type X
distribution by introducing a scale parameter and called it the
generalized Rayleigh distribution. It is observed that the generalized
Rayleigh and log-normal distributions have many common properties and both
distributions can be used quite effectively to analyze skewed data set. In
this paper, we mainly compare the Fisher information matrices of the two
distributions for complete and censored observations. Although, both
distributions may provide similar data fit and are quite similar in nature
in many aspects, the corresponding Fisher information matrices can be
quite different. We compute the total information measures of the two
distributions for different parameter ranges and also compare the loss of
information due to censoring. Real data analysis has been performed for
illustrative purposes.
Journal: Journal of Applied Statistics
Pages: 391-404
Issue: 3
Volume: 37
Year: 2010
Keywords: Fisher information matrix, Burr Type X distribution, generalized Rayleigh distribution, log-normal distribution, left censoring, right censoring,
X-DOI: 10.1080/02664760802698961
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802698961
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:391-404
Template-Type: ReDIF-Article 1.0
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Author-Name: Chi-Hyuck Jun
Author-X-Name-First: Chi-Hyuck
Author-X-Name-Last: Jun
Title: A double acceptance sampling plan for generalized log-logistic distributions with known shape parameters
Abstract:
A double acceptance sampling plan for the truncated life test is
developed assuming that the lifetime of a product follows a generalized
log-logistic distribution with known shape parameters. The zero and one
failure scheme is mainly considered, where the lot is accepted if no
failures are observed from the first sample and it is rejected if two or
more failures occur. When there is one failure from the first sample, the
second sample is drawn and tested for the same duration as the first
sample. The minimum sample sizes of the first and second samples are
determined to ensure that the true median life is longer than the given
life at the specified consumer's confidence level. The operating
characteristics are analyzed according to various ratios of the true
median life to the specified life. The minimum such ratios are also
obtained so as to lower the producer's risk at the specified level. The
results are explained with examples.
Journal: Journal of Applied Statistics
Pages: 405-414
Issue: 3
Volume: 37
Year: 2010
Keywords: consumer's confidence, double acceptance sampling, log-logistic distribution, producer's risk, single acceptance sampling,
X-DOI: 10.1080/02664760802698979
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802698979
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:405-414
Template-Type: ReDIF-Article 1.0
Author-Name: Christophe Demattei
Author-X-Name-First: Christophe
Author-X-Name-Last: Demattei
Title: Le Cam theorem on interval division by randomly chosen points: Pedagogical explanations and application to temporal cluster detection
Abstract:
The aim of this paper is to propose a pedagogical explanation of the Le
Cam theorem and to illustrate its use, through a practical application,
for temporal cluster detection. This theorem focusses on the interval
division by randomly chosen points. The aim of the theorem is to
characterize the asymptotic behavior of a certain category of sums of
functions applied to the length of successive intervals between points. It
is not very intuitive and its understanding needs some deepening. After
enouncing the theorem, its different aspects are explained and detailed in
a way as pedagogical as possible. Theoretical applications are proposed
through the proof of two propositions. Then a very concrete application of
this theorem for temporal cluster detection is presented, tested by a
power study, and compared with other global cluster detection tests.
Finally, this approach is applied to the well-known Knox temporal data
set.
Journal: Journal of Applied Statistics
Pages: 415-424
Issue: 3
Volume: 37
Year: 2010
Keywords: Le Cam theorem, uniform spacings, cluster detection, temporal cluster, Knox data set,
X-DOI: 10.1080/02664760802715872
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802715872
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:415-424
Template-Type: ReDIF-Article 1.0
Author-Name: Ling-Yau Chan
Author-X-Name-First: Ling-Yau
Author-X-Name-Last: Chan
Author-Name: Rahul Mukerjee
Author-X-Name-First: Rahul
Author-X-Name-Last: Mukerjee
Title: Interval estimation of a small proportion via inverse sampling
Abstract:
On the basis of a negative binomial sampling scheme, we consider a
uniformly most accurate upper confidence limit for a small but unknown
proportion, such as the proportion of defectives in a manufacturing
process. The optimal stopping rule, with reference to the twin criteria of
the expected length of the confidence interval and the expected sample
size, is investigated. The proposed confidence interval has also been
compared with several others that have received attention in the recent
literature.
Journal: Journal of Applied Statistics
Pages: 425-433
Issue: 3
Volume: 37
Year: 2010
Keywords: expected length, posterior quantile, negative binomial, sample size, score interval, uniformly most accurate, upper confidence limit,
X-DOI: 10.1080/02664760802715880
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802715880
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:425-433
Template-Type: ReDIF-Article 1.0
Author-Name: C. C. Figueiredo
Author-X-Name-First: C. C.
Author-X-Name-Last: Figueiredo
Author-Name: H. Bolfarine
Author-X-Name-First: H.
Author-X-Name-Last: Bolfarine
Author-Name: M. C. Sandoval
Author-X-Name-First: M. C.
Author-X-Name-Last: Sandoval
Author-Name: C. R. O. P. Lima
Author-X-Name-First: C. R. O. P.
Author-X-Name-Last: Lima
Title: On the skew-normal calibration model
Abstract:
In this article, we present the EM-algorithm for performing maximum
likelihood estimation of an asymmetric linear calibration model with the
assumption of skew-normally distributed error. A simulation study is
conducted for evaluating the performance of the calibration estimator with
interpolation and extrapolation situations. As one application in a real
data set, we fitted the model studied in a dimensional measurement method
used for calculating the testicular volume through a caliper and its
calibration by using ultrasonography as the standard method. By applying
this methodology, we do not need to transform the variables to have
symmetrical errors. Another interesting aspect of the approach is that the
developed transformation to make the information matrix nonsingular, when
the skewness parameter is near zero, leaves the parameter of interest
unchanged. Model fitting is implemented and the best choice between the
usual calibration model and the model proposed in this article was
evaluated by developing the Akaike information criterion, Schwarz's
Bayesian information criterion and Hannan-Quinn criterion.
Journal: Journal of Applied Statistics
Pages: 435-451
Issue: 3
Volume: 37
Year: 2010
Keywords: linear calibration model, EM-algorithm, skewness coefficient, skew-normal distribution, singularity of the information matrix, bias prevention,
X-DOI: 10.1080/02664760802715906
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802715906
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:435-451
Template-Type: ReDIF-Article 1.0
Author-Name: Chi-Hyuck Jun
Author-X-Name-First: Chi-Hyuck
Author-X-Name-Last: Jun
Author-Name: Hyeseon Lee
Author-X-Name-First: Hyeseon
Author-X-Name-Last: Lee
Author-Name: Sang-Ho Lee
Author-X-Name-First: Sang-Ho
Author-X-Name-Last: Lee
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Title: A variables repetitive group sampling plan under failure-censored reliability tests for Weibull distribution
Abstract:
We propose a variables repetitive group sampling plan under type-II or
failure-censored life testing when the lifetime of a part follows a
Weibull distribution with a known shape parameter. The acceptance criteria
do not involve unknown scale parameter differently from the existing
plans. To determine the design parameters of the proposed plan, the usual
approach of using two points on the operating characteristic curve is
adopted and an optimization problem is formulated so as to minimize the
average number of failures observed. Tables for design parameters are
constructed when the quality of parts is represented by the unreliability
or the ratio of the mean lifetime to the specified life. It is found that
the proposed sampling plan can reduce the sample size significantly than
do the single sampling plan.
Journal: Journal of Applied Statistics
Pages: 453-460
Issue: 3
Volume: 37
Year: 2010
Keywords: acceptance sampling, consumer's risk, failure censoring, OC curve, producer's risk, progressive censoring,
X-DOI: 10.1080/02664760802715914
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802715914
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:453-460
Template-Type: ReDIF-Article 1.0
Author-Name: Chung-Ho Chen
Author-X-Name-First: Chung-Ho
Author-X-Name-Last: Chen
Title: A note on some modified Pulak and Al-Sultan's model
Abstract:
Pulak and Al-Sultan presented a rectifying inspection plan applying in
the determination of optimum process mean. However, they did not point out
whether the non-conforming items in the sample of accepted lot are
replaced or eliminated from the lot and neglected the quality loss within
specification limits. In this paper, we further propose the modified Pulak
and Al-Sultan model with quadratic quality loss function. There are four
cases considered in the modified model: (1) the non-conforming items in
the sample of accepted lot are neither replaced nor eliminated from the
lot; (2) the non-conforming items in the sample of accepted lot are not
replaced but are eliminated from the lot; (3) the non-conforming
items in the sample of accepted lot are replaced by conforming ones; (4)
the non-conforming items in the sample of accepted lot are replaced by
non-inspected items. The numerical results and sensitivity analysis of
parameters show that their solutions are slightly different.
Journal: Journal of Applied Statistics
Pages: 461-472
Issue: 3
Volume: 37
Year: 2010
Keywords: rectifying inspection plan, process mean, quadratic quality loss function, specification limits,
X-DOI: 10.1080/02664760902729658
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902729658
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:461-472
Template-Type: ReDIF-Article 1.0
Author-Name: Angela Montanari
Author-X-Name-First: Angela
Author-X-Name-Last: Montanari
Author-Name: Cinzia Viroli
Author-X-Name-First: Cinzia
Author-X-Name-Last: Viroli
Title: A skew-normal factor model for the analysis of student satisfaction towards university courses
Abstract:
Classical factor analysis relies on the assumption of normally
distributed factors that guarantees the model to be estimated via the
maximum likelihood method. Even when the assumption of Gaussian factors is
not explicitly formulated and estimation is performed via the iterated
principal factors' method, the interest is actually mainly focussed on the
linear structure of the data, since only moments up to the second ones are
involved. In many real situations, the factors could not be adequately
described by the first two moments only. For example, skewness
characterizing most latent variables in social analysis can be properly
measured by the third moment: the factors are not normally distributed and
covariance is no longer a sufficient statistic. In this work we propose a
factor model characterized by skew-normally distributed factors.
Skew-normal refers to a parametric class of probability distributions,
that extends the normal distribution by an additional shape parameter
regulating the skewness. The model estimation can be solved by the
generalized EM algorithm, in which the iterative Newthon-Raphson procedure
is needed in the M-step to estimate the factor shape parameter. The
proposed skew-normal factor analysis is applied to the study of student
satisfaction towards university courses, in order to identify the factors
representing different aspects of the latent overall satisfaction.
Journal: Journal of Applied Statistics
Pages: 473-487
Issue: 3
Volume: 37
Year: 2010
Keywords: factor analysis, skew-normal distribution, latent variables, orthogonal rotations, EM algorithm, Gauss-Hermite quadrature points,
X-DOI: 10.1080/02664760902736737
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902736737
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:473-487
Template-Type: ReDIF-Article 1.0
Author-Name: Hakan Demirtas
Author-X-Name-First: Hakan
Author-X-Name-Last: Demirtas
Title: A distance-based rounding strategy for post-imputation ordinal data
Abstract:
Multiple imputation has emerged as a widely used model-based approach in
dealing with incomplete data in many application areas. Gaussian and
log-linear imputation models are fairly straightforward to implement for
continuous and discrete data, respectively. However, in missing data
settings which include a mix of continuous and discrete variables, correct
specification of the imputation model could be a daunting task owing to
the lack of flexible models for the joint distribution of variables of
different nature. This complication, along with accessibility to software
packages that are capable of carrying out multiple imputation under the
assumption of joint multivariate normality, appears to encourage applied
researchers for pragmatically treating the discrete variables as
continuous for imputation purposes, and subsequently rounding the imputed
values to the nearest observed category. In this article, I introduce a
distance-based rounding approach for ordinal variables in the presence of
continuous ones. The first step of the proposed rounding process is
predicated upon creating indicator variables that correspond to the
ordinal levels, followed by jointly imputing all variables under the
assumption of multivariate normality. The imputed values are then
converted to the ordinal scale based on their Euclidean distances to a set
of indicators, with minimal distance corresponding to the closest match. I
compare the performance of this technique to crude rounding via commonly
accepted accuracy and precision measures with simulated data sets.
Journal: Journal of Applied Statistics
Pages: 489-500
Issue: 3
Volume: 37
Year: 2010
Keywords: multiple imputation, rounding, bias, precision, ordinal data,
X-DOI: 10.1080/02664760902744954
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902744954
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:489-500
Template-Type: ReDIF-Article 1.0
Author-Name: Martin Rios
Author-X-Name-First: Martin
Author-X-Name-Last: Rios
Author-Name: Toni Monleon-Getino
Author-X-Name-First: Toni
Author-X-Name-Last: Monleon-Getino
Title: Application of a Markovian process to the calculation of mean time equilibrium in a genetic drift model
Abstract:
The most common phenomena in the evolution process are natural selection
and genetic drift. In this article, we propose a probabilistic method to
calculate the mean and variance time for random genetic drift equilibrium,
measured as number of generations, based on Markov process and a complex
probabilistic model. We studied the case of a constant, panmictic
population of diploid organisms, which had a demonstrated lack of
mutation, selection or migration for a determined autonomic locus, and two
possible alleles, H and h. The calculations presented in this article were
based on a Markov process. They explain how genetic and genotypic
frequencies changed in different generations and how the heterozygote
alleles became extinguished after many generations. This calculation could
be used in more evolutionary applications. Finally, some simulations are
presented to illustrate the theoretical calculations presented using
different basal situations.
Journal: Journal of Applied Statistics
Pages: 501-513
Issue: 3
Volume: 37
Year: 2010
Keywords: Markovian process, probabilistic model, genetic drift, population genetics, mean at equilibrium,
X-DOI: 10.1080/02664760902889981
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902889981
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:501-513
Template-Type: ReDIF-Article 1.0
Author-Name: Kanti Mardia
Author-X-Name-First: Kanti
Author-X-Name-Last: Mardia
Title: Bayesian analysis for bivariate von Mises distributions
Abstract:
There has been renewed interest in the directional Bayesian analysis for
the bivariate case especially in view of its fundamental new and
challenging applications to bioinformatics. The previous work had
concentrated on Bayesian analysis for univariate von Mises distribution.
Here, we give the description of the general bivariate von Mises (BVM)
distribution and its properties. There are various submodels of this
distribution which have become important and we give a review of these
submodels. Also, we derive the normalizing constant for the general BVM
distribution in a compact way. Conjugate priors and posteriors for the
general case and the submodels are obtained. The conjugate prior for a
multivariate von Mises distribution is also examined.
Journal: Journal of Applied Statistics
Pages: 515-528
Issue: 3
Volume: 37
Year: 2010
Keywords: bioinformatics, bivariate angular data, conjugate priors, cosine model, directional statistics, distributions on torus, sine model,
X-DOI: 10.1080/02664760903551267
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903551267
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:3:p:515-528
Template-Type: ReDIF-Article 1.0
Author-Name: Paul Baxter
Author-X-Name-First: Paul
Author-X-Name-Last: Baxter
Author-Name: Paul Marchant
Author-X-Name-First: Paul
Author-X-Name-Last: Marchant
Title: The cross-product ratio in bivariate lognormal and gamma distributions, with an application to non-randomized trials
Abstract:
Non-randomized trials can give a biased impression of the effectiveness
of any intervention. We consider trials in which incidence rates are
compared in two areas over two periods. Typically, one area receives an
intervention, whereas the other does not. We outline and illustrate a
method to estimate the bias in such trials under two different bivariate
models. The illustrations use data in which no particular intervention is
operating. The purpose is to illustrate the size of the bias that could be
observed purely due to regression towards the mean (RTM). The
illustrations show that the bias can be appreciably different from zero,
and even when centred on zero, the variance of the bias can be large. We
conclude that the results of non-randomized trials should be treated with
caution, as interventions which show small effects could be explained as
artefacts of RTM.
Journal: Journal of Applied Statistics
Pages: 529-536
Issue: 4
Volume: 37
Year: 2010
Keywords: bivariate lognormal distribution, crime reduction interventions, Kibble bivariate gamma distribution, non-randomized trials, regression towards the mean,
X-DOI: 10.1080/02664760902744962
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902744962
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:529-536
Template-Type: ReDIF-Article 1.0
Author-Name: Georgia Kourlaba
Author-X-Name-First: Georgia
Author-X-Name-Last: Kourlaba
Author-Name: Demosthenes Panagiotakos
Author-X-Name-First: Demosthenes
Author-X-Name-Last: Panagiotakos
Title: The diagnostic accuracy of a composite index increases as the number of partitions of the components increases and when specific weights are assigned to each component
Abstract:
The aim of this work was to evaluate whether the number of partitions of
index components and the use of specific weights for each component
influence the diagnostic accuracy of a composite index. Simulation studies
were conducted in order to compare the sensitivity, specificity and area
under the ROC curve (AUC) of indices constructed using equal number of
components but different number of partitions for all components.
Moreover, the odds ratio obtained from the univariate logistic regression
model for each component was proposed as potential weight. The current
simulation results showed that the sensitivity, specificity and AUC of an
index increase as the number of partitions of components increases.
However, the rate that the diagnostic accuracy increases is reduced as the
number of partitions increases. In addition, it was found that the
diagnostic accuracy of the weighted index developed using the proposed
weights is higher compared with that of the corresponding un-weighted
index. The use of large-scale index components and the use of effect size
measures (i.e. odds ratios, ORs) of index components as potential weights
are proposed in order to obtain indices with high diagnostic accuracy for
a particular binary outcome.
Journal: Journal of Applied Statistics
Pages: 537-554
Issue: 4
Volume: 37
Year: 2010
Keywords: weights, indices, specificity, AUC, simulations, application,
X-DOI: 10.1080/02664760902751876
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902751876
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:537-554
Template-Type: ReDIF-Article 1.0
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Author-Name: Debasis Kundu
Author-X-Name-First: Debasis
Author-X-Name-Last: Kundu
Author-Name: Munir Ahmad
Author-X-Name-First: Munir
Author-X-Name-Last: Ahmad
Title: Time truncated acceptance sampling plans for generalized exponential distribution
Abstract:
Acceptance sampling plans for generalized exponential distribution when
the lifetime experiment is truncated at a pre-determined time are provided
in this article. The tables are provided for the minimum sample size
required to ensure a certain median life of the experimental unit when the
shape parameter is two. The operating characteristic function values of
the sampling plans and the associated producer's risks are also presented.
It is shown that the tables presented here can be used if instead of
median life, other percentile life is chosen as the criterion or if the
shape parameter is not two. Examples are provided for illustrative
purposes.
Journal: Journal of Applied Statistics
Pages: 555-566
Issue: 4
Volume: 37
Year: 2010
Keywords: acceptance sampling plan, operating characteristic function value, median and percentile points, consumer and producer's risks,
X-DOI: 10.1080/02664760902769787
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902769787
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:555-566
Template-Type: ReDIF-Article 1.0
Author-Name: M. M. Nassar
Author-X-Name-First: M. M.
Author-X-Name-Last: Nassar
Author-Name: S. M. Khamis
Author-X-Name-First: S. M.
Author-X-Name-Last: Khamis
Author-Name: S. S. Radwan
Author-X-Name-First: S. S.
Author-X-Name-Last: Radwan
Title: Geometric sample size determination in Bayesian analysis
Abstract:
The problem of sample size determination in the context of Bayesian
analysis is considered. For the familiar and practically important
parameter of a geometric distribution with a beta prior, three different
Bayesian approaches based on the highest posterior density intervals are
discussed. A computer program handles all computational complexities and
is available upon request.
Journal: Journal of Applied Statistics
Pages: 567-575
Issue: 4
Volume: 37
Year: 2010
Keywords: Bayesian analysis, average coverage criterion (ACC), average length criterion (ALC), worst-outcome criterion (WOC),
X-DOI: 10.1080/02664760902803248
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902803248
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:567-575
Template-Type: ReDIF-Article 1.0
Author-Name: Jorge Cadima
Author-X-Name-First: Jorge
Author-X-Name-Last: Cadima
Author-Name: Francisco Lage Calheiros
Author-X-Name-First: Francisco Lage
Author-X-Name-Last: Calheiros
Author-Name: Isabel Preto
Author-X-Name-First: Isabel
Author-X-Name-Last: Preto
Title: The eigenstructure of block-structured correlation matrices and its implications for principal component analysis
Abstract:
Block-structured correlation matrices are correlation matrices in which
the p variables are subdivided into homogeneous groups, with equal
correlations for variables within each group, and equal correlations
between any given pair of variables from different groups.
Block-structured correlation matrices arise as approximations for certain
data sets' true correlation matrices. A block structure in a correlation
matrix entails a certain number of properties regarding its
eigendecomposition and, therefore, a principal component analysis of the
underlying data. This paper explores these properties, both from an
algebraic and a geometric perspective, and discusses their robustness.
Suggestions are also made regarding the choice of variables to be
subjected to a principal component analysis, when in the presence of
(approximately) block-structured variables.
Journal: Journal of Applied Statistics
Pages: 577-589
Issue: 4
Volume: 37
Year: 2010
Keywords: block-structured correlation matrices, eigendecomposition, principal component analysis, within-group eigenpairs, between-group eigenpairs,
X-DOI: 10.1080/02664760902803263
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902803263
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:577-589
Template-Type: ReDIF-Article 1.0
Author-Name: Francisco Cribari-Neto
Author-X-Name-First: Francisco
Author-X-Name-Last: Cribari-Neto
Author-Name: Maria da Gloria Lima
Author-X-Name-First: Maria da Gloria
Author-X-Name-Last: Lima
Title: Approximate inference in heteroskedastic regressions: A numerical evaluation
Abstract:
The commonly made assumption that all stochastic error terms in the
linear regression model share the same variance (homoskedasticity) is
oftentimes violated in practical applications, especially when they are
based on cross-sectional data. As a precaution, a number of practitioners
choose to base inference on the parameters that index the model on tests
whose statistics employ asymptotically correct standard errors, i.e.
standard errors that are asymptotically valid whether or not the errors
are homoskedastic. In this paper, we use numerical integration methods to
evaluate the finite-sample performance of tests based on different
(alternative) heteroskedasticity-consistent standard errors. Emphasis is
placed on a few recently proposed heteroskedasticity-consistent covariance
matrix estimators. Overall, the results favor the HC4 and HC5
heteroskedasticity-robust standard errors. We also consider the use of
restricted residuals when constructing asymptotically valid standard
errors. Our results show that the only test that clearly benefits from
such a strategy is the HC0 test.
Journal: Journal of Applied Statistics
Pages: 591-615
Issue: 4
Volume: 37
Year: 2010
Keywords: covariance matrix estimation, heteroskedasticity, leverage point, linear regression, quasi-t test,
X-DOI: 10.1080/02664760902803271
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902803271
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:591-615
Template-Type: ReDIF-Article 1.0
Author-Name: H. Jiang
Author-X-Name-First: H.
Author-X-Name-Last: Jiang
Author-Name: M. Xie
Author-X-Name-First: M.
Author-X-Name-Last: Xie
Author-Name: L. C. Tang
Author-X-Name-First: L. C.
Author-X-Name-Last: Tang
Title: On MLEs of the parameters of a modified Weibull distribution for progressively type-2 censored samples
Abstract:
Lifetimes of modern mechanic or electronic units usually exhibit
bathtub-shaped failure rates. An appropriate probability distribution to
model such data is the modified Weibull distribution proposed by Lai et
al. [15]. This distribution has both the two-parameter Weibull and type-1
extreme value distribution as special cases. It is able to model lifetime
data with monotonic and bathtub-shaped failure rates, and thus attracts
some interest among researchers because of this property. In this paper,
the procedure of obtaining the maximum likelihood estimates (MLEs) of the
parameters for progressively type-2 censored and complete samples are
studied. Existence and uniqueness of the MLEs are proved.
Journal: Journal of Applied Statistics
Pages: 617-627
Issue: 4
Volume: 37
Year: 2010
Keywords: modified Weibull distribution, bathtub-shaped failure rate, maximum likelihood estimation, Hessian matrix, uniqueness and existence,
X-DOI: 10.1080/02664760902803289
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902803289
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:617-627
Template-Type: ReDIF-Article 1.0
Author-Name: Hani Samawi
Author-X-Name-First: Hani
Author-X-Name-Last: Samawi
Author-Name: Mohammed Al-Haj Ebrahem
Author-X-Name-First: Mohammed
Author-X-Name-Last: Al-Haj Ebrahem
Author-Name: Noha Al-Zubaidin
Author-X-Name-First: Noha
Author-X-Name-Last: Al-Zubaidin
Title: An optimal sign test for one-sample bivariate location model using an alternative bivariate ranked-set sample
Abstract:
The aim of this paper is to find an optimal alternative bivariate
ranked-set sample for one-sample location model bivariate sign test. Our
numerical and theoretical results indicated that the optimal designs for
the bivariate sign test are the alternative designs with quantifying order
statistics with labels {((r+1)/2, (r+1)/2)}, when the set size r is odd
and {(r/2+1, r/2), (r/2, r/2+1)} when the set size r is even. The
asymptotic distribution and Pitman efficiencies of these designs are
derived. A simulation study is conducted to investigate the power of the
proposed optimal designs. Illustration using real data with the Bootstrap
algorithm for P-value estimation is used.
Journal: Journal of Applied Statistics
Pages: 629-650
Issue: 4
Volume: 37
Year: 2010
Keywords: bivariate ranked-set sample, location model, median ranked-set sample, Pitman efficiencies, ranked-set sample, simple random sample, sign test,
X-DOI: 10.1080/02664760902810805
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902810805
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:629-650
Template-Type: ReDIF-Article 1.0
Author-Name: Luis Nobre Pereira
Author-X-Name-First: Luis Nobre
Author-X-Name-Last: Pereira
Author-Name: Pedro Simoes Coelho
Author-X-Name-First: Pedro Simoes
Author-X-Name-Last: Coelho
Title: Small area estimation of mean price of habitation transaction using time-series and cross-sectional area-level models
Abstract:
In this paper, a new small domain estimator for area-level data is
proposed. The proposed estimator is driven by a real problem of estimating
the mean price of habitation transaction at a regional level in a European
country, using data collected from a longitudinal survey conducted by a
national statistical office. At the desired level of inference, it is not
possible to provide accurate direct estimates because the sample sizes in
these domains are very small. An area-level model with a heterogeneous
covariance structure of random effects assists the proposed combined
estimator. This model is an extension of a model due to Fay and Herriot
[5], but it integrates information across domains and over several periods
of time. In addition, a modified method of estimation of variance
components for time-series and cross-sectional area-level models is
proposed by including the design weights. A Monte Carlo simulation, based
on real data, is conducted to investigate the performance of the proposed
estimators in comparison with other estimators frequently used in small
area estimation problems. In particular, we compare the performance of
these estimators with the estimator based on the Rao-Yu model [23]. The
simulation study also accesses the performance of the modified variance
component estimators in comparison with the traditional ANOVA method.
Simulation results show that the estimators proposed perform better than
the other estimators in terms of both precision and bias.
Journal: Journal of Applied Statistics
Pages: 651-666
Issue: 4
Volume: 37
Year: 2010
Keywords: linear mixed models, chronological autocorrelation, estimation of variance components, empirical best linear unbiased predictor, estimation of mean price of habitation,
X-DOI: 10.1080/02664760902810821
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902810821
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:651-666
Template-Type: ReDIF-Article 1.0
Author-Name: Karl Majeske
Author-X-Name-First: Karl
Author-X-Name-Last: Majeske
Author-Name: Terri Lynch-Caris
Author-X-Name-First: Terri
Author-X-Name-Last: Lynch-Caris
Author-Name: Janet Brelin-Fornari
Author-X-Name-First: Janet
Author-X-Name-Last: Brelin-Fornari
Title: Quantifying R2 bias in the presence of measurement error
Abstract:
Measurement error (ME) is the difference between the true unknown value
of a variable and the data assigned to that variable during the measuring
process. The multiple correlation coefficient quantifies the strength of
the relationship between the dependent and independent variable(s) in
regression modeling. In this paper, we show that ME in the dependent
variable results in a negative bias in the multiple correlation
coefficient, making the relationship appear weaker than it should. The
adjusted R2 provides regression modelers an unbiased estimate of the
multiple correlation coefficient. However, due to the ME induced bias in
the multiple correlation coefficient, the otherwise unbiased adjusted R2
under-estimates the variance explained by a regression model. This paper
proposes two statistics for estimating the multiple correlation
coefficient, both of which take into account the ME in the dependent
variable. The first statistic uses all unbiased estimators, but may
produce values outside the [0,1] interval. The second statistic requires
modeling a single data set, created by including descriptive variables on
the subjects used in a gage study. Based on sums of squares, the statistic
has the properties of an R2: it measures the proportion of variance
explained; has values restricted to the [0,1] interval; and the endpoints
indicate no variance explained and all variance explained respectively. We
demonstrate the methodology using data from a study of cervical spine
range of motion in children.
Journal: Journal of Applied Statistics
Pages: 667-677
Issue: 4
Volume: 37
Year: 2010
Keywords: measurement error, regression analysis, R2, bias correction, gage R&R,
X-DOI: 10.1080/02664760902814542
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902814542
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:667-677
Template-Type: ReDIF-Article 1.0
Author-Name: Anne-Line Balduck
Author-X-Name-First: Anne-Line
Author-X-Name-Last: Balduck
Author-Name: Anita Prinzie
Author-X-Name-First: Anita
Author-X-Name-Last: Prinzie
Author-Name: Marc Buelens
Author-X-Name-First: Marc
Author-X-Name-Last: Buelens
Title: The effectiveness of coach turnover and the effect on home team advantage, team quality and team ranking
Abstract:
The effectiveness of coach turnover on team performance is widely
discussed in the literature due to the indirect impact of a team's
performance on a club's revenues. This study examines the effect of coach
turnover within a competition season by focusing on the change in team
quality and the change in home team advantage under the new coach. The
change in team quality or home team advantage can vary according to the
team (team specific) or might be an independent quantity (non-team
specific). We estimated nine possible regression models, given no change,
team-specific change and non-team-specific change in quality or home team
advantage. The data are the match results of Belgian male soccer teams
playing in the highest national division during seven seasons. Results
point to a team-specific effect of a new coach on a team's quality. This
article further contributes by evaluating the new coach's success with
regard to whether his ability to improve team quality also results in a
better position of the team in the final ranking. A new coach will be able
to improve the ranking of the team if the improved team quality under the
new coach renders a positive team quality.
Journal: Journal of Applied Statistics
Pages: 679-689
Issue: 4
Volume: 37
Year: 2010
Keywords: managerial change, home team advantage, team performance, team quality, regression model, individual match data, team ranking,
X-DOI: 10.1080/02664760902824731
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902824731
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:679-689
Template-Type: ReDIF-Article 1.0
Author-Name: Shola Adeyemi
Author-X-Name-First: Shola
Author-X-Name-Last: Adeyemi
Author-Name: Thierry Chaussalet
Author-X-Name-First: Thierry
Author-X-Name-Last: Chaussalet
Author-Name: Haifeng Xie
Author-X-Name-First: Haifeng
Author-X-Name-Last: Xie
Author-Name: Md Asaduzaman
Author-X-Name-First: Md
Author-X-Name-Last: Asaduzaman
Title: Random effects models for operational patient pathways
Abstract:
Patient flow modeling is a growing field of interest in health services
research. Several techniques have been applied to model movement of
patients within and between health-care facilities. However, individual
patient experience during the delivery of care has always been overlooked.
In this work, a random effects model is introduced to patient flow
modeling and applied to a London Hospital Neonatal unit data. In
particular, a random effects multinomial logit model is used to capture
individual patient trajectories in the process of care with patient
frailties modeled as random effects. Intuitively, both operational and
clinical patient flow are modeled, the former being physical and the
latter latent. Two variants of the model are proposed, one based on mere
patient pathways and the other based on patient characteristics. Our
technique could identify interesting pathways such as those that result in
high probability of death (survival), pathways incurring the least
(highest) cost of care or pathways with the least (highest) length of
stay. Patient-specific discharge probabilities from the health care system
could also be predicted. These are of interest to health-care managers in
planning the scarce resources needed to run health-care institutions.
Journal: Journal of Applied Statistics
Pages: 691-701
Issue: 4
Volume: 37
Year: 2010
Keywords: patient flow, frailty, pathways, transition,
X-DOI: 10.1080/02664760902873951
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902873951
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:691-701
Template-Type: ReDIF-Article 1.0
Author-Name: Søren Feodor Nielsen
Author-X-Name-First: Søren Feodor
Author-X-Name-Last: Nielsen
Title: Generalized linear models for insurance data
Abstract:
Journal: Journal of Applied Statistics
Pages: 703-703
Issue: 4
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902811571
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902811571
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:703-703
Template-Type: ReDIF-Article 1.0
Author-Name: Z.Q. John Lu
Author-X-Name-First: Z.Q.
Author-X-Name-Last: John Lu
Title: Bayesian methods for data analysis, third edition
Abstract:
Journal: Journal of Applied Statistics
Pages: 705-706
Issue: 4
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902811621
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902811621
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:4:p:705-706
Template-Type: ReDIF-Article 1.0
Author-Name: Y. H. Michlin
Author-X-Name-First: Y. H.
Author-X-Name-Last: Michlin
Author-Name: G. Grabarnik
Author-X-Name-First: G.
Author-X-Name-Last: Grabarnik
Title: Search boundaries of the truncated discrete sequential test
Abstract:
The theme of this paper is improved planning of binomial sequential
probability ratio tests in the context of comparison of two objects as to
their time between failures or to failure, assumed to be exponentially
distributed. The authors' earlier works established that the probabilities
of I- and II- type errors (α and β) are discrete in character
and do not lend themselves to analytical expression. Accordingly, the
choice of the optimal parameters for the decision boundaries necessitates
a search for extrema in discrete sets. The present work outlines a
procedure that involves application of the continued-fractions theory, and
permits finding the set of boundary positions in which the test
characteristics undergo changes. It was established, that in the domains
described in the earlier papers, the relationships of α and β
versus these positions are close to planar and - within narrow limits -
stepwise. The step sizes are highly variable, so that the standard minimum
search procedures are either cumbersome or actually useless. On the basis
of these relationships~- and others - a search algorithm is proposed for
the optimal test boundaries. An example is presented - planning and
implementation of this test in the integrated-circuit industry.
Journal: Journal of Applied Statistics
Pages: 707-724
Issue: 5
Volume: 37
Year: 2010
Keywords: reliability, test planning, time between failures, time to failure, exponential distribution, binomial distribution,
X-DOI: 10.1080/02664760903254078
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903254078
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:5:p:707-724
Template-Type: ReDIF-Article 1.0
Author-Name: M. I. Rolfe
Author-X-Name-First: M. I.
Author-X-Name-Last: Rolfe
Author-Name: K. Mengersen
Author-X-Name-First: K.
Author-X-Name-Last: Mengersen
Author-Name: G. Beadle
Author-X-Name-First: G.
Author-X-Name-Last: Beadle
Author-Name: K. Vearncombe
Author-X-Name-First: K.
Author-X-Name-Last: Vearncombe
Author-Name: B. Andrew
Author-X-Name-First: B.
Author-X-Name-Last: Andrew
Author-Name: H. L. Johnson
Author-X-Name-First: H. L.
Author-X-Name-Last: Johnson
Author-Name: C. Walsh
Author-X-Name-First: C.
Author-X-Name-Last: Walsh
Title: Latent class piecewise linear trajectory modelling for short-term cognition responses after chemotherapy for breast cancer patients
Abstract:
This paper investigates the impact of chemotherapy on cognitive function
of breast cancer patients and whether this response is homogeneous for all
patients. Latent class piecewise linear trajectory (growth) models were
employed to describe changes and identify subgroups in three Auditory
Verbal Learning Test measures (learning, immediate retention and delayed
recall) in 130 breast cancer patients taken at three time periods: before
chemotherapy and 1 and 6 months post-chemotherapy. Two distinct subgroups
of women exhibiting different patterns of response were identified for
learning and delayed recall and three for immediate retention. The groups
differed in level (intercept) at 1 month post-chemotherapy and patterns of
decline and recovery. Binomial and multinomial logistic regressions on the
latent classes found that age, initial National Adult Reading Test
(NART)-predicted IQ, stage of cancer and the initial Functional Assessment
of Cancer Therapy-Breast subscale (or subsets thereof) to be significant
predictors of classes.
Journal: Journal of Applied Statistics
Pages: 725-738
Issue: 5
Volume: 37
Year: 2010
Keywords: latent class, piecewise linear, trajectory, cognition, breast cancer, chemotherapy, growth models, mixtures,
X-DOI: 10.1080/02664760902729641
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902729641
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:5:p:725-738
Template-Type: ReDIF-Article 1.0
Author-Name: Marco Morales
Author-X-Name-First: Marco
Author-X-Name-Last: Morales
Title: Lag order selection for an optimal autoregressive covariance matrix estimator
Abstract:
A good parametric spectral estimator requires an accurate estimate of the
sum of AR coefficients, however a criterion which minimizes the innovation
variance not necessarily yields the best spectral estimate. This paper
develops an alternative information criterion considering the bias in the
sum of the parameters for the autoregressive estimator of the spectral
density at frequency zero.
Journal: Journal of Applied Statistics
Pages: 739-748
Issue: 5
Volume: 37
Year: 2010
Keywords: Spectral density, covariance matrix, autoregressive, lag-order selection, statistical inference,
X-DOI: 10.1080/02664760902873969
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902873969
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:5:p:739-748
Template-Type: ReDIF-Article 1.0
Author-Name: J. Jacques
Author-X-Name-First: J.
Author-X-Name-Last: Jacques
Author-Name: C. Biernacki
Author-X-Name-First: C.
Author-X-Name-Last: Biernacki
Title: Extension of model-based classification for binary data when training and test populations differ
Abstract:
Standard discriminant analysis supposes that both the training sample and
the test sample are derived from the same population. When these samples
arise from populations differing in their descriptive parameters, a
generalization of discriminant analysis consists of adapting the
classification rule related to the training population to another rule
related to the test population, by estimating a link map between both
populations. This paper extends an existing work in the multinormal
context to the case of binary data. In order to solve the problem of
defining a link map between the two binary populations, it is assumed that
the binary data result from the discretization of latent Gaussian data. An
estimation method and a robustness study are presented, and two
applications in a biological context illustrate this work.
Journal: Journal of Applied Statistics
Pages: 749-766
Issue: 5
Volume: 37
Year: 2010
Keywords: Biological application, discriminant analysis, EM algorithm, latent class model, Stochastic link,
X-DOI: 10.1080/02664760902889957
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902889957
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:5:p:749-766
Template-Type: ReDIF-Article 1.0
Author-Name: Hyo-Il Park
Author-X-Name-First: Hyo-Il
Author-X-Name-Last: Park
Author-Name: Seung-Man Hong
Author-X-Name-First: Seung-Man
Author-X-Name-Last: Hong
Title: A permutation test for multivariate data with grouped components
Abstract:
In this paper, we consider a nonparametric test procedure for
multivariate data with grouped components under the two sample problem
setting. For the construction of the test statistic, we use linear rank
statistics which were derived by applying the likelihood ratio principle
for each component. For the null distribution of the test statistic, we
apply the permutation principle for small or moderate sample sizes and
derive the limiting distribution for the large sample case. Also we
illustrate our test procedure with an example and compare with other
procedures through simulation study. Finally, we discuss some additional
interesting features as concluding remarks.
Journal: Journal of Applied Statistics
Pages: 767-778
Issue: 5
Volume: 37
Year: 2010
Keywords: grouped data, liner rank statistic, multivariate data, nonparametric test, permutation principle,
X-DOI: 10.1080/02664760902889973
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902889973
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:5:p:767-778
Template-Type: ReDIF-Article 1.0
Author-Name: Hyonggin An
Author-X-Name-First: Hyonggin
Author-X-Name-Last: An
Author-Name: Roderick Little
Author-X-Name-First: Roderick
Author-X-Name-Last: Little
Author-Name: Andrea Bozoki
Author-X-Name-First: Andrea
Author-X-Name-Last: Bozoki
Title: A statistical algorithm for detecting cognitive plateaus in Alzheimer's disease
Abstract:
Repeated neuropsychological measurements, such as mini-mental state
examination (MMSE) scores, are frequently used in Alzheimer's disease (AD)
research to study change in cognitive function of AD patients. A question
of interest among dementia researchers is whether some AD patients exhibit
transient “plateaus” of cognitive function in the course of
the disease. We consider a statistical approach to this question, based on
irregularly spaced repeated MMSE scores. We propose an algorithm that
formalizes the measurement of an apparent cognitive plateau, and a
procedure to evaluate the evidence of plateaus in AD using this algorithm
based on applying the algorithm to the observed data and to data sets
simulated from a linear mixed model. We apply these methods to repeated
MMSE data from the Michigan Alzheimer's Disease Research Center, finding a
high rate of apparent plateaus and also a high rate of false discovery.
Simulation studies are also conducted to assess the performance of the
algorithm. In general, the false discovery rate of the algorithm is high
unless the rate of decline is high compared with the measurement error of
the cognitive test. It is argued that the results are not a problem of the
specific algorithm chosen, but reflect a lack of information concerning
the presence of plateaus in the data.
Journal: Journal of Applied Statistics
Pages: 779-789
Issue: 5
Volume: 37
Year: 2010
Keywords: Alzheimer's disease, longitudinal data, linear mixed model, nonlinear model, false discovery rate, cognitive plateau,
X-DOI: 10.1080/02664760902889999
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902889999
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:5:p:779-789
Template-Type: ReDIF-Article 1.0
Author-Name: Zhiguo Xiao
Author-X-Name-First: Zhiguo
Author-X-Name-Last: Xiao
Author-Name: Jun Shao
Author-X-Name-First: Jun
Author-X-Name-Last: Shao
Author-Name: Mari Palta
Author-X-Name-First: Mari
Author-X-Name-Last: Palta
Title: GMM in linear regression for longitudinal data with multiple covariates measured with error
Abstract:
Griliches and Hausman 5 and Wansbeek 11 proposed using the generalized
method of moments (GMM) to obtain consistent estimators in linear
regression models for longitudinal data with measurement error in one
covariate, without requiring additional validation or replicate data. For
usefulness of this methodology, we must extend it to the more realistic
situation where more than one covariate are measured with error. Such an
extension is not straightforward, since measurement errors across
different covariates may be correlated. By a careful construction of the
measurement error correlation structure, we are able to extend Wansbeek's
GMM and show that the extended Griliches and Hausman's GMM is equivalent
to the extended Wansbeek's GMM. For illustration, we apply the extended
GMM to data from two medical studies, and compare it with the naive method
and the method assuming only one covariate having measurement error.
Journal: Journal of Applied Statistics
Pages: 791-805
Issue: 5
Volume: 37
Year: 2010
Keywords: longitudinal data, multiple covariates, measurement error, generalized method of moments,
X-DOI: 10.1080/02664760902890005
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902890005
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:5:p:791-805
Template-Type: ReDIF-Article 1.0
Author-Name: Fu-Kwun Wang
Author-X-Name-First: Fu-Kwun
Author-X-Name-Last: Wang
Author-Name: Yung-Fu Cheng
Author-X-Name-First: Yung-Fu
Author-X-Name-Last: Cheng
Title: Robust regression for estimating the Burr XII parameters with outliers
Abstract:
The Burr XII distribution offers a more flexible alternative to the
lognormal, log-logistic and Weibull distributions. Outliers can occur
during reliability life testing. Thus, we need an efficient method to
estimate the parameters of the Burr XII distribution for censored data
with outliers. The objective of this paper is to present a robust
regression (RR) method called M-estimator to estimate the parameters of a
two-parameter Burr XII distribution based on the probability plotting
procedure for both the complete and multiply-censored data with outliers.
The simulation results show that the RR method outperforms the unweighted
least squares and maximum likelihood methods in most cases in terms of
bias and errors in the root mean square.
Journal: Journal of Applied Statistics
Pages: 807-819
Issue: 5
Volume: 37
Year: 2010
Keywords: Burr XII distribution, robust regression, M-estimator, least squares, maximum likelihood,
X-DOI: 10.1080/02664760902906231
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902906231
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:5:p:807-819
Template-Type: ReDIF-Article 1.0
Author-Name: J. Anzures-Cabrera
Author-X-Name-First: J.
Author-X-Name-Last: Anzures-Cabrera
Author-Name: J. L. Hutton
Author-X-Name-First: J. L.
Author-X-Name-Last: Hutton
Title: Competing risks, left truncation and late entry effect in A-bomb survivors cohort
Abstract:
The cohort under study comprises A-bomb survivors residing in Hiroshima
Prefecture since 1968. After this year, thousands of survivors were newly
recognized every year. The aim of this study is to determine whether the
survival experience of the late entrants to the cohort is significantly
different from the registered population in 1968. Parametric models that
account for left truncation and competing risks were developed by using
sub-hazard functions. A Weibull distribution was used to determine the
possible existence of a late entry effect in Hiroshima A-bomb survivors.
The competing risks framework shows that there might be a late entry
effect in the male and female groups. Our findings are congruent with
previous studies analysing similar populations.
Journal: Journal of Applied Statistics
Pages: 821-831
Issue: 5
Volume: 37
Year: 2010
Keywords: competing risks, late entry effect, left truncation, sub-hazard function, Weibull distribution,
X-DOI: 10.1080/02664760902914417
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914417
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:5:p:821-831
Template-Type: ReDIF-Article 1.0
Author-Name: Jan Serroyen
Author-X-Name-First: Jan
Author-X-Name-Last: Serroyen
Author-Name: Geert Molenberghs
Author-X-Name-First: Geert
Author-X-Name-Last: Molenberghs
Author-Name: Marc Aerts
Author-X-Name-First: Marc
Author-X-Name-Last: Aerts
Author-Name: Ellen Vloeberghs
Author-X-Name-First: Ellen
Author-X-Name-Last: Vloeberghs
Author-Name: Peter Paul De Deyn
Author-X-Name-First: Peter Paul
Author-X-Name-Last: De Deyn
Author-Name: Geert Verbeke
Author-X-Name-First: Geert
Author-X-Name-Last: Verbeke
Title: Flexible estimation of serial correlation in nonlinear mixed models
Abstract:
In the conventional linear mixed-effects model, four structures can be
distinguished: fixed effects, random effects, measurement error and serial
correlation. The latter captures the phenomenon that the correlation
structure within a subject depends on the time lag between two
measurements. While the general linear mixed model is rather flexible, the
need has arisen to further increase flexibility. In addition to work done
in the area, we propose the use of spline-based modeling of the serial
correlation function, so as to allow for additional flexibility. This
approach is applied to data from a pre-clinical experiment in dementia
which studied the eating and drinking behavior in mice.
Journal: Journal of Applied Statistics
Pages: 833-846
Issue: 5
Volume: 37
Year: 2010
Keywords: Alzheimer's disease, dementia, ordinary least squares, random effect,
X-DOI: 10.1080/02664760902914425
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914425
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:5:p:833-846
Template-Type: ReDIF-Article 1.0
Author-Name: Adriana Bortoluzzo
Author-X-Name-First: Adriana
Author-X-Name-Last: Bortoluzzo
Author-Name: Pedro Morettin
Author-X-Name-First: Pedro
Author-X-Name-Last: Morettin
Author-Name: Clelia Toloi
Author-X-Name-First: Clelia
Author-X-Name-Last: Toloi
Title: Time-varying autoregressive conditional duration model
Abstract:
The main goal of this work is to generalize the autoregressive
conditional duration (ACD) model applied to times between trades to the
case of time-varying parameters. The use of wavelets allows that
parameters vary through time and makes possible the modeling of
non-stationary processes without preliminary data transformations. The
time-varying ACD model estimation was done by maximum-likelihood with
standard exponential distributed errors. The properties of the estimators
were assessed via bootstrap. We present a simulation exercise for a
non-stationary process and an empirical application to a real series,
namely the TELEMAR stock. Diagnostic and goodness of fit analysis suggest
that the time-varying ACD model simultaneously modeled the dependence
between durations, intra-day seasonality and volatility.
Journal: Journal of Applied Statistics
Pages: 847-864
Issue: 5
Volume: 37
Year: 2010
Keywords: ACD model, bootstrap, durations, non-stationarity, time-varying parameters, wavelet,
X-DOI: 10.1080/02664760902914458
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914458
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:5:p:847-864
Template-Type: ReDIF-Article 1.0
Author-Name: Emilio Augusto Coelho-Barros
Author-X-Name-First: Emilio Augusto
Author-X-Name-Last: Coelho-Barros
Author-Name: Jorge Alberto Achcar
Author-X-Name-First: Jorge Alberto
Author-X-Name-Last: Achcar
Author-Name: Josmar Mazucheli
Author-X-Name-First: Josmar
Author-X-Name-Last: Mazucheli
Title: Longitudinal Poisson modeling: an application for CD4 counting in HIV-infected patients
Abstract:
In this paper, we present different “frailty” models to
analyze longitudinal data in the presence of covariates. These models
incorporate the extra-Poisson variability and the possible correlation
among the repeated counting data for each individual. Assuming a CD4
counting data set in HIV-infected patients, we develop a hierarchical
Bayesian analysis considering the different proposed models and using
Markov Chain Monte Carlo methods. We also discuss some Bayesian
discrimination aspects for the choice of the best model.
Journal: Journal of Applied Statistics
Pages: 865-880
Issue: 5
Volume: 37
Year: 2010
Keywords: longitudinal Poisson data, “frailty” models, hierarchical Bayesian analysis, Winbugs software, clinical data,
X-DOI: 10.1080/02664760902914466
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914466
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:5:p:865-880
Template-Type: ReDIF-Article 1.0
Author-Name: B. D. McCullough
Author-X-Name-First: B. D.
Author-X-Name-Last: McCullough
Author-Name: Thomas McWilliams
Author-X-Name-First: Thomas
Author-X-Name-Last: McWilliams
Title: Baseball players with the initial “K” do not strike out more often
Abstract:
It has been claimed that baseball players whose first or last name begins
with the letter K have a tendency to strike out more than players whose
initials do not contain the letter K. This “result” was
achieved by a naive application of statistical methods. We show that this
result is a spurious statistical artifact that can be reversed by the use
of only slightly less naive statistical methods. We also show that other
letters have larger and/or more significant effects than the letter K.
Finally, we show that the original study applied the wrong
statistical test and tested the hypothesis incorrectly. When these errors
are corrected, most of the letters of the alphabet have a statistically
significant strikeout effect.
Journal: Journal of Applied Statistics
Pages: 881-891
Issue: 6
Volume: 37
Year: 2010
Keywords: name-letter effect, spurious correlation,
X-DOI: 10.1080/02664760902889965
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902889965
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:881-891
Template-Type: ReDIF-Article 1.0
Author-Name: Jurate Saltyte Benth
Author-X-Name-First: Jurate Saltyte
Author-X-Name-Last: Benth
Author-Name: Fred Espen Benth
Author-X-Name-First: Fred Espen
Author-X-Name-Last: Benth
Title: Analysis and modelling of wind speed in New York
Abstract:
In this paper we propose an ARMA time-series model for the wind speed at
a single spatial location, and estimate it on in-sample data recorded in
three different wind farm regions in New York state. The data have a
three-hour granularity, but based on applications to financial wind
derivatives contracts, we also consider daily average wind speeds. We
demonstrate that there are large discrepancies in the behaviour of daily
average and three-hourly wind speed records. The validation procedure
based on out-of-sample observations reflects that the proposed model is
reliable and can be used for various practical applications, like, for
instance, weather prediction, pricing of financial wind contracts, wind
generated power, etc. Furthermore, we discuss some striking resemblances
with temperature dynamics.
Journal: Journal of Applied Statistics
Pages: 893-909
Issue: 6
Volume: 37
Year: 2010
Keywords: wind speed, time series, ARMA, seasonality, seasonal variance,
X-DOI: 10.1080/02664760902914490
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914490
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:893-909
Template-Type: ReDIF-Article 1.0
Author-Name: Andrew Copas
Author-X-Name-First: Andrew
Author-X-Name-Last: Copas
Author-Name: Shaun Seaman
Author-X-Name-First: Shaun
Author-X-Name-Last: Seaman
Title: Bias from the use of generalized estimating equations to analyze incomplete longitudinal binary data
Abstract:
Patient dropout is a common problem in studies that collect repeated
binary measurements. Generalized estimating equations (GEE) are often used
to analyze such data. The dropout mechanism may be plausibly missing at
random (MAR), i.e. unrelated to future measurements given covariates and
past measurements. In this case, various authors have recommended weighted
GEE with weights based on an assumed dropout model, or an imputation
approach, or a doubly robust approach based on weighting and imputation.
These approaches provide asymptotically unbiased inference, provided the
dropout or imputation model (as appropriate) is correctly specified. Other
authors have suggested that, provided the working correlation structure is
correctly specified, GEE using an improved estimator of the correlation
parameters ('modified GEE') show minimal bias. These modified GEE have not
been thoroughly examined. In this paper, we study the asymptotic bias
under MAR dropout of these modified GEE, the standard GEE, and also GEE
using the true correlation. We demonstrate that all three methods are
biased in general. The modified GEE may be preferred to the standard GEE
and are subject to only minimal bias in many MAR scenarios but in others
are substantially biased. Hence, we recommend the modified GEE be used
with caution.
Journal: Journal of Applied Statistics
Pages: 911-922
Issue: 6
Volume: 37
Year: 2010
Keywords: binary data, generalized estimating equations, missing data, missing at random, repeated measures,
X-DOI: 10.1080/02664760902939604
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902939604
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:911-922
Template-Type: ReDIF-Article 1.0
Author-Name: M. Qamarul Islam
Author-X-Name-First: M. Qamarul
Author-X-Name-Last: Islam
Author-Name: Moti Tiku
Author-X-Name-First: Moti
Author-X-Name-Last: Tiku
Title: Multiple linear regression model with stochastic design variables
Abstract:
In a simple multiple linear regression model, the design variables have
traditionally been assumed to be non-stochastic. In numerous real-life
situations, however, they are stochastic and non-normal. Estimators of
parameters applicable to such situations are developed. It is shown that
these estimators are efficient and robust. A real-life example is given.
Journal: Journal of Applied Statistics
Pages: 923-943
Issue: 6
Volume: 37
Year: 2010
Keywords: correlation coefficient, least squares, linear regression, modified maximum likelihood, multivariate distributions, non-normality, random design,
X-DOI: 10.1080/02664760902939612
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902939612
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:923-943
Template-Type: ReDIF-Article 1.0
Author-Name: K. M. Matawie
Author-X-Name-First: K. M.
Author-X-Name-Last: Matawie
Author-Name: A. Assaf
Author-X-Name-First: A.
Author-X-Name-Last: Assaf
Title: Bayesian and DEA efficiency modelling: an application to hospital foodservice operations
Abstract:
The significant impact of health foodservice operations on the total
operational cost of the hospital sector has increased the need to improve
the efficiency of these operations. Although important studies on the
performance of foodservice operations have been published in various
academic journals and industrial reports, the findings and implications
remain simple and limited in scope and methodology. This paper
investigates two popular methodologies in the efficiency literature:
Bayesian “stochastic frontier analysis” (SFA) and
“data envelopment analysis” (DEA). The paper discusses the
statistical advantages of the Bayesian SFA and compares it with an
extended DEA model. The results from a sample of 101 hospital foodservice
operations show the existence of inefficiency in the sample, and indicate
significant differences between the average efficiency generated by the
Bayesian SFA and DEA models. The ranking of efficiency is, however,
statistically independent of the methodologies.
Journal: Journal of Applied Statistics
Pages: 945-953
Issue: 6
Volume: 37
Year: 2010
Keywords: Bayesian SFA, DEA, efficiency, hospitals,
X-DOI: 10.1080/02664760902949058
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902949058
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:945-953
Template-Type: ReDIF-Article 1.0
Author-Name: Chau-Chen Torng
Author-X-Name-First: Chau-Chen
Author-X-Name-Last: Torng
Author-Name: Chun-Chieh Tseng
Author-X-Name-First: Chun-Chieh
Author-X-Name-Last: Tseng
Author-Name: Pei-Hsi Lee
Author-X-Name-First: Pei-Hsi
Author-X-Name-Last: Lee
Title: Non-normality and combined double sampling and variable sampling interval [image omitted] control charts
Abstract:
A combination of double sampling the [image omitted] chart and the
variable sampling interval of the [image omitted] chart (DSVSI
[image omitted] chart) increases the sensitivity in the small shift
detection. A usual assumption of control charts is that the process
observations are normally distributed. However, this assumption may not be
true for some processes in the real industry. This paper presented the
performance of DSVSI [image omitted] chart under non-normality and
compared it with the Shewhart [image omitted] chart and the variable
parameters [image omitted] chart. The compared results show that the
DSVSI [image omitted] chart has the best performance in detecting
small process mean shifts.
Journal: Journal of Applied Statistics
Pages: 955-967
Issue: 6
Volume: 37
Year: 2010
Keywords: double sampling X chart, variable sampling intervals X chart, variable parameters X chart, non-normality,
X-DOI: 10.1080/02664760902984634
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902984634
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:955-967
Template-Type: ReDIF-Article 1.0
Author-Name: Felix Famoye
Author-X-Name-First: Felix
Author-X-Name-Last: Famoye
Title: On the bivariate negative binomial regression model
Abstract:
In this paper, a new bivariate negative binomial regression (BNBR) model
allowing any type of correlation is defined and studied. The marginal
means of the bivariate model are functions of the explanatory variables.
The parameters of the bivariate regression model are estimated by using
the maximum likelihood method. Some test statistics including
goodness-of-fit are discussed. Two numerical data sets are used to
illustrate the techniques. The BNBR model tends to perform better than the
bivariate Poisson regression model, but compares well with the bivariate
Poisson log-normal regression model.
Journal: Journal of Applied Statistics
Pages: 969-981
Issue: 6
Volume: 37
Year: 2010
Keywords: correlated count data, over-dispersion, goodness-of-fit, estimation,
X-DOI: 10.1080/02664760902984618
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902984618
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:969-981
Template-Type: ReDIF-Article 1.0
Author-Name: Ilmari Juutilainen
Author-X-Name-First: Ilmari
Author-X-Name-Last: Juutilainen
Author-Name: Juha Roning
Author-X-Name-First: Juha
Author-X-Name-Last: Roning
Title: How to compare interpretatively different models for the conditional variance function
Abstract:
This study considers regression-type models with heteroscedastic Gaussian
errors. The conditional variance is assumed to depend on the explanatory
variables via a parametric or non-parametric variance function. The
variance function has usually been selected on the basis of the
log-likelihoods of fitted models. However, log-likelihood is a difficult
quantity to interpret - the practical importance of differences in
log-likelihoods has been difficult to assess. This study overcomes these
difficulties by transforming the difference in log-likelihood to easily
interpretative difference in the error of predicted deviation. In
addition, methods for testing the statistical significance of the observed
difference in test data log-likelihood are proposed.
Journal: Journal of Applied Statistics
Pages: 983-997
Issue: 6
Volume: 37
Year: 2010
Keywords: conditional variance, variance function, predictive likelihood, log-scoring rule, predictive density, out-of-sample testing, model performance measure,
X-DOI: 10.1080/02664760902984642
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902984642
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:983-997
Template-Type: ReDIF-Article 1.0
Author-Name: Nursel Koyuncu
Author-X-Name-First: Nursel
Author-X-Name-Last: Koyuncu
Author-Name: Cem Kadilar
Author-X-Name-First: Cem
Author-X-Name-Last: Kadilar
Title: On improvement in estimating population mean in stratified random sampling
Abstract:
Gupta and Shabbir 2 have suggested an alternative form of ratio-type
estimators for estimating the population mean. In this paper, we obtained
a corrected version for the mean square error (MSE) of the Gupta-Shabbir
estimator, up to first order of approximation, and the optimum case is
discussed. We expand this estimator to the stratified random sampling and
propose general classes for combined and separate estimators. Also an
empirical study is carried out to show the properties of the proposed
estimators.
Journal: Journal of Applied Statistics
Pages: 999-1013
Issue: 6
Volume: 37
Year: 2010
Keywords: ratio estimator, auxiliary information, mean square error, efficiency, stratified random sampling,
X-DOI: 10.1080/02664760903002675
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903002675
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:999-1013
Template-Type: ReDIF-Article 1.0
Author-Name: Young-Ju Kim
Author-X-Name-First: Young-Ju
Author-X-Name-Last: Kim
Title: Semiparametric analysis for case-control studies: a partial smoothing spline approach
Abstract:
Case-control data are often used in medical-related applications, and
most studies have applied parametric logistic regression to analyze such
data. In this study, we investigated a semiparametric model for the
analysis of case-control data by relaxing the linearity assumption of risk
factors by using a partial smoothing spline model. A faster computation
method for the model by extending the lower-dimensional approximation
approach of Gu and Kim 4 developed in penalized likelihood regression is
considered to apply to case-control studies. Simulations were conducted to
evaluate the performance of the method with selected smoothing parameters
and to compare the method with existing methods. The method was applied to
Korean gastric cancer case-control data to estimate the nonparametric
probability function of age and regression parameters for other
categorical risk factors simultaneously. The method could be used in
preliminary studies to identify whether there is a flexible function form
of risk factors in the semiparametric logistic regression analysis
involving a large data set.
Journal: Journal of Applied Statistics
Pages: 1015-1025
Issue: 6
Volume: 37
Year: 2010
Keywords: case-control data, partial smoothing spline, penalized likelihood, smoothing parameter, semiparametric,
X-DOI: 10.1080/02664760903008979
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903008979
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:1015-1025
Template-Type: ReDIF-Article 1.0
Author-Name: Sandra De Iaco
Author-X-Name-First: Sandra
Author-X-Name-Last: De Iaco
Title: Space-time correlation analysis: a comparative study
Abstract:
Space-time correlation modelling is one of the crucial steps of
traditional structural analysis, since space-time models are used for
prediction purposes. A comparative study among some classes of space-time
covariance functions is proposed. The relevance of choosing a suitable
model by taking into account the characteristic behaviour of the models is
proved by using a space-time data set of ozone daily averages and the
flexibility of the product-sum model is also highlighted through simulated
data sets.
Journal: Journal of Applied Statistics
Pages: 1027-1041
Issue: 6
Volume: 37
Year: 2010
Keywords: space-time random field, space-time covariance, characteristic behaviour, product-sum model, structural analysis,
X-DOI: 10.1080/02664760903019422
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903019422
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:1027-1041
Template-Type: ReDIF-Article 1.0
Author-Name: Miao-Yu Tsai
Author-X-Name-First: Miao-Yu
Author-X-Name-Last: Tsai
Title: Extended Bayesian model averaging for heritability in twin studies
Abstract:
Family studies are often conducted to examine the existence of familial
aggregation. Particularly, twin studies can model separately the genetic
and environmental contribution. Here we estimate the heritability of
quantitative traits via variance components of random-effects in linear
mixed models (LMMs). The motivating example was a myopia twin study
containing complex nesting data structures: twins and siblings in the same
family and observations on both eyes for each individual. Three models are
considered for this nesting structure. Our proposal takes into account the
model uncertainty in both covariates and model structures via an extended
Bayesian model averaging (EBMA) procedure. We estimate the heritability
using EBMA under three suggested model structures. When compared with the
results under the model with the highest posterior model probability, the
EBMA estimate has smaller variation and is slightly conservative.
Simulation studies are conducted to evaluate the performance of
variance-components estimates, as well as the selections of risk factors,
under the correct or incorrect structure. The results indicate that EBMA,
with consideration of uncertainties in both covariates and model
structures, is robust in model misspecification than the usual Bayesian
model averaging (BMA) that considers only uncertainty in covariates
selection.
Journal: Journal of Applied Statistics
Pages: 1043-1058
Issue: 6
Volume: 37
Year: 2010
Keywords: Bayesian model averaging, boundary Laplace approximation, heritability, linear mixed models, model uncertainty,
X-DOI: 10.1080/02664760903093625
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903093625
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:1043-1058
Template-Type: ReDIF-Article 1.0
Author-Name: Mukesh Srivastava
Author-X-Name-First: Mukesh
Author-X-Name-Last: Srivastava
Title: Analysis of variance and covariance: how to choose and construct models for the life sciences
Abstract:
Journal: Journal of Applied Statistics
Pages: 1059-1060
Issue: 6
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902885203
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902885203
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:1059-1060
Template-Type: ReDIF-Article 1.0
Author-Name: Juana Sanchez
Author-X-Name-First: Juana
Author-X-Name-Last: Sanchez
Title: International migration in Europe
Abstract:
Journal: Journal of Applied Statistics
Pages: 1061-1061
Issue: 6
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902899733
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902899733
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:1061-1061
Template-Type: ReDIF-Article 1.0
Author-Name: Juana Sanchez
Author-X-Name-First: Juana
Author-X-Name-Last: Sanchez
Title: Introduction to modern time series analysis
Abstract:
Journal: Journal of Applied Statistics
Pages: 1063-1063
Issue: 6
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902899766
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902899766
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:6:p:1063-1063
Template-Type: ReDIF-Article 1.0
Author-Name: Shih-Chou Kao
Author-X-Name-First: Shih-Chou
Author-X-Name-Last: Kao
Title: Normalization of the origin-shifted exponential distribution for control chart construction
Abstract:
This study demonstrates that a location parameter of an exponential
distribution significantly influences normalization of the exponential.
The Kullback-Leibler information number is shown to be an appropriate
index for measuring data normality using a location parameter. Control
charts based on probability limits and transformation are compared for
known and estimated location parameters. The probabilities of type II
error (β-risks) and average run length (ARL) without a location
parameter indicate an ability to detect an out-of-control signal of an
individual chart using a power transformation similar to using probability
limits. The β-risks and ARL of control charts with an estimated
location parameter deviate significantly from their theoretical values
when a small sample size of n≤50 is used. Therefore, without taking
into account of the existence of a location parameter, the control charts
result in inaccurate detection of an out-of-control signal regardless of
whether a power or natural logarithmic transformation is used. The effects
of a location parameter should be eliminated before transformation. Two
examples are presented to illustrate these findings.
Journal: Journal of Applied Statistics
Pages: 1067-1087
Issue: 7
Volume: 37
Year: 2010
Keywords: location parameter, exponential distribution, power transformation, natural logarithmic transformation, Kullback-Leibler information number,
X-DOI: 10.1080/02664760802571333
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802571333
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:7:p:1067-1087
Template-Type: ReDIF-Article 1.0
Author-Name: Dong Han
Author-X-Name-First: Dong
Author-X-Name-Last: Han
Author-Name: Fugee Tsung
Author-X-Name-First: Fugee
Author-X-Name-Last: Tsung
Author-Name: Yanting Li
Author-X-Name-First: Yanting
Author-X-Name-Last: Li
Author-Name: Jinguo Xian
Author-X-Name-First: Jinguo
Author-X-Name-Last: Xian
Title: Detection of changes in a random financial sequence with a stable distribution
Abstract:
Quick detection of unanticipated changes in a financial sequence is a
critical problem for practitioners in the finance industry. Based on
refined logarithmic moment estimators for the four parameters of a stable
distribution, this article presents a stable-distribution-based
multi-CUSUM chart that consists of several CUSUM charts and detects
changes in the four parameters in an independent and identically
distributed random sequence with the stable distribution. Numerical
results of the average run lengths show that the multi-CUSUM chart is
superior (robust and quick) on the whole to a single CUSUM chart in
detecting the shift change of the four parameters. A real example that
monitors changes in IBM's stock returns is used to demonstrate the
performance of the proposed method.
Journal: Journal of Applied Statistics
Pages: 1089-1111
Issue: 7
Volume: 37
Year: 2010
Keywords: logarithmic moment estimators, multi-CUSUM charts, detection of changes, random sequence with stable distribution,
X-DOI: 10.1080/02664760902914433
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914433
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:7:p:1089-1111
Template-Type: ReDIF-Article 1.0
Author-Name: David Scollnik
Author-X-Name-First: David
Author-X-Name-Last: Scollnik
Title: Bayesian statistical inference for start-up demonstration tests with rejection of units upon observing d failures
Abstract:
This paper is concerned with Bayesian estimation and prediction in the
context of start-up demonstration tests in which rejection of a unit is
possible when a pre-specified number of failures is observed prior to
obtaining the number of consecutive successes required for acceptance of
the unit. A method for implementing Bayesian inference on the probability
of success is developed for use when the test result of each start-up is
not reported or even recorded, and only the number of trials until
termination of the testing is available. Some errors in the related
literature on the Bayesian analysis of start-up demonstration tests are
corrected. The method developed in this paper is a Markov chain Monte
Carlo (MCMC) method incorporating data augmentation, and it additionally
enables Bayesian posterior inference on the number of failures given the
number of start-up trials until termination to be made, along with
Bayesian predictive inferences on the number of start-up trials and the
number of failures until termination for any future run of the start-up
demonstration test. An illustrative example is also included.
Journal: Journal of Applied Statistics
Pages: 1113-1121
Issue: 7
Volume: 37
Year: 2010
Keywords: start-up demonstration test, Bayesian estimation, MCMC, data augmentation,
X-DOI: 10.1080/02664760902914516
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914516
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:7:p:1113-1121
Template-Type: ReDIF-Article 1.0
Author-Name: Kosei Fukuda
Author-X-Name-First: Kosei
Author-X-Name-Last: Fukuda
Title: Parameter changes in GARCH model
Abstract:
A new method for detecting the parameter changes in generalized
autoregressive heteroskedasticity GARCH (1,1) model is proposed. In the
proposed method, time series observations are divided into several
segments and a GARCH (1,1) model is fitted to each segment. The
goodness-of-fit of the global model composed of these local GARCH (1,1)
models is evaluated using the corresponding information criterion (IC).
The division that minimizes IC defines the best model. Furthermore, since
the simultaneous estimation of all possible models requires huge
computational time, a new time-saving algorithm is proposed. Simulation
results and empirical results both indicate that the proposed method is
useful in analysing financial data.
Journal: Journal of Applied Statistics
Pages: 1123-1135
Issue: 7
Volume: 37
Year: 2010
Keywords: GARCH(1,1), information criterion, model selection, parameter change,
X-DOI: 10.1080/02664760902914524
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914524
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:7:p:1123-1135
Template-Type: ReDIF-Article 1.0
Author-Name: Eva Fiserova
Author-X-Name-First: Eva
Author-X-Name-Last: Fiserova
Author-Name: Karel Hron
Author-X-Name-First: Karel
Author-X-Name-Last: Hron
Title: Total least squares solution for compositional data using linear models
Abstract:
The restrictive properties of compositional data, that is multivariate
data with positive parts that carry only relative information in their
components, call for special care to be taken while performing standard
statistical methods, for example, regression analysis. Among the special
methods suitable for handling this problem is the total least squares
procedure (TLS, orthogonal regression, regression with errors in
variables, calibration problem), performed after an appropriate log-ratio
transformation. The difficulty or even impossibility of deeper statistical
analysis (confidence regions, hypotheses testing) using the standard TLS
techniques can be overcome by calibration solution based on linear
regression. This approach can be combined with standard statistical
inference, for example, confidence and prediction regions and bounds,
hypotheses testing, etc., suitable for interpretation of results. Here, we
deal with the simplest TLS problem where we assume a linear relationship
between two errorless measurements of the same object (substance,
quantity). We propose an iterative algorithm for estimating the
calibration line and also give confidence ellipses for the location of
unknown errorless results of measurement. Moreover, illustrative examples
from the fields of geology, geochemistry and medicine are included. It is
shown that the iterative algorithm converges to the same values as those
obtained using the standard TLS techniques. Fitted lines and confidence
regions are presented for both original and transformed compositional
data. The paper contains basic principles of linear models and addresses
many related problems.
Journal: Journal of Applied Statistics
Pages: 1137-1152
Issue: 7
Volume: 37
Year: 2010
Keywords: isometric log-ratio transformation, total least squares, linear regression model, calibration line, estimation, confidence ellipse, multivariate outliers,
X-DOI: 10.1080/02664760902914532
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902914532
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:7:p:1137-1152
Template-Type: ReDIF-Article 1.0
Author-Name: Elizabeth Heron
Author-X-Name-First: Elizabeth
Author-X-Name-Last: Heron
Author-Name: Cathal Walsh
Author-X-Name-First: Cathal
Author-X-Name-Last: Walsh
Title: Bayesian discrete latent spatial modeling of crack initiation in orthopaedic hip replacement bone cement
Abstract:
In this paper, we propose a spatial model for the initiation of cracks in
the bone cement of hip replacement specimens. The failure of hip
replacements can be attributed mainly to damage accumulation, consisting
of crack initiation and growth, occurring in the cement mantle that
interlocks the hip prosthesis and the femur bone. Since crack initiation
is an important factor in determining the lifetime of a replacement, the
understanding of the reasons for crack initiation is vital in attempting
to prolong the life of the hip replacement. The data consist of crack
location coordinates from five laboratory experimental models, together
with stress measurements. It is known that stress plays a major role in
the initiation of cracks, and it is also known that other unmeasurable
factors such as air bubbles (pores) in the cement mantle are also
influential. We propose an identity-link spatial Poisson regression model
for the counts of cracks in discrete regions of the cement, incorporating
both the measured (stress), and through a latent process, any unmeasured
factors (possibly pores) that may be influential. All analysis is carried
out in a Bayesian framework, allowing for the inclusion of prior
information obtained from engineers, and parameter estimation for the
model is done via Markov chain Monte Carlo techniques.
Journal: Journal of Applied Statistics
Pages: 1153-1171
Issue: 7
Volume: 37
Year: 2010
Keywords: orthopaedic hip replacement, crack initiation, identity-link Poisson regression, latent spatial process, Bayesian analysis, Markov chain Monte Carlo, zero-inflated Poisson,
X-DOI: 10.1080/02664760902939620
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902939620
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:7:p:1153-1171
Template-Type: ReDIF-Article 1.0
Author-Name: Ming-Yuan Leon Li
Author-X-Name-First: Ming-Yuan Leon
Author-X-Name-Last: Li
Author-Name: Chun-Nan Chen
Author-X-Name-First: Chun-Nan
Author-X-Name-Last: Chen
Title: Examining the interrelation dynamics between option and stock markets using the Markov-switching vector error correction model
Abstract:
This study examines the dynamics of the interrelation between option and
stock markets using the Markov-switching vector error correction model.
Specifically, we calculate the implied stock prices from the Black-Scholes
6 model and establish a statistic framework in which the parameter of the
price discrepancy between the observed and implied prices switches
according to the phase of the volatility regime. The model is tested in
the US S&P 500 stock market. The empirical findings of this work are
consistent with the following notions. First, while option markets react
more quickly to the newest stock-option disequilibrium shocks than spot
markets, as found by earlier studies, we further indicate that the price
adjustment process occurring in option markets is pronounced when the high
variance condition is concerned, but less so during the stable period.
Second, the degree of the co-movement between the observed and implied
prices is significantly reduced during the high variance state. Last, the
lagged price deviation between the observed and implied prices functions
as an indicator of the variance-turning process.
Journal: Journal of Applied Statistics
Pages: 1173-1191
Issue: 7
Volume: 37
Year: 2010
Keywords: option market, Markov-switching, error correction model, volatility,
X-DOI: 10.1080/02664760902939638
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902939638
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:7:p:1173-1191
Template-Type: ReDIF-Article 1.0
Author-Name: Claudia Castro-Kuriss
Author-X-Name-First: Claudia
Author-X-Name-Last: Castro-Kuriss
Author-Name: Diana Kelmansky
Author-X-Name-First: Diana
Author-X-Name-Last: Kelmansky
Author-Name: Victor Leiva
Author-X-Name-First: Victor
Author-X-Name-Last: Leiva
Author-Name: Elena Martinez
Author-X-Name-First: Elena
Author-X-Name-Last: Martinez
Title: On a goodness-of-fit test for normality with unknown parameters and type-II censored data
Abstract:
We propose a new goodness-of-fit test for normal and lognormal
distributions with unknown parameters and type-II censored data. This test
is a generalization of Michael's test for censored samples, which is based
on the empirical distribution and a variance stabilizing transformation.
We estimate the parameters of the model by using maximum likelihood and
Gupta's methods. The quantiles of the distribution of the test statistic
under the null hypothesis are obtained through Monte Carlo simulations.
The power of the proposed test is estimated and compared to that of the
Kolmogorov-Smirnov test also using simulations. The new test is more
powerful than the Kolmogorov-Smirnov test in most of the studied cases.
Acceptance regions for the PP, QQ and Michael's stabilized probability
plots are derived, making it possible to visualize which data contribute
to the decision of rejecting the null hypothesis. Finally, an illustrative
example is presented.
Journal: Journal of Applied Statistics
Pages: 1193-1211
Issue: 7
Volume: 37
Year: 2010
Keywords: Kolmogorov-Smirnov test, maximum likelihood and Gupta's estimators, Monte Carlo simulation, PP, QQ and stabilized probability plots,
X-DOI: 10.1080/02664760902984626
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902984626
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:7:p:1193-1211
Template-Type: ReDIF-Article 1.0
Author-Name: Qin Yu
Author-X-Name-First: Qin
Author-X-Name-Last: Yu
Author-Name: Wan Tang
Author-X-Name-First: Wan
Author-X-Name-Last: Tang
Author-Name: Sue Marcus
Author-X-Name-First: Sue
Author-X-Name-Last: Marcus
Author-Name: Yan Ma
Author-X-Name-First: Yan
Author-X-Name-Last: Ma
Author-Name: Hui Zhang
Author-X-Name-First: Hui
Author-X-Name-Last: Zhang
Author-Name: Xin Tu
Author-X-Name-First: Xin
Author-X-Name-Last: Tu
Title: Modeling sensitivity and specificity with a time-varying reference standard within a longitudinal setting
Abstract:
Diagnostic tests are used in a wide range of behavioral, medical,
psychosocial, and healthcare-related research. Test sensitivity and
specificity are the most popular measures of accuracy for diagnostic
tests. Available methods for analyzing longitudinal study designs assume
fixed gold or reference standards and as such do not apply to studies with
dynamically changing reference standards, which are especially popular in
psychosocial research. In this article, we develop a novel approach to
address missing data and other related issues for modeling sensitivity and
specificity within such a time-varying reference standard setting. The
approach is illustrated with real as well as simulated data.
Journal: Journal of Applied Statistics
Pages: 1213-1230
Issue: 7
Volume: 37
Year: 2010
Keywords: augmented inverse probability weighted (AIPW) estimate, bivariate monotone missing data pattern (BMMDP), diagnostic test, double robust estimate, inverse probability weighted (IPW) estimate, missing data,
X-DOI: 10.1080/02664760902998444
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902998444
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:7:p:1213-1230
Template-Type: ReDIF-Article 1.0
Author-Name: Demetrios Antzoulakos
Author-X-Name-First: Demetrios
Author-X-Name-Last: Antzoulakos
Author-Name: Athanasios Rakitzis
Author-X-Name-First: Athanasios
Author-X-Name-Last: Rakitzis
Title: Runs rules schemes for monitoring process variability
Abstract:
To increase the sensitivity of Shewhart control charts in detecting small
process shifts sensitizing rules based on runs and scans are often used in
practice. Shewhart control charts supplemented with runs rules for
detecting shifts in process variance have not received as much attention
as their counterparts for detecting shifts in process mean. In this
article, we examine the performance of simple runs rules schemes for
monitoring increases and/or decreases in process variance based on the
sample standard deviation. We introduce one-sided S charts that overcome
the weakness of high false-alarm rates when runs rules are added to a
Shewhart control chart. The average run length performance and design
aspects of the charts are studied thoroughly. The performance of
associated two-sided control schemes is investigated as well.
Journal: Journal of Applied Statistics
Pages: 1231-1247
Issue: 7
Volume: 37
Year: 2010
Keywords: average run length, markov chain embedding technique, optimization, runs rules, Shewhart control charts, standard deviation,
X-DOI: 10.1080/02664760903002683
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903002683
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:7:p:1231-1247
Template-Type: ReDIF-Article 1.0
Author-Name: Mahmoud Mahmoud
Author-X-Name-First: Mahmoud
Author-X-Name-Last: Mahmoud
Author-Name: J. P. Morgan
Author-X-Name-First: J. P.
Author-X-Name-Last: Morgan
Author-Name: William Woodall
Author-X-Name-First: William
Author-X-Name-Last: Woodall
Title: The monitoring of simple linear regression profiles with two observations per sample
Abstract:
We evaluate and compare the performance of Phase II simple linear
regression profile approaches when only two observations are used to
establish each profile. We propose an EWMA control chart based on average
squared deviations from the in-control line, to be used in conjunction
with two EWMA control charts based on the slope and Y-intercept
estimators, to monitor changes in the three regression model parameters,
i.e., the slope, intercept and variance. Simulations establish that the
performance of the proposed technique is generally better than that of
other approaches in detecting parameter shifts.
Journal: Journal of Applied Statistics
Pages: 1249-1263
Issue: 8
Volume: 37
Year: 2010
Keywords: change point, exponentially weighted moving average chart, likelihood ratio, phase II analysis, profile monitoring, statistical process control,
X-DOI: 10.1080/02664760903008995
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903008995
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1249-1263
Template-Type: ReDIF-Article 1.0
Author-Name: Alireza Akbarzadeh Bagheban
Author-X-Name-First: Alireza Akbarzadeh
Author-X-Name-Last: Bagheban
Author-Name: Farid Zayeri
Author-X-Name-First: Farid
Author-X-Name-Last: Zayeri
Title: A generalization of the uniform association model for assessing rater agreement in ordinal scales
Abstract:
Recently, the data analysts pay more attention to the assessment of rater
agreement, especially in areas of medical sciences. In this context, the
statistical indices such as kappa and weighted kappa are the most common
choices. These indices are simple to calculate and interpret, although,
they fail to describe the structure of agreement, particularly when the
available outcome has an ordinal nature. In the previous decades,
statisticians suggested more efficient statistical tools such as diagonal
parameter, linear by linear association and agreement plus linear by
linear association models for describing the structure of rater agreement.
In these models, the equal interval scores are the common choice for the
levels of the ordinal scales. In this manuscript, we show that choosing
the common equal interval scores does not necessarily lead to the best fit
and propose a modification using a power transformation for the ordinal
scores. We also use two different data sets (IOTN and ovarian masses data)
to illustrate our suggestion more clearly. In addition, we utilize the
category distinguishability concept for interpreting the model parameter
estimates.
Journal: Journal of Applied Statistics
Pages: 1265-1273
Issue: 8
Volume: 37
Year: 2010
Keywords: rater agreement, association model, log-linear model, ordinal scales,
X-DOI: 10.1080/02664760903012666
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903012666
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1265-1273
Template-Type: ReDIF-Article 1.0
Author-Name: A. Bhattacharya
Author-X-Name-First: A.
Author-X-Name-Last: Bhattacharya
Title: Estimating crop yield via Gaussian quadrature
Abstract:
The present study proposes a method to estimate the yield of a crop. The
proposed Gaussian quadrature (GQ) method makes it possible to estimate the
crop yield from a smaller subsample. Identification of plots and
corresponding weights to be assigned to the yield of plots comprising a
subsample is done with the help of information about the full sample on
certain auxiliary variables relating to biometrical characteristics of the
plant. Computational experience reveals that the proposed method leads to
about 78% reduction in sample size with absolute percentage error of 2.7%.
Performance of the proposed method has been compared with that of random
sampling on the basis of the values of average absolute percentage error
and standard deviation of yield estimates obtained from 40 samples of
comparable size. Interestingly, average absolute percentage error as well
as standard deviation is considerably smaller for the GQ estimates than
for the random sample estimates. The proposed method is quite general and
can be applied for other crops as well-provided information on auxiliary
variables relating to yield contributing biometrical characteristics is
available.
Journal: Journal of Applied Statistics
Pages: 1275-1281
Issue: 8
Volume: 37
Year: 2010
Keywords: estimation of crop yield, crop cutting experiments, Gaussian quadrature, polynomial approximation, moments,
X-DOI: 10.1080/02664760903012674
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903012674
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1275-1281
Template-Type: ReDIF-Article 1.0
Author-Name: Ming-Hung Shu
Author-X-Name-First: Ming-Hung
Author-X-Name-Last: Shu
Author-Name: Hsien-Chung Wu
Author-X-Name-First: Hsien-Chung
Author-X-Name-Last: Wu
Title: Monitoring imprecise fraction of nonconforming items using p control charts
Abstract:
The quality characteristics, which are known as attributes, cannot be
conveniently and numerically represented. Generally, the attribute data
can be regarded as the fuzzy data, which are ubiquitous in the
manufacturing process and cannot be measured precisely and often be
collected by visual inspection. In this paper, we construct a p control
chart for monitoring the fraction of nonconforming items in the process in
which fuzzy sample data are collected from the manufacturing process. The
resolution identity - a well-known theorem in the fuzzy set theory - is
invoked to construct the control limits of fuzzy-p control charts using
fuzzy data. In order to determine whether the plotted imprecise fraction
of nonconforming items is within the fuzzy lower and upper control limits,
we also propose a ranking method for a set of fuzzy numbers. Using the
fuzzy-p control charts and the proposed acceptability function to classify
the manufacturing process allows the decision-maker to make linguistic
decisions such as rather in control or rather out of control. A practical
example is provided to describe the applicability of the fuzzy set theory
to a conventional p control chart.
Journal: Journal of Applied Statistics
Pages: 1283-1297
Issue: 8
Volume: 37
Year: 2010
Keywords: acceptability function, fuzzy-p control chart, fuzzy number, LR-fuzzy number, resolution identity,
X-DOI: 10.1080/02664760903030205
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903030205
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1283-1297
Template-Type: ReDIF-Article 1.0
Author-Name: Marko Sarstedt
Author-X-Name-First: Marko
Author-X-Name-Last: Sarstedt
Author-Name: Christian Ringle
Author-X-Name-First: Christian
Author-X-Name-Last: Ringle
Title: Treating unobserved heterogeneity in PLS path modeling: a comparison of FIMIX-PLS with different data analysis strategies
Abstract:
In the social science disciplines, the assumption that the data stem from
a single homogeneous population is often unrealistic in respect of
empirical research. When applying a causal modeling approach, such as
partial least squares path modeling, segmentation is a key issue in coping
with the problem of heterogeneity in the estimated cause-effect
relationships. This article uses the novel finite-mixture partial least
squares (FIMIX-PLS) method to uncover unobserved heterogeneity in a
complex path modeling example in the field of marketing. An evaluation of
the results includes a comparison with the outcomes of several data
analysis strategies based on a priori information or k-means cluster
analysis. The results of this article underpin the effectiveness and the
advantageous capabilities of FIMIX-PLS in general PLS path model set-ups
by means of empirical data and formative as well as reflective measurement
models. Consequently, this research substantiates the general
applicability of FIMIX-PLS to path modeling as a standard means of
evaluating PLS results by addressing the problem of unobserved
heterogeneity.
Journal: Journal of Applied Statistics
Pages: 1299-1318
Issue: 8
Volume: 37
Year: 2010
Keywords: partial least square (PLS), path modeling, heterogeneity, latent class, finite mixture, market segmentation, corporate reputation,
X-DOI: 10.1080/02664760903030213
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903030213
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1299-1318
Template-Type: ReDIF-Article 1.0
Author-Name: JrJung Lyu
Author-X-Name-First: JrJung
Author-X-Name-Last: Lyu
Author-Name: MingNan Chen
Author-X-Name-First: MingNan
Author-X-Name-Last: Chen
Title: Measurement of bivariate attributes using a novel statistical model
Abstract:
Reducing process variability is essential to many organisations.
According to the pertinent literature, a quality system that utilizes
quality techniques to reduce process variability is necessary. Quality
programs that respond to measurement precision are central to quality
systems, and the most common method of assessing the precision of a
measurement system is repeatability and reproducibility (R&R). Few studies
have investigated R&R using attribute data. In modern manufacturing
environments, automated manufacturing is becoming increasingly common;
however, a measurement resolution problem exists in automatic inspection
equipment, resulting in clusters and product defects. It is vital to
monitor effectively these bivariate quality characteristics. This study
presents a novel model for calculating R&R for bivariate attribute data.
An alloy manufacturing case is utilized to illustrate the process and
potential of the proposed model. Findings can be employed to evaluate and
improve measurement systems with bivariate attribute data.
Journal: Journal of Applied Statistics
Pages: 1319-1334
Issue: 8
Volume: 37
Year: 2010
Keywords: measurement system analysis, attribute data, repeatability, reproducibility,
X-DOI: 10.1080/02664760903030221
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903030221
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1319-1334
Template-Type: ReDIF-Article 1.0
Author-Name: Fabio Principato
Author-X-Name-First: Fabio
Author-X-Name-Last: Principato
Author-Name: Angela Vullo
Author-X-Name-First: Angela
Author-X-Name-Last: Vullo
Author-Name: Domenica Matranga
Author-X-Name-First: Domenica
Author-X-Name-Last: Matranga
Title: On implementation of the Gibbs sampler for estimating the accuracy of multiple diagnostic tests
Abstract:
Implementation of the Gibbs sampler for estimating the accuracy of
multiple binary diagnostic tests in one population has been investigated.
This method, proposed by Joseph, Gyorkos and Coupal, makes use of a
Bayesian approach and is used in the absence of a gold standard to
estimate the prevalence, the sensitivity and specificity of medical
diagnostic tests. The expressions that allow this method to be implemented
for an arbitrary number of tests are given. By using the convergence
diagnostics procedure of Raftery and Lewis, the relation between the
number of iterations of Gibbs sampling and the precision of the estimated
quantiles of the posterior distributions is derived. An example concerning
a data set of gastro-esophageal reflux disease patients collected to
evaluate the accuracy of the water siphon test compared with 24 h
pH-monitoring, endoscopy and histology tests is presented. The main
message that emerges from our analysis is that implementation of the Gibbs
sampler to estimate the parameters of multiple binary diagnostic tests can
be critical and convergence diagnostic is advised for this method. The
factors which affect the convergence of the chains to the posterior
distributions and those that influence the precision of their quantiles
are analyzed.
Journal: Journal of Applied Statistics
Pages: 1335-1354
Issue: 8
Volume: 37
Year: 2010
Keywords: Gibbs sampler, Bayesian analysis, convergence diagnostics, diagnostic tests, gastro-esophageal reflux disease,
X-DOI: 10.1080/02664760903030239
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903030239
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1335-1354
Template-Type: ReDIF-Article 1.0
Author-Name: Aparecida Souza
Author-X-Name-First: Aparecida
Author-X-Name-Last: Souza
Author-Name: Helio Migon
Author-X-Name-First: Helio
Author-X-Name-Last: Migon
Title: Bayesian outlier analysis in binary regression
Abstract:
We propose alternative approaches to analyze residuals in binary
regression models based on random effect components. Our preferred model
does not depend upon any tuning parameter, being completely automatic.
Although the focus is mainly on accommodation of outliers, the proposed
methodology is also able to detect them. Our approach consists of
evaluating the posterior distribution of random effects included in the
linear predictor. The evaluation of the posterior distributions of
interest involves cumbersome integration, which is easily dealt with
through stochastic simulation methods. We also discuss different
specifications of prior distributions for the random effects. The
potential of these strategies is compared in a real data set. The main
finding is that the inclusion of extra variability accommodates the
outliers, improving the adjustment of the model substantially, besides
correctly indicating the possible outliers.
Journal: Journal of Applied Statistics
Pages: 1355-1368
Issue: 8
Volume: 37
Year: 2010
Keywords: binary regression models, Bayesian residual, random effect, mixture of normals, Markov chain Monte Carlo,
X-DOI: 10.1080/02664760903031153
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903031153
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1355-1368
Template-Type: ReDIF-Article 1.0
Author-Name: M. Aslam
Author-X-Name-First: M.
Author-X-Name-Last: Aslam
Author-Name: C. -H. Jun
Author-X-Name-First: C. -H.
Author-X-Name-Last: Jun
Author-Name: M. Ahmad
Author-X-Name-First: M.
Author-X-Name-Last: Ahmad
Title: Design of a time-truncated double sampling plan for a general life distribution
Abstract:
A double sampling plan based on truncated life tests is proposed and
designed under a general life distribution. The design parameters such as
sample sizes and acceptance numbers for the first and the second samples
are determined so as to minimize the average sample number subject to
satisfying the consumer's and producer's risks at the respectively
specified quality levels. The resultant tables can be used regardless of
the underlying distribution as long as the reliability requirements are
specified at two risks. In addition, Gamma and Weibull distributions are
particularly considered to report the design parameters according to the
quality levels in terms of the mean ratios.
Journal: Journal of Applied Statistics
Pages: 1369-1379
Issue: 8
Volume: 37
Year: 2010
Keywords: acceptance probability, average sample number, consumer's risk, life distribution, life test, producer's risk,
X-DOI: 10.1080/02664760903030247
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903030247
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1369-1379
Template-Type: ReDIF-Article 1.0
Author-Name: Matei Demetrescu
Author-X-Name-First: Matei
Author-X-Name-Last: Demetrescu
Author-Name: Uwe Hassler
Author-X-Name-First: Uwe
Author-X-Name-Last: Hassler
Author-Name: Adina Tarcolea
Author-X-Name-First: Adina
Author-X-Name-Last: Tarcolea
Title: Testing for stationarity in large panels with cross-dependence, and US evidence on unit labor cost
Abstract:
A new stationarity test for heterogeneous panel data with large
cross-sectional dimension is developed and used to examine a panel with
growth rates of unit labor cost in the USA. The test allows for strong
cross-unit dependence in the form of unbounded long-run correlation
matrices, for which a simple parameterization is proposed. A KPSS-type
distribution results asymptotically if letting T→∞ be
followed by N→∞. Some evidence against stationarity (short
memory) is found for the examined series.
Journal: Journal of Applied Statistics
Pages: 1381-1397
Issue: 8
Volume: 37
Year: 2010
Keywords: panel KPSS-type test, cross-correlation, inflation dynamics,
X-DOI: 10.1080/02664760903038398
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903038398
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1381-1397
Template-Type: ReDIF-Article 1.0
Author-Name: Emmanouil Androulakis
Author-X-Name-First: Emmanouil
Author-X-Name-Last: Androulakis
Author-Name: Christos Koukouvinos
Author-X-Name-First: Christos
Author-X-Name-Last: Koukouvinos
Author-Name: Kalliopi Mylona
Author-X-Name-First: Kalliopi
Author-X-Name-Last: Mylona
Author-Name: Filia Vonta
Author-X-Name-First: Filia
Author-X-Name-Last: Vonta
Title: A real survival analysis application via variable selection methods for Cox's proportional hazards model
Abstract:
Variable selection is fundamental to high-dimensional statistical
modeling in diverse fields of sciences. In our health study, different
statistical methods are applied to analyze trauma annual data, collected
by 30 General Hospitals in Greece. The dataset consists of 6334
observations and 111 factors that include demographic, transport, and
clinical data. The statistical methods employed in this work are the
nonconcave penalized likelihood methods, Smoothly Clipped Absolute
Deviation, Least Absolute Shrinkage and Selection Operator, and Hard, the
maximum partial likelihood estimation method, and the best subset variable
selection, adjusted to Cox's proportional hazards model and used to detect
possible risk factors, which affect the length of stay in a hospital. A
variety of different statistical models are considered, with respect to
the combinations of factors while censored observations are present. A
comparative survey reveals several differences between results and
execution times of each method. Finally, we provide useful biological
justification of our results.
Journal: Journal of Applied Statistics
Pages: 1399-1406
Issue: 8
Volume: 37
Year: 2010
Keywords: variable selection, survival analysis, Cox's proportional hazards model, nonconcave penalized likelihood, high-dimensional dataset, trauma,
X-DOI: 10.1080/02664760903038406
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903038406
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1399-1406
Template-Type: ReDIF-Article 1.0
Author-Name: Terence Tai-Leung Chong
Author-X-Name-First: Terence Tai-Leung
Author-X-Name-Last: Chong
Author-Name: Zimu Li
Author-X-Name-First: Zimu
Author-X-Name-Last: Li
Author-Name: Haiqiang Chen
Author-X-Name-First: Haiqiang
Author-X-Name-Last: Chen
Author-Name: Melvin Hinich
Author-X-Name-First: Melvin
Author-X-Name-Last: Hinich
Title: An investigation of duration dependence in the American stock market cycle
Abstract:
This paper investigates the duration dependence of the US stock market
cycles. A new classification method for bull and bear market regimes based
on the crossing of the market index and its moving average is proposed. We
show evidence of duration dependence in whole cycles. The half cycles,
however, are found to be duration independent. More importantly, we find
that the degree of duration dependence of the US stock market cycles has
dropped after the launch of the NASDAQ index.
Journal: Journal of Applied Statistics
Pages: 1407-1416
Issue: 8
Volume: 37
Year: 2010
Keywords: duration dependence, stock market cycles, moving average,
X-DOI: 10.1080/02664760903039875
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903039875
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1407-1416
Template-Type: ReDIF-Article 1.0
Author-Name: Dougal Hutchison
Author-X-Name-First: Dougal
Author-X-Name-Last: Hutchison
Title: Handbook of multilevel analysis
Abstract:
Journal: Journal of Applied Statistics
Pages: 1417-1418
Issue: 8
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902899741
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902899741
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1417-1418
Template-Type: ReDIF-Article 1.0
Author-Name: Ilia Vonta
Author-X-Name-First: Ilia
Author-X-Name-Last: Vonta
Title: Model selection and model averaging
Abstract:
Journal: Journal of Applied Statistics
Pages: 1419-1420
Issue: 8
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902899774
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902899774
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1419-1420
Template-Type: ReDIF-Article 1.0
Author-Name: John Shade
Author-X-Name-First: John
Author-X-Name-Last: Shade
Title: Software for data analysis
Abstract:
Journal: Journal of Applied Statistics
Pages: 1421-1422
Issue: 8
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902899790
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902899790
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:8:p:1421-1422
Template-Type: ReDIF-Article 1.0
Author-Name: Paresh Kumar Narayan
Author-X-Name-First: Paresh Kumar
Author-X-Name-Last: Narayan
Author-Name: Stephan Popp
Author-X-Name-First: Stephan
Author-X-Name-Last: Popp
Title: A new unit root test with two structural breaks in level and slope at unknown time
Abstract:
In this paper, we propose a new augmented Dickey-Fuller-type test for
unit roots which accounts for two structural breaks. We consider two
different specifications: (a) two breaks in the level of a trending data
series and (b) two breaks in the level and slope of a trending data
series. The breaks whose time of occurrence is assumed to be unknown are
modeled as innovational outliers and thus take effect gradually. Using
Monte Carlo simulations, we show that our proposed test has correct size,
stable power, and identifies the structural breaks accurately.
Journal: Journal of Applied Statistics
Pages: 1425-1438
Issue: 9
Volume: 37
Year: 2010
Keywords: unit root test, multiple structural breaks, break date estimation, Monte Carlo simulations, US macroeconomic variables,
X-DOI: 10.1080/02664760903039883
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903039883
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1425-1438
Template-Type: ReDIF-Article 1.0
Author-Name: Yousung Park
Author-X-Name-First: Yousung
Author-X-Name-Last: Park
Author-Name: Bo-Seung Choi
Author-X-Name-First: Bo-Seung
Author-X-Name-Last: Choi
Title: Bayesian analysis for incomplete multi-way contingency tables with nonignorable nonresponse
Abstract:
We propose Bayesian methods with five types of priors to estimate cell
probabilities in an incomplete multi-way contingency table under
nonignorable nonresponse. In this situation, the maximum likelihood (ML)
estimates often fall in the boundary solution, causing the ML estimates to
become unstable. To deal with such a multi-way table, we present an EM
algorithm which generalizes the previous algorithm used for incomplete
one-way tables. Three of the five types of priors were previously
introduced while the other two are newly proposed to reflect different
response patterns between respondents and nonrespondents. Data analysis
and simulation studies show that Bayesian estimates based on the old three
priors can be worse than the ML regardless of occurrence of boundary
solution, contrary to previous studies. The Bayesian estimates from the
two new priors are most preferable when a boundary solution occurs. We
provide an illustrating example using data for a study of the relationship
between a mother's smoking and her newborn's weight.
Journal: Journal of Applied Statistics
Pages: 1439-1453
Issue: 9
Volume: 37
Year: 2010
Keywords: Bayesian analysis, nonignorable nonresponse, priors, boundary solution, EM algorithm,
X-DOI: 10.1080/02664760903046078
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903046078
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1439-1453
Template-Type: ReDIF-Article 1.0
Author-Name: Marc Aerts
Author-X-Name-First: Marc
Author-X-Name-Last: Aerts
Author-Name: Niel Hens
Author-X-Name-First: Niel
Author-X-Name-Last: Hens
Author-Name: Jeffrey Simonoff
Author-X-Name-First: Jeffrey
Author-X-Name-Last: Simonoff
Title: Model selection in regression based on pre-smoothing
Abstract:
In this paper, we investigate the effect of pre-smoothing on model
selection. Christobal et al 6 showed the beneficial effect of
pre-smoothing on estimating the parameters in a linear regression model.
Here, in a regression setting, we show that smoothing the response data
prior to model selection by Akaike's information criterion can lead to an
improved selection procedure. The bootstrap is used to control the
magnitude of the random error structure in the smoothed data. The effect
of pre-smoothing on model selection is shown in simulations. The method is
illustrated in a variety of settings, including the selection of the best
fractional polynomial in a generalized linear model.
Journal: Journal of Applied Statistics
Pages: 1455-1472
Issue: 9
Volume: 37
Year: 2010
Keywords: Akaike information criterion, fractional polynomial, latent variable model, model selection, pre-smoothing,
X-DOI: 10.1080/02664760903046086
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903046086
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1455-1472
Template-Type: ReDIF-Article 1.0
Author-Name: Panagiotis Mantalos
Author-X-Name-First: Panagiotis
Author-X-Name-Last: Mantalos
Author-Name: Ghazi Shukur
Author-X-Name-First: Ghazi
Author-X-Name-Last: Shukur
Title: The effect of spillover on the Granger causality test
Abstract:
In this paper, we investigate the properties of the Granger causality
test in stationary and stable vector autoregressive models under the
presence of spillover effects, that is, causality in variance. The Wald
test and the WW test (the Wald test with White's proposed
heteroskedasticity-consistent covariance matrix estimator imposed) are
analyzed. The investigation is undertaken by using Monte Carlo simulation
in which two different sample sizes and six different kinds of
data-generating processes are used. The results show that the Wald test
over-rejects the null hypothesis both with and without the spillover
effect, and that the over-rejection in the latter case is more severe in
larger samples. The size properties of the WW test are satisfactory when
there is spillover between the variables. Only when there is feedback in
the variance is the size of the WW test slightly affected. The Wald test
is shown to have higher power than the WW test when the errors follow a
GARCH(1,1) process without a spillover effect. When there is a spillover,
the power of both tests deteriorates, which implies that the spillover has
a negative effect on the causality tests.
Journal: Journal of Applied Statistics
Pages: 1473-1486
Issue: 9
Volume: 37
Year: 2010
Keywords: causality in variance, GARCH, Granger causality, volatility spillover,
X-DOI: 10.1080/02664760903046094
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903046094
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1473-1486
Template-Type: ReDIF-Article 1.0
Author-Name: Shiquan Ren
Author-X-Name-First: Shiquan
Author-X-Name-Last: Ren
Author-Name: Hong Lai
Author-X-Name-First: Hong
Author-X-Name-Last: Lai
Author-Name: Wenjing Tong
Author-X-Name-First: Wenjing
Author-X-Name-Last: Tong
Author-Name: Mostafa Aminzadeh
Author-X-Name-First: Mostafa
Author-X-Name-Last: Aminzadeh
Author-Name: Xuezhang Hou
Author-X-Name-First: Xuezhang
Author-X-Name-Last: Hou
Author-Name: Shenghan Lai
Author-X-Name-First: Shenghan
Author-X-Name-Last: Lai
Title: Nonparametric bootstrapping for hierarchical data
Abstract:
Nonparametric bootstrapping for hierarchical data is relatively
underdeveloped and not straightforward: certainly it does not make sense
to use simple nonparametric resampling, which treats all observations as
independent. We have provided some resampling strategies of hierarchical
data, proved that the strategy of nonparametric bootstrapping on the
highest level (randomly sampling all other levels without replacement
within the highest level selected by randomly sampling the highest levels
with replacement) is better than that on lower levels, analyzed real data
and performed simulation studies.
Journal: Journal of Applied Statistics
Pages: 1487-1498
Issue: 9
Volume: 37
Year: 2010
Keywords: random effects model, hierarchical data, nonparametric bootstrapping, resampling schemes, unbalanced data,
X-DOI: 10.1080/02664760903046102
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903046102
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1487-1498
Template-Type: ReDIF-Article 1.0
Author-Name: Firoozeh Haghighi
Author-X-Name-First: Firoozeh
Author-X-Name-Last: Haghighi
Author-Name: Mikhail Nikulin
Author-X-Name-First: Mikhail
Author-X-Name-Last: Nikulin
Title: On the linear degradation model with multiple failure modes
Abstract:
The purpose of this work is to develop statistical methods for using
degradation measure to estimate a survival function for a linear
degradation model. In this paper, we review existing methods and then
describe a parametric approach. We focus on estimating the survival
function. A simulation study is conducted to evaluate the performance of
the estimating method and the method is illustrated using real data.
Journal: Journal of Applied Statistics
Pages: 1499-1507
Issue: 9
Volume: 37
Year: 2010
Keywords: degradation model, failure time, intensity function, survival function, parametric method,
X-DOI: 10.1080/02664760903055434
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903055434
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1499-1507
Template-Type: ReDIF-Article 1.0
Author-Name: Feng-Chang Xie
Author-X-Name-First: Feng-Chang
Author-X-Name-Last: Xie
Author-Name: Jin-Guan Lin
Author-X-Name-First: Jin-Guan
Author-X-Name-Last: Lin
Author-Name: Bo-Cheng Wei
Author-X-Name-First: Bo-Cheng
Author-X-Name-Last: Wei
Title: Testing for varying zero-inflation and dispersion in generalized Poisson regression models
Abstract:
Homogeneity of dispersion parameters and zero-inflation parameters is a
standard assumption in zero-inflated generalized Poisson regression
(ZIGPR) models. However, this assumption may be not appropriate in some
situations. This work develops a score test for varying dispersion and/or
zero-inflation parameter in the ZIGPR models, and corresponding test
statistics are obtained. Two numerical examples are given to illustrate
our methodology, and the properties of score test statistics are
investigated through Monte Carlo simulations.
Journal: Journal of Applied Statistics
Pages: 1509-1522
Issue: 9
Volume: 37
Year: 2010
Keywords: generalized Poisson regression models, zero-inflation, dispersion, score test, test of homogeneity, simulation study,
X-DOI: 10.1080/02664760903055442
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903055442
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1509-1522
Template-Type: ReDIF-Article 1.0
Author-Name: Claire Weston
Author-X-Name-First: Claire
Author-X-Name-Last: Weston
Author-Name: John Thompson
Author-X-Name-First: John
Author-X-Name-Last: Thompson
Title: Modeling survival in childhood cancer studies using two-stage non-mixture cure models
Abstract:
Non-mixture cure models (NMCMs) are derived from a simplified
representation of the biological process that takes place after treatment
for cancer. These models are intended to represent the time from the end
of treatment to the time of first recurrence of cancer in studies when a
proportion of those treated are completely cured. However, for many
studies overall survival is also of interest. A two-stage NMCM that
estimates the overall survival from a combination of two cure models, one
from end of treatment to first recurrence and one from first recurrence to
death, is proposed. The model is applied to two studies of Ewing's tumor
in young patients. Caution needs to be exercised when extrapolating from
cure models fitted to short follow-up times, but these data and associated
simulations show how, when follow-up is limited, a two-stage model can
give more stable estimates of the cure fraction than a one-stage model
applied directly to overall survival.
Journal: Journal of Applied Statistics
Pages: 1523-1535
Issue: 9
Volume: 37
Year: 2010
Keywords: non-mixture cure model, parametric survival, paediatric cancer,
X-DOI: 10.1080/02664760903055459
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903055459
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1523-1535
Template-Type: ReDIF-Article 1.0
Author-Name: Emilio Gomez-Deniz
Author-X-Name-First: Emilio
Author-X-Name-Last: Gomez-Deniz
Author-Name: Enrique Calderin-Ojeda
Author-X-Name-First: Enrique
Author-X-Name-Last: Calderin-Ojeda
Title: A study of Bayesian local robustness with applications in actuarial statistics
Abstract:
Local or infinitesimal Bayesian robustness is a powerful tool to study
the sensitivity of posterior magnitudes, which cannot be expressed in a
simple manner. For these expressions, the global Bayesian robustness
methodology does not seem adequate since the practitioner cannot avoid
using inappropriate classes of prior distributions in order to make the
model mathematically tractable. This situation occurs, for example, when
we compute some types of premiums in actuarial statistics in order to fix
the premium to be charged to an insurance policy. In this paper,
analytical and simple expressions that allow us to study the sensitivity
of premiums, which are usually used in automobile insurance are provided
by using the local Bayesian robustness methodology. Some examples are
examined by using real automobile claim insurance data.
Journal: Journal of Applied Statistics
Pages: 1537-1546
Issue: 9
Volume: 37
Year: 2010
Keywords: posterior, local robustness, norm, premium,
X-DOI: 10.1080/02664760903082156
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903082156
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1537-1546
Template-Type: ReDIF-Article 1.0
Author-Name: D. J. Best
Author-X-Name-First: D. J.
Author-X-Name-Last: Best
Author-Name: J. C. W. Rayner
Author-X-Name-First: J. C. W.
Author-X-Name-Last: Rayner
Author-Name: O. Thas
Author-X-Name-First: O.
Author-X-Name-Last: Thas
Title: Four tests of fit for the beta-binomial distribution
Abstract:
Tests based on the Anderson-Darling statistic, a third moment statistic
and the classical Pearson-Fisher X2 statistic, along with its third-order
component, are considered. A small critical value and power study are
given. Some examples illustrate important applications.
Journal: Journal of Applied Statistics
Pages: 1547-1554
Issue: 9
Volume: 37
Year: 2010
Keywords: Anderson-Darling statistic, multinomial distribution, Pearson's X2, third moment statistic, third-order component,
X-DOI: 10.1080/02664760903089664
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903089664
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1547-1554
Template-Type: ReDIF-Article 1.0
Author-Name: Aristidis Nikoloulopoulos
Author-X-Name-First: Aristidis
Author-X-Name-Last: Nikoloulopoulos
Author-Name: Dimitris Karlis
Author-X-Name-First: Dimitris
Author-X-Name-Last: Karlis
Title: Regression in a copula model for bivariate count data
Abstract:
In many cases of modeling bivariate count data, the interest lies on
studying the association rather than the marginal properties. We form a
flexible regression copula-based model where covariates are used not only
for the marginal but also for the copula parameters. Since copula measures
the association, the use of covariates in its parameters allow for direct
modeling of association. A real-data application related to transaction
market basket data is used. Our goal is to refine and understand whether
the association between the number of purchases of certain product
categories depends on particular demographic customers' characteristics.
Such information is important for decision making for marketing purposes.
Journal: Journal of Applied Statistics
Pages: 1555-1568
Issue: 9
Volume: 37
Year: 2010
Keywords: dependence modeling, Kendall's tau, covariate function, negative binomial distribution,
X-DOI: 10.1080/02664760903093591
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903093591
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1555-1568
Template-Type: ReDIF-Article 1.0
Author-Name: Claudio Agostinelli
Author-X-Name-First: Claudio
Author-X-Name-Last: Agostinelli
Author-Name: Luisa Bisaglia
Author-X-Name-First: Luisa
Author-X-Name-Last: Bisaglia
Title: ARFIMA processes and outliers: a weighted likelihood approach
Abstract:
In this paper, we consider the problem of robust estimation of the
fractional parameter, d, in long memory autoregressive fractionally
integrated moving average processes, when two types of outliers, i.e.
additive and innovation, are taken into account without knowing their
number, position or intensity. The proposed method is a weighted
likelihood estimation (WLE) approach for which needed definitions and
algorithm are given. By an extensive Monte Carlo simulation study, we
compare the performance of the WLE method with the performance of both the
approximated maximum likelihood estimation (MLE) and the robust
M-estimator proposed by Beran (Statistics for Long-Memory Processes,
Chapman & Hall, London, 1994). We find that robustness against the two
types of considered outliers can be achieved without loss of efficiency.
Moreover, as a byproduct of the procedure, we can classify the suspicious
observations in different kinds of outliers. Finally, we apply the
proposed methodology to the Nile River annual minima time series.
Journal: Journal of Applied Statistics
Pages: 1569-1584
Issue: 9
Volume: 37
Year: 2010
Keywords: ARFIMA processes, outliers, robust estimation, weighted likelihood,
X-DOI: 10.1080/02664760903093609
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903093609
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1569-1584
Template-Type: ReDIF-Article 1.0
Author-Name: Xavier Puig
Author-X-Name-First: Xavier
Author-X-Name-Last: Puig
Author-Name: Josep Ginebra
Author-X-Name-First: Josep
Author-X-Name-Last: Ginebra
Author-Name: Marti Font
Author-X-Name-First: Marti
Author-X-Name-Last: Font
Title: The Sichel model and the mixing and truncation order
Abstract:
The analysis of word frequency count data can be very useful in
authorship attribution problems. Zero-truncated generalized inverse
Gaussian-Poisson mixture models are very helpful in the analysis of these
kinds of data because their model-mixing density estimates can be used as
estimates of the density of the word frequencies of the vocabulary. It is
found that this model provides excellent fits for the word frequency
counts of very long texts, where the truncated inverse Gaussian-Poisson
special case fails because it does not allow for the large degree of
over-dispersion in the data. The role played by the three parameters of
this truncated GIG-Poisson model is also explored. Our second goal is to
compare the fit of the truncated GIG-Poisson mixture model with the fit of
the model that results from switching the order of the mixing and
truncation stages. A heuristic interpretation of the mixing distribution
estimates obtained under this alternative GIG-truncated Poisson mixture
model is also provided.
Journal: Journal of Applied Statistics
Pages: 1585-1603
Issue: 9
Volume: 37
Year: 2010
Keywords: categorical data, generalized inverse Gaussian, mixture model, Poisson mixture, stylometry, truncated model, truncated mixture, word frequency,
X-DOI: 10.1080/02664760903093617
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903093617
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:9:p:1585-1603
Template-Type: ReDIF-Article 1.0
Author-Name: A. A. M. Nurunnabi
Author-X-Name-First: A. A. M.
Author-X-Name-Last: Nurunnabi
Author-Name: A.H.M. Rahmatullah Imon
Author-X-Name-First: A.H.M.
Author-X-Name-Last: Rahmatullah Imon
Author-Name: M. Nasser
Author-X-Name-First: M.
Author-X-Name-Last: Nasser
Title: Identification of multiple influential observations in logistic regression
Abstract:
The identification of influential observations in logistic regression has
drawn a great deal of attention in recent years. Most of the available
techniques like Cook's distance and difference of fits (DFFITS) are based
on single-case deletion. But there is evidence that these techniques
suffer from masking and swamping problems and consequently fail to detect
multiple influential observations. In this paper, we have developed a new
measure for the identification of multiple influential observations in
logistic regression based on a generalized version of DFFITS. The
advantage of the proposed method is then investigated through several
well-referred data sets and a simulation study.
Journal: Journal of Applied Statistics
Pages: 1605-1624
Issue: 10
Volume: 37
Year: 2010
Keywords: generalized DFFITS, generalized Studentized Pearson residual, generalized weight, high leverage point, influential observation, masking, outlier, swamping,
X-DOI: 10.1080/02664760903104307
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903104307
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1605-1624
Template-Type: ReDIF-Article 1.0
Author-Name: Nandini Kannan
Author-X-Name-First: Nandini
Author-X-Name-Last: Kannan
Author-Name: Debasis Kundu
Author-X-Name-First: Debasis
Author-X-Name-Last: Kundu
Author-Name: P. Nair
Author-X-Name-First: P.
Author-X-Name-Last: Nair
Author-Name: R. C. Tripathi
Author-X-Name-First: R. C.
Author-X-Name-Last: Tripathi
Title: The generalized exponential cure rate model with covariates
Abstract:
In this article, we consider a parametric survival model that is
appropriate when the population of interest contains long-term survivors
or immunes. The model referred to as the cure rate model was introduced by
Boag 1 in terms of a mixture model that included a component representing
the proportion of immunes and a distribution representing the life times
of the susceptible population. We propose a cure rate model based on the
generalized exponential distribution that incorporates the effects of risk
factors or covariates on the probability of an individual being a
long-time survivor. Maximum likelihood estimators of the model parameters
are obtained using the the expectation-maximisation (EM) algorithm. A
graphical method is also provided for assessing the goodness-of-fit of the
model. We present an example to illustrate the fit of this model to data
that examines the effects of different risk factors on relapse time for
drug addicts.
Journal: Journal of Applied Statistics
Pages: 1625-1636
Issue: 10
Volume: 37
Year: 2010
Keywords: cure rate, long-term survivor, generalized exponential distribution, EM algorithm, goodness-of-fit,
X-DOI: 10.1080/02664760903117739
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903117739
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1625-1636
Template-Type: ReDIF-Article 1.0
Author-Name: Ralf Ostermark
Author-X-Name-First: Ralf
Author-X-Name-Last: Ostermark
Title: Concurrent processing of heteroskedastic vector-valued mixture density models
Abstract:
We introduce a combined two-stage least-squares (2SLS)-expectation
maximization (EM) algorithm for estimating vector-valued autoregressive
conditional heteroskedasticity models with standardized errors generated
by Gaussian mixtures. The procedure incorporates the identification of the
parametric settings as well as the estimation of the model parameters. Our
approach does not require a priori knowledge of the Gaussian densities.
The parametric settings of the 2SLS_EM algorithm are determined by the
genetic hybrid algorithm (GHA). We test the GHA-driven 2SLS_EM algorithm
on some simulated cases and on international asset pricing data. The
statistical properties of the estimated models and the derived mixture
densities indicate good performance of the algorithm. We conduct tests on
a massively parallel processor supercomputer to cope with situations
involving numerous mixtures. We show that the algorithm is scalable.
Journal: Journal of Applied Statistics
Pages: 1637-1659
Issue: 10
Volume: 37
Year: 2010
Keywords: vector-valued ARCH processes, mixture densities, geno-mathematical monitoring, parallel programming, high-performance computing,
X-DOI: 10.1080/02664760903121236
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903121236
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1637-1659
Template-Type: ReDIF-Article 1.0
Author-Name: D. Todem
Author-X-Name-First: D.
Author-X-Name-Last: Todem
Author-Name: Y. Zhang
Author-X-Name-First: Y.
Author-X-Name-Last: Zhang
Author-Name: A. Ismail
Author-X-Name-First: A.
Author-X-Name-Last: Ismail
Author-Name: W. Sohn
Author-X-Name-First: W.
Author-X-Name-Last: Sohn
Title: Random effects regression models for count data with excess zeros in caries research
Abstract:
We extend the family of Poisson and negative binomial models to derive
the joint distribution of clustered count outcomes with extra zeros. Two
random effects models are formulated. The first model assumes a shared
random effects term between the conditional probability of perfect zeros
and the conditional mean of the imperfect state. The second formulation
relaxes the shared random effects assumption by relating the conditional
probability of perfect zeros and the conditional mean of the imperfect
state to two different but correlated random effects variables. Under the
conditional independence and the missing data at random assumption, a
direct optimization of the marginal likelihood and an EM algorithm are
proposed to fit the proposed models. Our proposed models are fitted to
dental caries counts of children under the age of six in the city of
Detroit.
Journal: Journal of Applied Statistics
Pages: 1661-1679
Issue: 10
Volume: 37
Year: 2010
Keywords: adaptive Gaussian quadrature, dental caries scores, clustered data, factor loadings, inflated zeros, negative binomial/Poisson distribution, random effects,
X-DOI: 10.1080/02664760903127605
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903127605
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1661-1679
Template-Type: ReDIF-Article 1.0
Author-Name: Yi-Ting Hwang
Author-X-Name-First: Yi-Ting
Author-X-Name-Last: Hwang
Author-Name: Jia-Jung Lai
Author-X-Name-First: Jia-Jung
Author-X-Name-Last: Lai
Author-Name: Shyh-Tyan Ou
Author-X-Name-First: Shyh-Tyan
Author-X-Name-Last: Ou
Title: Evaluations of FWER-controlling methods in multiple hypothesis testing
Abstract:
Simultaneously testing a family of n null hypotheses can arise in many
applications. A common problem in multiple hypothesis testing is to
control Type-I error. The probability of at least one false rejection
referred to as the familywise error rate (FWER) is one of the earliest
error rate measures. Many FWER-controlling procedures have been proposed.
The ability to control the FWER and achieve higher power is often used to
evaluate the performance of a controlling procedure. However, when testing
multiple hypotheses, FWER and power are not sufficient for evaluating
controlling procedure's performance. Furthermore, the performance of a
controlling procedure is also governed by experimental parameters such as
the number of hypotheses, sample size, the number of true null hypotheses
and data structure. This paper evaluates, under various experimental
settings, the performance of some FWER-controlling procedures in terms of
five indices, the FWER, the false discovery rate, the false non-discovery
rate, the sensitivity and the specificity. The results can provide
guidance on how to select an appropriate FWER-controlling procedure to
meet a study's objective.
Journal: Journal of Applied Statistics
Pages: 1681-1694
Issue: 10
Volume: 37
Year: 2010
Keywords: Bonferroni's method, false discovery rate, false non-discovery rate, familywise error rate, multiple hypothesis testing, sensitivity, specificity,
X-DOI: 10.1080/02664760903136960
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903136960
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1681-1694
Template-Type: ReDIF-Article 1.0
Author-Name: Jesper Ryden
Author-X-Name-First: Jesper
Author-X-Name-Last: Ryden
Author-Name: Sven Erick Alm
Author-X-Name-First: Sven Erick
Author-X-Name-Last: Alm
Title: The effect of interaction and rounding error in two-way ANOVA: example of impact on testing for normality
Abstract:
A key issue in various applications of analysis of variance (ANOVA) is
testing for the interaction and the interpretation of resulting ANOVA
tables. In this note it is demonstrated that for a two-way ANOVA, whether
interactions are incorporated or not may have a dramatic influence when
considering the usual statistical tests for normality of residuals. The
effect of numerical rounding is also discussed.
Journal: Journal of Applied Statistics
Pages: 1695-1701
Issue: 10
Volume: 37
Year: 2010
Keywords: analysis of variance, rounding error, Shapiro-Wilk test, factorial experiment, interaction,
X-DOI: 10.1080/02664760903143925
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903143925
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1695-1701
Template-Type: ReDIF-Article 1.0
Author-Name: Douglas Jones
Author-X-Name-First: Douglas
Author-X-Name-Last: Jones
Author-Name: Francis Mendez Mediavilla
Author-X-Name-First: Francis
Author-X-Name-Last: Mendez Mediavilla
Title: A Bayesian method for query approximation
Abstract:
This study presents statistical techniques to obtain local approximate
query answers for aggregate multivariate materialized views thus
eliminating the need for repetitive scanning of the source data. In widely
distributed management information systems, detailed data do not
necessarily reside in the same physical location as the decision-maker;
thus, requiring scanning of the source data as needed by the query demand.
Decision-making, business intelligence and data analysis could involve
multiple data sources, data diversity, aggregates and large amounts of
data. Management often confronts delays in information acquisition from
remote sites. Management decisions usually involve analyses that require
the most precise summary data available. These summaries are readily
available from data warehouses and can be used to estimate or approximate
data in exchange for a quicker response. An approach to supporting
aggregate materialized view management is proposed that reconstructs data
sets locally using posterior parameter estimates based on sufficient
statistics in a log-linear model with a multinomial likelihood.
Journal: Journal of Applied Statistics
Pages: 1703-1715
Issue: 10
Volume: 37
Year: 2010
Keywords: query approximation, data reduction, materialized view management, BIPF,
X-DOI: 10.1080/02664760903148791
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903148791
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1703-1715
Template-Type: ReDIF-Article 1.0
Author-Name: Jing-Er Chiu
Author-X-Name-First: Jing-Er
Author-X-Name-Last: Chiu
Author-Name: Tsen-I Kuo
Author-X-Name-First: Tsen-I
Author-X-Name-Last: Kuo
Title: Control charts for fraction nonconforming in a bivariate binomial process
Abstract:
Many multivariate quality control techniques are used for multivariate
variable processes, but few work for multivariate attribute processes. To
monitor multivariate attributes, controlling the false alarms (type I
errors) and considering the correlation between attributes are two
important issues. By taking into account these two issues, a new control
chart is presented to monitor a bivariate binomial process. An example is
illustrated for the proposed method. To evaluate the performance of the
proposed method, a simulation study is conducted to compare the results
with those using both the multivariate np chart and skewness reduction
approaches. The results show that the correlation is taken into account in
the designed chart and the overall false alarm is controlled at the
nominal value. Moreover, the process shift can be quickly detected and the
variable that is responsible for a signal can be determined.
Journal: Journal of Applied Statistics
Pages: 1717-1728
Issue: 10
Volume: 37
Year: 2010
Keywords: multi-attribute control chart, control of bivariate binomial processes, average run length, multivariate process monitoring,
X-DOI: 10.1080/02664760903150698
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903150698
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1717-1728
Template-Type: ReDIF-Article 1.0
Author-Name: Assam Pryseley
Author-X-Name-First: Assam
Author-X-Name-Last: Pryseley
Author-Name: Koen Mintiens
Author-X-Name-First: Koen
Author-X-Name-Last: Mintiens
Author-Name: Katia Knapen
Author-X-Name-First: Katia
Author-X-Name-Last: Knapen
Author-Name: Yves Van der Stede
Author-X-Name-First: Yves
Author-X-Name-Last: Van der Stede
Author-Name: Geert Molenberghs
Author-X-Name-First: Geert
Author-X-Name-Last: Molenberghs
Title: Estimating precision, repeatability, and reproducibility from Gaussian and non- Gaussian data: a mixed models approach
Abstract:
Quality control relies heavily on the use of formal assessment metrics.
In this paper, for the context of veterinary epidemiology, we review the
main proposals, precision, repeatability, reproducibility, and
intermediate precision, in agreement with ISO (international Organization
for Standardization) practice, generalize these by placing them within the
linear mixed model framework, which we then extend to the generalized
linear mixed model setting, so that both Gaussian as well as non-Gaussian
data can be employed. Similarities and differences are discussed between
the classical ANOVA (analysis of variance) approach and the proposed mixed
model settings, on the one hand, and between the Gaussian and non-Gaussian
cases, on the other hand. The new proposals are applied to five studies in
three diseases: Aujeszky's disease, enzootic bovine leucosis (EBL) and
bovine brucellosis. The mixed-models proposals are also discussed in the
light of their computational requirements.
Journal: Journal of Applied Statistics
Pages: 1729-1747
Issue: 10
Volume: 37
Year: 2010
Keywords: accuracy, analysis of variance, Aujeszky's disease, bias, bovine brucellosis, enzootic bovine leucosis, generalized linear mixed models, linear mixed models, quality control,
X-DOI: 10.1080/02664760903150706
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903150706
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1729-1747
Template-Type: ReDIF-Article 1.0
Author-Name: Jie Wang
Author-X-Name-First: Jie
Author-X-Name-Last: Wang
Author-Name: James Stamey
Author-X-Name-First: James
Author-X-Name-Last: Stamey
Title: A Bayesian algorithm for sample size determination for equivalence and non-inferiority test
Abstract:
Bayesian sample size estimation for equivalence and non-inferiority tests
for diagnostic methods is considered. The goal of the study is to test
whether a new screening test of interest is equivalent to, or not inferior
to the reference test, which may or may not be a gold standard. Sample
sizes are chosen by the model performance criteria of average posterior
variance, length and coverage probability. In the absence of a gold
standard, sample sizes are evaluated by the ratio of marginal
probabilities of the two screening tests; whereas in the presence of gold
standard, sample sizes are evaluated by the measures of sensitivity and
specificity.
Journal: Journal of Applied Statistics
Pages: 1749-1759
Issue: 10
Volume: 37
Year: 2010
Keywords: average length criterion, average posterior variance criteria, average coverage criterion, equivalence test, non-inferiority test, sample size determination,
X-DOI: 10.1080/02664760903150714
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903150714
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1749-1759
Template-Type: ReDIF-Article 1.0
Author-Name: Y. Samimi
Author-X-Name-First: Y.
Author-X-Name-Last: Samimi
Author-Name: A. Aghaie
Author-X-Name-First: A.
Author-X-Name-Last: Aghaie
Title: Monitoring heterogeneous serially correlated usage behavior in subscription-based services
Abstract:
Effective monitoring of usage behavior necessitates applying accurate
stochastic models to represent customer heterogeneous time-dependent
behavior. In this research, it is assumed that the sequence of customer
visits over a subscription period occurs based on the Poisson process,
while usage at each purchase occasion follows an autoregressive Bernoulli
model of first order. The autocorrelated observations are derived from a
two-state Markov chain model. Generalized linear models are employed to
describe heterogeneous behavior across customers. In order to monitor the
number of visits as well as the fraction of visits eventuated in a
purchase, control statistics are defined on the basis of generalized
likelihood ratio (GLR) test. Furthermore, in the case of the marginal
logistic model for dependent observations, a chi-square test statistic
based on the asymptotic multivariate normal distribution of
quasi-likelihood estimates is employed. Performances of the monitoring
schemes are compared with an illustrative case provided by simulation.
Results indicate that the adjusted Shewhart c chart resembles the deviance
residual control chart for monitoring the frequency of customer visit. On
the other hand, the GLR statistic based on the conditional logistic
regression is more powerful in detecting unnatural usage behavior when
compared with the chi-square control statistic based on the marginal
logistic model.
Journal: Journal of Applied Statistics
Pages: 1761-1777
Issue: 10
Volume: 37
Year: 2010
Keywords: generalized linear models, autocorrelated Bernoulli process, quasi-likelihood, longitudinal data, customer usage behavior,
X-DOI: 10.1080/02664760903159103
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903159103
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1761-1777
Template-Type: ReDIF-Article 1.0
Author-Name: Pieter Bastiaan Ober
Author-X-Name-First: Pieter Bastiaan
Author-X-Name-Last: Ober
Title: Modeling with Data
Abstract:
Journal: Journal of Applied Statistics
Pages: 1779-1779
Issue: 10
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902919721
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902919721
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1779-1779
Template-Type: ReDIF-Article 1.0
Author-Name: Mukesh Srivastava
Author-X-Name-First: Mukesh
Author-X-Name-Last: Srivastava
Title: Analysis of Variance and Covariance
Abstract:
Journal: Journal of Applied Statistics
Pages: 1781-1782
Issue: 10
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902919747
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902919747
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1781-1782
Template-Type: ReDIF-Article 1.0
Author-Name: Jennifer Klapper
Author-X-Name-First: Jennifer
Author-X-Name-Last: Klapper
Title: Discrete Fourier Analysis and Wavelets
Abstract:
Journal: Journal of Applied Statistics
Pages: 1783-1784
Issue: 10
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902919762
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902919762
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:10:p:1783-1784
Template-Type: ReDIF-Article 1.0
Author-Name: Stan Lipovetsky
Author-X-Name-First: Stan
Author-X-Name-Last: Lipovetsky
Title: Double logistic curve in regression modeling
Abstract:
The logistic sigmoid curve is widely used in nonlinear regression and in
binary response modeling. There are problems corresponding to a double
sigmoid behavior which consists of the first increase to an early
saturation at an intermediate level, and the second sigmoid with the
eventual plateau of saturation. A double sigmoid behavior is usually
achieved using additive or multiplicative combinations of logit and more
complicated functions with numerous parameters. In this work, double
sigmoid functions are constructed as logistic ones with a sign defining
the point of inflection and with an additional powering parameter. The
elaborated models describe rather complicated double saturation behavior
via only four or five parameters which can be efficiently estimated by
nonlinear optimization techniques. Theoretical features and practical
applications of the models are discussed.
Journal: Journal of Applied Statistics
Pages: 1785-1793
Issue: 11
Volume: 37
Year: 2010
Keywords: logistic function, double sigmoid function, two levels of saturation,
X-DOI: 10.1080/02664760903093633
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903093633
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:11:p:1785-1793
Template-Type: ReDIF-Article 1.0
Author-Name: M. A. Graham
Author-X-Name-First: M. A.
Author-X-Name-Last: Graham
Author-Name: S. W. Human
Author-X-Name-First: S. W.
Author-X-Name-Last: Human
Author-Name: S. Chakraborti
Author-X-Name-First: S.
Author-X-Name-Last: Chakraborti
Title: A Phase I nonparametric Shewhart-type control chart based on the median
Abstract:
A nonparametric Shewhart-type control chart is proposed for monitoring
the location of a continuous variable in a Phase I process control
setting. The chart is based on the pooled median of the available Phase I
samples and the charting statistics are the counts (number of
observations) in each sample that are less than the pooled median. An
exact expression for the false alarm probability (FAP) is given in terms
of the multivariate hypergeometric distribution and this is used to
provide tables for the control limits for a specified nominal FAP value
(of 0.01, 0.05 and 0.10, respectively) and for some values of the sample
size (n) and the number of Phase I samples (m). Some approximations are
discussed in terms of the univariate hypergeometric and the normal
distributions. A simulation study shows that the proposed chart performs
as well as, and in some cases better than, an existing Shewhart-type chart
based on the normal distribution. Numerical examples are given to
demonstrate the implementation of the new chart.
Journal: Journal of Applied Statistics
Pages: 1795-1813
Issue: 11
Volume: 37
Year: 2010
Keywords: false alarm rate, false alarm probability, retrospective, prospective, distribution-free, multivariate hypergeometric,
X-DOI: 10.1080/02664760903164913
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903164913
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:11:p:1795-1813
Template-Type: ReDIF-Article 1.0
Author-Name: Silvia Figini
Author-X-Name-First: Silvia
Author-X-Name-Last: Figini
Author-Name: Paolo Giudici
Author-X-Name-First: Paolo
Author-X-Name-Last: Giudici
Author-Name: Pierpaolo Uberti
Author-X-Name-First: Pierpaolo
Author-X-Name-Last: Uberti
Title: A threshold based approach to merge data in financial risk management
Abstract:
According to the last proposals by the Basel Committee, banks are allowed
to use statistical approaches for the computation of their capital charge
covering financial risks such as credit risk, market risk and operational
risk. It is widely recognized that internal loss data alone do not suffice
to provide accurate capital charge in financial risk management,
especially for high-severity and low-frequency events. Financial
institutions typically use external loss data to augment the available
evidence and, therefore, provide more accurate risk estimates. Rigorous
statistical treatments are required to make internal and external data
comparable and to ensure that merging the two databases leads to unbiased
estimates. The goal of this paper is to propose a correct statistical
treatment to make the external and internal data comparable and,
therefore, mergeable. Such methodology augments internal losses with
relevant, rather than redundant, external loss data.
Journal: Journal of Applied Statistics
Pages: 1815-1824
Issue: 11
Volume: 37
Year: 2010
Keywords: data merging, threshold, financial risk management,
X-DOI: 10.1080/02664760903164921
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903164921
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:11:p:1815-1824
Template-Type: ReDIF-Article 1.0
Author-Name: Claudio Conversano
Author-X-Name-First: Claudio
Author-X-Name-Last: Conversano
Author-Name: Domenico Vistocco
Author-X-Name-First: Domenico
Author-X-Name-Last: Vistocco
Title: Analysis of mutual funds' management styles: a modeling, ranking and visualizing approach
Abstract:
A method to rank mutual funds according to their investment style
measured with respect to the returns of a reference portfolio (benchmark)
is introduced. It is based on a style analysis model estimating a mutual
fund portfolio composition as well as the benchmark one. Starting from
such compositions, it computes a proximity measure based on the L1 or L2
norm to assess the similarity between each mutual fund portfolio returns
and the benchmark returns as well as between the returns of each benchmark
constituent and that of the corresponding mutual fund constituent. To this
purpose the mean integrated absolute error and the mean integrated squared
error are computed to derive both a global ranking of mutual fund
management styles and partial rankings expressing the over- (under-)
weighting of each portfolio constituent. A visual inspection of the
results emphasizing main differences in management styles is provided,
using a parallel coordinates plot. Since a modeling, a ranking and a
visualizing approach are integrated, the method is named MoRaViA. From the
practitioners' point of view, it allows the identification of a specific
management style for each mutual fund, discriminating active management
funds from passive management ones. To evaluate the effectiveness of
MoRaViA, many sets of artificial portfolios are generated and an
application on a set of equity funds operating in the European market is
presented.
Journal: Journal of Applied Statistics
Pages: 1825-1845
Issue: 11
Volume: 37
Year: 2010
Keywords: constrained linear regression, mean integrated squared error, mean integrated absolute error, parallel coordinates, subsampling, active vs. passive management, benchmarking,
X-DOI: 10.1080/02664760903166272
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903166272
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:11:p:1825-1845
Template-Type: ReDIF-Article 1.0
Author-Name: Steffen Unkel
Author-X-Name-First: Steffen
Author-X-Name-Last: Unkel
Author-Name: Nickolay Trendafilov
Author-X-Name-First: Nickolay
Author-X-Name-Last: Trendafilov
Author-Name: Abdel Hannachi
Author-X-Name-First: Abdel
Author-X-Name-Last: Hannachi
Author-Name: Ian Jolliffe
Author-X-Name-First: Ian
Author-X-Name-Last: Jolliffe
Title: Independent exploratory factor analysis with application to atmospheric science data
Abstract:
The independent exploratory factor analysis method is introduced for
recovering independent latent sources from their observed mixtures. The
new model is viewed as a method of factor rotation in exploratory factor
analysis (EFA). First, estimates for all EFA model parameters are obtained
simultaneously. Then, an orthogonal rotation matrix is sought that
minimizes the dependence between the common factors. The rotation of the
scores is compensated by a rotation of the initial loading matrix. The
proposed approach is applied to study winter monthly sea-level pressure
anomalies over the Northern Hemisphere. The North Atlantic Oscillation,
the North Pacific Oscillation, and the Scandinavian pattern are identified
among the rotated spatial patterns with a physically interpretable
structure.
Journal: Journal of Applied Statistics
Pages: 1847-1862
Issue: 11
Volume: 37
Year: 2010
Keywords: noisy independent component analysis, exploratory factor analysis, factor rotation, more variables than observations, rotated spatial patterns, gridded climate data, sea-level pressure anomalies,
X-DOI: 10.1080/02664760903166280
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903166280
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:11:p:1847-1862
Template-Type: ReDIF-Article 1.0
Author-Name: Man-Suk Oh
Author-X-Name-First: Man-Suk
Author-X-Name-Last: Oh
Author-Name: Dong Wan Shin
Author-X-Name-First: Dong Wan
Author-X-Name-Last: Shin
Title: Bayesian tests for unit root and multiple breaks
Abstract:
A Bayesian approach is considered for identifying sources of
nonstationarity for models with a unit root and breaks. Different types of
multiple breaks are allowed through crash models, changing growth models,
and mixed models. All possible nonstationary models are represented by
combinations of zero or nonzero parameters associated with time trends,
dummy for breaks, or previous levels, for which Bayesian posterior
probabilities are computed. Multiple tests based on Markov chain Monte
Carlo procedures are implemented. The proposed method is applied to a real
data set, the Korean GDP data set, showing a strong evidence for two
breaks rather than the usual unit root or one break.
Journal: Journal of Applied Statistics
Pages: 1863-1874
Issue: 11
Volume: 37
Year: 2010
Keywords: multiple breaks, unit root test, Markov chain Monte Carlo,
X-DOI: 10.1080/02664760903173450
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903173450
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:11:p:1863-1874
Template-Type: ReDIF-Article 1.0
Author-Name: Giancarlo Diana
Author-X-Name-First: Giancarlo
Author-X-Name-Last: Diana
Author-Name: Pier Francesco Perri
Author-X-Name-First: Pier Francesco
Author-X-Name-Last: Perri
Title: New scrambled response models for estimating the mean of a sensitive quantitative character
Abstract:
Moving from the scrambling mechanism recently suggested by Saha [25],
three scrambled randomized response (SRR) models are introduced with the
intent to realize a right trade-off between efficiency and privacy
protection. The models perturb the true response on the sensitive variable
by resorting to the multiplicative and additive approaches in different
ways. Some analytical and numerical comparisons of efficiency are
performed to set up the conditions under which improvements upon Saha's
model can be obtained and to quantify the efficiency gain. The use of
auxiliary information is also discussed in a class of estimators for the
sensitive mean under a generic randomization scheme. The class includes
also the three proposed SRR models. Finally, some graphical comparisons
are carried out from the double perspective of the accuracy in the
estimates and respondents' privacy protection.
Journal: Journal of Applied Statistics
Pages: 1875-1890
Issue: 11
Volume: 37
Year: 2010
Keywords: auxiliary variable, class of estimators, privacy protection, sensitive variable,
X-DOI: 10.1080/02664760903186031
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903186031
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:11:p:1875-1890
Template-Type: ReDIF-Article 1.0
Author-Name: Tai Vo Van
Author-X-Name-First: Tai
Author-X-Name-Last: Vo Van
Author-Name: T. Pham-Gia
Author-X-Name-First: T.
Author-X-Name-Last: Pham-Gia
Title: Clustering probability distributions
Abstract:
This article presents some theoretical results on the maximum of several
functions, and its use to define the joint distance of k probability
densities, which, in turn, serves to derive new algorithms for clustering
densities. Numerical examples are presented to illustrate the theory.
Journal: Journal of Applied Statistics
Pages: 1891-1910
Issue: 11
Volume: 37
Year: 2010
Keywords: maximum function, cluster, L1-distance, Bayes error, hierarchical approach,
X-DOI: 10.1080/02664760903186049
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903186049
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:11:p:1891-1910
Template-Type: ReDIF-Article 1.0
Author-Name: Ross Sparks
Author-X-Name-First: Ross
Author-X-Name-Last: Sparks
Author-Name: Tim Keighley
Author-X-Name-First: Tim
Author-X-Name-Last: Keighley
Author-Name: David Muscatello
Author-X-Name-First: David
Author-X-Name-Last: Muscatello
Title: Early warning CUSUM plans for surveillance of negative binomial daily disease counts
Abstract:
Automated public health surveillance of disease counts for rapid
outbreak, epidemic or bioterrorism detection using conventional control
chart methods can be hampered by over-dispersion and background
('in-control') mean counts that vary over time. An adaptive cumulative sum
(CUSUM) plan is developed for signalling unusually high incidence in
prospectively monitored time series of over-dispersed daily disease counts
with a non-homogeneous mean. Negative binomial transitional regression is
used to prospectively model background counts and provide 'one-step-ahead'
forecasts of the next day's count. A CUSUM plan then accumulates
departures of observed counts from an offset (reference value) that is
dynamically updated using the modelled forecasts. The CUSUM signals
whenever the accumulated departures exceed a threshold. The amount of
memory of past observations retained by the CUSUM plan is determined by
the offset value; a smaller offset retains more memory and is efficient at
detecting smaller shifts. Our approach optimises early outbreak detection
by dynamically adjusting the offset value. We demonstrate the practical
application of the 'optimal' CUSUM plans to daily counts of
laboratory-notified influenza and Ross River virus diagnoses, with
particular emphasis on the steady-state situation (i.e. changes that occur
after the CUSUM statistic has run through several in-control counts).
Journal: Journal of Applied Statistics
Pages: 1911-1929
Issue: 11
Volume: 37
Year: 2010
Keywords: average run length, cumulative sum, monitoring, outbreak detection, surveillance,
X-DOI: 10.1080/02664760903186056
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903186056
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:11:p:1911-1929
Template-Type: ReDIF-Article 1.0
Author-Name: Govert Bijwaard
Author-X-Name-First: Govert
Author-X-Name-Last: Bijwaard
Title: Regularity in individual shopping trips: implications for duration models in marketing
Abstract:
Most models for purchase-timing behavior of households do not take into
account that many households have regular and non-shopping days. We
propose a statistical model for purchase timing that exploits information
on the shopping days of households. The model is formulated in a counting
process framework that counts the recurrent purchases for each household
over (calendar) time. In our empirical application of yogurt and detergent
purchases from the ERIM1 database, we show that calendar time effects and
regular and non-shopping days are important features to include in models
for purchase-timing behavior. We find, for instance, that for these
product categories the probability of purchasing is 50-60% higher on
Saturdays and 70% higher on regular shopping days. We highlight the
managerial implications of these model features by simulating some
promotional actions.
Journal: Journal of Applied Statistics
Pages: 1931-1945
Issue: 11
Volume: 37
Year: 2010
Keywords: purchase timing, regular shopping days, non-shopping days, counting process, mixed proportional hazard,
X-DOI: 10.1080/02664760903186064
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903186064
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:11:p:1931-1945
Template-Type: ReDIF-Article 1.0
Author-Name: Rosa Bernardini Papalia
Author-X-Name-First: Rosa Bernardini
Author-X-Name-Last: Papalia
Title: Data disaggregation procedures within a maximum entropy framework
Abstract:
The aim of this paper is to formulate an
analytical-informational-theoretical approach which, given the incomplete
nature of the available micro-level data, can be used to provide
disaggregated values of a given variable. A functional relationship
between the variable to be disaggregated and the available
variables/indicators at the area level is specified through a combination
of different macro- and micro-data sources. Data disaggregation is
accomplished by considering two different cases. In the first case,
sub-area level information on the variable of interest is available, and a
generalized maximum entropy approach is employed to estimate the optimal
disaggregate model. In the second case, we assume that the sub-area level
information is partial and/or incomplete, and we estimate the model on a
smaller scale by developing a generalized cross-entropy-based formulation.
The proposed spatial-disaggregation approach is used in relation to an
Italian data set in order to compute the value-added per manufacturing
sector of local labour systems within the Umbria region, by combining the
available micro/macro-level data and by formulating a suitable set of
constraints for the optimization problem in the presence of errors in
micro-aggregates.
Journal: Journal of Applied Statistics
Pages: 1947-1959
Issue: 11
Volume: 37
Year: 2010
Keywords: data disaggregation, maximum entropy, cross-entropy,
X-DOI: 10.1080/02664760903199489
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903199489
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:11:p:1947-1959
Template-Type: ReDIF-Article 1.0
Author-Name: Martin Tanco
Author-X-Name-First: Martin
Author-X-Name-Last: Tanco
Author-Name: Elisabeth Viles
Author-X-Name-First: Elisabeth
Author-X-Name-Last: Viles
Author-Name: Maria Jesus Alvarez
Author-X-Name-First: Maria
Author-X-Name-Last: Jesus Alvarez
Author-Name: Laura Ilzarbe
Author-X-Name-First: Laura
Author-X-Name-Last: Ilzarbe
Title: Why is not design of experiments widely used by engineers in Europe?
Abstract:
An extensive literature review was carried out to detect why design of
experiments (DoE) is not widely used among engineers in Europe. Once 16
main barriers were identified, a survey was carried out to obtain
first-hand information about the significance of each. We obtained 101
responses from academics, consultants and practitioners interested in DoE.
A statistical analysis of the survey is introduced, including: (a) a
ranking of the barriers, (b) grouping of barriers using factorial
analysis, (c) differences between characteristics of respondents. This
exploratory analysis showed that the main barriers that hinder the
widespread use of DoE are low managerial commitment and engineers' general
weakness in statistics. Once the barriers were classified, the most
important resultant group was that related to business barriers.
Journal: Journal of Applied Statistics
Pages: 1961-1977
Issue: 12
Volume: 37
Year: 2010
Keywords: barriers, design of experiments, engineers, industry, survey,
X-DOI: 10.1080/02664760903207308
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903207308
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:1961-1977
Template-Type: ReDIF-Article 1.0
Author-Name: Manuel Galea
Author-X-Name-First: Manuel
Author-X-Name-Last: Galea
Author-Name: David Cademartori
Author-X-Name-First: David
Author-X-Name-Last: Cademartori
Author-Name: Filidor Vilca
Author-X-Name-First: Filidor
Author-X-Name-Last: Vilca
Title: The structural Sharpe model under t-distributions
Abstract:
In this paper we consider Sharpe's single-index model or Sharpe's model,
by assuming that the returns obtained follow a multivariate t elliptical
distribution. Also, given that the returns of the market are not
observable, the statistical analysis was made in the context of an
errors-in-variables model. In order to analyze the sensibility to possible
outliers and/or atypical returns of the maximum likelihood estimators the
local influence method [10] was implemented. The results are illustrated
by using a set of shares of companies belonging to the Chilean Stock
Market. The main conclusion is that the t model with small degrees of
freedom is able to incorporate possible outliers and influential returns
in the data.
Journal: Journal of Applied Statistics
Pages: 1979-1990
Issue: 12
Volume: 37
Year: 2010
Keywords: diagnostics, t-distribution, errors-in-variables models, portfolios, Sharpe model,
X-DOI: 10.1080/02664760903207316
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903207316
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:1979-1990
Template-Type: ReDIF-Article 1.0
Author-Name: Yafen Liu
Author-X-Name-First: Yafen
Author-X-Name-Last: Liu
Author-Name: Zhen He
Author-X-Name-First: Zhen
Author-X-Name-Last: He
Author-Name: M. Shamsuzzaman
Author-X-Name-First: M.
Author-X-Name-Last: Shamsuzzaman
Author-Name: Zhang Wu
Author-X-Name-First: Zhang
Author-X-Name-Last: Wu
Title: A combined control scheme for monitoring the frequency and size of an attribute event
Abstract:
A traffic accident can be considered as an example of the attribute
events, and the number of the injured in each accident is called the event
size. Some control charts have been developed for monitoring either the
time interval (T) between the occurrences of an event or the event size
(C) in each occurrence. This article studies the statistical monitoring of
the attribute events in which T and C are monitored simultaneously and C
is an integer. Essentially, it integrates a T chart and a C chart, and is
therefore referred to as a T&C scheme. Our studies show that the new chart
is more effective than an individual T chart or C chart for detecting the
out-of-control status of the event, in particular for detecting downward
shifts (sparse occurrence and/or small size). Another desirable feature of
the T&C scheme is that its detection effectiveness is more invariable
against different types of shifts (i.e. T shift, C shift and joint shift
in T&C) compared with an individual T or C chart. The improvement in
performance is achieved due to the simultaneous monitoring of T and C. The
T&C scheme can be applied in manufacturing systems and especially in
non-manufacturing sectors (e.g. supply chain management, health care
industry, disaster management and security control).
Journal: Journal of Applied Statistics
Pages: 1991-2013
Issue: 12
Volume: 37
Year: 2010
Keywords: quality control, statistical process control, control chart, time between events, attribute event,
X-DOI: 10.1080/02664760903207324
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903207324
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:1991-2013
Template-Type: ReDIF-Article 1.0
Author-Name: Gavin Ross
Author-X-Name-First: Gavin
Author-X-Name-Last: Ross
Author-Name: C. Sarada
Author-X-Name-First: C.
Author-X-Name-Last: Sarada
Title: Reparameterization of nonlinear statistical models: a case study
Abstract:
The importance of finding appropriate parameterizations for nonlinear
statistical models is highlighted. The purpose of this paper is to explore
the principles of reparameterization, using an example from real data. It
is shown that stable parameterizations allow likelihood-based confidence
intervals to be computed. Further, it is noted that the choice of error
distribution may seriously affect the estimates and confidence intervals
of quantities of interest. The influence of each observation on the
estimation of each parameter is displayed for each error model.
Multidimensional likelihood contours may be displayed pairwise using
profile likelihood computations.
Journal: Journal of Applied Statistics
Pages: 2015-2026
Issue: 12
Volume: 37
Year: 2010
Keywords: aphid population growth model, normal distributed errors, Poisson distributed errors, profile likelihoods, reparameterization, stable parameters,
X-DOI: 10.1080/02664760903207332
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903207332
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:2015-2026
Template-Type: ReDIF-Article 1.0
Author-Name: Jaromir Antoch
Author-X-Name-First: Jaromir
Author-X-Name-Last: Antoch
Author-Name: Lubos Prchal
Author-X-Name-First: Lubos
Author-X-Name-Last: Prchal
Author-Name: Maria Rosaria De Rosa
Author-X-Name-First: Maria
Author-X-Name-Last: Rosaria De Rosa
Author-Name: Pascal Sarda
Author-X-Name-First: Pascal
Author-X-Name-Last: Sarda
Title: Electricity consumption prediction with functional linear regression using spline estimators
Abstract:
A functional linear regression model linking observations of a functional
response variable with measurements of an explanatory functional variable
is considered. This model serves to analyse a real data set describing
electricity consumption in Sardinia. The interest lies in predicting
either oncoming weekends' or oncoming weekdays' consumption, provided
actual weekdays' consumption is known. A B-spline estimator of the
functional parameter is used. Selected computational issues are addressed
as well.
Journal: Journal of Applied Statistics
Pages: 2027-2041
Issue: 12
Volume: 37
Year: 2010
Keywords: functional linear regression, functional response, ARH(1), penalized least squares, B-spline, electricity consumption in Sardinia,
X-DOI: 10.1080/02664760903214395
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903214395
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:2027-2041
Template-Type: ReDIF-Article 1.0
Author-Name: Chin Wen Cheong
Author-X-Name-First: Chin Wen
Author-X-Name-Last: Cheong
Title: Optimal choice of sample fraction in univariate financial tail index estimation
Abstract:
This study introduces a technique to estimate the Pareto distribution of
the stock exchange index by using the maximum-likelihood Hill estimator.
Recursive procedures based on the goodness-of-fit statistics are used to
determine the optimal threshold fraction of extreme values to be included
in tail estimation. These procedures are applied to three indices in the
Malaysian stock market which included the consideration of a drastic
economic event such as the Asian financial crisis. The empirical results
evidenced alternating varying behavior of heavy-tailed distributions in
the regimes for both upper and lower tails.
Journal: Journal of Applied Statistics
Pages: 2043-2056
Issue: 12
Volume: 37
Year: 2010
Keywords: heavy-tailed distribution, Hill estimator, goodness-of-fit test, structural change,
X-DOI: 10.1080/02664760903214403
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903214403
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:2043-2056
Template-Type: ReDIF-Article 1.0
Author-Name: Kung-Jong Lui
Author-X-Name-First: Kung-Jong
Author-X-Name-Last: Lui
Author-Name: Kuang-Chao Chang
Author-X-Name-First: Kuang-Chao
Author-X-Name-Last: Chang
Title: Notes on odds ratio estimation for a randomized clinical trial with noncompliance and missing outcomes
Abstract:
The odds ratio (OR) has been recommended elsewhere to measure the
relative treatment efficacy in a randomized clinical trial (RCT), because
it possesses a few desirable statistical properties. In practice, it is
not uncommon to come across an RCT in which there are patients who do not
comply with their assigned treatments and patients whose outcomes are
missing. Under the compound exclusion restriction, latent ignorable and
monotonicity assumptions, we derive the maximum likelihood estimator (MLE)
of the OR and apply Monte Carlo simulation to compare its performance with
those of the other two commonly used estimators for missing completely at
random (MCAR) and for the intention-to-treat (ITT) analysis based on
patients with known outcomes, respectively. We note that both estimators
for MCAR and the ITT analysis may produce a misleading inference of the OR
even when the relative treatment effect is equal. We further derive three
asymptotic interval estimators for the OR, including the interval
estimator using Wald's statistic, the interval estimator using the
logarithmic transformation, and the interval estimator using an ad hoc
procedure of combining the above two interval estimators. On the basis of
a Monte Carlo simulation, we evaluate the finite-sample performance of
these interval estimators in a variety of situations. Finally, we use the
data taken from a randomized encouragement design studying the effect of
flu shots on the flu-related hospitalization rate to illustrate the use of
the MLE and the asymptotic interval estimators for the OR
developed here.
Journal: Journal of Applied Statistics
Pages: 2057-2071
Issue: 12
Volume: 37
Year: 2010
Keywords: odds ratio, noncompliance, missing outcomes, interval estimators, ITT analysis,
X-DOI: 10.1080/02664760903214411
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903214411
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:2057-2071
Template-Type: ReDIF-Article 1.0
Author-Name: M. P. Gadre
Author-X-Name-First: M. P.
Author-X-Name-Last: Gadre
Author-Name: K. A. Joshi
Author-X-Name-First: K. A.
Author-X-Name-Last: Joshi
Author-Name: R. N. Rattihalli
Author-X-Name-First: R. N.
Author-X-Name-Last: Rattihalli
Title: A side sensitive modified group runs control chart to detect shifts in the process mean
Abstract:
Gadre and Rattihalli [5] have introduced the Modified Group Runs (MGR)
control chart to identify the increases in fraction non-conforming and to
detect shifts in the process mean. The MGR chart reduces the
out-of-control average time-to-signal (ATS), as compared with most of the
well-known control charts. In this article, we develop the Side Sensitive
Modified Group Runs (SSMGR) chart to detect shifts in the process mean.
With the help of numerical examples, it is illustrated that the SSMGR
chart performs better than the Shewhart's X chart, the synthetic chart
[12], the Group Runs chart [4], the Side Sensitive Group Runs chart [6],
as well as the MGR chart [5]. In some situations it is also superior to
the Cumulative Sum chart p9] and the exponentially weighed moving average
chart [10]. In the steady state also, its performance is better than the
above charts.
Journal: Journal of Applied Statistics
Pages: 2073-2087
Issue: 12
Volume: 37
Year: 2010
Keywords: average time-to-signal, CRL chart, EWMA chart, GR chart, MGR chart, SSGR chart, steady-state ATS, synthetic chart,
X-DOI: 10.1080/02664760903222190
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903222190
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:2073-2087
Template-Type: ReDIF-Article 1.0
Author-Name: Marianne Frisen
Author-X-Name-First: Marianne
Author-X-Name-Last: Frisen
Author-Name: Eva Andersson
Author-X-Name-First: Eva
Author-X-Name-Last: Andersson
Author-Name: Linus Schioler
Author-X-Name-First: Linus
Author-X-Name-Last: Schioler
Title: Evaluation of multivariate surveillance
Abstract:
Multivariate surveillance is of interest in many areas such as industrial
production, bioterrorism detection, spatial surveillance, and financial
transaction strategies. Some of the suggested approaches to multivariate
surveillance have been multivariate counterparts to the univariate
Shewhart, EWMA, and CUSUM methods. Our emphasis is on the special
challenges of evaluating multivariate surveillance methods. Some new
measures are suggested and the properties of several measures are
demonstrated by applications to various situations. It is demonstrated
that zero-state and steady-state ARL, which are widely used in univariate
surveillance, should be used with care in multivariate surveillance.
Journal: Journal of Applied Statistics
Pages: 2089-2100
Issue: 12
Volume: 37
Year: 2010
Keywords: average run length, EWMA, false alarms, FDR, performance metrics, predictive value, steady state, zero state,
X-DOI: 10.1080/02664760903222208
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903222208
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:2089-2100
Template-Type: ReDIF-Article 1.0
Author-Name: Rosaria Lombardo
Author-X-Name-First: Rosaria
Author-X-Name-Last: Lombardo
Author-Name: Eric Beh
Author-X-Name-First: Eric
Author-X-Name-Last: Beh
Title: Simple and multiple correspondence analysis for ordinal-scale variables using orthogonal polynomials
Abstract:
Correspondence analysis (CA) has gained a reputation for being a very
useful statistical technique for determining the nature of association
between two or more categorical variables. For simple and multiple CA, the
singular value decomposition (SVD) is the primary tool used and allows the
user to construct a low-dimensional space to visualize this association.
As an alternative to SVD, one may consider the bivariate moment
decomposition (BMD), a method of decomposition that involves using
orthogonal polynomials to reflect the structure of ordered categorical
responses. When the features of BMD are combined with SVD, a hybrid
decomposition (HD) is formed. The aim of this paper is to show the
applicability of HD when performing simple and multiple CA.
Journal: Journal of Applied Statistics
Pages: 2101-2116
Issue: 12
Volume: 37
Year: 2010
Keywords: multiple correspondence analysis, ordinal-scale variables, singular value decomposition, bivariate moment decomposition, orthogonal polynomials, hybrid decomposition,
X-DOI: 10.1080/02664760903247692
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903247692
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:2101-2116
Template-Type: ReDIF-Article 1.0
Author-Name: Tobias Verbeke
Author-X-Name-First: Tobias
Author-X-Name-Last: Verbeke
Title: BOOK REVIEW
Abstract:
Journal: Journal of Applied Statistics
Pages: 2117-2118
Issue: 12
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760902931346
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760902931346
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:2117-2118
Template-Type: ReDIF-Article 1.0
Author-Name: Faisel Yunus
Author-X-Name-First: Faisel
Author-X-Name-Last: Yunus
Title: Statistics Using SPSS: An Integrative Approach, second edition
Abstract:
Journal: Journal of Applied Statistics
Pages: 2119-2120
Issue: 12
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760903075515
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903075515
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:2119-2120
Template-Type: ReDIF-Article 1.0
Author-Name: A. C. Brooms
Author-X-Name-First: A. C.
Author-X-Name-Last: Brooms
Title: Data Manipulation with R
Abstract:
Journal: Journal of Applied Statistics
Pages: 2121-2121
Issue: 12
Volume: 37
Year: 2010
X-DOI: 10.1080/02664760903075531
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903075531
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:37:y:2010:i:12:p:2121-2121
Template-Type: ReDIF-Article 1.0
Author-Name: Robert Aykroyd
Author-X-Name-First: Robert
Author-X-Name-Last: Aykroyd
Title: Editorial
Abstract:
Journal: Journal of Applied Statistics
Pages: 1-1
Issue: 1
Volume: 38
Year: 2011
X-DOI: 10.1080/02664763.2011.537062
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2011.537062
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:1-1
Template-Type: ReDIF-Article 1.0
Author-Name: Laura Gosoniu
Author-X-Name-First: Laura
Author-X-Name-Last: Gosoniu
Author-Name: Penelope Vounatsou
Author-X-Name-First: Penelope
Author-X-Name-Last: Vounatsou
Title: Non-stationary partition modeling of geostatistical data for malaria risk mapping
Abstract:
The most common assumption in geostatistical modeling of malaria is
stationarity, that is spatial correlation is a function of the separation
vector between locations. However, local factors (environmental or
human-related activities) may influence geographical dependence in malaria
transmission differently at different locations, introducing
non-stationarity. Ignoring this characteristic in malaria spatial modeling
may lead to inaccurate estimates of the standard errors for both the
covariate effects and the predictions. In this paper, a model based on
random Voronoi tessellation that takes into account non-stationarity was
developed. In particular, the spatial domain was partitioned into
sub-regions (tiles), a stationary spatial process was assumed within each
tile and between-tile correlation was taken into account. The number and
configuration of the sub-regions are treated as random parameters in the
model and inference is made using reversible jump Markov chain Monte Carlo
simulation. This methodology was applied to analyze malaria survey data
from Mali and to produce a country-level smooth map of malaria risk.
Journal: Journal of Applied Statistics
Pages: 3-13
Issue: 1
Volume: 38
Year: 2011
Keywords: Bayesian inference, geostatistics, kriging, malaria risk, prevalence data, non-stationarity, reversible jump Markov chain Monte Carlo, Voronoi tessellation,
X-DOI: 10.1080/02664760903008961
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903008961
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:3-13
Template-Type: ReDIF-Article 1.0
Author-Name: Chengjie Xiong
Author-X-Name-First: Chengjie
Author-X-Name-Last: Xiong
Author-Name: Gerald van Belle
Author-X-Name-First: Gerald
Author-X-Name-Last: van Belle
Author-Name: Kejun Zhu
Author-X-Name-First: Kejun
Author-X-Name-Last: Zhu
Author-Name: J. Philip Miller
Author-X-Name-First: J. Philip
Author-X-Name-Last: Miller
Author-Name: John Morris
Author-X-Name-First: John
Author-X-Name-Last: Morris
Title: A unified approach of meta-analysis: application to an antecedent biomarker study in Alzheimer's disease
Abstract:
This article provides a unified methodology of meta-analysis that
synthesizes medical evidence by using both available individual patient
data (IPD) and published summary statistics within the framework of
likelihood principle. Most up-to-date scientific evidence on medicine is
crucial information not only to consumers but also to decision makers, and
can only be obtained when existing evidence from the literature and the
most recent IPD are optimally synthesized. We propose a general linear
mixed effects model to conduct meta-analyses when IPD are only available
for some of the studies and summary statistics have to be used for the
rest of the studies. Our approach includes both the traditional
meta-analyses in which only summary statistics are available for all
studies and the other extreme case in which IPD are available for all
studies as special examples. We implement the proposed model with
statistical procedures from standard computing packages. We provide
measures of heterogeneity based on the proposed model. Finally, we
demonstrate the proposed methodology through a real-life example by
studying the cerebrospinal fluid biomarkers to identify individuals with a
high risk of developing Alzheimer's disease when they are still
cognitively normal.
Journal: Journal of Applied Statistics
Pages: 15-27
Issue: 1
Volume: 38
Year: 2011
Keywords: confidence interval, general linear mixed effects model, heterogeneity index, individual patient data, maximum likelihood estimate (MLE), meta-analyses,
X-DOI: 10.1080/02664760903008987
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903008987
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:15-27
Template-Type: ReDIF-Article 1.0
Author-Name: Fahimah Al-Awadhi
Author-X-Name-First: Fahimah
Author-X-Name-Last: Al-Awadhi
Author-Name: Merrilee Hurn
Author-X-Name-First: Merrilee
Author-X-Name-Last: Hurn
Author-Name: Christopher Jennison
Author-X-Name-First: Christopher
Author-X-Name-Last: Jennison
Title: Three-dimensional Bayesian image analysis and confocal microscopy
Abstract:
We report methods for tackling a challenging three-dimensional (3D)
deconvolution problem arising in confocal microscopy. We fit a marked
point process model for the set of cells in the sample using Bayesian
methods; this produces automatic or semi-automatic segmentations showing
the shape, size, orientation and spatial arrangement of objects in a
sample. Importantly, the methods also provide measures of uncertainty
about size and shape attributes. The 3D problem is considerably more
demanding computationally than the two-dimensional analogue considered in
Al-Awadhi et al. [2] due to the much larger data set and
higher-dimensional descriptors for objects in the image. In using Markov
chain Monte Carlo simulation to draw samples from the posterior
distribution, substantial computing effort can be consumed simply in
reaching the main area of support of the posterior distribution. For more
effective use of computation time, we use morphological techniques to help
construct an initial typical image under the posterior distribution.
Journal: Journal of Applied Statistics
Pages: 29-46
Issue: 1
Volume: 38
Year: 2011
Keywords: Bayesian statistics, confocal microscopy, image analysis, Markov chain Monte Carlo methods, mathematical morphology, object recognition, stochastic simulation, three-dimensional deconvolution,
X-DOI: 10.1080/02664760903117747
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903117747
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:29-46
Template-Type: ReDIF-Article 1.0
Author-Name: Hakan Demirtas
Author-X-Name-First: Hakan
Author-X-Name-Last: Demirtas
Author-Name: Donald Hedeker
Author-X-Name-First: Donald
Author-X-Name-Last: Hedeker
Title: Generating multivariate continuous data via the notion of nearest neighbors
Abstract:
Taylor and Thompson [15] introduced a clever algorithm for simulating
multivariate continuous data sets that resemble the original data. Their
approach is predicated upon determining a few nearest neighbors of a given
row of data through a statistical distance measure, and subsequently
combining the observations by stochastic multipliers that are drawn from a
uniform distribution to generate simulated data that essentially maintain
the original data trends. The newly drawn values are assumed to come from
the same underlying hypothetical process that governs the mechanism of how
the data are formed. This technique is appealing in that no density
estimation is required. We believe that this data-based simulation method
has substantial potential in multivariate data generation due to the local
nature of the generation scheme, which does not have strict specification
requirements as in most other algorithms. In this work, we provide two R
routines: one has a built-in simulator for finding the optimal number of
nearest neighbors for any given data set, and the other generates
pseudo-random data using this optimal number.
Journal: Journal of Applied Statistics
Pages: 47-55
Issue: 1
Volume: 38
Year: 2011
Keywords: simulation, random number generation, density estimation, bootstrap, nearest neighbors,
X-DOI: 10.1080/02664760903229260
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903229260
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:47-55
Template-Type: ReDIF-Article 1.0
Author-Name: Vicente Cancho
Author-X-Name-First: Vicente
Author-X-Name-Last: Cancho
Author-Name: Josemar Rodrigues
Author-X-Name-First: Josemar
Author-X-Name-Last: Rodrigues
Author-Name: Mario de Castro
Author-X-Name-First: Mario
Author-X-Name-Last: de Castro
Title: A flexible model for survival data with a cure rate: a Bayesian approach
Abstract:
In this paper we deal with a Bayesian analysis for right-censored
survival data suitable for populations with a cure rate. We consider a
cure rate model based on the negative binomial distribution, encompassing
as a special case the promotion time cure model. Bayesian analysis is
based on Markov chain Monte Carlo (MCMC) methods. We also present some
discussion on model selection and an illustration with a real
data set.
Journal: Journal of Applied Statistics
Pages: 57-70
Issue: 1
Volume: 38
Year: 2011
Keywords: survival analysis, cure rate models, long-term survival models, negative binomial distribution, Bayesian analysis, piecewise exponential distribution, Weibull distribution,
X-DOI: 10.1080/02664760903254052
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903254052
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:57-70
Template-Type: ReDIF-Article 1.0
Author-Name: Guglielmo Maria Caporale
Author-X-Name-First: Guglielmo Maria
Author-X-Name-Last: Caporale
Author-Name: Luis Gil-Alana
Author-X-Name-First: Luis
Author-X-Name-Last: Gil-Alana
Title: Fractional integration and impulse responses: a bivariate application to real output in the USA and four Scandinavian countries
Abstract:
This article analyzes impulse response functions in the context of vector
fractionally integrated time series. We derive analytically the
restrictions required to identify the structural-form system. As an
illustration of the recommended procedure, we carry out an empirical
application based on a bivariate system including real output in the USA
and, in turn, in one of the four Scandinavian countries (Denmark, Finland,
Norway, and Sweden). The empirical results appear to be sensitive, to some
extent, to the specification of the stochastic process driving the
disturbances, but generally a positive shock to US output has a positive
effect on the Scandinavian countries, which tend to disappear in the long
run.
Journal: Journal of Applied Statistics
Pages: 71-85
Issue: 1
Volume: 38
Year: 2011
Keywords: long memory, multivariate time series, impulse response functions,
X-DOI: 10.1080/02664760903254060
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903254060
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:71-85
Template-Type: ReDIF-Article 1.0
Author-Name: Aurelia De Araujo Rodrigues
Author-X-Name-First: Aurelia
Author-X-Name-Last: De Araujo Rodrigues
Author-Name: Eugenio Kahn Epprecht
Author-X-Name-First: Eugenio Kahn
Author-X-Name-Last: Epprecht
Author-Name: Maysa Sacramento De Magalhaes
Author-X-Name-First: Maysa Sacramento
Author-X-Name-Last: De Magalhaes
Title: Double-sampling control charts for attributes
Abstract:
In this article, we propose a double-sampling (DS) np control chart. We
assume that the time interval between samples is fixed. The choice of the
design parameters of the proposed chart and also comparisons between
charts are based on statistical properties, such as the average number of
samples until a signal. The optimal design parameters of the proposed
control chart are obtained. During the optimization procedure, constraints
are imposed on the in-control average sample size and on the in-control
average run length. In this way, required statistical properties can be
assured. Varying some input parameters, the proposed DS np chart is
compared with the single-sampling np chart, variable sample size np chart,
CUSUM np and EWMA np charts. The comparisons are carried out considering
the optimal design for each chart. For the ranges of parameters
considered, the DS scheme is the fastest one for the detection of
increases of 100% or more in the fraction non-conforming and, moreover,
the DS np chart is easy to operate.
Journal: Journal of Applied Statistics
Pages: 87-112
Issue: 1
Volume: 38
Year: 2011
Keywords: double sampling, np charts, statistical design, EWMA, CUSUM, variable sample size,
X-DOI: 10.1080/02664760903266007
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903266007
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:87-112
Template-Type: ReDIF-Article 1.0
Author-Name: Li-Chu Chien
Author-X-Name-First: Li-Chu
Author-X-Name-Last: Chien
Title: A robust diagnostic plot for explanatory variables under model mis-specification
Abstract:
A typical added variable plot is a commonly used plot in assessing the
accuracy of a normal linear model. This plot is often used to evaluate the
effect of adding an explanatory variable into the model and to detect
possibly high leverage points or influential observations on the added
variable. However, this type of plot is generally in doubt, once the
normal distributional assumptions are violated. In this article, we extend
the robust likelihood technique introduced by Royall and Tsou [11] to
propose a robust added variable plot. The validity of this diagnostic plot
requires no knowledge of the true underlying distributions so long as
their second moments exist. The usefulness of the robust graphical
approach is demonstrated through a few illustrations and simulations.
Journal: Journal of Applied Statistics
Pages: 113-126
Issue: 1
Volume: 38
Year: 2011
Keywords: added variable plot, high leverage points, influential data, adjusted normal regression,
X-DOI: 10.1080/02664760903271940
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903271940
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:113-126
Template-Type: ReDIF-Article 1.0
Author-Name: Jeremy Balka
Author-X-Name-First: Jeremy
Author-X-Name-Last: Balka
Author-Name: Anthony Desmond
Author-X-Name-First: Anthony
Author-X-Name-Last: Desmond
Author-Name: Paul McNicholas
Author-X-Name-First: Paul
Author-X-Name-Last: McNicholas
Title: Bayesian and likelihood inference for cure rates based on defective inverse Gaussian regression models
Abstract:
Failure time models are considered when there is a subpopulation of
individuals that is immune, or not susceptible, to an event of interest.
Such models are of considerable interest in biostatistics. The most common
approach is to postulate a proportion p of immunes or long-term survivors
and to use a mixture model [5]. This paper introduces the defective
inverse Gaussian model as a cure model and examines the use of the Gibbs
sampler together with a data augmentation algorithm to study Bayesian
inferences both for the cured fraction and the regression parameters. The
results of the Bayesian and likelihood approaches are illustrated on two
real data sets.
Journal: Journal of Applied Statistics
Pages: 127-144
Issue: 1
Volume: 38
Year: 2011
Keywords: cure rates, defective inverse Gaussian, Gibbs sampler, survival analysis,
X-DOI: 10.1080/02664760903301127
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903301127
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:127-144
Template-Type: ReDIF-Article 1.0
Author-Name: E. J. Allen
Author-X-Name-First: E. J.
Author-X-Name-Last: Allen
Author-Name: V. T. Farewell
Author-X-Name-First: V. T.
Author-X-Name-Last: Farewell
Title: The use of variance components for the assessment of outcome measures in rheumatology
Abstract:
There is current interest in the development of new or improved outcome
measures for rheumatological diseases. In the early stages of development,
attention is usually directed to how well the measure distinguishes
between patients and whether different observers attach similar values of
the measure to the same patient. An approach, based on variance
components, to the assessment of outcome measures is presented. The need
to assess different aspects of variation associated with a measure is
stressed. The terms 'observer reliability' and 'agreement' are frequently
used in the evaluation of measurement instruments, and are often used
interchangeably. In this paper, we use the terms to refer to different
concepts assessing different aspects of variation. They are likely to
correspond well in heterogeneous populations, but not in homogeneous
populations where reliability will generally be low but agreement may well
be high. Results from a real patient exercise, designed to study a set of
tools for assessing myositis outcomes, are used to illustrate the approach
that examines both reliability and agreement, and the need to evaluate
both is demonstrated. A new measure of agreement, based on the ratio of
standard deviations, is presented and inference procedures are discussed.
To facilitate the interpretation of the combination of measures of
reliability and agreement, a classification system is proposed that
provides a summary of the performance of the tools. The approach is
demonstrated for discrete ordinal and continuous outcomes.
Journal: Journal of Applied Statistics
Pages: 145-159
Issue: 1
Volume: 38
Year: 2011
Keywords: agreement, reliability, variance components, myositis, measurement,
X-DOI: 10.1080/02664760903301135
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903301135
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:145-159
Template-Type: ReDIF-Article 1.0
Author-Name: Guillermo de Leon
Author-X-Name-First: Guillermo
Author-X-Name-Last: de Leon
Author-Name: Pere Grima
Author-X-Name-First: Pere
Author-X-Name-Last: Grima
Author-Name: Xavier Tort-Martorell
Author-X-Name-First: Xavier
Author-X-Name-Last: Tort-Martorell
Title: Comparison of normal probability plots and dot plots in judging the significance of effects in two level factorial designs
Abstract:
In this article, we present a study carried out to compare the
effectiveness of the normal probability plot (NPP) and a simple dot plot
in assessing the significance of the effects in experimental designs with
factors at two levels (2k-p designs). Several groups of students who had
just completed a course that covered factorial designs were asked to
identify the significant effects in a total of 32 situations, 16 of which
were represented using NPPs and the other 16 using dot plots. Although the
32 scenarios were said to be different, there were really only 16
different situations, each of which was represented using the two methods
to be compared. A simple graphical analysis shows no evidence that there
is a difference between the two procedures. However, in designs with 16
runs there are some cases where NPP seems to give slightly
better results.
Journal: Journal of Applied Statistics
Pages: 161-174
Issue: 1
Volume: 38
Year: 2011
Keywords: normal probability plot, significance of effects, factorial design, teaching statistics, experimental design,
X-DOI: 10.1080/02664760903301143
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903301143
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:161-174
Template-Type: ReDIF-Article 1.0
Author-Name: Bin Li
Author-X-Name-First: Bin
Author-X-Name-Last: Li
Author-Name: R. S. Sanderlin
Author-X-Name-First: R. S.
Author-X-Name-Last: Sanderlin
Author-Name: Rebecca Melanson
Author-X-Name-First: Rebecca
Author-X-Name-Last: Melanson
Author-Name: Qingzhao Yu
Author-X-Name-First: Qingzhao
Author-X-Name-Last: Yu
Title: Spatio-temporal analysis of a plant disease in a non-uniform crop: a Monte Carlo approach
Abstract:
Identification of the type of disease pattern and spread in a field is
critical in epidemiological investigations of plant diseases. For example,
an aggregation pattern of infected plants suggests that, at the time of
observation, the pathogen is spreading from a proximal source. Conversely,
a random pattern suggests a lack of spread from a proximal source. Most of
the existing methods of spatial pattern analysis work with only one
variety of plant at each location and with uniform genetic disease
susceptibility across the field. Pecan orchards, used in this study, and
other orchard crops are usually composed of different varieties with
different levels of susceptibility to disease. A new measure is suggested
to characterize the spatio-temporal transmission patterns of disease; a
Monte Carlo test procedure is proposed to test whether the transmission of
disease is random or aggregated. In addition, we propose a
mixed-transmission model, which allows us to quantify the degree of
aggregation effect.
Journal: Journal of Applied Statistics
Pages: 175-182
Issue: 1
Volume: 38
Year: 2011
Keywords: hypothesis testing, lattice system, Monte Carlo, spatial, spatio-temporal analysis,
X-DOI: 10.1080/02664760903301150
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903301150
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:175-182
Template-Type: ReDIF-Article 1.0
Author-Name: J. A. A. Andrade
Author-X-Name-First: J. A. A.
Author-X-Name-Last: Andrade
Author-Name: J. P. Gosling
Author-X-Name-First: J. P.
Author-X-Name-Last: Gosling
Title: Predicting rainy seasons: quantifying the beliefs of prophets
Abstract:
In general, meteorologists find it difficult to make seasonal predictions
in the north-east region of Brazil due to the contrasting atmospheric
phenomena that take place there. The rain prophets claim to be able to
predict the seasonal weather by observing the behavior of nature. Their
predictions have a strong degree of subjectivity; this makes science
(especially meteorology) disregard these predictions, which could be a
relevant source of information for prediction models. In this article, we
regard the prophets' knowledge from a subjectivist point of view: we apply
elicitation of expert knowledge techniques to extract their opinions and
convert them into probability densities that represent their predictions
of forthcoming rainy seasons.
Journal: Journal of Applied Statistics
Pages: 183-193
Issue: 1
Volume: 38
Year: 2011
Keywords: Brazilian climate, elicitation, Kumaraswamy distribution, rain prophets, seasonal weather,
X-DOI: 10.1080/02664760903301168
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903301168
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:183-193
Template-Type: ReDIF-Article 1.0
Author-Name: Ahmad Zubaidi Baharumshah
Author-X-Name-First: Ahmad Zubaidi
Author-X-Name-Last: Baharumshah
Author-Name: Nor Aishah Hamzah
Author-X-Name-First: Nor Aishah
Author-X-Name-Last: Hamzah
Author-Name: Shamsul Rijal Muhammad Sabri
Author-X-Name-First: Shamsul Rijal Muhammad
Author-X-Name-Last: Sabri
Title: Inflation uncertainty and economic growth: evidence from the LAD ARCH model
Abstract:
In this paper, we combined the panel data and least absolute deviation
autoregressive conditional heteroscedastic (ARCH) (L1-ARCH) model to infer
on the relationship between inflation uncertainty and economic growth in
five emerging market economies. Two interesting findings emerged from the
analysis; first, we confirmed that the inflation uncertainty has a
significant and negative effect on economic growth. Second, inflation is
also an important variable and it is detrimental to economic prospects in
the fast-growing Association of Southeast Asian Nations (ASEAN) economies.
All in all, the empirical findings suggest that greater stability in the
economy may be desirable in order to stimulate economic growth in the
region.
Journal: Journal of Applied Statistics
Pages: 195-206
Issue: 1
Volume: 38
Year: 2011
Keywords: inflation uncertainty, economic growth, LAD ARCH,
X-DOI: 10.1080/02664760903406397
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903406397
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:195-206
Template-Type: ReDIF-Article 1.0
Author-Name: Paul Mielke
Author-X-Name-First: Paul
Author-X-Name-Last: Mielke
Author-Name: Kenneth Berry
Author-X-Name-First: Kenneth
Author-X-Name-Last: Berry
Author-Name: Janis Johnston
Author-X-Name-First: Janis
Author-X-Name-Last: Johnston
Title: Robustness without rank order statistics
Abstract:
An alternative to conventional rank tests based on a Euclidean distance
analysis space is described. Comparisons based on exact probability values
among classical two-sample t-tests and the Wilcoxon-Mann-Whitney test
illustrate the advantages of the Euclidean distance analysis space
alternative.
Journal: Journal of Applied Statistics
Pages: 207-214
Issue: 1
Volume: 38
Year: 2011
Keywords: analysis space, Euclidean distance, rank-order statistics, robustness,
X-DOI: 10.1080/02664760903406439
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664760903406439
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:1:p:207-214
Template-Type: ReDIF-Article 1.0
Author-Name: S. W. Human
Author-X-Name-First: S. W.
Author-X-Name-Last: Human
Author-Name: P. Kritzinger
Author-X-Name-First: P.
Author-X-Name-Last: Kritzinger
Author-Name: S. Chakraborti
Author-X-Name-First: S.
Author-X-Name-Last: Chakraborti
Title: Robustness of the EWMA control chart for individual observations
Abstract:
The traditional exponentially weighted moving average (EWMA) chart is one
of the most popular control charts used in practice today. The in-control
robustness is the key to the proper design and implementation of any
control chart, lack of which can render its out-of-control shift detection
capability almost meaningless. To this end, Borror et al. [5] studied the
performance of the traditional EWMA chart for the mean for i.i.d. data. We
use a more extensive simulation study to further investigate the
in-control robustness (to non-normality) of the three different EWMA
designs studied by Borror et al. [5]. Our study includes a much wider
collection of non-normal distributions including light- and heavy-tailed
and symmetric and asymmetric bi-modal as well as the contaminated normal,
which is particularly useful to study the effects of outliers. Also, we
consider two separate cases: (i) when the process mean and standard
deviation are both known and (ii) when they are both unknown and estimated
from an in-control Phase I sample. In addition, unlike in the study done
by Borror et al. [5], the average run-length (ARL) is not used as the sole
performance measure in our study, we consider the standard deviation of
the run-length (SDRL), the median run-length (MDRL), and the first and the
third quartiles as well as the first and the 99th percentiles of the
in-control run-length distribution for a better overall assessment of the
traditional EWMA chart's in-control performance. Our findings sound a
cautionary note to the (over) use of the EWMA chart in practice, at least
with some types of non-normal data. A summary and recommendations are
provided.
Journal: Journal of Applied Statistics
Pages: 2071-2087
Issue: 10
Volume: 38
Year: 2011
Keywords: average run-length, boxplot, distribution-free, median run-length, non-parametric, percentile, run-length, simulation,
X-DOI: 10.1080/02664763.2010.545114
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545114
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2071-2087
Template-Type: ReDIF-Article 1.0
Author-Name: Yun Zhao
Author-X-Name-First: Yun
Author-X-Name-Last: Zhao
Author-Name: Andy Lee
Author-X-Name-First: Andy
Author-X-Name-Last: Lee
Author-Name: Kelvin Yau
Author-X-Name-First: Kelvin
Author-X-Name-Last: Yau
Author-Name: Geoffrey McLachlan
Author-X-Name-First: Geoffrey
Author-X-Name-Last: McLachlan
Title: Assessing the adequacy of Weibull survival models: a simulated envelope approach
Abstract:
The Weibull proportional hazards model is commonly used for analysing
survival data. However, formal tests of model adequacy are still lacking.
It is well known that residual-based goodness-of-fit measures are
inappropriate for censored data. In this paper, a graphical diagnostic
plot of Cox-Snell residuals with a simulated envelope added is proposed to
assess the adequacy of Weibull survival models. Both single component and
two-component mixture models with random effects are considered for
recurrent failure time data. The effectiveness of the diagnostic method is
illustrated using simulated data sets and data on recurrent urinary tract
infections of elderly women.
Journal: Journal of Applied Statistics
Pages: 2089-2097
Issue: 10
Volume: 38
Year: 2011
Keywords: goodness-of-fit, mixture models, model adequacy, simulated envelope, survival analysis,
X-DOI: 10.1080/02664763.2010.545115
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545115
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2089-2097
Template-Type: ReDIF-Article 1.0
Author-Name: Luis Gutierrez
Author-X-Name-First: Luis
Author-X-Name-Last: Gutierrez
Author-Name: Fernando Quintana
Author-X-Name-First: Fernando
Author-X-Name-Last: Quintana
Author-Name: Dietrich von Baer
Author-X-Name-First: Dietrich
Author-X-Name-Last: von Baer
Author-Name: Claudia Mardones
Author-X-Name-First: Claudia
Author-X-Name-Last: Mardones
Title: Multivariate Bayesian discrimination for varietal authentication of Chilean red wine
Abstract:
The process through which food or beverages is verified as complying with
its label description is called food authentication. We propose to treat
the authentication process as a classification problem. We consider
multivariate observations and propose a multivariate Bayesian classifier
that extends results from the univariate linear mixed model to the
multivariate case. The model allows for correlation between wine samples
from the same valley. We apply the proposed model to concentration
measurements of nine chemical compounds named anthocyanins in 399 samples
of Chilean red wines of the varieties Merlot, Carmenere and Cabernet
Sauvignon, vintages 2001-2004. We find satisfactory results, with a
misclassification error rate based on a leave-one-out cross-validation
approach of about 4%. The multivariate extension can be generally applied
to authentication of food and beverages, where it is common to have
several dependent measurements per sample unit, and it would not be
appropriate to treat these as independent univariate versions of a common
model.
Journal: Journal of Applied Statistics
Pages: 2099-2109
Issue: 10
Volume: 38
Year: 2011
Keywords: Bayesian classifier, Gibbs sampling, hierarchical linear models, food authentication,
X-DOI: 10.1080/02664763.2010.545116
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545116
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2099-2109
Template-Type: ReDIF-Article 1.0
Author-Name: Thomas Smith
Author-X-Name-First: Thomas
Author-X-Name-Last: Smith
Author-Name: Cornelius McKenna
Author-X-Name-First: Cornelius
Author-X-Name-Last: McKenna
Title: A weighted test of internal symmetry
Abstract:
This study examines extensions of McNemar's Test with multinomial
responses, and proposes a linear weighting scheme, based on the distance
of the response change, that is applied to one of these extensions
(Bowker's test). This weighted version of Bowker's test is then
appropriate for ordinal response variables. A Monte Carlo simulation was
conducted to examine the Type I error rate of the weighted Bowker's test
for a cross-classification table based on a five-category ordinal response
scale. The weighted Bowker's test was also applied to a data set involving
change in student attitudes towards mathematics. The results of the
weighted Bowker's test were compared with the results of Bowker's test
applied to the same set of data.
Journal: Journal of Applied Statistics
Pages: 2111-2118
Issue: 10
Volume: 38
Year: 2011
Keywords: Bowker's test for internal symmetry, McNemar's test, homogeneity, contingency table, simultaneous inference,
X-DOI: 10.1080/02664763.2010.545117
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545117
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2111-2118
Template-Type: ReDIF-Article 1.0
Author-Name: Rosaria Lombardo
Author-X-Name-First: Rosaria
Author-X-Name-Last: Lombardo
Author-Name: Eric Beh
Author-X-Name-First: Eric
Author-X-Name-Last: Beh
Author-Name: Antonello D'Ambra
Author-X-Name-First: Antonello
Author-X-Name-Last: D'Ambra
Title: Studying the dependence between ordinal-nominal categorical variables via orthogonal polynomials
Abstract:
In situations where the structure of one of the variables of a
contingency table is ordered recent theory involving the augmentation of
singular vectors and orthogonal polynomials has shown to be applicable for
performing symmetric and non-symmetric correspondence analysis. Such an
approach has the advantage of allowing the user to identify the source of
variation between the categories in terms of components that reflect
linear, quadratic and higher-order trends. The purpose of this paper is to
focus on the study of two asymmetrically related variables
cross-classified to form a two-way contingency table where only one of the
variables has an ordinal structure.
Journal: Journal of Applied Statistics
Pages: 2119-2132
Issue: 10
Volume: 38
Year: 2011
Keywords: ordered categorical variables, non-symmetric correspondence analysis, bivariate moment decomposition, singular value decomposition, orthogonal polynomials,
X-DOI: 10.1080/02664763.2010.545118
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545118
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2119-2132
Template-Type: ReDIF-Article 1.0
Author-Name: G. Barbato
Author-X-Name-First: G.
Author-X-Name-Last: Barbato
Author-Name: E. M. Barini
Author-X-Name-First: E. M.
Author-X-Name-Last: Barini
Author-Name: G. Genta
Author-X-Name-First: G.
Author-X-Name-Last: Genta
Author-Name: R. Levi
Author-X-Name-First: R.
Author-X-Name-Last: Levi
Title: Features and performance of some outlier detection methods
Abstract:
A review of several statistical methods that are currently in use for
outlier identification is presented, and their performances are compared
theoretically for typical statistical distributions of experimental data,
considering values derived from the distribution of extreme order
statistics as reference terms. A simple modification of a popular, broadly
used method based upon box-plot is introduced, in order to overcome a
major limitation concerning sample size. Examples are presented concerning
exploitation of methods considered on two data sets: a historical one
concerning evaluation of an astronomical constant performed by a number of
leading observatories and a substantial database pertaining to an ongoing
investigation on absolute measurement of gravity acceleration, exhibiting
peculiar aspects concerning outliers. Some problems related to outlier
treatment are examined, and the requirement of both statistical analysis
and expert opinion for proper outlier management is underlined.
Journal: Journal of Applied Statistics
Pages: 2133-2149
Issue: 10
Volume: 38
Year: 2011
Keywords: exclusion rules, order statistics, outliers, robust statistics, statistical test,
X-DOI: 10.1080/02664763.2010.545119
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545119
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2133-2149
Template-Type: ReDIF-Article 1.0
Author-Name: I. Albarran
Author-X-Name-First: I.
Author-X-Name-Last: Albarran
Author-Name: P. J. Alonso
Author-X-Name-First: P. J.
Author-X-Name-Last: Alonso
Author-Name: J. M. Marin
Author-X-Name-First: J. M.
Author-X-Name-Last: Marin
Title: Nonlinear models of disability and age applied to census data
Abstract:
It is usually considered that the proportion of handicapped people grows
with age. Namely, the older the man/woman, the more the level of
disability he/she suffers. However, empirical evidence shows that this
assessment is not always true, or at least, it is not true in the Spanish
population. The study tries to assess the impact of age on disability in
Spain. Each gender has been treated separately because it can be shown
that men and women have their own pattern of behaviour. Three different
methods of estimation have been used to check the link between those
variables. The results seem to support the idea that the relationship
among age and the intensity of disability is not always direct. One of the
concluding remarks in this analysis is that the method of estimation has a
great incidence in the final results, especially in central ages between
20 and 80 years old.
Journal: Journal of Applied Statistics
Pages: 2151-2163
Issue: 10
Volume: 38
Year: 2011
Keywords: disability, local estimation, splines, neural networks, BARS,
X-DOI: 10.1080/02664763.2010.545120
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545120
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2151-2163
Template-Type: ReDIF-Article 1.0
Author-Name: R. S. Sparks
Author-X-Name-First: R. S.
Author-X-Name-Last: Sparks
Author-Name: T. Keighley
Author-X-Name-First: T.
Author-X-Name-Last: Keighley
Author-Name: D. Muscatello
Author-X-Name-First: D.
Author-X-Name-Last: Muscatello
Title: Optimal exponentially weighted moving average (EWMA) plans for detecting seasonal epidemics when faced with non-homogeneous negative binomial counts
Abstract:
Exponentially weighted moving average (EWMA) plans for non-homogeneous
negative binomial counts are developed for detecting the onset of seasonal
disease outbreaks in public health surveillance. These plans are robust to
changes in the in-control mean and over-dispersion parameter of the
negative binomial distribution, and therefore are referred to as adaptive
plans. They differ from the traditional approach of using standardized
forecast errors based on the normality assumption. Plans are investigated
in terms of early signal properties for seasonal epidemics. The paper
demonstrates that the proposed EWMA plan has efficient early detection
properties that can be useful to epidemiologists for communicable and
other disease control and is compared with the CUSUM plan.
Journal: Journal of Applied Statistics
Pages: 2165-2181
Issue: 10
Volume: 38
Year: 2011
Keywords: control charts, EWMA, monitoring, negative binomial counts, statistical process control,
X-DOI: 10.1080/02664763.2010.545184
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545184
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2165-2181
Template-Type: ReDIF-Article 1.0
Author-Name: Kerry Patterson
Author-X-Name-First: Kerry
Author-X-Name-Last: Patterson
Author-Name: Hossein Hassani
Author-X-Name-First: Hossein
Author-X-Name-Last: Hassani
Author-Name: Saeed Heravi
Author-X-Name-First: Saeed
Author-X-Name-Last: Heravi
Author-Name: Anatoly Zhigljavsky
Author-X-Name-First: Anatoly
Author-X-Name-Last: Zhigljavsky
Title: Multivariate singular spectrum analysis for forecasting revisions to real-time data
Abstract:
Real-time data on national accounts statistics typically undergo an
extensive revision process, leading to multiple vintages on the same
generic variable. The time between the publication of the initial and
final data is a lengthy one and raises the question of how to model and
forecast the final vintage of data - an issue that dates from seminal
articles by Mankiw et al. [51], Mankiw and Shapiro [52] and Nordhaus [57].
To solve this problem, we develop the non-parametric method of
multivariate singular spectrum analysis (MSSA) for multi-vintage data.
MSSA is much more flexible than the standard methods of modelling that
involve at least one of the restrictive assumptions of linearity,
normality and stationarity. The benefits are illustrated with data on the
UK index of industrial production: neither the preliminary vintages nor
the competing models are as accurate as the forecasts using MSSA.
Journal: Journal of Applied Statistics
Pages: 2183-2211
Issue: 10
Volume: 38
Year: 2011
Keywords: non-parametric methods, data revisions, trajectory matrix, reconstruction, Hankelisation, recurrence formula, forecasting,
X-DOI: 10.1080/02664763.2010.545371
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545371
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2183-2211
Template-Type: ReDIF-Article 1.0
Author-Name: Carlos dos Santos
Author-X-Name-First: Carlos
Author-X-Name-Last: dos Santos
Author-Name: Jorge Alberto Achcar
Author-X-Name-First: Jorge Alberto
Author-X-Name-Last: Achcar
Title: A Bayesian analysis for the Block and Basu bivariate exponential distribution in the presence of covariates and censored data
Abstract:
In this paper, we introduce a Bayesian Analysis for the Block and Basu
bivariate exponential distribution using Markov Chain Monte Carlo (MCMC)
methods and considering lifetimes in presence of covariates and censored
data. Posterior summaries of interest are obtained using the popular
WinBUGS software. Numerical illustrations are introduced considering a
medical data set related to the recurrence times of infection for kidney
patients and a medical data set related to bone marrow transplantation for
leukemia.
Journal: Journal of Applied Statistics
Pages: 2213-2223
Issue: 10
Volume: 38
Year: 2011
Keywords: Block and Basu exponential distribution, Bayesian analysis, MCMC methods, covariates, censored lifetimes,
X-DOI: 10.1080/02664763.2010.545372
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545372
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2213-2223
Template-Type: ReDIF-Article 1.0
Author-Name: F. Belloc
Author-X-Name-First: F.
Author-X-Name-Last: Belloc
Author-Name: A. Maruotti
Author-X-Name-First: A.
Author-X-Name-Last: Maruotti
Author-Name: L. Petrella
Author-X-Name-First: L.
Author-X-Name-Last: Petrella
Title: How individual characteristics affect university students drop-out: a semiparametric mixed-effects model for an Italian case study
Abstract:
University drop-out is a topic of increasing concern in Italy as well as
in other countries. In empirical analysis, university drop-out is
generally measured by means of a binary variable indicating the drop-out
versus retention. In this paper, we argue that the withdrawal decision is
one of the possible outcomes of a set of four alternatives: retention in
the same faculty, drop out, change of faculty within the same university,
and change of institution. We examine individual-level data collected by
the administrative offices of “Sapienza” University of Rome,
which cover 117 072 students enrolling full-time for a 3-year degree
in the academic years from 2001/2002 to 2006/2007. Relying on a
non-parametric maximum likelihood approach in a finite mixture context, we
introduce a multinomial latent effects model with endogeneity that
accounts for both heterogeneity and omitted covariates. Our estimation
results show that the decisions to change faculty or university have their
own peculiarities, thus we suggest that caution should be used in
interpreting results obtained without modeling all the relevant
alternatives that students face.
Journal: Journal of Applied Statistics
Pages: 2225-2239
Issue: 10
Volume: 38
Year: 2011
Keywords: University drop-out, mixed effects models, multinomial regression, Italian university system,
X-DOI: 10.1080/02664763.2010.545373
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545373
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2225-2239
Template-Type: ReDIF-Article 1.0
Author-Name: Beibei Guo
Author-X-Name-First: Beibei
Author-X-Name-Last: Guo
Author-Name: Yuehua Wu
Author-X-Name-First: Yuehua
Author-X-Name-Last: Wu
Author-Name: Hong Xie
Author-X-Name-First: Hong
Author-X-Name-Last: Xie
Author-Name: Baiqi Miao
Author-X-Name-First: Baiqi
Author-X-Name-Last: Miao
Title: A segmented regime-switching model with its application to stock market indices
Abstract:
This paper evaluates the ability of a Markov regime-switching log-normal
(RSLN) model to capture the time-varying features of stock return and
volatility. The model displays a better ability to depict a fat tail
distribution as compared with using a log-normal model, which means that
the RSLN model can describe observed market behavior better. Our major
objective is to explore the capability of the model to capture stock
market behavior over time. By analyzing the behavior of calibrated
regime-switching parameters over different lengths of time intervals, the
change-point concept is introduced and an algorithm is proposed for
identifying the change-points in the series corresponding to the times
when there are changes in parameter estimates. This algorithm for
identifying change-points is tested on the Standard and Poor's 500 monthly
index data from 1971 to 2008, and the Nikkei 225 monthly index data from
1984 to 2008. It is evident that the change-points we identify match the
big events observed in the US stock market and the Japan stock market
(e.g., the October 1987 stock market crash), and that the segmentations of
stock index series, which are defined as the periods between
change-points, match the observed bear-bull market phases.
Journal: Journal of Applied Statistics
Pages: 2241-2252
Issue: 10
Volume: 38
Year: 2011
Keywords: algorithm, change-point, log-normal, log-returns, Markov process, maximum likelihood estimation, segmented regime-switching model, stock market index, time series,
X-DOI: 10.1080/02664763.2010.545374
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545374
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2241-2252
Template-Type: ReDIF-Article 1.0
Author-Name: Stephen Walters
Author-X-Name-First: Stephen
Author-X-Name-Last: Walters
Author-Name: C. Jane Morrell
Author-X-Name-First: C.
Author-X-Name-Last: Jane Morrell
Author-Name: Pauline Slade
Author-X-Name-First: Pauline
Author-X-Name-Last: Slade
Title: Analysing data from a cluster randomized trial (cRCT) in primary care: a case study
Abstract:
Health technology assessment often requires the evaluation of
interventions which are implemented at the level of the health service
organization unit (e.g. GP practice) for clusters of individuals. In a
cluster randomized controlled trial (cRCT), clusters of patients are
randomized; not each patient individually. The majority of statistical
analyses, in individually RCT, assume that the outcomes on different
patients are independent. In cRCTs there is doubt about the validity of
this assumption as the outcomes of patients, in the same cluster, may be
correlated. Hence, the analysis of data from cRCTs presents a number of
difficulties. The aim of this paper is to describe the statistical methods
of adjusting for clustering, in the context of cRCTs. There are
essentially four approaches to analysing cRCTs: Cluster-level analysis
using aggregate summary data. Regression analysis with robust standard
errors. Random-effects/cluster-specific approach.
Marginal/population-averaged approach. This paper will compare and
contrast the four approaches, using example data, with binary and
continuous outcomes, from a cRCT designed to evaluate the effectiveness of
training Health Visitors in psychological approaches to identify
post-natal depressive symptoms and support post-natal women compared with
usual care. The PoNDER Trial randomized 101 clusters (GP practices) and
collected data on 2659 new mothers with an 18-month follow-up.
Journal: Journal of Applied Statistics
Pages: 2253-2269
Issue: 10
Volume: 38
Year: 2011
Keywords: cluster randomized trial, GLM, marginal model, random-effects model, GEEs,
X-DOI: 10.1080/02664763.2010.545375
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545375
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2253-2269
Template-Type: ReDIF-Article 1.0
Author-Name: Peng Shi
Author-X-Name-First: Peng
Author-X-Name-Last: Shi
Author-Name: Wei Zhang
Author-X-Name-First: Wei
Author-X-Name-Last: Zhang
Title: A copula regression model for estimating firm efficiency in the insurance industry
Abstract:
This article considers the estimation of insurers' cost-efficiency in a
longitudinal context. The current practice ignores the tails of the cost
distribution, where the most and least efficient insurers belong to. To
address this issue, we propose a copula regression model to estimate
insurers' cost frontier. Both time-invariant and time-varying efficiency
are adapted to this framework and various temporal patterns are
considered. In our method, flexible distributions are allowed for the
marginals, and the subject heterogeneity is accommodated through an
association matrix. Specifically, when fitting to the insurance data, we
perform a GB2 regression on insurers total cost and employ a t-copula to
capture their intertemporal dependencies. In doing so, we provide a
nonlinear formulation of the stochastic panel frontier and the parameters
are easily estimated by likelihood-based method. Based on a translog cost
function, the X-efficiency is estimated for US property-casualty insurers.
An economic analysis provides evidences of economies of scale and the
consistency between the cost-efficiency and other performance measures.
Journal: Journal of Applied Statistics
Pages: 2271-2287
Issue: 10
Volume: 38
Year: 2011
Keywords: copula, long-tail regression, longitudinal data, GB2, cost-efficiency,
X-DOI: 10.1080/02664763.2010.545376
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.545376
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2271-2287
Template-Type: ReDIF-Article 1.0
Author-Name: Yinglei Lai
Author-X-Name-First: Yinglei
Author-X-Name-Last: Lai
Author-Name: Baolin Wu
Author-X-Name-First: Baolin
Author-X-Name-Last: Wu
Author-Name: Hongyu Zhao
Author-X-Name-First: Hongyu
Author-X-Name-Last: Zhao
Title: A permutation test approach to the choice of size k for the nearest neighbors classifier
Abstract:
The k nearest neighbors (k-NN) classifier is one of the most popular
methods for statistical pattern recognition and machine learning. In
practice, the size k, the number of neighbors used for classification, is
usually arbitrarily set to one or some other small numbers, or based on
the cross-validation procedure. In this study, we propose a novel
alternative approach to decide the size k. Based on a k-NN-based
multivariate multi-sample test, we assign each k a permutation test based
Z-score. The number of NN is set to the k with the highest Z-score. This
approach is computationally efficient since we have derived the formulas
for the mean and variance of the test statistic under permutation
distribution for multiple sample groups. Several simulation and real-world
data sets are analyzed to investigate the performance of our approach. The
usefulness of our approach is demonstrated through the evaluation of
prediction accuracies using Z-score as a criterion to select the size k.
We also compare our approach to the widely used cross-validation
approaches. The results show that the size k selected by our approach
yields high prediction accuracies when informative features are used for
classification, whereas the cross-validation approach may fail in some
cases.
Journal: Journal of Applied Statistics
Pages: 2289-2302
Issue: 10
Volume: 38
Year: 2011
Keywords: nearest neighbors classifier, number of neighbors, permutation test, prediction accuracy, cross-validation,
X-DOI: 10.1080/02664763.2010.547565
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.547565
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2289-2302
Template-Type: ReDIF-Article 1.0
Author-Name: A. Snoussi
Author-X-Name-First: A.
Author-X-Name-Last: Snoussi
Title: SPC for short-run multivariate autocorrelated processes
Abstract:
This paper discusses the development of a multivariate control charting
technique for short-run autocorrelated data manufacturing environment. The
proposed approach is a combination of the multivariate residual charts for
autocorrelated data and the multivariate transformation technique for
i.i.d. process observations of short lengths. The proposed approach
consists in fitting adequate multivariate time-series model of various
process outputs and computes the residuals, transforming them into
standard normal N(0, 1) data and then using standardized data as inputs to
plot conventional univariate i.i.d. control charts. The objective for
applying multivariate finite horizon techniques for autocorrelated
processes is to allow continuous process monitoring, since all process
outputs are controlled trough the use of a single control chart with
constant control limits. Throughout simulated examples, it is shown that
the proposed short-run process monitoring technique provides approximately
similar shifts detection properties as VAR residual charts.
Journal: Journal of Applied Statistics
Pages: 2303-2312
Issue: 10
Volume: 38
Year: 2011
Keywords: time-series model, univariate statistical process control, multivariate statistical process control, SCC control charts, VAR Residual control charts, V statistics, T2 statistics, average run length,
X-DOI: 10.1080/02664763.2010.547566
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.547566
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2303-2312
Template-Type: ReDIF-Article 1.0
Author-Name: R. A. Hubbard
Author-X-Name-First: R. A.
Author-X-Name-Last: Hubbard
Author-Name: X. H. Zhou
Author-X-Name-First: X. H.
Author-X-Name-Last: Zhou
Title: A comparison of non-homogeneous Markov regression models with application to Alzheimer's disease progression
Abstract:
Markov regression models are useful tools for estimating risk factor
effects on transition rates between multiple disease states. Alzheimer's
disease (AD) is an example of a multi-state disease process where great
interest lies in identifying risk factors for transition. In this context,
non-homogeneous models are required because transition rates change as
subjects age. In this report we propose a non-homogeneous Markov
regression model that allows for reversible and recurrent states,
transitions among multiple states between observations, and unequally
spaced observation times. We conducted simulation studies to compare
performance of estimators for covariate effects from this model and
alternative models when the underlying non-homogeneous process was
correctly specified and under model misspecification. In simulation
studies, we found that covariate effects were biased if non-homogeneity of
the disease process was not accounted for. However, estimates from
non-homogeneous models were robust to misspecification of the form of the
non-homogeneity. We used our model to estimate risk factors for transition
to mild cognitive impairment (MCI) and AD in a longitudinal study of
subjects included in the National Alzheimer's Coordinating Center's
Uniform Data Set. We found that subjects with MCI affecting multiple
cognitive domains were significantly less likely to revert to normal
cognition.
Journal: Journal of Applied Statistics
Pages: 2313-2326
Issue: 10
Volume: 38
Year: 2011
Keywords: Alzheimer's disease, interval censoring, Markov process, mild cognitive impairment, non-homogeneous, panel data,
X-DOI: 10.1080/02664763.2010.547567
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.547567
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2313-2326
Template-Type: ReDIF-Article 1.0
Author-Name: Dipankor Coondoo
Author-X-Name-First: Dipankor
Author-X-Name-Last: Coondoo
Author-Name: Amita Majumder
Author-X-Name-First: Amita
Author-X-Name-Last: Majumder
Author-Name: Somnath Chattopadhyay
Author-X-Name-First: Somnath
Author-X-Name-Last: Chattopadhyay
Title: District-level poverty estimation: a proposed method
Abstract:
This paper develops a method of estimating micro-level poverty in cases
where data are scarce. The method is applied to estimate district-level
poverty using the household level Indian national sample survey data for
two states, viz., West Bengal and Madhya Pradesh. The method involves
estimation of state-level poverty indices from the data formed by pooling
data of all the districts (each time excluding one district) and
multiplying this poverty vector with a known weight matrix to obtain the
unknown district-level poverty vector. The proposed method is expected to
yield reliable estimates at the district level, because the district-level
estimate is now based on a much larger sample size obtained by pooling
data of several districts. This method can be an alternative to the
“small area estimation technique” for estimating poverty at
sub-state levels in developing countries.
Journal: Journal of Applied Statistics
Pages: 2327-2343
Issue: 10
Volume: 38
Year: 2011
Keywords: district-level poverty, scarce data, bootstrap, extraneous information, sub-sample estimate,
X-DOI: 10.1080/02664763.2010.547568
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.547568
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2327-2343
Template-Type: ReDIF-Article 1.0
Author-Name: Pao-Sheng Shen
Author-X-Name-First: Pao-Sheng
Author-X-Name-Last: Shen
Title: Empirical likelihood ratio with doubly truncated data
Abstract:
Doubly truncated data appear in a number of applications, including
astronomy and survival analysis. For doubly-truncated data, the lifetime T
is observable only when U≤T≤V, where U and V are the
left-truncated and right-truncated time, respectively. Based on the
empirical likelihood approach of Zhou [21], we propose a modified EM
algorithm of Turnbull [19] to construct the interval estimator of the
distribution function of T. Simulation results indicate that the empirical
likelihood method can be more efficient than the bootstrap method.
Journal: Journal of Applied Statistics
Pages: 2345-2353
Issue: 10
Volume: 38
Year: 2011
Keywords: likelihood ratio, double truncation, maximization,
X-DOI: 10.1080/02664763.2010.549216
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.549216
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2345-2353
Template-Type: ReDIF-Article 1.0
Author-Name: Gregory Wilding
Author-X-Name-First: Gregory
Author-X-Name-Last: Wilding
Author-Name: Xueya Cai
Author-X-Name-First: Xueya
Author-X-Name-Last: Cai
Author-Name: Alan Hutson
Author-X-Name-First: Alan
Author-X-Name-Last: Hutson
Author-Name: Zhangsheng Yu
Author-X-Name-First: Zhangsheng
Author-X-Name-Last: Yu
Title: A linear model-based test for the heterogeneity of conditional correlations
Abstract:
Current methods of testing the equality of conditional correlations of
bivariate data on a third variable of interest (covariate) are limited due
to discretizing of the covariate when it is continuous. In this study, we
propose a linear model approach for estimation and hypothesis testing of
the Pearson correlation coefficient, where the correlation itself can be
modeled as a function of continuous covariates. The restricted maximum
likelihood method is applied for parameter estimation, and the corrected
likelihood ratio test is performed for hypothesis testing. This approach
allows for flexible and robust inference and prediction of the conditional
correlations based on the linear model. Simulation studies show that the
proposed method is statistically more powerful and more flexible in
accommodating complex covariate patterns than the existing methods. In
addition, we illustrate the approach by analyzing the correlation between
the physical component summary and the mental component summary of the MOS
SF-36 form across a fair number of covariates in the national survey data.
Journal: Journal of Applied Statistics
Pages: 2355-2366
Issue: 10
Volume: 38
Year: 2011
Keywords: correlation coefficient, heterogeneity, linear model, multivariate normal distribution, MOS SF-36,
X-DOI: 10.1080/02664763.2011.559201
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2011.559201
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2355-2366
Template-Type: ReDIF-Article 1.0
Author-Name: Pieter Bastiaan Ober
Author-X-Name-First: Pieter Bastiaan
Author-X-Name-Last: Ober
Title: Basic statistics
Abstract:
Journal: Journal of Applied Statistics
Pages: 2367-2367
Issue: 10
Volume: 38
Year: 2011
X-DOI: 10.1080/02664763.2010.484892
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.484892
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2367-2367
Template-Type: ReDIF-Article 1.0
Author-Name: Kassim Mwitondi
Author-X-Name-First: Kassim
Author-X-Name-Last: Mwitondi
Title: Bayesian computation with R
Abstract:
Journal: Journal of Applied Statistics
Pages: 2367-2368
Issue: 10
Volume: 38
Year: 2011
X-DOI: 10.1080/02664763.2010.484893
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.484893
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2367-2368
Template-Type: ReDIF-Article 1.0
Author-Name: Andrey Kostenko
Author-X-Name-First: Andrey
Author-X-Name-Last: Kostenko
Title: Picturing the uncertain world
Abstract:
Journal: Journal of Applied Statistics
Pages: 2368-2369
Issue: 10
Volume: 38
Year: 2011
X-DOI: 10.1080/02664763.2010.517932
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.517932
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2368-2369
Template-Type: ReDIF-Article 1.0
Author-Name: Han Lin Shang
Author-X-Name-First: Han
Author-X-Name-Last: Lin Shang
Title: Dynamic linear models with R
Abstract:
Journal: Journal of Applied Statistics
Pages: 2369-2370
Issue: 10
Volume: 38
Year: 2011
X-DOI: 10.1080/02664763.2010.517938
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.517938
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2369-2370
Template-Type: ReDIF-Article 1.0
Author-Name: Søren Feodor Nielsen
Author-X-Name-First: Søren Feodor
Author-X-Name-Last: Nielsen
Title: Introductory time series with R
Abstract:
Journal: Journal of Applied Statistics
Pages: 2370-2371
Issue: 10
Volume: 38
Year: 2011
X-DOI: 10.1080/02664763.2010.517940
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.517940
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2370-2371
Template-Type: ReDIF-Article 1.0
Author-Name: Long Kang
Author-X-Name-First: Long
Author-X-Name-Last: Kang
Title: Volatility and time series econometrics: essays in honor of Robert Engle
Abstract:
Journal: Journal of Applied Statistics
Pages: 2371-2372
Issue: 10
Volume: 38
Year: 2011
X-DOI: 10.1080/02664763.2010.530388
File-URL: http://www.tandfonline.com/doi/abs/10.1080/02664763.2010.530388
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:10:p:2371-2372
Template-Type: ReDIF-Article 1.0
Author-Name: H. Zhang
Author-X-Name-First: H.
Author-X-Name-Last: Zhang
Author-Name: Y. Xia
Author-X-Name-First: Y.
Author-X-Name-Last: Xia
Author-Name: R. Chen
Author-X-Name-First: R.
Author-X-Name-Last: Chen
Author-Name: D. Gunzler
Author-X-Name-First: D.
Author-X-Name-Last: Gunzler
Author-Name: W. Tang
Author-X-Name-First: W.
Author-X-Name-Last: Tang
Author-Name: Xin Tu
Author-X-Name-First: Xin
Author-X-Name-Last: Tu
Title: Modeling longitudinal binomial responses: implications from two dueling paradigms
Abstract:
The generalized estimating equations (GEEs) and generalized linear
mixed-effects model (GLMM) are the two most popular paradigms to extend
models for cross-sectional data to a longitudinal setting. Although the
two approaches yield well-interpreted models for continuous outcomes, it
is quite a different story when applied to binomial responses. We discuss
major modeling differences between the GEE- and GLMM-derived models by
presenting new results regarding the model-driven differences. Our results
show that GLMM induces some artifacts in the marginal models at assessment
times, making it inappropriate when applied to such responses from real
study data. The different interpretations of parameters resulting from the
conceptual difference between the two modeling approaches also carry quite
significant implications and ramifications with respect to data and power
analyses. Although a special case involving a scale difference in
parameters between GEE and GLMM has been noted in the literature, its
implications in real data analysis has not been thoroughly addressed.
Further, this special case has a very limited covariate structure and does
not apply to most real studies, especially multi-center clinical trials.
The new results presented fill a substantial gap in the literature
regarding the model-driven differences between the two dueling paradigms.
Journal: Journal of Applied Statistics
Pages: 2373-2390
Issue: 11
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664763.2010.550038
File-URL: http://hdl.handle.net/10.1080/02664763.2010.550038
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2373-2390
Template-Type: ReDIF-Article 1.0
Author-Name: I. Gijbels
Author-X-Name-First: I.
Author-X-Name-Last: Gijbels
Author-Name: I. Prosdocimi
Author-X-Name-First: I.
Author-X-Name-Last: Prosdocimi
Title: Smooth estimation of mean and dispersion function in extended generalized additive models with application to Italian induced abortion data
Abstract:
We analyse data on abortion rate (AR) in Italy with a particular focus on
different behaviours in different regions in Italy. The aim is to try to
reveal the relationship between the AR and several covariates that
describe in some way the modernity of the region and the condition of the
women there. The data are mostly underdispersed and the degree of
underdispersion also varies with the covariates. To analyse these data,
recent techniques for flexible modelling of a mean and dispersion function
in a double exponential family framework are further developed now in a
generalized additive model context for dealing with the multivariate
set-up. The appealing unified framework and approach even allow to
semi-parametric modelling of the covariates without any additional
efforts. The methodology is illustrated on ozone-level data and leads to
interesting findings in the Italian abortion data.
Journal: Journal of Applied Statistics
Pages: 2391-2411
Issue: 11
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664763.2010.550039
File-URL: http://hdl.handle.net/10.1080/02664763.2010.550039
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2391-2411
Template-Type: ReDIF-Article 1.0
Author-Name: Hukum Chandra
Author-X-Name-First: Hukum
Author-X-Name-Last: Chandra
Author-Name: Nicola Salvati
Author-X-Name-First: Nicola
Author-X-Name-Last: Salvati
Author-Name: U. C. Sud
Author-X-Name-First: U. C.
Author-X-Name-Last: Sud
Title: Disaggregate-level estimates of indebtedness in the state of Uttar Pradesh in India: an application of small-area estimation technique
Abstract:
The National Sample Survey Organisation (NSSO) surveys are the main
source of official statistics in India, and generate a range of invaluable
data at the macro level (e.g. state and national levels). However, the
NSSO data cannot be used directly to produce reliable estimates at the
micro level (e.g. district or further disaggregate level) due to small
sample sizes. There is a rapidly growing demand of such micro-level
statistics in India, as the country is moving from centralized to more
decentralized planning system. In this article, we employ small-area
estimation (SAE) techniques to derive model-based estimates of the
proportion of indebted households at district or at other small-area
levels in the state of Uttar Pradesh in India by linking data from the
Debt--Investment Survey 2002--2003 of NSSO and the Population Census 2001
and the Agriculture Census 2003. Our results show that the model-based
estimates are precise and representative. For many small areas, it is even
not possible to produce estimates using sample data alone. The model-based
estimates generated using SAE are still reliable for such areas. The
estimates are expected to provide invaluable information to policy
analysts and decision-makers.
Journal: Journal of Applied Statistics
Pages: 2413-2432
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559202
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559202
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2413-2432
Template-Type: ReDIF-Article 1.0
Author-Name: Ricardo S. Ehlers
Author-X-Name-First: Ricardo S.
Author-X-Name-Last: Ehlers
Title: Comparison of Bayesian models for production efficiency
Abstract:
In this paper, we use Markov Chain Monte Carlo (MCMC) methods in order to
estimate and compare stochastic production frontier models from a Bayesian
perspective. We consider a number of competing models in terms of
different production functions and the distribution of the asymmetric
error term. All MCMC simulations are done using the package
JAGS (Just Another Gibbs Sampler), a clone of the
classic BUGS package which works closely with the
R package where all the statistical computations
and graphics are done.
Journal: Journal of Applied Statistics
Pages: 2433-2443
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559203
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559203
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2433-2443
Template-Type: ReDIF-Article 1.0
Author-Name: Nicola Lama
Author-X-Name-First: Nicola
Author-X-Name-Last: Lama
Author-Name: Patrizia Boracchi
Author-X-Name-First: Patrizia
Author-X-Name-Last: Boracchi
Author-Name: Elia Biganzoli
Author-X-Name-First: Elia
Author-X-Name-Last: Biganzoli
Title: Partial logistic relevance vector machines in survival analysis
Abstract:
The use of relevance vector machines to flexibly model hazard rate
functions is explored. This technique is adapted to survival analysis
problems through the partial logistic approach. The method exploits the
Bayesian automatic relevance determination procedure to obtain sparse
solutions and it incorporates the flexibility of kernel-based models.
Example results are presented on literature data from a head-and-neck
cancer survival study using Gaussian and spline kernels. Sensitivity
analysis is conducted to assess the influence of hyperprior distribution
parameters. The proposed method is then contrasted with other flexible
hazard regression methods, in particular the HARE model proposed by
Kooperberg et al. [16]. A simulation study is conducted
to carry out the comparison. The model developed in this paper exhibited
good performance in the prediction of hazard rate. The application of this
sparse Bayesian technique to a real cancer data set demonstrated that the
proposed method can potentially reveal characteristics of the hazards,
associated with the dynamics of the studied diseases, which may be missed
by existing modeling approaches based on different perspectives on the
bias vs. variance balance.
Journal: Journal of Applied Statistics
Pages: 2445-2458
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559204
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559204
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2445-2458
Template-Type: ReDIF-Article 1.0
Author-Name: Asghar Seif
Author-X-Name-First: Asghar
Author-X-Name-Last: Seif
Author-Name: Alireza Faraz
Author-X-Name-First: Alireza
Author-X-Name-Last: Faraz
Author-Name: C�dric Heuchenne
Author-X-Name-First: C�dric
Author-X-Name-Last: Heuchenne
Author-Name: Erwin Saniga
Author-X-Name-First: Erwin
Author-X-Name-Last: Saniga
Author-Name: M. B. Moghadam
Author-X-Name-First: M. B.
Author-X-Name-Last: Moghadam
Title: A modified economic-statistical design of the T-super-2 control chart with variable sample sizes and control limits
Abstract:
Recent studies have shown that using variable sampling size and control
limits (VSSC) schemes result in charts with more statistical power than
variable sampling size (VSS) when detecting small to moderate shifts in
the process mean vector. This paper presents an economic-statistical
design (ESD) of the VSSC T-super-2 control chart using the general model
of Lorenzen and Vance [22]. The genetic algorithm approach is then
employed to search for the optimal values of the six test parameters of
the chart. We then compare the expected cost per unit of time of the
optimally designed VSSC chart with optimally designed VSS and FRS (fixed
ratio sampling) T-super-2 charts as well as MEWMA charts.
Journal: Journal of Applied Statistics
Pages: 2459-2469
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559205
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559205
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2459-2469
Template-Type: ReDIF-Article 1.0
Author-Name: Sandra De Iaco
Author-X-Name-First: Sandra
Author-X-Name-Last: De Iaco
Title: A new space--time multivariate approach for environmental data analysis
Abstract:
Air quality control usually requires a monitoring system of multiple
indicators measured at various points in space and time. Hence, the use of
space--time multivariate techniques are of fundamental importance in this
context, where decisions and actions regarding environmental protection
should be supported by studies based on either inter-variables relations
and spatial--temporal correlations. This paper describes how canonical
correlation analysis can be combined with space--time geostatistical
methods for analysing two spatial--temporal correlated aspects, such as
air pollution concentrations and meteorological conditions. Hourly
averages of three pollutants (nitric oxide, nitrogen dioxide and ozone)
and three atmospheric indicators (temperature, humidity and wind speed)
taken for two critical months (February and August) at several monitoring
stations are considered and space--time variograms for the variables are
estimated. Simultaneous relationships between such sample space--time
variograms are determined through canonical correlation analysis. The most
correlated canonical variates are used for describing synthetically the
underlying space--time behaviour of the components of the two sets.
Journal: Journal of Applied Statistics
Pages: 2471-2483
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559206
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559206
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2471-2483
Template-Type: ReDIF-Article 1.0
Author-Name: Suzan Gazioğlu
Author-X-Name-First: Suzan
Author-X-Name-Last: Gazioğlu
Author-Name: E. Marian Scott
Author-X-Name-First: E. Marian
Author-X-Name-Last: Scott
Title: Sensitivity analysis of linear time-invariant compartmental models with steady-state constraint
Abstract:
Compartmental models have been widely used in modelling systems in
pharmaco-kinetics, engineering, biomedicine and ecology since 1943 and
turn out to be very good approximations for many different real-life
systems. Sensitivity analysis (SA) is commonly employed at a preliminary
stage of model development process to increase the confidence in the model
and its predictions by providing an understanding of how the model
response variables respond to changes in the inputs, data used to
calibrate it and model structures. This paper concerns the application of
some SA techniques to a linear, deterministic, time-invariant
compartmental model of global carbon cycle (GCC). The same approach is
also illustrated with a more complex GCC model which has some nonlinear
components. By focusing on these two structurally different models for
estimating the atmospheric CO2 content in the year 2100,
sensitivity of model predictions to uncertainty attached to the model
input factors is studied. The application/modification of SA techniques to
compartmental models with steady-state constraint is explored using the
8-compartment model, and computational methods developed to maintain the
initial steady-state condition are presented. In order to adjust the
values of model input factors to achieve an acceptable match between
observed and predicted model conditions, windowing analysis is used.
Journal: Journal of Applied Statistics
Pages: 2485-2509
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559207
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559207
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2485-2509
Template-Type: ReDIF-Article 1.0
Author-Name: N. T. Longford
Author-X-Name-First: N. T.
Author-X-Name-Last: Longford
Author-Name: Pierpaolo D'Urso
Author-X-Name-First: Pierpaolo
Author-X-Name-Last: D'Urso
Title: Mixture models with an improper component
Abstract:
A class of mixture models in which a component is associated with an
improper distribution is introduced. This component is intended mainly for
outliers. The models are motivated by the EM algorithm, and are fitted by
its simple adaptation. They are illustrated on several examples with large
samples, one of them about transactions of residential properties in
Wellington, New Zealand, in 2006.
Journal: Journal of Applied Statistics
Pages: 2511-2521
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559208
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559208
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2511-2521
Template-Type: ReDIF-Article 1.0
Author-Name: Melody S. Goodman
Author-X-Name-First: Melody S.
Author-X-Name-Last: Goodman
Author-Name: Yi Li
Author-X-Name-First: Yi
Author-X-Name-Last: Li
Author-Name: Ram C. Tiwari
Author-X-Name-First: Ram C.
Author-X-Name-Last: Tiwari
Title: Detecting multiple change points in piecewise constant hazard functions
Abstract:
The National Cancer Institute (NCI) suggests a sudden reduction in
prostate cancer mortality rates, likely due to highly successful
treatments and screening methods for early diagnosis. We are interested in
understanding the impact of medical breakthroughs, treatments, or
interventions, on the survival experience for a population. For this
purpose, estimating the underlying hazard function, with possible time
change points, would be of substantial interest, as it will provide a
general picture of the survival trend and when this trend is disrupted.
Increasing attention has been given to testing the assumption of a
constant failure rate against a failure rate that changes at a single
point in time. We expand the set of alternatives to allow for the
consideration of multiple change-points, and propose a model selection
algorithm using sequential testing for the piecewise constant hazard
model. These methods are data driven and allow us to estimate not only the
number of change points in the hazard function but where those changes
occur. Such an analysis allows for better understanding of how changing
medical practice affects the survival experience for a patient population.
We test for change points in prostate cancer mortality rates using the NCI
Surveillance, Epidemiology, and End Results dataset.
Journal: Journal of Applied Statistics
Pages: 2523-2532
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559209
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559209
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2523-2532
Template-Type: ReDIF-Article 1.0
Author-Name: Himadri Ghosh
Author-X-Name-First: Himadri
Author-X-Name-Last: Ghosh
Author-Name: Prajneshu
Author-X-Name-First:
Author-X-Name-Last: Prajneshu
Title: Statistical learning theory for fitting multimodal distribution to rainfall data: an application
Abstract:
The promising methodology of the “Statistical Learning
Theory” for the estimation of multimodal distribution is thoroughly
studied. The “tail” is estimated through Hill's, UH and
moment methods. The threshold value is determined by nonparametric
bootstrap and the minimum mean square error criterion. Further, the
“body” is estimated by the nonparametric structural risk
minimization method of the empirical distribution function under the
regression set-up. As an illustration, rainfall data for the
meteorological subdivision of Orissa, India during the period 1871--2006
are used. It is shown that Hill's method has performed the best for tail
density. Finally, the combined estimated “body” and
“tail” of the multimodal distribution is shown to capture
the multimodality present in the data.
Journal: Journal of Applied Statistics
Pages: 2533-2545
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559210
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559210
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2533-2545
Template-Type: ReDIF-Article 1.0
Author-Name: Rakhee Dinubhai Patel
Author-X-Name-First: Rakhee Dinubhai
Author-X-Name-Last: Patel
Author-Name: Frederic Paik Schoenberg
Author-X-Name-First: Frederic Paik
Author-X-Name-Last: Schoenberg
Title: A graphical test for local self-similarity in univariate data
Abstract:
The Pareto distribution, or power-law distribution, has long been used to
model phenomena in many fields, including wildfire sizes, earthquake
seismic moments and stock price changes. Recent observations have brought
the fit of the Pareto into question, however, particularly in the upper
tail where it often overestimates the frequency of the largest events.
This paper proposes a graphical self-similarity test specifically designed
to assess whether a Pareto distribution fits better than a tapered Pareto
or another alternative. Unlike some model selection methods, this
graphical test provides the advantage of highlighting where the model fits
well and where it breaks down. Specifically, for data that seem to be
better modeled by the tapered Pareto or other alternatives, the test
assesses the degree of local self-similarity at each value where the test
is computed. The basic properties of the graphical test and its
implementation are discussed, and applications of the test to
seismological, wildfire, and financial data are considered.
Journal: Journal of Applied Statistics
Pages: 2547-2562
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559211
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559211
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2547-2562
Template-Type: ReDIF-Article 1.0
Author-Name: Chia-Lin Chang
Author-X-Name-First: Chia-Lin
Author-X-Name-Last: Chang
Author-Name: Michael McAleer
Author-X-Name-First: Michael
Author-X-Name-Last: McAleer
Author-Name: Les Oxley
Author-X-Name-First: Les
Author-X-Name-Last: Oxley
Title: How are journal impact, prestige and article influence related? An application to neuroscience
Abstract:
The paper analyzes the leading journals in neurosciences using
quantifiable research assessment measures (RAM), highlights the
similarities and differences in alternative RAM, shows that several RAM
capture similar performance characteristics of highly cited journals, and
shows that some other RAM have low correlations with each other, and hence
add significant informational value. Alternative RAM are discussed for the
Thomson Reuters ISI Web of Science database (hereafter ISI). The RAM that
are calculated annually or updated daily include the classic 2-year impact
factor (2YIF), 5-year impact factor, immediacy (or zero-year impact
factor), Eigenfactor score, article influence score, C3PO (citation
performance per paper online), h-index, Zinfluence, PI-BETA (papers
ignored by even the authors), 2-year and historical self-citation
threshold approval ratings, impact factor inflation, and cited article
influence (CAI). The RAM are analyzed for 26 highly cited journals in the
ISI category of neurosciences. The paper finds that the Eigenfactor score
and PI-BETA are not highly correlated with the other RAM scores, so that
they convey additional information regarding journal rankings, that
article influence is highly correlated with some existing RAM, so that it
has little informative incremental value, and that CAI has additional
informational value to that of article influence. Harmonic mean rankings
of the 13 RAM criteria for the 26 highly cited journals are also
presented. Emphasizing the 2YIF of a journal to the exclusion of other
informative RAM criteria is shown to lead to a distorted evaluation of
journal performance and influence, especially given the informative value
of several other RAM.
Journal: Journal of Applied Statistics
Pages: 2563-2573
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559212
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559212
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2563-2573
Template-Type: ReDIF-Article 1.0
Author-Name: Chi-Shuan Liu
Author-X-Name-First: Chi-Shuan
Author-X-Name-Last: Liu
Author-Name: Fang-Chih Tien
Author-X-Name-First: Fang-Chih
Author-X-Name-Last: Tien
Title: A single-featured EWMA-X control chart for detecting shifts in process mean and standard deviation
Abstract:
The combined EWMA-X chart is a commonly used tool for
monitoring both large and small process shifts. However, this chart
requires calculating and monitoring two statistics along with two sets of
control limits. Thus, this study develops a single-featured
EWMA-X (called SFEWMA-X) control chart
which has the ability to simultaneously monitor both large and small
process shifts using only one set of statistic and control limits. The
proposed SFEWMA-X chart is further extended to monitoring
the shifts in process standard deviation. A set of simulated data are used
to demonstrate the proposed chart's superior performance in terms of
average run length compared with that of the traditional charts. The
experimental examples also show that the SFEWMA-X chart
is neater and easier to visually interpret than the original
EWMA-X chart.
Journal: Journal of Applied Statistics
Pages: 2575-2596
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559213
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559213
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2575-2596
Template-Type: ReDIF-Article 1.0
Author-Name: Kadri Ulas Akay
Author-X-Name-First: Kadri Ulas
Author-X-Name-Last: Akay
Author-Name: Müjgan Tez
Author-X-Name-First: Müjgan
Author-X-Name-Last: Tez
Title: Alternative modeling techniques for the quantal response data in mixture experiments
Abstract:
Mixture experiments are commonly encountered in many fields including
chemical, pharmaceutical and consumer product industries. Due to their
wide applications, mixture experiments, a special study of response
surface methodology, have been given greater attention in both model
building and determination of designs compared with other experimental
studies. In this paper, some new approaches are suggested on model
building and selection for the analysis of the data in mixture experiments
by using a special generalized linear models, logistic regression model,
proposed by Chen et al. [7]. Generally, the special
mixture models, which do not have a constant term, are highly affected by
collinearity in modeling the mixture experiments. For this reason, in
order to alleviate the undesired effects of collinearity in the analysis
of mixture experiments with logistic regression, a new mixture model is
defined with an alternative ratio variable. The deviance analysis table is
given for standard mixture polynomial models defined by transformations
and special mixture models used as linear predictors. The effects of
components on the response in the restricted experimental region are given
by using an alternative representation of Cox's direction approach. In
addition, odds ratio and the confidence intervals of odds ratio are
identified according to the chosen reference and control groups. To
compare the suggested models, some model selection criteria, graphical
odds ratio and the confidence intervals of the odds ratio are used. The
advantage of the suggested approaches is illustrated on tumor incidence
data set.
Journal: Journal of Applied Statistics
Pages: 2597-2616
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.559214
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559214
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2597-2616
Template-Type: ReDIF-Article 1.0
Author-Name: Raymond Hubbard
Author-X-Name-First: Raymond
Author-X-Name-Last: Hubbard
Title: The widespread misinterpretation of p-values as error probabilities
Abstract:
The anonymous mixing of Fisherian (p-values) and
Neyman--Pearsonian (α levels) ideas about testing, distilled in the
customary but misleading p > α criterion of
statistical significance, has led researchers in the social and management
sciences (and elsewhere) to commonly misinterpret the
p-value as a ‘data-adjusted’ Type I error
rate. Evidence substantiating this claim is provided from a number of
fronts, including comments by statisticians, articles judging the value of
significance testing, textbooks, surveys of scholars, and the statistical
reporting behaviours of applied researchers. That many investigators do
not know the difference between p’s and
α’s indicates much bewilderment over what those most ardently
sought research outcomes—statistically significant
results—means. Statisticians can play a leading role in clearing
this confusion. A good starting point would be to abolish the
p > α criterion of statistical significance.
Journal: Journal of Applied Statistics
Pages: 2617-2626
Issue: 11
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.567245
File-URL: http://hdl.handle.net/10.1080/02664763.2011.567245
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2617-2626
Template-Type: ReDIF-Article 1.0
Author-Name: Alessio Farcomeni
Author-X-Name-First: Alessio
Author-X-Name-Last: Farcomeni
Author-Name: Alessandra Nardi
Author-X-Name-First: Alessandra
Author-X-Name-Last: Nardi
Author-Name: Elena Fabrizi
Author-X-Name-First: Elena
Author-X-Name-Last: Fabrizi
Title: Joint analysis of occurrence and time to stability after entrance into the Italian labour market: an approach based on a Bayesian cure model with structured stochastic search variable selection
Abstract:
Precarious employment is a serious social problem, especially in those
countries, such as Italy, where there are limited benefits from social
security. We investigate this phenomenon by analysing the initial part of
the career of employees starting with unstable contracts for a panel of
Italian workers. Our aim is to estimate the probability of getting a
stable job and to detect factors influencing both this probability and the
duration of precariousness. To answer these questions, we use an
ad hoc mixture cure rate model in a Bayesian framework.
Journal: Journal of Applied Statistics
Pages: 2627-2646
Issue: 11
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.567246
File-URL: http://hdl.handle.net/10.1080/02664763.2011.567246
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2627-2646
Template-Type: ReDIF-Article 1.0
Author-Name: Shuen-Lin Jeng
Author-X-Name-First: Shuen-Lin
Author-X-Name-Last: Jeng
Author-Name: Yu-Te Liu
Author-X-Name-First: Yu-Te
Author-X-Name-Last: Liu
Title: Adaptive tangent distance classifier on recognition of handwritten digits
Abstract:
Simard et al. [1617] proposed a transformation distance
called “tangent distance” (TD) which can make pattern
recognition be efficient. The key idea is to construct a distance measure
which is invariant with respect to some chosen transformations. In this
research, we provide a method using adaptive TD based on an idea inspired
by “discriminant adaptive nearest neighbor” [7]. This method
is relatively easy compared with many other complicated ones. A real
handwritten recognition data set is used to illustrate our new method. Our
results demonstrate that the proposed method gives lower classification
error rates than those by standard implementation of neural networks and
support vector machines and is as good as several other complicated
approaches.
Journal: Journal of Applied Statistics
Pages: 2647-2659
Issue: 11
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.567247
File-URL: http://hdl.handle.net/10.1080/02664763.2011.567247
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2647-2659
Template-Type: ReDIF-Article 1.0
Author-Name: K. Fernández-Aguirre
Author-X-Name-First: K.
Author-X-Name-Last: Fernández-Aguirre
Author-Name: M. I. Landaluce-Calvo
Author-X-Name-First: M. I.
Author-X-Name-Last: Landaluce-Calvo
Author-Name: A. Mart�n-Arroyuelos
Author-X-Name-First: A.
Author-X-Name-Last: Mart�n-Arroyuelos
Author-Name: J. I. Modroño-Herrán
Author-X-Name-First: J. I.
Author-X-Name-Last: Modroño-Herrán
Title: Knowledge extraction from a large on-line survey: a case study for a higher education corporate marketing
Abstract:
For a higher education public institution, young in relative terms,
featuring local competition with another private and both long-established
and reputed one, it is of great importance to become a reference
university institution to be better known and felt with identification in
the society it belongs to and ultimately to reach a good position within
the European Higher Education Area. These considerations have made the
university governors setting up the objective of achieving an adequate
management of the university institutional brand focused on its logo and
on image promotion, leading to the establishment of a university shop as
it is considered a highly adequate instrument for such promotion. In this
context, an on-line survey is launched on three different kinds of members
of the institution, resulting in a large data sample. Different kinds of
variables are analysed through appropriate exploratory multivariate
techniques (symmetrical methods) and regression-related techniques
(non-symmetrical methods). An advocacy for such combination is given as a
conclusion. The application of statistical techniques of data and text
mining provides us with empirical insights about the institution
members’ perceptions and helps us to extract some facts valuable to
establish policies that would improve the corporate identity and the
success of the corporate shop.
Journal: Journal of Applied Statistics
Pages: 2661-2679
Issue: 11
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.567248
File-URL: http://hdl.handle.net/10.1080/02664763.2011.567248
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2661-2679
Template-Type: ReDIF-Article 1.0
Author-Name: Ron S. Kenett
Author-X-Name-First: Ron S.
Author-X-Name-Last: Kenett
Title: On the planning and design of sample surveys
Journal: Journal of Applied Statistics
Pages: 2681-2681
Issue: 11
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664763.2011.616688
File-URL: http://hdl.handle.net/10.1080/02664763.2011.616688
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:11:p:2681-2681
Template-Type: ReDIF-Article 1.0
Author-Name: M. J. Faddy
Author-X-Name-First: M. J.
Author-X-Name-Last: Faddy
Author-Name: D. M. Smith
Author-X-Name-First: D. M.
Author-X-Name-Last: Smith
Title: Analysis of count data with covariate dependence in both mean and variance
Abstract:
Extended Poisson process modelling is generalised to allow for
covariate-dependent dispersion as well as a covariate-dependent mean
response. This is done by a re-parameterisation that uses approximate
expressions for the mean and variance. Such modelling allows under- and
over-dispersion, or a combination of both, in the same data set to be
accommodated within the same modelling framework. All the necessary
calculations can be done numerically, enabling maximum likelihood
estimation of all model parameters to be carried out. The modelling is
applied to re-analyse two published data sets, where there is evidence of
covariate-dependent dispersion, with the modelling leading to more
informative analyses of these data and more appropriate measures of the
precision of any estimates.
Journal: Journal of Applied Statistics
Pages: 2683-2694
Issue: 12
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.567250
File-URL: http://hdl.handle.net/10.1080/02664763.2011.567250
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2683-2694
Template-Type: ReDIF-Article 1.0
Author-Name: Ramesh C. Gupta
Author-X-Name-First: Ramesh C.
Author-X-Name-Last: Gupta
Author-Name: Debasis Kundu
Author-X-Name-First: Debasis
Author-X-Name-Last: Kundu
Title: Weighted inverse Gaussian -- a versatile lifetime model
Abstract:
Jorgensen et al. [14] introduced a three-parameter
generalized inverse Gaussian distribution, which is a mixture of the
inverse Gaussian distribution and length biased inverse Gaussian
distribution. Also Birnbaum--Saunders distribution is a special case for
, where
p is the mixing parameter. It is observed that the
estimators of the unknown parameters can be obtained by solving a
three-dimensional optimization process, which may not be a trivial issue.
Most of the iterative algorithms are quite sensitive to the initial
guesses. In this paper, we propose to use the EM algorithm to estimate the
unknown parameters for complete and censored samples. In the proposed EM
algorithm, at the M-step the optimization problem can be solved
analytically, and the observed Fisher information matrix can be obtained.
These can be used to construct asymptotic confidence intervals of the
unknown parameters. Some simulation experiments are conducted to examine
the performance of the proposed EM algorithm, and it is observed that the
performances are quite satisfactory. The methodology proposed here is
illustrated by three data sets.
Journal: Journal of Applied Statistics
Pages: 2695-2708
Issue: 12
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.567251
File-URL: http://hdl.handle.net/10.1080/02664763.2011.567251
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2695-2708
Template-Type: ReDIF-Article 1.0
Author-Name: Wen-Liang Hung
Author-X-Name-First: Wen-Liang
Author-X-Name-Last: Hung
Author-Name: Yen-Chang Chang
Author-X-Name-First: Yen-Chang
Author-X-Name-Last: Chang
Title: Comparison between method of moments and entropy regularization algorithm applied to parameter estimation for mixed-Weibull distribution
Abstract:
Mixed-Weibull distribution has been used to model a wide range of failure
data sets, and in many practical situations the number of components in a
mixture model is unknown. Thus, the parameter estimation of a
mixed-Weibull distribution is considered and the important issue of how to
determine the number of components is discussed. Two approaches are
proposed to solve this problem. One is the method of moments and the other
is a regularization type of fuzzy clustering algorithm. Finally, numerical
examples and two real data sets are given to illustrate the features of
the proposed approaches.
Journal: Journal of Applied Statistics
Pages: 2709-2722
Issue: 12
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.567252
File-URL: http://hdl.handle.net/10.1080/02664763.2011.567252
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2709-2722
Template-Type: ReDIF-Article 1.0
Author-Name: G. B. Cybis
Author-X-Name-First: G. B.
Author-X-Name-Last: Cybis
Author-Name: S. R.C. Lopes
Author-X-Name-First: S. R.C.
Author-X-Name-Last: Lopes
Author-Name: H. P. Pinheiro
Author-X-Name-First: H. P.
Author-X-Name-Last: Pinheiro
Title: Power of the likelihood ratio test for models of DNA base substitution
Abstract:
The goal of this work is to study the properties of the likelihood ratio
(LR) tests comparing base substitution models. These are the most widely
used hypothesis tests. With mild regularity conditions, we show that the
asymptotic distribution of the LR statistic test, under the alternative
hypothesis, is a non-central chi-square distribution.
The asymptotic normal distribution of the LR test is proved when the
sequence length S goes to infinity. We also propose a
consistent estimator for the non-centrality parameter D.
Through asymptotic theory and based on this consistent estimator for
D, we propose a low computational cost estimator for the
power of the LR test. The methodology is applied to 17 different gene
sequences of the ECP--EDN family in primates.
Journal: Journal of Applied Statistics
Pages: 2723-2737
Issue: 12
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.567253
File-URL: http://hdl.handle.net/10.1080/02664763.2011.567253
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2723-2737
Template-Type: ReDIF-Article 1.0
Author-Name: Chang-Xing Ma
Author-X-Name-First: Chang-Xing
Author-X-Name-Last: Ma
Author-Name: Albert Vexler
Author-X-Name-First: Albert
Author-X-Name-Last: Vexler
Author-Name: Enrique F. Schisterman
Author-X-Name-First: Enrique F.
Author-X-Name-Last: Schisterman
Author-Name: Lili Tian
Author-X-Name-First: Lili
Author-X-Name-Last: Tian
Title: Cost-efficient designs based on linearly associated biomarkers
Abstract:
A major limiting factor in much of the epidemiological and environmental
researches is the cost of measuring the biomarkers or analytes of
interest. Often, the number of specimens available for analysis is greater
than the number of assays that is budgeted for. These assays are then
performed on a random sample of specimens. Regression calibration is then
utilized to infer biomarker levels of expensive assays from other
correlated biomarkers that are relatively inexpensive to obtain and
analyze. In other contexts, use of pooled specimens has been shown to
increase efficiency in estimation. In this article, we examine two types
of pooling in lieu of a random sample. The first is
random (or traditional) pooling, and we characterize the second as
“optimal” pooling. The second, which we propose for
regression analysis, is pooling based on specimens ranked on the less
expensive biomarker. The more expensive assay is then performed on the
pool of relatively similar measurements. The optimal nature of this
technique is also exemplified via Monte Carlo evaluations and real
biomarker data. By displaying the considerable robustness of our method
via a Monte Carlo study, it is shown that the proposed pooling design is a
viable option whenever expensive assays are considered.
Journal: Journal of Applied Statistics
Pages: 2739-2750
Issue: 12
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.567254
File-URL: http://hdl.handle.net/10.1080/02664763.2011.567254
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2739-2750
Template-Type: ReDIF-Article 1.0
Author-Name: Tina Žagar
Author-X-Name-First: Tina
Author-X-Name-Last: Žagar
Author-Name: Vesna Zadnik
Author-X-Name-First: Vesna
Author-X-Name-Last: Zadnik
Author-Name: Maja Primic Žakelj
Author-X-Name-First: Maja Primic
Author-X-Name-Last: Žakelj
Title: Local standardized incidence ratio estimates and comparison with other mapping methods for small geographical areas using Slovenian breast cancer data
Abstract:
Cancer maps are important tools in the descriptive presentation of the
cancer burden. The objective is to explore the advantages and
disadvantages of mapping methods based on point data in comparison with
maps based on aggregated data. Four types of maps were prepared based on
the same underlying data set on breast cancer incidence in Slovenian
females, 2002--2004. First, the standardized incidence ratios (SIR) by
municipalities are mapped in a traditional way. Second, two maps applying
widely used smoothing methods are presented, both based on aggregated
municipalities’ data: floating weighted averages and the Bayesian
hierarchical modelling. Finally, the new alternative method based on exact
cancer cases and population coordinates is applied -- called the local SIR
estimates. The decreasing west to east trend is visible on all map types.
Smoothing produced more stable and less noisy SIR estimates. The map of
the local SIR estimates emphasizes extremes, but unlike the map, based on
the observed SIR, these estimates are statistically stable, enabling more
accurate evaluation. The main advantages of local SIR estimates over the
other three methods are the abilities of revealing more localized patterns
and ignoring the arbitrary administrative borders. The disadvantage is
that the geocoded data are not always available.
Journal: Journal of Applied Statistics
Pages: 2751-2761
Issue: 12
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.570314
File-URL: http://hdl.handle.net/10.1080/02664763.2011.570314
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2751-2761
Template-Type: ReDIF-Article 1.0
Author-Name: Solaiman Afroughi
Author-X-Name-First: Solaiman
Author-X-Name-Last: Afroughi
Author-Name: Soghrat Faghihzadeh
Author-X-Name-First: Soghrat
Author-X-Name-Last: Faghihzadeh
Author-Name: Majid Jafari Khaledi
Author-X-Name-First: Majid Jafari
Author-X-Name-Last: Khaledi
Author-Name: Mehdi Ghandehari Motlagh
Author-X-Name-First: Mehdi Ghandehari
Author-X-Name-Last: Motlagh
Author-Name: Ebrahim Hajizadeh
Author-X-Name-First: Ebrahim
Author-X-Name-Last: Hajizadeh
Title: Analysis of clustered spatially correlated binary data using autologistic model and Bayesian method with an application to dental caries of 3--5-year-old children
Abstract:
The autologistic model, first introduced by Besag, is a popular tool for
analyzing binary data in spatial lattices. However, no investigation was
found to consider modeling of binary data clustered in uncorrelated
lattices. Owing to spatial dependency of responses, the exact likelihood
estimation of parameters is not possible. For circumventing this
difficulty, many studies have been designed to approximate the likelihood
and the related partition function of the model. So, the traditional and
Bayesian estimation methods based on the likelihood function are often
time-consuming and require heavy computations and recursive techniques.
Some investigators have introduced and implemented data augmentation and
latent variable model to reduce computational complications in parameter
estimation. In this work, the spatially correlated binary data distributed
in uncorrelated lattices were modeled using autologistic regression, a
Bayesian inference was developed with contribution of data augmentation
and the proposed models were applied to caries experiences of deciduous
dents.
Journal: Journal of Applied Statistics
Pages: 2763-2774
Issue: 12
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.570315
File-URL: http://hdl.handle.net/10.1080/02664763.2011.570315
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2763-2774
Template-Type: ReDIF-Article 1.0
Author-Name: Silvia Bacci
Author-X-Name-First: Silvia
Author-X-Name-Last: Bacci
Author-Name: Valeria Caviezel
Author-X-Name-First: Valeria
Author-X-Name-Last: Caviezel
Title: Multilevel IRT models for the university teaching evaluation
Abstract:
In this paper, a generalization of the two-parameter partial credit model
(2PL-PCM) and of two special cases, the partial credit model (PCM) and the
rating scale model (RSM), with a hierarchical data structure will be
presented. Having shown how 2PL-PCM, as with other item response theory
(IRT) models, may be read in terms of a generalized linear mixed model
(GLMM) with two aggregation levels, a presentation will be given of an
extension to the case of measuring the latent trait of individuals
aggregated in groups. The use of this Multilevel IRT model will be
illustrated via reference to the evaluation of university teaching by
students following the courses. The aim is to generate a ranking of
teaching on the basis of student satisfaction, so as to give teachers, and
those responsible for organizing study courses, a background of
information that takes the opinions of the direct target group for
university teaching (that is, the students) into account, in the context
of improving the teaching courses available.
Journal: Journal of Applied Statistics
Pages: 2775-2791
Issue: 12
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.570316
File-URL: http://hdl.handle.net/10.1080/02664763.2011.570316
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2775-2791
Template-Type: ReDIF-Article 1.0
Author-Name: Manuel G. Scotto
Author-X-Name-First: Manuel G.
Author-X-Name-Last: Scotto
Author-Name: Susana M. Barbosa
Author-X-Name-First: Susana M.
Author-X-Name-Last: Barbosa
Author-Name: Andr�s M. Alonso
Author-X-Name-First: Andr�s M.
Author-X-Name-Last: Alonso
Title: Extreme value and cluster analysis of European daily temperature series
Abstract:
Time series of daily mean temperature obtained from the European Climate
Assessment data set is analyzed with respect to their extremal properties.
A time-series clustering approach which combines Bayesian methodology,
extreme value theory and classification techniques is adopted for the
analysis of the regional variability of temperature extremes. The daily
mean temperature records are clustered on the basis of their corresponding
predictive distributions for 25-, 50- and 100-year return values. The
results of the cluster analysis show a clear distinction between the
highest altitude stations, for which the return values are lowest, and the
remaining stations. Furthermore, a clear distinction is also found between
the northernmost stations in Scandinavia and the stations in central and
southern Europe. This spatial structure of the return period distributions
for 25-, 50- and 100-years seems to be consistent with projected changes
in the variability of temperature extremes over Europe pointing to a
different behavior in central Europe than in northern Europe and the
Mediterranean area, possibly related to the effect of soil moisture and
land-atmosphere coupling.
Journal: Journal of Applied Statistics
Pages: 2793-2804
Issue: 12
Volume: 38
Year: 2011
Month: 3
X-DOI: 10.1080/02664763.2011.570317
File-URL: http://hdl.handle.net/10.1080/02664763.2011.570317
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2793-2804
Template-Type: ReDIF-Article 1.0
Author-Name: Peng Bai
Author-X-Name-First: Peng
Author-X-Name-Last: Bai
Author-Name: Wen Gan
Author-X-Name-First: Wen
Author-X-Name-Last: Gan
Author-Name: Lei Shi
Author-X-Name-First: Lei
Author-X-Name-Last: Shi
Title: Bayesian confidence interval for the risk ratio in a correlated 2 × 2 table with structural zero
Abstract:
This paper studies the construction of a Bayesian confidence interval for
the risk ratio (RR) in a 2 × 2 table with structural zero. Under a
Dirichlet prior distribution, the exact posterior distribution of the RR
is derived, and tail-based interval is suggested for constructing Bayesian
confidence interval. The frequentist performance of this confidence
interval is investigated by simulation and compared with the score-based
interval in terms of the mean coverage probability and mean expected width
of the interval. An advantage of the Bayesian confidence interval is that
it is well defined for all data structure and has shorter expected width.
Our simulation shows that the Bayesian tail-based interval under
Jeffreys’ prior performs as well as or better than the score-based
confidence interval.
Journal: Journal of Applied Statistics
Pages: 2805-2817
Issue: 12
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.570318
File-URL: http://hdl.handle.net/10.1080/02664763.2011.570318
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2805-2817
Template-Type: ReDIF-Article 1.0
Author-Name: V�ctor Leiva
Author-X-Name-First: V�ctor
Author-X-Name-Last: Leiva
Author-Name: Emilia Athayde
Author-X-Name-First: Emilia
Author-X-Name-Last: Athayde
Author-Name: Cecilia Azevedo
Author-X-Name-First: Cecilia
Author-X-Name-Last: Azevedo
Author-Name: Carolina Marchant
Author-X-Name-First: Carolina
Author-X-Name-Last: Marchant
Title: Modeling wind energy flux by a Birnbaum--Saunders distribution with an unknown shift parameter
Abstract:
In this paper, we discuss a Birnbaum--Saunders distribution with an
unknown shift parameter and apply it to wind energy modeling. We describe
structural aspects of this distribution including properties, moments,
mode and hazard and shape analyses. We also discuss estimation, goodness
of fit and diagnostic methods for this distribution. A computational
implementation in R language of the obtained
results is provided. Finally, we apply such results to two unpublished
real wind speed data from Chile, which allows us to show the
characteristics of this statistical distribution and to model wind energy
flux.
Journal: Journal of Applied Statistics
Pages: 2819-2838
Issue: 12
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.570319
File-URL: http://hdl.handle.net/10.1080/02664763.2011.570319
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2819-2838
Template-Type: ReDIF-Article 1.0
Author-Name: Athanasios C. Rakitzis
Author-X-Name-First: Athanasios C.
Author-X-Name-Last: Rakitzis
Author-Name: Demetrios L. Antzoulakos
Author-X-Name-First: Demetrios L.
Author-X-Name-Last: Antzoulakos
Title: On the improvement of one-sided S control charts
Abstract:
The most common charting procedure used for monitoring the variance of
the distribution of a quality characteristic is the S
control chart. As a Shewhart-type control chart, it is relatively
insensitive in the quick detection of small and moderate shifts in process
variance. The performance of the S chart can be improved
by supplementing it with runs rules or by varying the sample size and the
sampling interval. In this work, we introduce and study one-sided adaptive
S control charts, supplemented or not with one powerful
runs rule, for detecting increases or decreases in process variation. The
properties of the proposed control schemes are obtained by using a Markov
chain approach. Furthermore, a practical guidance for the choice of the
most suitable control scheme is also provided.
Journal: Journal of Applied Statistics
Pages: 2839-2858
Issue: 12
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.570320
File-URL: http://hdl.handle.net/10.1080/02664763.2011.570320
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2839-2858
Template-Type: ReDIF-Article 1.0
Author-Name: P. G. Sankaran
Author-X-Name-First: P. G.
Author-X-Name-Last: Sankaran
Author-Name: P. Anisha
Author-X-Name-First: P.
Author-X-Name-Last: Anisha
Title: Shared frailty model for recurrent event data with multiple causes
Abstract:
The topic of heterogeneity in the analysis of recurrent event data has
received considerable attention recent times. Frailty models are widely
employed in such situations as they allow us to model the heterogeneity
through common random effect. In this paper, we introduce a shared frailty
model for gap time distributions of recurrent events with multiple causes.
The parameters of the model are estimated using EM algorithm. An extensive
simulation study is used to assess the performance of the method. Finally,
we apply the proposed model to a real-life data.
Journal: Journal of Applied Statistics
Pages: 2859-2868
Issue: 12
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.570321
File-URL: http://hdl.handle.net/10.1080/02664763.2011.570321
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2859-2868
Template-Type: ReDIF-Article 1.0
Author-Name: Jin Zhang
Author-X-Name-First: Jin
Author-X-Name-Last: Zhang
Title: Adaptive normal reference bandwidth based on quantile for kernel density estimation
Abstract:
Bandwidth selection is an important problem of kernel density estimation.
Traditional simple and quick bandwidth selectors usually oversmooth the
density estimate. Existing sophisticated selectors usually have
computational difficulties and occasionally do not exist. Besides, they
may not be robust against outliers in the sample data, and some are highly
variable, tending to undersmooth the density. In this paper, a highly
robust simple and quick bandwidth selector is proposed, which adapts to
different types of densities.
Journal: Journal of Applied Statistics
Pages: 2869-2880
Issue: 12
Volume: 38
Year: 2011
Month: 3
X-DOI: 10.1080/02664763.2011.570322
File-URL: http://hdl.handle.net/10.1080/02664763.2011.570322
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2869-2880
Template-Type: ReDIF-Article 1.0
Author-Name: Martin Huber
Author-X-Name-First: Martin
Author-X-Name-Last: Huber
Title: Testing for covariate balance using quantile regression and resampling methods
Abstract:
Consistency of propensity score matching estimators hinges on the
propensity score's ability to balance the distributions of covariates in
the pools of treated and non-treated units. Conventional balance tests
merely check for differences in covariates’ means, but cannot
account for differences in higher moments. For this reason, this paper
proposes balance tests which test for differences in the entire
distributions of continuous covariates based on quantile regression (to
derive Kolmogorov--Smirnov and Cramer--von-Mises--Smirnov-type test
statistics) and resampling methods (for inference). Simulations suggest
that these methods are very powerful and capture imbalances related to
higher moments when conventional balance tests fail to do so.
Journal: Journal of Applied Statistics
Pages: 2881-2899
Issue: 12
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664763.2011.570323
File-URL: http://hdl.handle.net/10.1080/02664763.2011.570323
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2881-2899
Template-Type: ReDIF-Article 1.0
Author-Name: Chien-Hung Chen
Author-X-Name-First: Chien-Hung
Author-X-Name-Last: Chen
Author-Name: Tsung-Shan Tsou
Author-X-Name-First: Tsung-Shan
Author-X-Name-Last: Tsou
Title: Robust likelihood inferences for multivariate correlated data
Abstract:
Multivariate normal, due to its well-established theories, is commonly
utilized to analyze correlated data of various types. However, the
validity of the resultant inference is, more often than not, erroneous if
the model assumption fails. We present a modification for making the
multivariate normal likelihood acclimatize itself to general correlated
data. The modified likelihood is asymptotically legitimate for any true
underlying joint distributions so long as they have finite second moments.
One can, hence, acquire full likelihood inference without knowing the true
random mechanisms underlying the data. Simulations and real data analysis
are provided to demonstrate the merit of our proposed parametric robust
method.
Journal: Journal of Applied Statistics
Pages: 2901-2910
Issue: 12
Volume: 38
Year: 2011
Month: 3
X-DOI: 10.1080/02664763.2011.573539
File-URL: http://hdl.handle.net/10.1080/02664763.2011.573539
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2901-2910
Template-Type: ReDIF-Article 1.0
Author-Name: Nan Jia
Author-X-Name-First: Nan
Author-X-Name-Last: Jia
Author-Name: Thomas M. Braun
Author-X-Name-First: Thomas M.
Author-X-Name-Last: Braun
Title: The adaptive accelerated biased coin design for phase I clinical trials
Abstract:
Phase I clinical trials are designed to study several doses of the same
drug in a small group of patients to determine the maximum tolerated dose
(MTD), which is defined as the dose that is associated with dose-limiting
toxicity (DLT) in a desired fraction Γ of patients. Durham and
Flournoy [5] proposed the biased coin design (BCD), which is an
up-and-down design that assigns a new patient to a dose depending upon
whether or not the current patient experienced a DLT. However, the BCD in
its standard form requires the complete follow-up of the current patient
before the new patient can be assigned a dose. In situations where
patients’ follow-up times are relatively long compared to patient
inter-arrival times, the BCD will result in an impractically long trial
and cause patients to either have delayed entry into the trial or refusal
of entry altogether. We propose an adaptive accelerated BCD (aaBCD) that
generalizes the traditional BCD design algorithm by incorporating an
adaptive weight function based upon the amount of follow-up of each
enrolled patient. By doing so, the dose assignment for each eligible
patient can be determined immediately with no delay, leading to a shorter
trial overall. We show, via simulation, that the frequency of correctly
identifying the MTD at the end of the study with the aaBCD, as well as the
number of patients assigned to the MTD, are comparable to that of the
traditional BCD design. We also compare the performance of the aaBCD with
the accelerated BCD (ABCD) of Stylianou and Follman [19], as well as the
time-to-event continual reassessment method (TITE-CRM) of Cheung and
Chappell [4].
Journal: Journal of Applied Statistics
Pages: 2911-2924
Issue: 12
Volume: 38
Year: 2011
Month: 3
X-DOI: 10.1080/02664763.2011.573540
File-URL: http://hdl.handle.net/10.1080/02664763.2011.573540
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2911-2924
Template-Type: ReDIF-Article 1.0
Author-Name: Xinyu Tang
Author-X-Name-First: Xinyu
Author-X-Name-Last: Tang
Author-Name: Abdus S. Wahed
Author-X-Name-First: Abdus S.
Author-X-Name-Last: Wahed
Title: Comparison of treatment regimes with adjustment for auxiliary variables
Abstract:
Treatment regimes are algorithms for assigning treatments to patients
with complex diseases, where treatment consists of more than one episode
of therapy, potentially with different dosages of the same agent or
different agents. Sequentially randomized clinical trials are usually
designed to evaluate and compare the effect of different treatment
regimes. In such designs, eligible patients are first randomly assigned to
receive one of the initial treatments. Patients meeting some criteria
(e.g. no progressive disease) are then randomized to receive one of the
maintenance treatments. Usually, the procedure continues until all
treatment options are exhausted. Such multistage treatment assignment
results in treatment regimes consisting of initial treatments,
intermediate responses and second-stage treatments. However, methods for
efficient analysis of sequentially randomized trials have only been
developed very recently. As a result, earlier clinical trials reported
results based only on the comparison of stage-specific treatments. In this
article, we propose a model that applies to comparisons of any combination
of any number of treatment regimes regardless of the number of stages of
treatment adjusted for auxiliary variables. Contrasts of treatment regimes
are tested using the Wald chi-square method. Both the model and Wald
chi-square tests of contrasts are illustrated through a simulation study
and an application to a high-risk neuroblastoma study to complement the
earlier results reported on this study.
Journal: Journal of Applied Statistics
Pages: 2925-2938
Issue: 12
Volume: 38
Year: 2011
Month: 3
X-DOI: 10.1080/02664763.2011.573541
File-URL: http://hdl.handle.net/10.1080/02664763.2011.573541
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2925-2938
Template-Type: ReDIF-Article 1.0
Author-Name: Harry Haupt
Author-X-Name-First: Harry
Author-X-Name-Last: Haupt
Author-Name: Kathrin Kagerer
Author-X-Name-First: Kathrin
Author-X-Name-Last: Kagerer
Author-Name: Joachim Schnurbus
Author-X-Name-First: Joachim
Author-X-Name-Last: Schnurbus
Title: Cross-validating fit and predictive accuracy of nonlinear quantile regressions
Abstract:
The paper proposes a cross-validation method to address the question of
specification search in a multiple nonlinear quantile regression
framework. Linear parametric, spline-based partially linear and
kernel-based fully nonparametric specifications are contrasted as
competitors using cross-validated weighted L
1-norm based goodness-of-fit and prediction error criteria. The
aim is to provide a fair comparison with respect to estimation accuracy
and/or predictive ability for different semi- and nonparametric
specification paradigms. This is challenging as the model dimension cannot
be estimated for all competitors and the meta-parameters such as kernel
bandwidths, spline knot numbers and polynomial degrees are difficult to
compare. General issues of specification comparability and automated
data-driven meta-parameter selection are discussed. The proposed method
further allows us to assess the balance between fit and model complexity.
An extensive Monte Carlo study and an application to a well-known data set
provide empirical illustration of the method.
Journal: Journal of Applied Statistics
Pages: 2939-2954
Issue: 12
Volume: 38
Year: 2011
Month: 3
X-DOI: 10.1080/02664763.2011.573542
File-URL: http://hdl.handle.net/10.1080/02664763.2011.573542
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2939-2954
Template-Type: ReDIF-Article 1.0
Author-Name: Roland Langrock
Author-X-Name-First: Roland
Author-X-Name-Last: Langrock
Title: Some applications of nonlinear and non-Gaussian state--space modelling by means of hidden Markov models
Abstract:
Nonlinear and non-Gaussian state--space models (SSMs) are fitted to
different types of time series. The applications include homogeneous and
seasonal time series, in particular earthquake counts, polio counts,
rainfall occurrence data, glacial varve data and daily returns on a share.
The considered SSMs comprise Poisson, Bernoulli, gamma and
Student-t distributions at the observation level.
Parameter estimations for the SSMs are carried out using a likelihood
approximation that is obtained after discretization of the state space.
The approximation can be made arbitrarily accurate, and the approximated
likelihood is precisely that of a finite-state hidden Markov model (HMM).
The proposed method enables us to apply standard HMM techniques. It is
easy to implement and can be extended to all kinds of SSMs in a
straightforward manner.
Journal: Journal of Applied Statistics
Pages: 2955-2970
Issue: 12
Volume: 38
Year: 2011
Month: 3
X-DOI: 10.1080/02664763.2011.573543
File-URL: http://hdl.handle.net/10.1080/02664763.2011.573543
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2955-2970
Template-Type: ReDIF-Article 1.0
Author-Name: Filippo Domma
Author-X-Name-First: Filippo
Author-X-Name-Last: Domma
Author-Name: Sabrina Giordano
Author-X-Name-First: Sabrina
Author-X-Name-Last: Giordano
Author-Name: Mariangela Zenga
Author-X-Name-First: Mariangela
Author-X-Name-Last: Zenga
Title: Maximum likelihood estimation in Dagum distribution with censored samples
Abstract:
In this work, we show that the Dagum distribution [3] may be a
competitive model for describing data which include censored observations
in lifetime and reliability problems. Maximum likelihood estimates of the
three parameters of the Dagum distribution are determined from samples
with type I right and type II doubly censored data. We perform an
empirical analysis using published censored data sets: in certain cases,
the Dagum distribution fits the data better than other parametric
distributions that are more commonly used in survival and reliability
analysis. Graphical comparisons confirm that the Dagum model behaves
better than a number of competitive distributions in describing the
empirical hazard rate of the analyzed data. A probability plot to provide
graphical check of the appropriateness of the Dagum model for right
censored data is constructed, and the details are given in the appendix.
Finally, a simulation study that shows the good performance of the maximum
likelihood estimators of the Dagum shape parameters for finite type II
doubly censored samples is carried out.
Journal: Journal of Applied Statistics
Pages: 2971-2985
Issue: 12
Volume: 38
Year: 2011
Month: 3
X-DOI: 10.1080/02664763.2011.578613
File-URL: http://hdl.handle.net/10.1080/02664763.2011.578613
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2971-2985
Template-Type: ReDIF-Article 1.0
Author-Name: Isaac Dialsingh
Author-X-Name-First: Isaac
Author-X-Name-Last: Dialsingh
Title: Multiple testing problems in pharmaceutical statistics
Journal: Journal of Applied Statistics
Pages: 2987-2987
Issue: 12
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664763.2010.536309
File-URL: http://hdl.handle.net/10.1080/02664763.2010.536309
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2987-2987
Template-Type: ReDIF-Article 1.0
Author-Name: Alex Karagrigoriou
Author-X-Name-First: Alex
Author-X-Name-Last: Karagrigoriou
Title: Frailty Models in Survival Analysis
Journal: Journal of Applied Statistics
Pages: 2988-2989
Issue: 12
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664763.2011.559371
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559371
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2988-2989
Template-Type: ReDIF-Article 1.0
Author-Name: Pieter Bastiaan Ober
Author-X-Name-First: Pieter Bastiaan
Author-X-Name-Last: Ober
Title: Random Phenomena: Fundamentals of Probability and Statistics for Engineers
Journal: Journal of Applied Statistics
Pages: 2989-2990
Issue: 12
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664763.2011.559372
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559372
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2989-2990
Template-Type: ReDIF-Article 1.0
Author-Name: Han Lin Shang
Author-X-Name-First: Han Lin
Author-X-Name-Last: Shang
Title: Bayesian Nonparametrics
Journal: Journal of Applied Statistics
Pages: 2990-2990
Issue: 12
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664763.2011.559374
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559374
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2990-2990
Template-Type: ReDIF-Article 1.0
Author-Name: Yves Laberge
Author-X-Name-First: Yves
Author-X-Name-Last: Laberge
Title: Advising on Research Methods: A Consultant's Companion
Journal: Journal of Applied Statistics
Pages: 2991-2991
Issue: 12
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664763.2011.559375
File-URL: http://hdl.handle.net/10.1080/02664763.2011.559375
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2991-2991
Template-Type: ReDIF-Article 1.0
Author-Name: Han Lin Shang
Author-X-Name-First: Han Lin
Author-X-Name-Last: Shang
Title: Non-Parametric Econometrics
Journal: Journal of Applied Statistics
Pages: 2992-2992
Issue: 12
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664763.2011.575999
File-URL: http://hdl.handle.net/10.1080/02664763.2011.575999
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2992-2992
Template-Type: ReDIF-Article 1.0
Author-Name: Kassim S. Mwitondi
Author-X-Name-First: Kassim S.
Author-X-Name-Last: Mwitondi
Title: Data Analysis Using SAS ENTERPRISE GUIDE
Journal: Journal of Applied Statistics
Pages: 2993-2993
Issue: 12
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664763.2011.576807
File-URL: http://hdl.handle.net/10.1080/02664763.2011.576807
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2993-2993
Template-Type: ReDIF-Article 1.0
Author-Name: Philippe Castagliola
Author-X-Name-First: Philippe
Author-X-Name-Last: Castagliola
Title: Introduction to Time-Series Modelling
Journal: Journal of Applied Statistics
Pages: 2993-2994
Issue: 12
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664763.2011.583725
File-URL: http://hdl.handle.net/10.1080/02664763.2011.583725
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2993-2994
Template-Type: ReDIF-Article 1.0
Author-Name: Kassim S. Mwitondi
Author-X-Name-First: Kassim S.
Author-X-Name-Last: Mwitondi
Title: Interpreting Economic and Social Data
Journal: Journal of Applied Statistics
Pages: 2994-2995
Issue: 12
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664763.2011.583726
File-URL: http://hdl.handle.net/10.1080/02664763.2011.583726
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:12:p:2994-2995
Template-Type: ReDIF-Article 1.0
Author-Name: Tony Vangeneugden
Author-X-Name-First: Tony
Author-X-Name-Last: Vangeneugden
Author-Name: Geert Molenberghs
Author-X-Name-First: Geert
Author-X-Name-Last: Molenberghs
Author-Name: Geert Verbeke
Author-X-Name-First: Geert
Author-X-Name-Last: Verbeke
Author-Name: Clarice G.B. Dem�trio
Author-X-Name-First: Clarice G.B.
Author-X-Name-Last: Dem�trio
Title: Marginal correlation from an extended random-effects model for repeated and overdispersed counts
Abstract:
Vangeneugden et al. [15] derived approximate correlation
functions for longitudinal sequences of general data type, Gaussian and
non-Gaussian, based on generalized linear mixed-effects models (GLMM).
Their focus was on binary sequences, as well as on a combination of binary
and Gaussian sequences. Here, we focus on the specific case of repeated
count data, important in two respects. First, we employ the model proposed
by Molenberghs et al. [13], which generalizes at the same
time the Poisson-normal GLMM and the conventional overdispersion models,
in particular the negative-binomial model. The model flexibly accommodates
data hierarchies, intra-sequence correlation, and overdispersion. Second,
means, variances, and joint probabilities can be expressed in closed form,
allowing for exact intra-sequence correlation expressions. Next to the
general situation, some important special cases such as exchangeable
clustered outcomes are considered, producing insightful expressions. The
closed-form expressions are contrasted with the generic approximate
expressions of Vangeneugden et al. [15]. Data from an
epileptic-seizures trial are analyzed and correlation functions derived.
It is shown that the proposed extension strongly outperforms the classical
GLMM.
Journal: Journal of Applied Statistics
Pages: 215-232
Issue: 2
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664760903406405
File-URL: http://hdl.handle.net/10.1080/02664760903406405
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:215-232
Template-Type: ReDIF-Article 1.0
Author-Name: A. F.B. Costa
Author-X-Name-First: A. F.B.
Author-X-Name-Last: Costa
Author-Name: M. A.G. Machado
Author-X-Name-First: M. A.G.
Author-X-Name-Last: Machado
Title: A control chart based on sample ranges for monitoring the covariance matrix of the multivariate processes
Abstract:
For the univariate case, the R chart and the
S -super-2 chart are the most common charts used for
monitoring the process dispersion. With the usual sample size of 4 and 5,
the R chart is slightly inferior to the
S -super-2 chart in terms of efficiency in detecting
process shifts. In this article, we show that for the multivariate case,
the chart based on the standardized sample ranges, we call the RMAX chart,
is substantially inferior in terms of efficiency in detecting shifts in
the covariance matrix than the VMAX chart, which is based on the
standardized sample variances. The user's familiarity with sample ranges
is a point in favor of the RMAX chart. An example is presented to
illustrate the application of the proposed chart.
Journal: Journal of Applied Statistics
Pages: 233-245
Issue: 2
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664760903406413
File-URL: http://hdl.handle.net/10.1080/02664760903406413
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:233-245
Template-Type: ReDIF-Article 1.0
Author-Name: Kuo-Chin Lin
Author-X-Name-First: Kuo-Chin
Author-X-Name-Last: Lin
Title: Assessing cumulative logit models via a score test in random effect models
Abstract:
The purpose of this article is to develop a goodness-of-fit test based on
score test statistics for cumulative logit models with extra variation of
random effects. Two main theorems for the proposed score test statistics
are derived. In simulation studies, the powers of the proposed tests are
discussed and the power curve against a variety of dispersion parameters
and bandwidths is depicted. The proposed method is illustrated by an
ordinal data set from Mosteller and Tukey [23].
Journal: Journal of Applied Statistics
Pages: 247-259
Issue: 2
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664760903406421
File-URL: http://hdl.handle.net/10.1080/02664760903406421
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:247-259
Template-Type: ReDIF-Article 1.0
Author-Name: N. Crato
Author-X-Name-First: N.
Author-X-Name-Last: Crato
Author-Name: R. R. Linhares
Author-X-Name-First: R. R.
Author-X-Name-Last: Linhares
Author-Name: S. R.C. Lopes
Author-X-Name-First: S. R.C.
Author-X-Name-Last: Lopes
Title: α-stable laws for noncoding regions in DNA sequences
Abstract:
In this work, we analyze the long-range dependence
parameter for a nucleotide sequence in several different transformations.
The long-range dependence parameter is estimated by the approximated
maximum likelihood method, by a novel estimator based on the spectral
envelope theory, by a regression method based on the periodogram function,
and also by the detrended fluctuation analysis method. We
study the length distribution of coding and noncoding regions for all
Homo sapiens chromosomes available from the European
Bioinformatics Institute. The parameter of the tail rate decay is
estimated by the Hill estimator ˆα. We show that the tail rate
decay is greater than 2 for coding regions, while for almost all noncoding
regions it is less than 2.
Journal: Journal of Applied Statistics
Pages: 261-271
Issue: 2
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664760903406447
File-URL: http://hdl.handle.net/10.1080/02664760903406447
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:261-271
Template-Type: ReDIF-Article 1.0
Author-Name: T. Banerjee
Author-X-Name-First: T.
Author-X-Name-Last: Banerjee
Author-Name: G. Grover
Author-X-Name-First: G.
Author-X-Name-Last: Grover
Author-Name: T. Pensi
Author-X-Name-First: T.
Author-X-Name-Last: Pensi
Author-Name: D. Banerjee
Author-X-Name-First: D.
Author-X-Name-Last: Banerjee
Title: Estimation of hazard of death in vertically transmitted HIV-1-infected children for doubly censored failure times and fixed covariates
Abstract:
This work estimates the effect of covariates on survival data when times
of both originating and failure events are interval-censored. Proportional
hazards model [16] along with log-linear models was applied on a data of
130 vertically infected HIV-1 children visiting the paediatrics clinic.
The covariates considered for the analysis were antiretroviral (ARV)
therapy, age at diagnosis, and change in CD4+T cell count. Change in CD4+T
cell count was the difference in the last and first count in non-ARV
therapy group, while in the ARV therapy group the same was considered
after the start of the treatment. Our findings suggest that children on
ARV therapy had significantly lower risk of death
(p>0.001). We further investigated the effect of age and
change in CD4+T cell count on risk of death. These covariates exhibited a
possible association with risk of death by both the procedures
(p>0.0001). The effect of number of years under ARV
therapy with diagnosis year as a confounding factor was directly related
to longevity. The results obtained by the two procedures gave reasonable
estimates. We conclude that when the lengths of intervals are narrow, we
can opt for parametric modeling which is less computationally intensive.
Journal: Journal of Applied Statistics
Pages: 273-285
Issue: 2
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664760903406454
File-URL: http://hdl.handle.net/10.1080/02664760903406454
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:273-285
Template-Type: ReDIF-Article 1.0
Author-Name: Aquiles E.G. Kalatzis
Author-X-Name-First: Aquiles E.G.
Author-X-Name-Last: Kalatzis
Author-Name: Camila F. Bassetto
Author-X-Name-First: Camila F.
Author-X-Name-Last: Bassetto
Author-Name: Carlos R. Azzoni
Author-X-Name-First: Carlos R.
Author-X-Name-Last: Azzoni
Title: Multicollinearity and financial constraint in investment decisions: a Bayesian generalized ridge regression
Abstract:
This paper addresses the investment decisions considering the presence of
financial constraints of 373 large Brazilian firms from 1997 to 2004,
using panel data. A Bayesian econometric model was used considering ridge
regression for multicollinearity problems among the variables in the
model. Prior distributions are assumed for the parameters, classifying the
model into random or fixed effects. We used a Bayesian approach to
estimate the parameters, considering normal and Student t
distributions for the error and assumed that the initial values for the
lagged dependent variable are not fixed, but generated by a random
process. The recursive predictive density criterion was used for model
comparisons. Twenty models were tested and the results indicated that
multicollinearity does influence the value of the estimated parameters.
Controlling for capital intensity, financial constraints are found to be
more important for capital-intensive firms, probably due to their lower
profitability indexes, higher fixed costs and higher degree of property
diversification.
Journal: Journal of Applied Statistics
Pages: 287-299
Issue: 2
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664760903406462
File-URL: http://hdl.handle.net/10.1080/02664760903406462
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:287-299
Template-Type: ReDIF-Article 1.0
Author-Name: Xu Xiaosi
Author-X-Name-First: Xu
Author-X-Name-Last: Xiaosi
Author-Name: Chen Ying
Author-X-Name-First: Chen
Author-X-Name-Last: Ying
Author-Name: Zheng Haitao
Author-X-Name-First: Zheng
Author-X-Name-Last: Haitao
Title: The comparison of enterprise bankruptcy forecasting method
Abstract:
The enterprise bankruptcy forecasting is vital to manage credit risk,
which can be solved through classifying method. There are three typical
classifying methods to forecast enterprise bankruptcy: the statistics
method, the Artificial Neural Network method and the kernel-based learning
method. The paper introduces the first two methods briefly, and then
introduces Support Vector Machine (SVM) of the kernel-based learning
method, and lastly compares the bankruptcy forecasting accuracies of the
three methods by building the corresponding models with the data of
China's stock exchange data. From the positive analysis, we can draw a
conclusion that the SVM method has a higher adaptability and precision to
forecast enterprise bankruptcy.
Journal: Journal of Applied Statistics
Pages: 301-308
Issue: 2
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664760903406470
File-URL: http://hdl.handle.net/10.1080/02664760903406470
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:301-308
Template-Type: ReDIF-Article 1.0
Author-Name: Heonsang Lim
Author-X-Name-First: Heonsang
Author-X-Name-Last: Lim
Author-Name: Bong-Jin Yum
Author-X-Name-First: Bong-Jin
Author-X-Name-Last: Yum
Title: Optimal design of accelerated degradation tests based on Wiener process models
Abstract:
Optimal accelerated degradation test (ADT) plans are developed assuming
that the constant-stress loading method is employed and the degradation
characteristic follows a Wiener process. Unlike the previous works on
planning ADTs based on stochastic process models, this article determines
the test stress levels and the proportion of test units allocated to each
stress level such that the asymptotic variance of the maximum-likelihood
estimator of the qth quantile of the lifetime
distribution at the use condition is minimized. In addition, compromise
plans are also developed for checking the validity of the relationship
between the model parameters and the stress variable. Finally, using an
example, sensitivity analysis procedures are presented for evaluating the
robustness of optimal and compromise plans against the uncertainty in the
pre-estimated parameter value, and the importance of optimally determining
test stress levels and the proportion of units allocated to each stress
level are illustrated.
Journal: Journal of Applied Statistics
Pages: 309-325
Issue: 2
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664760903406488
File-URL: http://hdl.handle.net/10.1080/02664760903406488
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:309-325
Template-Type: ReDIF-Article 1.0
Author-Name: P. P. Balestrassi
Author-X-Name-First: P. P.
Author-X-Name-Last: Balestrassi
Author-Name: A. P. Paiva
Author-X-Name-First: A. P.
Author-X-Name-Last: Paiva
Author-Name: A. C. Zambroni de Souza
Author-X-Name-First: A. C. Zambroni
Author-X-Name-Last: de Souza
Author-Name: J. B. Turrioni
Author-X-Name-First: J. B.
Author-X-Name-Last: Turrioni
Author-Name: Elmira Popova
Author-X-Name-First: Elmira
Author-X-Name-Last: Popova
Title: A multivariate descriptor method for change-point detection in nonlinear time series
Abstract:
The purpose of this paper is to present a novel method that is applied to
detect dynamic changes in nonlinear time series. The method combines a
multivariate control chart that monitors the variation of three normalized
descriptors -- Hjorth's descriptors of activity, mobility and complexity
-- and is applied to the change-point detection problem of nonlinear time
series. The approach is estimated using six simulated nonlinear time
series. In addition, a case study of six time series of short-term
electricity load consumption was used to illustrate the power of the
method.
Journal: Journal of Applied Statistics
Pages: 327-342
Issue: 2
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664760903406496
File-URL: http://hdl.handle.net/10.1080/02664760903406496
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:327-342
Template-Type: ReDIF-Article 1.0
Author-Name: C. B. Zeller
Author-X-Name-First: C. B.
Author-X-Name-Last: Zeller
Author-Name: V. H. Lachos
Author-X-Name-First: V. H.
Author-X-Name-Last: Lachos
Author-Name: F. E. Vilca-Labra
Author-X-Name-First: F. E.
Author-X-Name-Last: Vilca-Labra
Title: Local influence analysis for regression models with scale mixtures of skew-normal distributions
Abstract:
The robust estimation and the local influence analysis for linear
regression models with scale mixtures of multivariate skew-normal
distributions have been developed in this article. The main virtue of
considering the linear regression model under the class of scale mixtures
of skew-normal distributions is that they have a nice hierarchical
representation which allows an easy implementation of inference. Inspired
by the expectation maximization algorithm, we have developed a local
influence analysis based on the conditional expectation of the
complete-data log-likelihood function, which is a measurement invariant
under reparametrizations. This is because the observed data log-likelihood
function associated with the proposed model is somewhat complex and with
Cook's well-known approach it can be very difficult to obtain measures of
the local influence. Some useful perturbation schemes are discussed. In
order to examine the robust aspect of this flexible class against outlying
and influential observations, some simulation studies have also been
presented. Finally, a real data set has been analyzed, illustrating the
usefulness of the proposed methodology.
Journal: Journal of Applied Statistics
Pages: 343-368
Issue: 2
Volume: 38
Year: 2011
Month: 10
X-DOI: 10.1080/02664760903406504
File-URL: http://hdl.handle.net/10.1080/02664760903406504
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:343-368
Template-Type: ReDIF-Article 1.0
Author-Name: K. Triantafyllopoulos
Author-X-Name-First: K.
Author-X-Name-Last: Triantafyllopoulos
Title: Time-varying vector autoregressive models with stochastic volatility
Abstract:
The purpose of this paper is to propose a time-varying vector
autoregressive model (TV-VAR) for forecasting multivariate time series.
The model is casted into a state-space form that allows flexible
description and analysis. The volatility covariance matrix of the time
series is modelled via inverted Wishart and singular multivariate beta
distributions allowing a fully conjugate Bayesian inference. Model
assessment and model comparison are performed via the log-posterior
function, sequential Bayes factors, the mean of squared standardized
forecast errors, the mean of absolute forecast errors (known also as mean
absolute deviation), and the mean forecast error. Bayes factors are also
used in order to choose the autoregressive (AR) order of the model.
Multi-step forecasting is discussed in detail and a flexible formula is
proposed to approximate the forecast function. Two examples, consisting of
bivariate data of IBM and Microsoft shares and of a 30-dimensional asset
selection problem, illustrate the methods. For the IBM and Microsoft data
we discuss model performance and multi-step forecasting in some detail.
For the basket of 30 assets we discuss sequential portfolio allocation;
for both data sets our empirical findings suggest that the TV-VAR models
outperform the widely used vector AR models.
Journal: Journal of Applied Statistics
Pages: 369-382
Issue: 2
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664760903406512
File-URL: http://hdl.handle.net/10.1080/02664760903406512
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:369-382
Template-Type: ReDIF-Article 1.0
Author-Name: Christian H. Weiß
Author-X-Name-First: Christian H.
Author-X-Name-Last: Weiß
Title: Detecting mean increases in Poisson INAR(1) processes with EWMA control charts
Abstract:
Processes of serially dependent Poisson counts are commonly observed in
real-world applications and can often be modeled by the first-order
integer-valued autoregressive (INAR) model. For detecting positive shifts
in the mean of a Poisson INAR(1) process, we propose the one-sided s
exponentially weighted moving average (EWMA) control chart, which is based
on a new type of rounding operation. The s-EWMA chart allows computing
average run length (ARLs) exactly and efficiently with a Markov chain
approach. Using an implementation of this procedure for ARL computation,
the s-EWMA chart is easily designed, which is demonstrated with a
real-data example. Based on an extensive study of ARLs, the out-of-control
performance of the chart is analyzed and compared with that of a
c chart and a one-sided cumulative sum (CUSUM) chart. We
also investigate the robustness of the chart against departures from the
assumed Poisson marginal distribution.
Journal: Journal of Applied Statistics
Pages: 383-398
Issue: 2
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664760903406520
File-URL: http://hdl.handle.net/10.1080/02664760903406520
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:383-398
Template-Type: ReDIF-Article 1.0
Author-Name: Amitava Mukherjee
Author-X-Name-First: Amitava
Author-X-Name-Last: Mukherjee
Author-Name: Barendra Purkait
Author-X-Name-First: Barendra
Author-X-Name-Last: Purkait
Title: Simultaneous semi-sequential testing of dual alternatives for pattern recognition
Abstract:
In this paper, we propose a new nonparametric simultaneous test for dual
alternatives. Simultaneous tests for dual alternatives are used for
pattern detection of arsenic contamination level in
ground water. We consider two possible patterns, namely, monotone shift
and an umbrella-type location alternative, as the dual alternatives.
Pattern recognition problems of this nature are addressed in Bandyopadhyay
et al. [5], stretching the idea of multiple hypotheses
tests as in Benjamini and Hochberg [6]. In the present context, we develop
an alternative approach based on contrasts that helps us to detect three
underlying pattern much more efficiently. We illustrate the new
methodology through a motivating example related to highly sensitive issue
of arsenic contamination in ground water. We provide some
Monte-Carlo studies related to the proposed technique and give a
comparative study between different detection procedures. We also obtain
some related asymptotic results.
Journal: Journal of Applied Statistics
Pages: 399-419
Issue: 2
Volume: 38
Year: 2011
Month: 10
X-DOI: 10.1080/02664760903456392
File-URL: http://hdl.handle.net/10.1080/02664760903456392
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:399-419
Template-Type: ReDIF-Article 1.0
Author-Name: Jun Yang
Author-X-Name-First: Jun
Author-X-Name-Last: Yang
Author-Name: Min Xie
Author-X-Name-First: Min
Author-X-Name-Last: Xie
Author-Name: Thong Ngee Goh
Author-X-Name-First: Thong Ngee
Author-X-Name-Last: Goh
Title: Outlier identification and robust parameter estimation in a zero-inflated Poisson model
Abstract:
The Zero-inflated Poisson distribution has been used in the modeling of
count data in different contexts. This model tends to be influenced by
outliers because of the excessive occurrence of zeroes, thus outlier
identification and robust parameter estimation are important for such
distribution. Some outlier identification methods are studied in this
paper, and their applications and results are also presented with an
example. To eliminate the effect of outliers, two robust parameter
estimates are proposed based on the trimmed mean and the Winsorized mean.
Simulation results show the robustness of our proposed parameter
estimates.
Journal: Journal of Applied Statistics
Pages: 421-430
Issue: 2
Volume: 38
Year: 2011
Month: 10
X-DOI: 10.1080/02664760903456426
File-URL: http://hdl.handle.net/10.1080/02664760903456426
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:421-430
Template-Type: ReDIF-Article 1.0
Author-Name: Andrew Robinson
Author-X-Name-First: Andrew
Author-X-Name-Last: Robinson
Title: BOOK REVIEW
Journal: Journal of Applied Statistics
Pages: 431-431
Issue: 2
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664760903075556
File-URL: http://hdl.handle.net/10.1080/02664760903075556
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:431-431
Template-Type: ReDIF-Article 1.0
Author-Name: A. C. Brooms
Author-X-Name-First: A. C.
Author-X-Name-Last: Brooms
Title: BOOK REVIEW
Journal: Journal of Applied Statistics
Pages: 433-434
Issue: 2
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664760903075572
File-URL: http://hdl.handle.net/10.1080/02664760903075572
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:433-434
Template-Type: ReDIF-Article 1.0
Author-Name: Kassim S. Mwitondi
Author-X-Name-First: Kassim S.
Author-X-Name-Last: Mwitondi
Title: BOOK REVIEW
Journal: Journal of Applied Statistics
Pages: 435-435
Issue: 2
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664760903075580
File-URL: http://hdl.handle.net/10.1080/02664760903075580
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:2:p:435-435
Template-Type: ReDIF-Article 1.0
Author-Name: Wojtek J. Krzanowski
Author-X-Name-First: Wojtek J.
Author-X-Name-Last: Krzanowski
Author-Name: David J. Hand
Author-X-Name-First: David J.
Author-X-Name-Last: Hand
Title: Testing the difference between two Kolmogorov--Smirnov values in the context of receiver operating characteristic curves
Abstract:
The maximum vertical distance between a receiver operating characteristic
(ROC) curve and its chance diagonal is a common measure of effectiveness
of the classifier that gives rise to this curve. This measure is known to
be equivalent to a two-sample Kolmogorov--Smirnov statistic; so the
absolute difference D between two such statistics is
often used informally as a measure of difference between the corresponding
classifiers. A significance test of D is of great
practical interest, but the available Kolmogorov--Smirnov distribution
theory precludes easy analytical construction of such a significance test.
We, therefore, propose a Monte Carlo procedure for conducting the test,
using the binormal model for the underlying ROC curves. We provide Splus/R
routines for the computation, tabulate the results for a number of
illustrative cases, apply the methods to some practical examples and
discuss some implications.
Journal: Journal of Applied Statistics
Pages: 437-450
Issue: 3
Volume: 38
Year: 2011
Month: 10
X-DOI: 10.1080/02664760903456400
File-URL: http://hdl.handle.net/10.1080/02664760903456400
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:437-450
Template-Type: ReDIF-Article 1.0
Author-Name: Qingzhao Yu
Author-X-Name-First: Qingzhao
Author-X-Name-Last: Yu
Title: Weighted bagging: a modification of AdaBoost from the perspective of importance sampling
Abstract:
We motivate the success of AdaBoost (ADA) in classification problems by
appealing to an importance sampling perspective. Based on this insight, we
propose the Weighted Bagging (WB) algorithm, a regularization method that
naturally extends ADA to solve both classification and regression
problems. WB uses a part of the available data to build models, and a
separate part to modify the weights of observations. The method is used
with categorical and regression tress and is compared with ADA, Boosting,
Bagging, Random Forest and Support Vector Machine. We apply these methods
to some real data sets and report some results of simulations. These
applications and simulations show the effectiveness of WB.
Journal: Journal of Applied Statistics
Pages: 451-463
Issue: 3
Volume: 38
Year: 2011
Month: 10
X-DOI: 10.1080/02664760903456418
File-URL: http://hdl.handle.net/10.1080/02664760903456418
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:451-463
Template-Type: ReDIF-Article 1.0
Author-Name: Vasyl Golosnoy
Author-X-Name-First: Vasyl
Author-X-Name-Last: Golosnoy
Author-Name: Roman Liesenfeld
Author-X-Name-First: Roman
Author-X-Name-Last: Liesenfeld
Title: Interval shrinkage estimators
Abstract:
This paper considers estimation of an unknown distribution parameter in
situations where we believe that the parameter belongs to a finite
interval. We propose for such situations an interval shrinkage approach
which combines in a coherent way an unbiased conventional estimator and
non-sample information about the range of plausible parameter values. The
approach is based on an infeasible interval shrinkage estimator which
uniformly dominates the underlying conventional estimator with respect to
the mean square error criterion. This infeasible estimator allows us to
obtain useful feasible counterparts. The properties of these feasible
interval shrinkage estimators are illustrated both in a simulation study
and in empirical examples.
Journal: Journal of Applied Statistics
Pages: 465-477
Issue: 3
Volume: 38
Year: 2011
Month: 10
X-DOI: 10.1080/02664760903456434
File-URL: http://hdl.handle.net/10.1080/02664760903456434
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:465-477
Template-Type: ReDIF-Article 1.0
Author-Name: Shande Chen
Author-X-Name-First: Shande
Author-X-Name-Last: Chen
Title: A class of confidence intervals for sequential phase II clinical trials with binary outcome
Abstract:
The phase II clinical trials often use the binary outcome. Thus,
accessing the success rate of the treatment is a primary objective for the
phase II clinical trials. Reporting confidence intervals is a common
practice for clinical trials. Due to the group sequence design and
relatively small sample size, many existing confidence intervals for phase
II trials are much conservative. In this paper, we propose a class of
confidence intervals for binary outcomes. We also provide a general theory
to assess the coverage of confidence intervals for discrete distributions,
and hence make recommendations for choosing the parameter in calculating
the confidence interval. The proposed method is applied to Simon's [14]
optimal two-stage design with numerical studies. The proposed method can
be viewed as an alternative approach for the confidence interval for
discrete distributions in general.
Journal: Journal of Applied Statistics
Pages: 479-489
Issue: 3
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903456442
File-URL: http://hdl.handle.net/10.1080/02664760903456442
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:479-489
Template-Type: ReDIF-Article 1.0
Author-Name: Himadri Ghosh
Author-X-Name-First: Himadri
Author-X-Name-Last: Ghosh
Author-Name: M. A. Iquebal
Author-X-Name-First: M. A.
Author-X-Name-Last: Iquebal
Author-Name: Prajneshu
Author-X-Name-First:
Author-X-Name-Last: Prajneshu
Title: Bootstrap study of parameter estimates for nonlinear Richards growth model through genetic algorithm
Abstract:
Richards nonlinear growth model, which is a generalization of the
well-known logistic and Gompertz models, generally provides a realistic
description of many phenomena. However, this model is very rarely used as
it is extremely difficult to fit it by employing nonlinear estimation
procedures. To this end, utility of using a very powerful optimization
technique of genetic algorithm is advocated. Parametric bootstrap
methodology is then used to obtain standard errors of the estimates.
Subsequently, bootstrap confidence-intervals are constructed by two
methods, viz. the Percentile method, and Bias-corrected and accelerated
method. The methodology is illustrated by applying it to India's total
annual foodgrain production time-series data.
Journal: Journal of Applied Statistics
Pages: 491-500
Issue: 3
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903521401
File-URL: http://hdl.handle.net/10.1080/02664760903521401
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:491-500
Template-Type: ReDIF-Article 1.0
Author-Name: J. López Fidalgo
Author-X-Name-First: J. López
Author-X-Name-Last: Fidalgo
Author-Name: I. M. Ortiz Rodr�guez
Author-X-Name-First: I. M.
Author-X-Name-Last: Ortiz Rodr�guez
Author-Name: Weng Kee Wong
Author-X-Name-First: Weng Kee
Author-X-Name-Last: Wong
Title: Design issues for population growth models
Abstract:
We briefly review and discuss design issues for population growth and
decline models. We then use a flexible growth and decline model as an
illustrative example and apply optimal design theory to find optimal
sampling times for estimating model parameters, specific parameters and
interesting functions of the model parameters for the model with two real
applications. Robustness properties of the optimal designs are
investigated when nominal values or the model is mis-specified, and also
under a different optimality criterion. To facilitate use of optimal
design ideas in practice, we also introduce a website for generating a
variety of optimal designs for popular models from different disciplines.
Journal: Journal of Applied Statistics
Pages: 501-512
Issue: 3
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903521419
File-URL: http://hdl.handle.net/10.1080/02664760903521419
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:501-512
Template-Type: ReDIF-Article 1.0
Author-Name: Marco Riquelme
Author-X-Name-First: Marco
Author-X-Name-Last: Riquelme
Author-Name: V�ctor Leiva
Author-X-Name-First: V�ctor
Author-X-Name-Last: Leiva
Author-Name: Manuel Galea
Author-X-Name-First: Manuel
Author-X-Name-Last: Galea
Author-Name: Antonio Sanhueza
Author-X-Name-First: Antonio
Author-X-Name-Last: Sanhueza
Title: Influence diagnostics on the coefficient of variation of elliptically contoured distributions
Abstract:
In this article, we study the behavior of the coefficient of variation
(CV) of a random variable that follows a symmetric distribution in the
real line. Specifically, we estimate this coefficient using the
maximum-likelihood (ML) method. In addition, we provide asymptotic
inference for this parameter, which allows us to contrast hypothesis and
construct confidence intervals. Furthermore, we produce influence
diagnostics to evaluate the sensitivity of the ML estimate of this
coefficient when atypical data are present. Moreover, we illustrate the
obtained results by using financial real data. Finally, we carry out a
simulation study to detect the potential influence of atypical
observations on the ML estimator of the CV of a symmetric distribution.
The illustration and simulation demonstrate the robustness of the ML
estimation of this coefficient.
Journal: Journal of Applied Statistics
Pages: 513-532
Issue: 3
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903521427
File-URL: http://hdl.handle.net/10.1080/02664760903521427
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:513-532
Template-Type: ReDIF-Article 1.0
Author-Name: Altaf Hossain
Author-X-Name-First: Altaf
Author-X-Name-Last: Hossain
Author-Name: Mohammed Nasser
Author-X-Name-First: Mohammed
Author-X-Name-Last: Nasser
Title: Comparison of the finite mixture of ARMA-GARCH, back propagation neural networks and support-vector machines in forecasting financial returns
Abstract:
The use of GARCH type models and computational-intelligence-based
techniques for forecasting financial time series has been proved extremely
successful in recent times. In this article, we apply the finite mixture
of ARMA-GARCH model instead of AR or ARMA models to compare with the
standard BP and SVM in forecasting financial time series (daily stock
market index returns and exchange rate returns). We do not apply the pure
GARCH model as the finite mixture of the ARMA-GARCH model outperforms the
pure GARCH model. These models are evaluated on five performance metrics
or criteria. Our experiment shows that the SVM model outperforms both the
finite mixture of ARMA-GARCH and BP models in deviation performance
criteria. In direction performance criteria, the finite mixture of
ARMA-GARCH model performs better. The memory property of these forecasting
techniques is also examined using the behavior of forecasted values
vis-à-vis the original values. Only the SVM model shows long memory
property in forecasting financial returns.
Journal: Journal of Applied Statistics
Pages: 533-551
Issue: 3
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903521435
File-URL: http://hdl.handle.net/10.1080/02664760903521435
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:533-551
Template-Type: ReDIF-Article 1.0
Author-Name: Yu-Chang Lin
Author-X-Name-First: Yu-Chang
Author-X-Name-Last: Lin
Author-Name: Chao-Yu Chou
Author-X-Name-First: Chao-Yu
Author-X-Name-Last: Chou
Title: Robustness of the EWMA and the combined X¯--EWMA control charts with variable sampling intervals to non-normality
Abstract:
The exponentially weighted moving average (EWMA) control charts with
variable sampling intervals (VSIs) have been shown to be substantially
quicker than the fixed sampling intervals (FSI) EWMA control charts in
detecting process mean shifts. The usual assumption for designing a
control chart is that the data or measurements are normally distributed.
However, this assumption may not be true for some processes. In the
present paper, the performances of the EWMA and combined
X¯--EWMA control charts with VSIs are evaluated
under non-normality. It is shown that adding the VSI feature to the EWMA
control charts results in very substantial decreases in the expected time
to detect shifts in process mean under both normality and non-normality.
However, the combined X¯--EWMA chart has its false
alarm rate and its detection ability is affected if the process data are
not normally distributed.
Journal: Journal of Applied Statistics
Pages: 553-570
Issue: 3
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903521443
File-URL: http://hdl.handle.net/10.1080/02664760903521443
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:553-570
Template-Type: ReDIF-Article 1.0
Author-Name: Nasir Abbas
Author-X-Name-First: Nasir
Author-X-Name-Last: Abbas
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Title: Extending the Bradley--Terry model for paired comparisons to accommodate weights
Abstract:
In the method of paired comparisons (PCs), treatments are compared on the
basis of qualitative characteristics they possess, in the light of their
sensory evaluations made by judges. However, there may emerge the
situations where in addition to qualitative merits/worths, judges may
assign quantitative weights to reflect/specify the relative importance of
the treatments. In this study, an attempt is made to reconcile the
qualitative and the quantitative PCs through assigning quantitative
weights to treatments having qualitative merits using/extending the
Bradley--Terry (BT) model. Behaviors of the existing BT model and the
proposed weighted BT model are studied through the test of
goodness-of-fit. Experimental and simulated data sets are used for
illustration.
Journal: Journal of Applied Statistics
Pages: 571-580
Issue: 3
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903521450
File-URL: http://hdl.handle.net/10.1080/02664760903521450
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:571-580
Template-Type: ReDIF-Article 1.0
Author-Name: Reza Pakyari
Author-X-Name-First: Reza
Author-X-Name-Last: Pakyari
Title: Nonparametric mixture analysis of rock crab of the genus Leptograpsus
Abstract:
A nonparametric mixture analysis has been applied to study morphological
characteristics of Leptograpsus crab. Two Gaussian models
were also considered, one with the assumption of independent components
and one with arbitrary relationship between the components. The three
models then fitted to several combination of variables based on selecting
different morphological characteristics. It has been observed that the
nonparametric method gave the best result overall.
Journal: Journal of Applied Statistics
Pages: 581-589
Issue: 3
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903521468
File-URL: http://hdl.handle.net/10.1080/02664760903521468
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:581-589
Template-Type: ReDIF-Article 1.0
Author-Name: Jiajia Zhang
Author-X-Name-First: Jiajia
Author-X-Name-Last: Zhang
Author-Name: Andrew B. Lawson
Author-X-Name-First: Andrew B.
Author-X-Name-Last: Lawson
Title: Bayesian parametric accelerated failure time spatial model and its application to prostate cancer
Abstract:
Prostate cancer (PrCA) is the most common cancer diagnosed in American
men and the second leading cause of death from malignancies. There are
large geographical variation and racial disparities existing in the
survival rate of PrCA. Much work on the spatial survival model is based on
the proportional hazards (PH) model, but few focused on the accelerated
failure time (AFT) model. In this paper, we investigate the PrCA data of
Louisiana from the Surveillance, Epidemiology, and End Results program and
the violation of the PH assumption suggests that the spatial survival
model based on the AFT model is more appropriate for this data set. To
account for the possible extra-variation, we consider spatially referenced
independent or dependent spatial structures. The deviance information
criterion is used to select a best-fitting model within the Bayesian frame
work. The results from our study indicate that age, race, stage, and
geographical distribution are significant in evaluating PrCA survival.
Journal: Journal of Applied Statistics
Pages: 591-603
Issue: 3
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903521476
File-URL: http://hdl.handle.net/10.1080/02664760903521476
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:591-603
Template-Type: ReDIF-Article 1.0
Author-Name: Jianwen Xu
Author-X-Name-First: Jianwen
Author-X-Name-Last: Xu
Author-Name: Hu Yang
Author-X-Name-First: Hu
Author-X-Name-Last: Yang
Title: On the restricted almost unbiased estimators in linear regression
Abstract:
In this paper, the restricted almost unbiased ridge regression estimator
and restricted almost unbiased Liu estimator are introduced for the vector
of parameters in a multiple linear regression model with linear
restrictions. The bias, variance matrices and mean square error (MSE) of
the proposed estimators are derived and compared. It is shown that the
proposed estimators will have smaller quadratic bias but larger variance
than the corresponding competitors in literatures. However, they will
respectively outperform the latter according to the MSE criterion under
certain conditions. Finally, a simulation study and a numerical example
are given to illustrate some of the theoretical results.
Journal: Journal of Applied Statistics
Pages: 605-617
Issue: 3
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903521484
File-URL: http://hdl.handle.net/10.1080/02664760903521484
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:605-617
Template-Type: ReDIF-Article 1.0
Author-Name: Suely Ruiz Giolo
Author-X-Name-First: Suely Ruiz
Author-X-Name-Last: Giolo
Author-Name: Clarice Garcia Borges Dem�trio
Author-X-Name-First: Clarice Garcia Borges
Author-X-Name-Last: Dem�trio
Title: A frailty modeling approach for parental effects in animal breeding
Abstract:
Survival models involving frailties are commonly applied in studies where
correlated event time data arise due to natural or artificial clustering.
In this paper we present an application of such models in the animal
breeding field. Specifically, a mixed survival model with a multivariate
correlated frailty term is proposed for the analysis of data from over
3611 Brazilian Nellore cattle. The primary aim is to evaluate parental
genetic effects on the trait length in days that their progeny need to
gain a commercially specified standard weight gain. This trait is not
measured directly but can be estimated from growth data. Results point to
the importance of genetic effects and suggest that these models constitute
a valuable data analysis tool for beef cattle breeding.
Journal: Journal of Applied Statistics
Pages: 619-629
Issue: 3
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903521492
File-URL: http://hdl.handle.net/10.1080/02664760903521492
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:619-629
Template-Type: ReDIF-Article 1.0
Author-Name: R. K. Mishra
Author-X-Name-First: R. K.
Author-X-Name-Last: Mishra
Author-Name: Zillur Rahman
Author-X-Name-First: Zillur
Author-X-Name-Last: Rahman
Title: Nonparametric approach to rank global petroleum business opportunities
Abstract:
Crude oil continues to be one of the significant energy sources. Several
countries do not have enough indigenous oil and gas resources. These
countries resort to overseas business of Exploration and Production (E&P)
of oil to secure a stable supply. Profitability, risk and growth guide
overseas investment decisions. Selection of overseas investment
opportunities are critical for a firm because of uncertainty in
identifying and quantifying the attendant geological, commercial, social
and political risks as well as return on investment. To secure overseas
oil acreage, business entities intend to invest in overseas E&P
destination having reasonable petroleum reserve, favorable contract terms
(fiscal terms), well-developed infrastructure, sound legal system, minimum
country risk (CR) (economic, social and political) and facilitate relative
ease to do business in that country. The countries have varied mix of
these parameters, and it leads to growing concern to screen and rank
overseas investment opportunities. Methodologies to rank global
opportunities should take into consideration the risk factors such as
petroleum potential, infrastructure, geo-political scenario, contract
terms, etc. We coin the term for the numerical rank as Globalization Index
(GI), which is a function of the factors considered to affect the decision
of a business entity in screening the global destinations for venturing in
to E&P business of crude oil. This paper is an attempt to model these
factors by invoking Alternating Conditional Expectation methodology to
find GI.
Journal: Journal of Applied Statistics
Pages: 631-646
Issue: 3
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903563601
File-URL: http://hdl.handle.net/10.1080/02664760903563601
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:3:p:631-646
Template-Type: ReDIF-Article 1.0
Author-Name: George Saridakis
Author-X-Name-First: George
Author-X-Name-Last: Saridakis
Title: Violent crime and incentives in the long-run: evidence from England and Wales
Abstract:
This study uses recent advances in time-series econometrics to
investigate the non-stationarity and co-integration properties of violent
crime series in England and Wales. In particular, we estimate the long-run
impact of economic conditions, beer consumption and various deterrents on
different categories of recorded violent crime. The results suggest that a
long-run causal model exists for only minor crimes of violence, with beer
consumption being a predominant factor.
Journal: Journal of Applied Statistics
Pages: 647-660
Issue: 4
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903563619
File-URL: http://hdl.handle.net/10.1080/02664760903563619
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:647-660
Template-Type: ReDIF-Article 1.0
Author-Name: Antonio F. B. Costa
Author-X-Name-First: Antonio F. B.
Author-X-Name-Last: Costa
Author-Name: Philippe Castagliola
Author-X-Name-First: Philippe
Author-X-Name-Last: Castagliola
Title: Effect of measurement error and autocorrelation on the X¯ chart
Abstract:
Measurement error and autocorrelation often exist in quality control
applications. Both have an adverse effect on the X¯
chart's performance. To counteract the undesired effect of
autocorrelation, we build-up the samples with non-neighbouring items,
according to the time they were produced. To counteract the undesired
effect of measurement error, we measure the quality characteristic of each
item of the sample several times. The chart's performance is assessed when
multiple measurements are applied and the samples are built by taking one
item from the production line and skipping one, two or more before
selecting the next.
Journal: Journal of Applied Statistics
Pages: 661-673
Issue: 4
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664760903563627
File-URL: http://hdl.handle.net/10.1080/02664760903563627
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:661-673
Template-Type: ReDIF-Article 1.0
Author-Name: Pao-Sheng Shen
Author-X-Name-First: Pao-Sheng
Author-X-Name-Last: Shen
Title: Semiparametric analysis of transformation models with doubly censored data
Abstract:
Double censoring arises when T represents an outcome
variable that can only be accurately measured within a certain range,
[L, U], where L and U
are the left- and right-censoring variables, respectively. In this note,
using Martingale arguments of Chen et al. [3], we propose
an estimator (denoted by ˜β) for estimating regression
coefficients of transformation model when L is always
observed. Under Cox proportional hazards model, the proposed estimator is
equivalent to the partial likelihood estimator for left-truncated and
right-censored data if the left-censoring variables L
were regarded as left-truncated variables. In this case, the estimator
˜β can be obtained by the standard software. A simulation
study is conducted to investigate the performance of ˜β. For
the purpose of comparison, the simulation study also includes the
estimator proposed by Cai and Cheng [2] for the case when
L and U are always observed.
Journal: Journal of Applied Statistics
Pages: 675-682
Issue: 4
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903563635
File-URL: http://hdl.handle.net/10.1080/02664760903563635
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:675-682
Template-Type: ReDIF-Article 1.0
Author-Name: Ehab F. Abd-Elfattah
Author-X-Name-First: Ehab F.
Author-X-Name-Last: Abd-Elfattah
Author-Name: Ronald W. Butler
Author-X-Name-First: Ronald W.
Author-X-Name-Last: Butler
Title: Tests for symmetry with right censoring
Abstract:
Permutation tests for symmetry are suggested using data that are subject
to right censoring. Such tests are directly relevant to the assumptions
that underlie the generalized Wilcoxon test since the symmetric logistic
distribution for log-errors has been used to motivate Wilcoxon scores in
the censored accelerated failure time model. Its principal competitor is
the log-rank (LGR) test motivated by an extreme value error distribution
that is positively skewed. The proposed one-sided tests for symmetry
against the alternative of positive skewness are directly relevant to the
choice between usage of these two tests. The permutation tests use
statistics from the weighted LGR class normally used for making two-sample
comparisons. From this class, the test using LGR weights (all weights
equal) showed the greatest discriminatory power in simulations that
compared the possibility of logistic errors versus extreme value errors.
In the test construction, a median estimate, determined by inverting the
Kaplan--Meier estimator, is used to divide the data into a
“control” group to its left that is compared with a
“treatment” group to its right. As an unavoidable
consequence of testing symmetry, data in the control group that have been
censored become uninformative in performing this two-sample test. Thus,
early heavy censoring of data can reduce the effective sample size of the
control group and result in diminished power for discriminating symmetry
in the population distribution.
Journal: Journal of Applied Statistics
Pages: 683-693
Issue: 4
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664760903563643
File-URL: http://hdl.handle.net/10.1080/02664760903563643
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:683-693
Template-Type: ReDIF-Article 1.0
Author-Name: Liliane Bel
Author-X-Name-First: Liliane
Author-X-Name-Last: Bel
Author-Name: Avner Bar-Hen
Author-X-Name-First: Avner
Author-X-Name-Last: Bar-Hen
Author-Name: R�my Petit
Author-X-Name-First: R�my
Author-X-Name-Last: Petit
Author-Name: Rachid Cheddadi
Author-X-Name-First: Rachid
Author-X-Name-Last: Cheddadi
Title: Spatio-temporal functional regression on paleoecological data
Abstract:
There is much interest in predicting the impact of global warming on the
genetic diversity of natural populations and the influence of climate on
biodiversity is an important ecological question. Since Holocene, we face
many climate perturbations and the geographical ranges of plant taxa have
changed substantially. Actual genetic diversity of plant is a result of
these processes and a first step to study the impact of future climate
change is to understand the important features of reconstructed climate
variables such as temperature or precipitation for the last 15,000 years
on actual genetic diversity of forest. We model the relationship between
genetic diversity in the European beech (Fagus sylvatica) forests and
curves of temperature and precipitation reconstructed from pollen
databases. Our model links the genetic measure to the climate curves. We
adapt classical functional linear model to take into account interactions
between climate variables as a bilinear form. Since the data are
georeferenced, our extensions also account for the spatial dependence
among the observations. The practical issues of these methodological
extensions are discussed.
Journal: Journal of Applied Statistics
Pages: 695-704
Issue: 4
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903563650
File-URL: http://hdl.handle.net/10.1080/02664760903563650
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:695-704
Template-Type: ReDIF-Article 1.0
Author-Name: G. Muniz Terrera
Author-X-Name-First: G.
Author-X-Name-Last: Muniz Terrera
Author-Name: A. van den Hout
Author-X-Name-First: A.
Author-X-Name-Last: van den Hout
Author-Name: F. E. Matthews
Author-X-Name-First: F. E.
Author-X-Name-Last: Matthews
Title: Random change point models: investigating cognitive decline in the presence of missing data
Abstract:
With the aim of identifying the age of onset of change in the rate of
cognitive decline while accounting for the missing observations, we
considered a selection modelling framework. A random change point model
was fitted to data from a population-based longitudinal study of ageing
(the Cambridge City over 75 Cohort Study) to model the longitudinal
process. A missing at random mechanism was modelled using logistic
regression. Random effects such as initial cognitive status, rate of
decline before and after the change point, and the age of onset of change
in rate of decline were estimated after adjustment for risk factors for
cognitive decline. Among other possible predictors, the last observed
cognitive score was used to adjust the probability of death and dropout.
Individuals who experienced less variability in their cognitive scores
experienced a change in their rate of decline at older ages than
individuals whose cognitive scores varied more.
Journal: Journal of Applied Statistics
Pages: 705-716
Issue: 4
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903563668
File-URL: http://hdl.handle.net/10.1080/02664760903563668
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:705-716
Template-Type: ReDIF-Article 1.0
Author-Name: Bi-Min Hsu
Author-X-Name-First: Bi-Min
Author-X-Name-Last: Hsu
Author-Name: Ming-Hung Shu
Author-X-Name-First: Ming-Hung
Author-X-Name-Last: Shu
Title: A two-phase method for controlling Erlang-failure processes with high reliability
Abstract:
Monitoring a failure process and measuring its performance are important
issues for complex nonrepairable and repairable systems. For a highly
reliable process, traditional methods for reliability monitoring and
performance measuring become inapplicable. This paper proposes a new
two-phase controlling method for monitoring and measuring an
Erlang-failure process (EFP). In the first-phase controlling method, a
control chart is used to monitor the EFP condition. When special causes of
variation have been removed from the EFP and all of the failure times
plotted on the control chart lie within the control limits, the EFP is
considered to be in control. However, the in-control EFP still likely
carries out bad or out-of-lifetime-specification conditions. Thus, its
lifetime-specification limit is taken into consideration as the
second-phase controlling method for measuring the in-control EFP
performance. We propose a lifetime-capability index. Its value has a
one-to-one corresponding relationship with the lifetime-conforming rate,
which indicates the lifetime performance of this EFP. Without collecting
additional data efforts, in-control data gathered from the control chart
in the first phase is employed to estimate the lifetime-capability index.
To realize main lifetime-capability of the EFP impacting on downstream
customers, the lower confidence bound of the estimate of the
lifetime-capability index, capturing its minimum lifetime capability, is
considered. The advantage of this two-phase method for controlling the
failure processes can motivate the manufacturer to develop a
reliability-monitoring technique, establish an adequate reliability
improvement program and implement an appropriate analysis to ensure its
lifetime performance meeting the customers requirement.
Journal: Journal of Applied Statistics
Pages: 717-734
Issue: 4
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664760903563676
File-URL: http://hdl.handle.net/10.1080/02664760903563676
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:717-734
Template-Type: ReDIF-Article 1.0
Author-Name: Darryl Holden
Author-X-Name-First: Darryl
Author-X-Name-Last: Holden
Title: Testing for heteroskedasticity in the tobit and probit models
Abstract:
Non-constant variance across observations (heteroskedasticity) results in
the maximum likelihood estimators of tobit and probit model parameters
being inconsistent. Some of the available tests for constant variance
across observations (homoskedasticity) are discussed and examined in a
small Monte Carlo experiment.
Journal: Journal of Applied Statistics
Pages: 735-744
Issue: 4
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903563684
File-URL: http://hdl.handle.net/10.1080/02664760903563684
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:735-744
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Congdon
Author-X-Name-First: Peter
Author-X-Name-Last: Congdon
Title: Structural equation models for area health outcomes with model selection
Abstract:
Recent analyses seeking to explain variation in area health outcomes
often consider the impact on them of latent measures (i.e. unobserved
constructs) of population health risk. The latter are typically obtained
by forms of multivariate analysis, with a small set of latent constructs
derived from a collection of observed indicators, and a few recent area
studies take such constructs to be spatially structured rather than
independent over areas. A confirmatory approach is often applicable to the
model linking indicators to constructs, based on substantive knowledge of
relevant risks for particular diseases or outcomes. In this paper,
population constructs relevant to a particular set of health outcomes are
derived using an integrated model containing all the manifest variables,
namely health outcome variables, as well as indicator variables underlying
the latent constructs. A further feature of the approach is the use of
variable selection techniques to select significant loadings and factors
(especially in terms of effects of constructs on health outcomes), so
ensuring parsimonious models are selected. A case study considers suicide
mortality and self-harm contrasts in the East of England in relation to
three latent constructs: deprivation, fragmentation and urbanicity.
Journal: Journal of Applied Statistics
Pages: 745-767
Issue: 4
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664760903563692
File-URL: http://hdl.handle.net/10.1080/02664760903563692
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:745-767
Template-Type: ReDIF-Article 1.0
Author-Name: Albert Vexler
Author-X-Name-First: Albert
Author-X-Name-Last: Vexler
Author-Name: Shuling Liu
Author-X-Name-First: Shuling
Author-X-Name-Last: Liu
Author-Name: Enrique F. Schisterman
Author-X-Name-First: Enrique F.
Author-X-Name-Last: Schisterman
Title: Nonparametric-likelihood inference based on cost-effectively-sampled-data
Abstract:
Costs associated with the evaluation of biomarkers can restrict the
number of relevant biological samples to be measured. This common problem
has been dealt with extensively in the epidemiologic and biostatistical
literature that proposes to apply different cost-efficient procedures,
including pooling and random sampling strategies. The pooling design has
been widely addressed as a very efficient sampling method under certain
parametric assumptions regarding data distribution. When cost is not a
main factor in the evaluation of biomarkers but measurement is subject to
a limit of detection, a common instrument limitation on the measurement
process, the pooling design can partially overcome this instrumental
limitation. In certain situations, the pooling design can provide data
that is less informative than a simple random sample; however this is not
always the case. Pooled-data-based nonparametric inferences have not been
well addressed in the literature. In this article, a distribution-free
method based on the empirical likelihood technique is proposed to
substitute the traditional parametric-likelihood approach, providing the
true coverage, confidence interval estimation and powerful tests based on
data obtained after the cost-efficient designs. We also consider several
nonparametric tests to compare with the proposed procedure. We examine the
proposed methodology via a broad Monte Carlo study and a real data
example.
Journal: Journal of Applied Statistics
Pages: 769-783
Issue: 4
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003692290
File-URL: http://hdl.handle.net/10.1080/02664761003692290
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:769-783
Template-Type: ReDIF-Article 1.0
Author-Name: Jimin Lee
Author-X-Name-First: Jimin
Author-X-Name-Last: Lee
Author-Name: Seunggeun Hyun
Author-X-Name-First: Seunggeun
Author-X-Name-Last: Hyun
Title: Confidence bands for the difference of two survival functions under the additive risk model
Abstract:
In many clinical studies, a commonly encountered problem is to compare
the survival probabilities of two treatments for a given patient with a
certain set of covariates, and there is often a need to make adjustments
for other covariates that may affect outcomes. One approach is to plot the
difference between the two subject-specific predicted survival estimates
with a simultaneous confidence band. Such a band will provide useful
information about when these two treatments differ and which treatment has
a better survival probability. In this paper, we show how to construct
such a band based on the additive risk model and we use the martingale
central limit theorem to derive its asymptotic distribution. The proposed
method is evaluated from a simulation study and is illustrated with two
real examples.
Journal: Journal of Applied Statistics
Pages: 785-797
Issue: 4
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003692308
File-URL: http://hdl.handle.net/10.1080/02664761003692308
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:785-797
Template-Type: ReDIF-Article 1.0
Author-Name: E. Silva
Author-X-Name-First: E.
Author-X-Name-Last: Silva
Author-Name: V. M. Guerrero
Author-X-Name-First: V. M.
Author-X-Name-Last: Guerrero
Author-Name: D. Peña
Author-X-Name-First: D.
Author-X-Name-Last: Peña
Title: Temporal disaggregation and restricted forecasting of multiple population time series
Abstract:
This article presents some applications of time-series procedures to
solve two typical problems that arise when analyzing demographic
information in developing countries: (1) unavailability of annual time
series of population growth rates (PGRs) and their corresponding
population time series and (2) inappropriately defined population growth
goals in official population programs. These problems are considered as
situations that require combining information of population time series.
Firstly, we suggest the use of temporal disaggregation techniques to
combine census data with vital statistics information in order to estimate
annual PGRs. Secondly, we apply multiple restricted forecasting to combine
the official targets on future PGRs with the disaggregated series. Then,
we propose a mechanism to evaluate the compatibility of the demographic
goals with the annual data. We apply the aforementioned procedures to data
of the Mexico City Metropolitan Zone divided by concentric rings and
conclude that the targets established in the official program are not
feasible. Hence, we derive future PGRs that are both in line with the
official targets and with the historical demographic behavior. We conclude
that growth population programs should be based on this kind of analysis
to be supported empirically. So, through specialized multivariate
time-series techniques, we propose to obtain first an optimal estimate of
a disaggregate vector of population time series and then, produce
restricted forecasts in agreement with some data-based population policies
here derived.
Journal: Journal of Applied Statistics
Pages: 799-815
Issue: 4
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664761003692316
File-URL: http://hdl.handle.net/10.1080/02664761003692316
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:799-815
Template-Type: ReDIF-Article 1.0
Author-Name: Z. Rezaei Ghahroodi
Author-X-Name-First: Z. Rezaei
Author-X-Name-Last: Ghahroodi
Author-Name: M. Ganjali
Author-X-Name-First: M.
Author-X-Name-Last: Ganjali
Author-Name: F. Harandi
Author-X-Name-First: F.
Author-X-Name-Last: Harandi
Author-Name: D. Berridge
Author-X-Name-First: D.
Author-X-Name-Last: Berridge
Title: Bivariate transition model for analysing ordinal and nominal categorical responses: an application to the Labour Force Survey data
Abstract:
In many panel studies, bivariate ordinal--nominal responses are measured
and the aim is to investigate the effects of explanatory variables on
these responses. A regression analysis for these types of data must allow
for the correlation among responses of the same individual. To analyse
such ordinal--nominal responses using a proper weighting approach, an
ordinal--nominal bivariate transition model is proposed and maximum
likelihood is used to find the parameter estimates. We propose a method in
which the likelihood function can be partitioned to make possible the use
of existing software. The approach is applied to the Labour Force Survey
data in Iran, where the ordinal response, at the first period, is the
duration of unemployment for unemployed people and the nominal response,
in the second period, is economic activity status of these individuals.
The interest is to find the reasons for staying unemployed or moving to
another status of economic activity.
Journal: Journal of Applied Statistics
Pages: 817-832
Issue: 4
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003692324
File-URL: http://hdl.handle.net/10.1080/02664761003692324
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:817-832
Template-Type: ReDIF-Article 1.0
Author-Name: A. Schörgendorfer
Author-X-Name-First: A.
Author-X-Name-Last: Schörgendorfer
Author-Name: L. V. Madden
Author-X-Name-First: L. V.
Author-X-Name-Last: Madden
Author-Name: A. C. Bathke
Author-X-Name-First: A. C.
Author-X-Name-Last: Bathke
Title: Choosing appropriate covariance matrices in a nonparametric analysis of factorials in block designs
Abstract:
The standard nonparametric, rank-based approach to the analysis of
dependent data from factorial designs is based on an estimated
unstructured (UN) variance--covariance matrix, but the large number of
variance--covariance terms in many designs can seriously affect test
performance. In a simulation study for a factorial arranged in blocks, we
compared estimates of type-I error probability and power based on the UN
structure with the estimates obtained with a more parsimonious
heterogeneous-compound-symmetry structure (CSH). Although tests based on
the UN structure were anti-conservative with small number of factor
levels, especially with four or six blocks, they became conservative at
higher number of factor levels. Tests based on the CSH structure were
anti-conservative, and results did not depend on the number of factor
levels. When both tests were anti-conservative, tests based on the CSH
structure were less so. Although use of the CSH structure is concluded to
be more suitable than use of the UN structure for the small number of
blocks typical in agricultural experiments, results suggest that further
improvement of test statistics is needed for such situations.
Journal: Journal of Applied Statistics
Pages: 833-850
Issue: 4
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664761003692332
File-URL: http://hdl.handle.net/10.1080/02664761003692332
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:833-850
Template-Type: ReDIF-Article 1.0
Author-Name: Diego F. de Bernardini
Author-X-Name-First: Diego F.
Author-X-Name-Last: de Bernardini
Author-Name: Laura L.R. Rifo
Author-X-Name-First: Laura L.R.
Author-X-Name-Last: Rifo
Title: Full Bayesian significance test for extremal distributions
Abstract:
A new Bayesian measure of evidence is used for model choice within the
generalized extreme value family of distributions, given an absolutely
continuous posterior distribution on the related parametric space. This
criterion allows quantitative measurement of evidence of any sharp
hypothesis, with no need of a prior distribution assignment to it. We
apply this methodology to the testing of the precise hypothesis given by
the Gumbel model using real data. Performance is compared with usual
evidence measures, such as Bayes factor, Bayesian information criterion,
deviance information criterion and descriptive level for deviance
statistic.
Journal: Journal of Applied Statistics
Pages: 851-863
Issue: 4
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664761003692340
File-URL: http://hdl.handle.net/10.1080/02664761003692340
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:851-863
Template-Type: ReDIF-Article 1.0
Author-Name: Miroslav M. Ristić
Author-X-Name-First: Miroslav M.
Author-X-Name-Last: Ristić
Title: Statistics: A Very Short Introduction
Journal: Journal of Applied Statistics
Pages: 865-865
Issue: 4
Volume: 38
Year: 2011
Month: 4
X-DOI: 10.1080/02664760903075598
File-URL: http://hdl.handle.net/10.1080/02664760903075598
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:865-865
Template-Type: ReDIF-Article 1.0
Author-Name: Andreas Rosenblad
Author-X-Name-First: Andreas
Author-X-Name-Last: Rosenblad
Title: The Concise Encyclopedia of Statistics
Journal: Journal of Applied Statistics
Pages: 867-868
Issue: 4
Volume: 38
Year: 2011
Month: 4
X-DOI: 10.1080/02664760903075614
File-URL: http://hdl.handle.net/10.1080/02664760903075614
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:867-868
Template-Type: ReDIF-Article 1.0
Author-Name: Pieter Bastiaan Ober
Author-X-Name-First: Pieter Bastiaan
Author-X-Name-Last: Ober
Title: Asymptotic Theory of Statistics and Probability
Journal: Journal of Applied Statistics
Pages: 869-869
Issue: 4
Volume: 38
Year: 2011
Month: 4
X-DOI: 10.1080/02664760903075630
File-URL: http://hdl.handle.net/10.1080/02664760903075630
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:4:p:869-869
Template-Type: ReDIF-Article 1.0
Author-Name: Guoqing Wu
Author-X-Name-First: Guoqing
Author-X-Name-Last: Wu
Author-Name: Chao Chen
Author-X-Name-First: Chao
Author-X-Name-Last: Chen
Author-Name: Xuefeng Yan
Author-X-Name-First: Xuefeng
Author-X-Name-Last: Yan
Title: Modified minimum covariance determinant estimator and its application to outlier detection of chemical process data
Abstract:
To overcome the main flaw of minimum covariance determinant (MCD)
estimator, i.e. difficulty to determine its main parameter
h, a modified-MCD (M-MCD) algorithm is proposed. In
M-MCD, the self-adaptive iteration is proposed to minimize the deflection
between the standard deviation of robust mahalanobis distance square,
which is calculated by MCD with the parameter h based on
the sample, and the standard deviation of theoretical mahalanobis distance
square by adjusting the parameter h of MCD. Thus, the
optimal parameter h of M-MCD is determined when the
minimum deflection is obtained. The results of convergence analysis
demonstrate that M-MCD has good convergence property. Further, M-MCD and
MCD were applied to detect outliers for two typical data and chemical
process data, respectively. The results show that M-MCD can get the
optimal parameter h by using the self-adaptive iteration
and thus its performances of outlier detection are better than MCD.
Journal: Journal of Applied Statistics
Pages: 1007-1020
Issue: 5
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664761003692456
File-URL: http://hdl.handle.net/10.1080/02664761003692456
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:1007-1020
Template-Type: ReDIF-Article 1.0
Author-Name: A. R. de Leon
Author-X-Name-First: A. R.
Author-X-Name-Last: de Leon
Author-Name: A. Soo
Author-X-Name-First: A.
Author-X-Name-Last: Soo
Author-Name: T. Williamson
Author-X-Name-First: T.
Author-X-Name-Last: Williamson
Title: Classification with discrete and continuous variables via general mixed-data models
Abstract:
We study the problem of classifying an individual into one of several
populations based on mixed nominal, continuous, and ordinal data.
Specifically, we obtain a classification procedure as an extension to the
so-called location linear discriminant function, by specifying a general
mixed-data model for the joint distribution of the mixed discrete and
continuous variables. We outline methods for estimating misclassification
error rates. Results of simulations of the performance of proposed
classification rules in various settings vis-à-vis a robust
mixed-data discrimination method are reported as well. We give an example
utilizing data on croup in children.
Journal: Journal of Applied Statistics
Pages: 1021-1032
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003758976
File-URL: http://hdl.handle.net/10.1080/02664761003758976
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:1021-1032
Template-Type: ReDIF-Article 1.0
Author-Name: Camillo Cammarota
Author-X-Name-First: Camillo
Author-X-Name-Last: Cammarota
Title: The difference-sign runs length distribution in testing for serial independence
Abstract:
We investigate the sequence of difference-sign runs length of a time
series in the context of non-parametric tests for serial independence.
This sequence is, under suitable conditioning, a stationary sequence and
we prove that the normalized correlation of two consecutive runs length is
small (≈0.0427). We use this result in a test based on the relative
entropy of the empirical distribution of the runs length. We investigate
the performance of the test in simulated series and test serial
independence of cardiac data series in atrial fibrillation.
Journal: Journal of Applied Statistics
Pages: 1033-1043
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003758984
File-URL: http://hdl.handle.net/10.1080/02664761003758984
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:1033-1043
Template-Type: ReDIF-Article 1.0
Author-Name: M. M. Nassar
Author-X-Name-First: M. M.
Author-X-Name-Last: Nassar
Author-Name: S. M. Khamis
Author-X-Name-First: S. M.
Author-X-Name-Last: Khamis
Author-Name: S. S. Radwan
Author-X-Name-First: S. S.
Author-X-Name-Last: Radwan
Title: On Bayesian sample size determination
Abstract:
Three Bayesian methods are considered for the determination of sample
sizes for sampling from the Laplace distribution -- the distribution of
time between rare events -- with a normal prior. These methods are applied
to the sizing of aircraft mid-air collisions in a navigation system or
large flight path deviations of aircraft in air traffic management
scenarios. A computer program handles all computations and gives a good
insight about the best suggested method.
Journal: Journal of Applied Statistics
Pages: 1045-1054
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003758992
File-URL: http://hdl.handle.net/10.1080/02664761003758992
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:1045-1054
Template-Type: ReDIF-Article 1.0
Author-Name: Bidemi Yusuf
Author-X-Name-First: Bidemi
Author-X-Name-Last: Yusuf
Author-Name: Olayinka Omigbodun
Author-X-Name-First: Olayinka
Author-X-Name-Last: Omigbodun
Author-Name: Babatunde Adedokun
Author-X-Name-First: Babatunde
Author-X-Name-Last: Adedokun
Author-Name: Odunayo Akinyemi
Author-X-Name-First: Odunayo
Author-X-Name-Last: Akinyemi
Title: Identifying predictors of violent behaviour among students using the conventional logistic and multilevel logistic models
Abstract:
Analysing individual-, school- and class-level observations is a good and
efficient approach in epidemiologic research. Using data on violent
behaviour among secondary school students we compared results from the
conventional logistic modelling with multilevel logistic modelling
approach using the gllamm command in Stata. We illustrated the advantage
of multilevel modelling over the conventional logistic modelling through
an example of data from violence experience among secondary school
students. We constructed a logistic model with a random intercept on the
school and class levels to account for unexplained heterogeneity between
schools and classes. In the multilevel model, we estimated that, in an
average school, the odds of experiencing violence are 3 (OR=2.99, 95% CI:
1.86, 4.81, p>0.0001) times higher for students who use
drugs as opposed to the odds of experiencing violence for students who do
not use drugs. However, the estimates in the conventional logistic model
are slightly lower. We estimated that a normally distributed
random intercept for schools and classes that accounts for any unexplained
heterogeneity between schools and classes has variances 0.017 and 0.035,
respectively. We therefore recommend the multilevel logistic modelling
when data are clustered.
Journal: Journal of Applied Statistics
Pages: 1055-1061
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003759008
File-URL: http://hdl.handle.net/10.1080/02664761003759008
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:1055-1061
Template-Type: ReDIF-Article 1.0
Author-Name: Juvêncio S. Nobre
Author-X-Name-First: Juvêncio S.
Author-X-Name-Last: Nobre
Author-Name: Julio M. Singer
Author-X-Name-First: Julio M.
Author-X-Name-Last: Singer
Title: Leverage analysis for linear mixed models
Abstract:
We consider a generalized leverage matrix useful for the identification
of influential units and observations in linear mixed models and show how
a decomposition of this matrix may be employed to identify high leverage
points for both the marginal fitted values and the random effect component
of the conditional fitted values. We illustrate the different uses of the
two components of the decomposition with a simulated example as well as
with a real data set.
Journal: Journal of Applied Statistics
Pages: 1063-1072
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003759016
File-URL: http://hdl.handle.net/10.1080/02664761003759016
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:1063-1072
Template-Type: ReDIF-Article 1.0
Author-Name: Arthur Pewsey
Author-X-Name-First: Arthur
Author-X-Name-Last: Pewsey
Author-Name: Kunio Shimizu
Author-X-Name-First: Kunio
Author-X-Name-Last: Shimizu
Author-Name: Rolando de la Cruz
Author-X-Name-First: Rolando
Author-X-Name-Last: de la Cruz
Title: On an extension of the von Mises distribution due to Batschelet
Abstract:
This paper considers the three-parameter family of symmetric unimodal
circular distributions proposed by Batschelet in [1], an extension of the
von Mises distribution containing distributional forms ranging from the
highly leptokurtic to the very platykurtic. The family's fundamental
properties are given, and likelihood-based techniques described which can
be used to perform estimation and hypothesis testing. Analyses are
presented of two data sets which illustrate how the family and three of
its most direct competitors can be applied in the search for parsimonious
models for circular data.
Journal: Journal of Applied Statistics
Pages: 1073-1085
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003759024
File-URL: http://hdl.handle.net/10.1080/02664761003759024
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:1073-1085
Template-Type: ReDIF-Article 1.0
Author-Name: Leakemariam Berhe
Author-X-Name-First: Leakemariam
Author-X-Name-Last: Berhe
Author-Name: Göran Arnoldsson
Author-X-Name-First: Göran
Author-X-Name-Last: Arnoldsson
Title: D s -optimal designs for Kozak's tree taper model
Abstract:
In this work, we study D s
-optimal design for Kozak's tree taper model. The approximate
D s -optimal designs are
found invariant to tree size and hence create a ground to construct a
general replication-free D s
-optimal design. Even though the designs are found not to be
dependent on the parameter value p of the Kozak's model,
they are sensitive to the s×1 subset parameter
vector values of the model. The 12 points replication-free design (with
91% efficiency) suggested in this study is believed to reduce cost and
time for data collection and more importantly to precisely estimate the
subset parameters of interest.
Journal: Journal of Applied Statistics
Pages: 1087-1102
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003759925
File-URL: http://hdl.handle.net/10.1080/02664761003759925
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:1087-1102
Template-Type: ReDIF-Article 1.0
Author-Name: Artur J. Lemonte
Author-X-Name-First: Artur J.
Author-X-Name-Last: Lemonte
Author-Name: Alexandre G. Patriota
Author-X-Name-First: Alexandre G.
Author-X-Name-Last: Patriota
Title: Influence diagnostics in Birnbaum--Saunders nonlinear regression models
Abstract:
We consider the issue of assessing influence of observations in the class
of Birnbaum--Saunders nonlinear regression models, which is useful in
lifetime data analysis. Our results generalize those in Galea et
al. [8] which are confined to Birnbaum--Saunders linear
regression models. Some influence methods, such as the local influence,
total local influence of an individual and generalized leverage are
discussed. Additionally, the normal curvatures for studying local
influence are derived under some perturbation schemes. We also give an
application to a real fatigue data set.
Journal: Journal of Applied Statistics
Pages: 871-884
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003692357
File-URL: http://hdl.handle.net/10.1080/02664761003692357
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:871-884
Template-Type: ReDIF-Article 1.0
Author-Name: Hani M. Samawi
Author-X-Name-First: Hani M.
Author-X-Name-Last: Samawi
Author-Name: Amal Helu
Author-X-Name-First: Amal
Author-X-Name-Last: Helu
Author-Name: Robert Vogel
Author-X-Name-First: Robert
Author-X-Name-Last: Vogel
Title: A nonparametric test of symmetry based on the overlapping coefficient
Abstract:
In this paper, we introduce a new nonparametric test of symmetry based on
the empirical overlap coefficient using kernel density estimation. Our
investigation reveals that the new test is more powerful than the runs
test of symmetry proposed by McWilliams [31]. Intensive simulation is
conducted to examine the power of the proposed test. Data from a level I
Trauma center are used to illustrate the procedures developed in this
paper.
Journal: Journal of Applied Statistics
Pages: 885-898
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003692365
File-URL: http://hdl.handle.net/10.1080/02664761003692365
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:885-898
Template-Type: ReDIF-Article 1.0
Author-Name: Luis Mariano Esteban
Author-X-Name-First: Luis Mariano
Author-X-Name-Last: Esteban
Author-Name: Gerardo Sanz
Author-X-Name-First: Gerardo
Author-X-Name-Last: Sanz
Author-Name: Angel Borque
Author-X-Name-First: Angel
Author-X-Name-Last: Borque
Title: A step-by-step algorithm for combining diagnostic tests
Abstract:
Combining data of several tests or markers for the classification of
patients according to their health status for assigning better treatments
is a major issue in the study of diseases such as cancer. In order to
tackle this problem, several approaches have been proposed in the
literature. In this paper, a step-by-step algorithm for estimating the
parameters of a linear classifier that combines several measures is
considered. The optimization criterion is to maximize the area under the
receiver operating characteristic curve. The algorithm is applied to
different simulated data sets and its performance is evaluated. Finally,
the method is illustrated with a prostate cancer staging database.
Journal: Journal of Applied Statistics
Pages: 899-911
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003692373
File-URL: http://hdl.handle.net/10.1080/02664761003692373
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:899-911
Template-Type: ReDIF-Article 1.0
Author-Name: Andr�s Farall
Author-X-Name-First: Andr�s
Author-X-Name-Last: Farall
Author-Name: Ricardo Maronna
Author-X-Name-First: Ricardo
Author-X-Name-Last: Maronna
Author-Name: Tomás Tetzlaff
Author-X-Name-First: Tomás
Author-X-Name-Last: Tetzlaff
Title: A mixture model for the detection of Neosporosis without a gold standard
Abstract:
Neosporosis is a bovine disease caused by the parasite
Neospora caninum. It is not yet sufficiently studied, and
it is supposed to cause an important number of abortions. Its clinical
symptoms do not yet allow the reliable identification of infected animals.
Its study and treatment would improve if a test based on antibody counts
were available. Knowing the distribution functions of observed counts of
uninfected and infected cows would allow the determination of a cutoff
value. These distributions cannot be estimated directly. This paper deals
with the indirect estimation of these distributions based on a data set
consisting of the antibody counts for some 200 pairs of cows and their
calves. The desired distributions are estimated through a mixture model
based on simple assumptions that describe the relationship between each
cow and its calf. The model then allows the estimation of the cutoff value
and of the error probabilities.
Journal: Journal of Applied Statistics
Pages: 913-926
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003692381
File-URL: http://hdl.handle.net/10.1080/02664761003692381
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:913-926
Template-Type: ReDIF-Article 1.0
Author-Name: Baisuo Jin
Author-X-Name-First: Baisuo
Author-X-Name-Last: Jin
Author-Name: Mong-Na Lo Huang
Author-X-Name-First: Mong-Na Lo
Author-X-Name-Last: Huang
Author-Name: Baiqi Miao
Author-X-Name-First: Baiqi
Author-X-Name-Last: Miao
Title: Testing for variance changes in autoregressive models with unknown order
Abstract:
The problem of change point in autoregressive process is studied in this
article. We propose a Bayesian information criterion-iterated cumulative
sums of squares algorithm to detect the variance changes in an
autoregressive series with unknown order. Simulation results and two
examples are presented, where it is shown to have good performances when
the sample size is relatively small.
Journal: Journal of Applied Statistics
Pages: 927-936
Issue: 5
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664761003692399
File-URL: http://hdl.handle.net/10.1080/02664761003692399
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:927-936
Template-Type: ReDIF-Article 1.0
Author-Name: Suzy Van Sanden
Author-X-Name-First: Suzy
Author-X-Name-Last: Van Sanden
Author-Name: Tomasz Burzykowski
Author-X-Name-First: Tomasz
Author-X-Name-Last: Burzykowski
Title: Evaluation of Laplace distribution-based ANOVA models applied to microarray data
Abstract:
In a microarray experiment, intensity measurements tend to vary due to
various systematic and random effects, which enter at the different stages
of the measurement process. Common test statistics do not take these
effects into account. An alternative is to use, for example, ANOVA models.
In many cases, we can, however, not make the assumption of normally
distributed error terms. Purdom and Holmes [6] have concluded that the
distribution of microarray intensity measurements can often be better
approximated by a Laplace distribution. In this paper, we consider the
analysis of microarray data by using ANOVA models under the assumption of
Laplace-distributed error terms. We explain the methodology and discuss
problems related to fitting of this type of models. In addition to
evaluating the models using several real-life microarray experiments, we
conduct a simulation study to investigate different aspects of the models
in detail. We find that, while the normal model is less sensitive to model
misspecifications, the Laplace model has more power when the data are
truly Laplace distributed. However, in the latter situation, neither of
the models is able to control the false discovery rate at the
pre-specified significance level. This problem is most likely related to
sample size issues.
Journal: Journal of Applied Statistics
Pages: 937-950
Issue: 5
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664761003692407
File-URL: http://hdl.handle.net/10.1080/02664761003692407
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:937-950
Template-Type: ReDIF-Article 1.0
Author-Name: Eren Demir
Author-X-Name-First: Eren
Author-X-Name-Last: Demir
Author-Name: Thierry Chaussalet
Author-X-Name-First: Thierry
Author-X-Name-Last: Chaussalet
Title: Capturing the re-admission process: focus on time window
Abstract:
In the majority of studies on patient re-admissions, a re-admission is
deemed to have occurred if a patient is admitted within a time window of
the previous discharge date. However, these time windows have rarely been
objectively justified. We capture the re-admission process from the
community using a special case of a Coxian phase-type distribution,
expressed as a mixture of two generalized Erlang distributions. Using the
Bayes theorem, we compute the optimal time windows in defining
re-admission. From the national data set in England, we defined
re-admission for chronic obstructive pulmonary disease (COPD), stroke,
congestive heart failure, and hip- and thigh-fractured patients as 41, 9,
37, and 8 days, respectively. These time windows could be used to classify
patients into two groups (binary response), namely those patients who are
at high risk (e.g. within 41 days for COPD) and low risk of re-admission
group (respectively, greater than 41 days). The generality of the
modelling framework and the capability of supporting a broad class of
distributions enables the applicability into other domains, to capture the
process within the field of interest and to determine an appropriate time
window (a cut-off value) based on evidence objectively derived from
operational data.
Journal: Journal of Applied Statistics
Pages: 951-960
Issue: 5
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664761003692415
File-URL: http://hdl.handle.net/10.1080/02664761003692415
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:951-960
Template-Type: ReDIF-Article 1.0
Author-Name: S. McKay Curtis
Author-X-Name-First: S.
Author-X-Name-Last: McKay Curtis
Author-Name: Sujit K. Ghosh
Author-X-Name-First: Sujit K.
Author-X-Name-Last: Ghosh
Title: A variable selection approach to monotonic regression with Bernstein polynomials
Abstract:
One of the standard problems in statistics consists of determining the
relationship between a response variable and a single predictor variable
through a regression function. Background scientific knowledge is often
available that suggests that the regression function should have a certain
shape (e.g. monotonically increasing or concave) but not necessarily a
specific parametric form. Bernstein polynomials have been used to impose
certain shape restrictions on regression functions. The Bernstein
polynomials are known to provide a smooth estimate over equidistant knots.
Bernstein polynomials are used in this paper due to their ease of
implementation, continuous differentiability, and theoretical properties.
In this work, we demonstrate a connection between the monotonic regression
problem and the variable selection problem in the linear model. We develop
a Bayesian procedure for fitting the monotonic regression model by
adapting currently available variable selection procedures. We demonstrate
the effectiveness of our method through simulations and the analysis of
real data.
Journal: Journal of Applied Statistics
Pages: 961-976
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003692423
File-URL: http://hdl.handle.net/10.1080/02664761003692423
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:961-976
Template-Type: ReDIF-Article 1.0
Author-Name: Gunnar Taraldsen
Author-X-Name-First: Gunnar
Author-X-Name-Last: Taraldsen
Title: Analysis of rounded exponential data
Abstract:
The problem of inference based on a rounded random sample from the
exponential distribution is treated. The main results are given by an
explicit expression for the maximum-likelihood estimator, a confidence
interval with a guaranteed level of confidence, and a conjugate class of
distributions for Bayesian analysis. These results are exemplified on two
concrete examples. The large and increasing body of results on the topic
of grouped data has been mostly focused on the effect on the estimators.
The methods and results for the derivation of confidence intervals here
are hence of some general theoretical value as a model approach for other
parametric models. The Bayesian credibility interval recommended in cases
with a lack of other prior information follows by letting the prior equal
the inverted exponential with a scale equal to one divided by the
resolution. It is shown that this corresponds to the standard
non-informative prior for the scale in the case of non-rounded data. For
cases with the absence of explicit prior information it is argued that the
inverted exponential prior with a scale given by the resolution is a
reasonable choice for more general digitized scale families also.
Journal: Journal of Applied Statistics
Pages: 977-986
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003692431
File-URL: http://hdl.handle.net/10.1080/02664761003692431
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:977-986
Template-Type: ReDIF-Article 1.0
Author-Name: Ao Yuan
Author-X-Name-First: Ao
Author-X-Name-Last: Yuan
Author-Name: Guanjie Chen
Author-X-Name-First: Guanjie
Author-X-Name-Last: Chen
Author-Name: Juan Xiong
Author-X-Name-First: Juan
Author-X-Name-Last: Xiong
Author-Name: Wenqing He
Author-X-Name-First: Wenqing
Author-X-Name-Last: He
Author-Name: Wen Jin
Author-X-Name-First: Wen
Author-X-Name-Last: Jin
Author-Name: Charles Rotimi
Author-X-Name-First: Charles
Author-X-Name-Last: Rotimi
Title: Bayesian--frequentist hybrid model with application to the analysis of gene copy number changes
Abstract:
Gene copy number (GCN) changes are common characteristics of many genetic
diseases. Comparative genomic hybridization (CGH) is a new technology
widely used today to screen the GCN changes in mutant cells with high
resolution genome-wide. Statistical methods for analyzing such CGH data
have been evolving. Existing methods are either frequentist's or full
Bayesian. The former often has computational advantage, while the latter
can incorporate prior information into the model, but could be misleading
when one does not have sound prior information. In an attempt to take full
advantages of both approaches, we develop a Bayesian-frequentist hybrid
approach, in which a subset of the model parameters is inferred by the
Bayesian method, while the rest parameters by the frequentist's. This new
hybrid approach provides advantages over those of the Bayesian or
frequentist's method used alone. This is especially the case when sound
prior information is available on part of the parameters, and the sample
size is relatively small. Spatial dependence and false discovery rate are
also discussed, and the parameter estimation is efficient. As an
illustration, we used the proposed hybrid approach to analyze a real CGH
data.
Journal: Journal of Applied Statistics
Pages: 987-1005
Issue: 5
Volume: 38
Year: 2011
Month: 2
X-DOI: 10.1080/02664761003692449
File-URL: http://hdl.handle.net/10.1080/02664761003692449
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:987-1005
Template-Type: ReDIF-Article 1.0
Author-Name: E. Bahrami Samani
Author-X-Name-First: E.
Author-X-Name-Last: Bahrami Samani
Author-Name: M. Ganjali
Author-X-Name-First: M.
Author-X-Name-Last: Ganjali
Title: Bayesian latent variable model for mixed continuous and ordinal responses with possibility of missing responses
Abstract:
A general framework is proposed for joint modelling of mixed correlated
ordinal and continuous responses with missing values for responses, where
the missing mechanism for both kinds of responses is also considered.
Considering the posterior distribution of unknowns given all available
information, a Markov Chain Monte Carlo sampling algorithm via winBUGS is
used for estimating the posterior distribution of the parameters. For
sensitivity analysis to investigate the perturbation from missing at
random to not missing at random, it is shown how one can use some elements
of covariance structure. These elements associate responses and their
missing mechanisms. Influence of small perturbation of these elements on
posterior displacement and posterior estimates is also studied. The model
is illustrated using data from a foreign language achievement study.
Journal: Journal of Applied Statistics
Pages: 1103-1116
Issue: 6
Volume: 38
Year: 2011
Month: 3
X-DOI: 10.1080/02664763.2010.484485
File-URL: http://hdl.handle.net/10.1080/02664763.2010.484485
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1103-1116
Template-Type: ReDIF-Article 1.0
Author-Name: Pablo Mart�nez-Camblor
Author-X-Name-First: Pablo
Author-X-Name-Last: Mart�nez-Camblor
Title: Testing the equality among distribution functions from independent and right censored samples via Cram�r--von Mises criterion
Abstract:
The traditional Cram�r--von Mises criterion is used in order to develop a
test to compare the equality of the underlying lifetime distributions in
the presence of independent censoring times. Its asymptotic distribution
is proved and a resampling plan, which is valid for unbalanced data
situations, is proposed. Its statistical power is studied and compared
with commonly used linear rank tests by Monte Carlo simulations and a real
data analysis is also considered. It is observed that the new test is
clearly more powerful than the traditional ones when there exists
no uniform dominance among involved distributions and in
the presence of late differences. Its statistical power is also good in
the other considered scenarios.
Journal: Journal of Applied Statistics
Pages: 1117-1131
Issue: 6
Volume: 38
Year: 2011
Month: 3
X-DOI: 10.1080/02664763.2010.484486
File-URL: http://hdl.handle.net/10.1080/02664763.2010.484486
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1117-1131
Template-Type: ReDIF-Article 1.0
Author-Name: Chi Tim Ng
Author-X-Name-First: Chi Tim
Author-X-Name-Last: Ng
Author-Name: Johan Lim
Author-X-Name-First: Johan
Author-X-Name-Last: Lim
Author-Name: Kyu S. Hahn
Author-X-Name-First: Kyu S.
Author-X-Name-Last: Hahn
Title: Testing stochastic orders in tails of contingency tables
Abstract:
Testing for the difference in the strength of bivariate association in
two independent contingency tables is an important issue that finds
applications in various disciplines. Currently, many of the commonly used
tests are based on single-index measures of association. More
specifically, one obtains single-index measurements of association from
two tables and compares them based on asymptotic theory. Although they are
usually easy to understand and use, often much of the information
contained in the data is lost with single-index measures. Accordingly,
they fail to fully capture the association in the data. To remedy this
shortcoming, we introduce a new summary statistic measuring various types
of association in a contingency table. Based on this new summary
statistic, we propose a likelihood ratio test comparing the strength of
association in two independent contingency tables. The proposed test
examines the stochastic order between summary statistics. We derive its
asymptotic null distribution and demonstrate that the least favorable
distributions are chi-bar distributions. We numerically compare the power
of the proposed test to that of the tests based on single-index measures.
Finally, we provide two examples illustrating the new summary statistics
and the related tests.
Journal: Journal of Applied Statistics
Pages: 1133-1149
Issue: 6
Volume: 38
Year: 2011
Month: 3
X-DOI: 10.1080/02664763.2010.484487
File-URL: http://hdl.handle.net/10.1080/02664763.2010.484487
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1133-1149
Template-Type: ReDIF-Article 1.0
Author-Name: Jūratė Šaltytė Benth
Author-X-Name-First: Jūratė
Author-X-Name-Last: Šaltytė Benth
Author-Name: Laura Šaltytė
Author-X-Name-First: Laura
Author-X-Name-Last: Šaltytė
Title: Spatial--temporal model for wind speed in Lithuania
Abstract:
In this paper, we propose a spatial--temporal model for the wind speed
(WS). We first estimate the model at the single spatial meteorological
station independently on spatial correlations. The temporal model contains
seasonality, a higher-order autoregressive component and a variance
describing the remaining heteroskedesticity in residuals. We then model
spatial dependencies by a Gaussian random field. The model is estimated on
daily WS records from 18 meteorological stations in Lithuania. The
validation procedure based on out-of-sample observations shows that the
proposed model is reliable and can be used for various practical
applications.
Journal: Journal of Applied Statistics
Pages: 1151-1168
Issue: 6
Volume: 38
Year: 2011
Month: 4
X-DOI: 10.1080/02664763.2010.491857
File-URL: http://hdl.handle.net/10.1080/02664763.2010.491857
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1151-1168
Template-Type: ReDIF-Article 1.0
Author-Name: Malin Albing
Author-X-Name-First: Malin
Author-X-Name-Last: Albing
Author-Name: Kerstin Vännman
Author-X-Name-First: Kerstin
Author-X-Name-Last: Vännman
Title: Elliptical safety region plots for C pk
Abstract:
The process capability index C pk
is widely used when measuring the capability of a manufacturing
process. A process is defined to be capable if the capability index
exceeds a stated threshold value, e.g. C
pk >4/3. This inequality can be expressed
graphically using a process capability plot, which is a plot in the plane
defined by the process mean and the process standard deviation, showing
the region for a capable process. In the process capability plot, a safety
region can be plotted to obtain a simple graphical decision rule to assess
process capability at a given significance level. We consider safety
regions to be used for the index C
pk . Under the assumption of normality, we derive
elliptical safety regions so that, using a random sample, conclusions
about the process capability can be drawn at a given significance level.
This simple graphical tool is helpful when trying to understand whether it
is the variability, the deviation from target, or both that need to be
reduced to improve the capability. Furthermore, using safety regions,
several characteristics with different specification limits and different
sample sizes can be monitored in the same plot. The proposed graphical
decision rule is also investigated with respect to power.
Journal: Journal of Applied Statistics
Pages: 1169-1187
Issue: 6
Volume: 38
Year: 2011
Month: 4
X-DOI: 10.1080/02664763.2010.491858
File-URL: http://hdl.handle.net/10.1080/02664763.2010.491858
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1169-1187
Template-Type: ReDIF-Article 1.0
Author-Name: Guillermo Villa
Author-X-Name-First: Guillermo
Author-X-Name-Last: Villa
Author-Name: Isabel Molina
Author-X-Name-First: Isabel
Author-X-Name-Last: Molina
Author-Name: Roland Fried
Author-X-Name-First: Roland
Author-X-Name-Last: Fried
Title: Modeling attendance at Spanish professional football league
Abstract:
Prediction of demand for professional sports is increasingly drawing the
attention of economists. We apply linear mixed models for modeling
attendance figures at Spanish professional football. We investigate
economic variables, such as the price of the tickets or the size of the
market, and sporting variables, such as the quality of a team or the level
of competition within the league, as potential predictors of attendance.
It turns out that a model with temporally correlated random team effects
provides good forecasts of attendance at a time horizon of two seasons.
Results from this model agree with economic theory.
Journal: Journal of Applied Statistics
Pages: 1189-1206
Issue: 6
Volume: 38
Year: 2011
Month: 4
X-DOI: 10.1080/02664763.2010.491859
File-URL: http://hdl.handle.net/10.1080/02664763.2010.491859
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1189-1206
Template-Type: ReDIF-Article 1.0
Author-Name: Frederico Z. Poleto
Author-X-Name-First: Frederico Z.
Author-X-Name-Last: Poleto
Author-Name: Julio M. Singer
Author-X-Name-First: Julio M.
Author-X-Name-Last: Singer
Author-Name: Carlos Daniel Paulino
Author-X-Name-First: Carlos Daniel
Author-X-Name-Last: Paulino
Title: Comparing diagnostic tests with missing data
Abstract:
When missing data occur in studies designed to compare the accuracy of
diagnostic tests, a common, though naive, practice is to base the
comparison of sensitivity, specificity, as well as of positive and
negative predictive values on some subset of the data that fits into
methods implemented in standard statistical packages. Such methods are
usually valid only under the strong missing completely at random (MCAR)
assumption and may generate biased and less precise estimates. We review
some models that use the dependence structure of the completely observed
cases to incorporate the information of the partially categorized
observations into the analysis and show how they may be fitted via a
two-stage hybrid process involving maximum likelihood in the first stage
and weighted least squares in the second. We indicate how computational
subroutines written in R may be used to fit the
proposed models and illustrate the different analysis strategies with
observational data collected to compare the accuracy of three distinct
non-invasive diagnostic methods for endometriosis. The results indicate
that even when the MCAR assumption is plausible, the naive partial
analyses should be avoided.
Journal: Journal of Applied Statistics
Pages: 1207-1222
Issue: 6
Volume: 38
Year: 2011
Month: 4
X-DOI: 10.1080/02664763.2010.491860
File-URL: http://hdl.handle.net/10.1080/02664763.2010.491860
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1207-1222
Template-Type: ReDIF-Article 1.0
Author-Name: Steen Magnussen
Author-X-Name-First: Steen
Author-X-Name-Last: Magnussen
Author-Name: Ron McRoberts
Author-X-Name-First: Ron
Author-X-Name-Last: McRoberts
Title: A modified bootstrap procedure for cluster sampling variance estimation of species richness
Abstract:
Variance estimators for probability sample-based predictions of species
richness (S) are typically conditional on the sample
(expected variance). In practical applications, sample sizes are typically
small, and the variance of input parameters to a richness estimator should
not be ignored. We propose a modified bootstrap variance estimator that
attempts to capture the sampling variance by generating B
replications of the richness prediction from stochastically resampled data
of species incidence. The variance estimator is demonstrated for the
observed richness (SO), five richness estimators, and with simulated
cluster sampling (without replacement) in 11 finite populations of forest
tree species. A key feature of the bootstrap procedure is a probabilistic
augmentation of a species incidence matrix by the number of species
expected to be ‘lost’ in a conventional bootstrap resampling
scheme. In Monte-Carlo (MC) simulations, the modified bootstrap procedure
performed well in terms of tracking the average MC estimates of richness
and standard errors. Bootstrap-based estimates of standard errors were as
a rule conservative. Extensions to other sampling designs, estimators of
species richness and diversity, and estimates of change are possible.
Journal: Journal of Applied Statistics
Pages: 1223-1238
Issue: 6
Volume: 38
Year: 2011
Month: 4
X-DOI: 10.1080/02664763.2010.491861
File-URL: http://hdl.handle.net/10.1080/02664763.2010.491861
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1223-1238
Template-Type: ReDIF-Article 1.0
Author-Name: Francisco Louzada-Neto
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada-Neto
Author-Name: Vicente G. Cancho
Author-X-Name-First: Vicente G.
Author-X-Name-Last: Cancho
Author-Name: Gladys D.C. Barriga
Author-X-Name-First: Gladys D.C.
Author-X-Name-Last: Barriga
Title: The Poisson--exponential distribution: a Bayesian approach
Abstract:
In this paper, we proposed a new two-parameter lifetime distribution with
increasing failure rate. The new distribution arises on a latent
complementary risk scenario. The properties of the proposed distribution
are discussed, including a formal proof of its density function and an
explicit algebraic formulae for its quantiles and survival and hazard
functions. Also, we have discussed inference aspects of the model proposed
via Bayesian inference by using Markov chain Monte Carlo simulation. A
simulation study investigates the frequentist properties of the proposed
estimators obtained under the assumptions of non-informative priors.
Further, some discussions on models selection criteria are given. The
developed methodology is illustrated on a real data set.
Journal: Journal of Applied Statistics
Pages: 1239-1248
Issue: 6
Volume: 38
Year: 2011
Month: 4
X-DOI: 10.1080/02664763.2010.491862
File-URL: http://hdl.handle.net/10.1080/02664763.2010.491862
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1239-1248
Template-Type: ReDIF-Article 1.0
Author-Name: Sara B. Crawford
Author-X-Name-First: Sara B.
Author-X-Name-Last: Crawford
Author-Name: John J. Hanfelt
Author-X-Name-First: John J.
Author-X-Name-Last: Hanfelt
Title: Testing for qualitative interaction of multiple sources of informative dropout in longitudinal data
Abstract:
Longitudinal studies suffer from patient dropout. The dropout process may
be informative if there exists an association between dropout patterns and
the rate of change in the response over time. Multiple patterns are
plausible in that different causes of dropout might contribute to
different patterns. These multiple patterns can be dichotomized into two
groups: quantitative and qualitative interaction. Quantitative interaction
indicates that each of the multiple sources is biasing the estimate of the
rate of change in the same direction, although with differing magnitudes.
Alternatively, qualitative interaction results in the multiple sources
biasing the estimate of the rate of change in opposing directions.
Qualitative interaction is of special concern, since it is less likely to
be detected by conventional methods and can lead to highly misleading
slope estimates. We explore a test for qualitative interaction based on
simultaneous confidence intervals. The test accommodates the realistic
situation where reasons for dropout are not fully understood, or even
entirely unknown. It allows for an additional level of clustering among
participating subjects. We apply these methods to a study exploring tumor
growth rates in mice as well as a longitudinal study exploring rates of
change in cognitive functioning for Alzheimer's patients.
Journal: Journal of Applied Statistics
Pages: 1249-1264
Issue: 6
Volume: 38
Year: 2011
Month: 4
X-DOI: 10.1080/02664763.2010.491969
File-URL: http://hdl.handle.net/10.1080/02664763.2010.491969
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1249-1264
Template-Type: ReDIF-Article 1.0
Author-Name: Zhensheng Huang
Author-X-Name-First: Zhensheng
Author-X-Name-Last: Huang
Title: Empirical likelihood for generalized partially linear varying-coefficient models
Abstract:
Generalized partially linear varying-coefficient models (GPLVCM) are
frequently used in statistical modeling. However, the statistical
inference of the GPLVCM, such as confidence region/interval construction,
has not been very well developed. In this article, empirical
likelihood-based inference for the parametric components in the GPLVCM is
investigated. Based on the local linear estimators of the GPLVCM, an
estimated empirical likelihood-based statistic is proposed. We show that
the resulting statistic is asymptotically non-standard chi-squared. By the
proposed empirical likelihood method, the confidence regions for the
parametric components are constructed. In addition, when some components
of the parameter are of particular interest, the construction of their
confidence intervals is also considered. A simulation study is undertaken
to compare the empirical likelihood and the other existing methods in
terms of coverage accuracies and average lengths. The proposed method is
applied to a real example.
Journal: Journal of Applied Statistics
Pages: 1265-1275
Issue: 6
Volume: 38
Year: 2011
Month: 5
X-DOI: 10.1080/02664763.2010.498500
File-URL: http://hdl.handle.net/10.1080/02664763.2010.498500
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1265-1275
Template-Type: ReDIF-Article 1.0
Author-Name: Arzu Altin Yavuz
Author-X-Name-First: Arzu Altin
Author-X-Name-Last: Yavuz
Author-Name: Birdal Senoglu
Author-X-Name-First: Birdal
Author-X-Name-Last: Senoglu
Title: Comparison of estimation methods for the finite population mean in simple random sampling: symmetric super-populations
Abstract:
In this paper, a new estimator combined estimator (CE) is proposed for
estimating the finite population mean ¯ Y
N in simple random sampling assuming a long-tailed
symmetric super-population model. The efficiency and robustness properties
of the CE is compared with the widely used and well-known estimators of
the finite population mean ¯ Y
N by Monte Carlo simulation. The parameter
estimators considered in this study are the classical least squares
estimator, trimmed mean, winsorized mean, trimmed L-mean, modified
maximum-likelihood estimator, Huber estimator (W24) and the non-parametric
Hodges--Lehmann estimator. The mean square error criteria are used to
compare the performance of the estimators. We show that the CE is overall
more efficient than the other estimators. The CE is also shown to be more
robust for estimating the finite population mean ¯ Y
N , since it is insensitive to outliers and
to misspecification of the distribution. We give a real life example.
Journal: Journal of Applied Statistics
Pages: 1277-1288
Issue: 6
Volume: 38
Year: 2011
Month: 5
X-DOI: 10.1080/02664763.2010.498501
File-URL: http://hdl.handle.net/10.1080/02664763.2010.498501
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1277-1288
Template-Type: ReDIF-Article 1.0
Author-Name: Hamid Shahriari
Author-X-Name-First: Hamid
Author-X-Name-Last: Shahriari
Author-Name: Orod Ahmadi
Author-X-Name-First: Orod
Author-X-Name-Last: Ahmadi
Author-Name: Amir H. Shokouhi
Author-X-Name-First: Amir H.
Author-X-Name-Last: Shokouhi
Title: A two-step robust estimation of the process mean using M-estimator
Abstract:
Parameter estimation is the first step in constructing control charts.
One of these parameters is the process mean. The classical estimators of
the process mean are sensitive to the presence of outlying data and
subgroups which contaminate the whole data. In existing robust estimators
for the process mean, the effects of the presence of the individual
outliers are being considered, while, in this paper, a robust estimator is
being proposed to reduce the effect of outlying subgroups as well as the
individual outliers within a subgroup. The proposed estimator was compared
with some classical and robust estimators of the process mean. Although,
its relative efficiency is fourth among the estimators tested, its
robustness and efficiency are large when the outlying subgroups are
present. Evaluation of the results indicated that the proposed estimator
is less sensitive to the presence of outliers and the process mean
performs well when there are no individual outliers or outlying subgroups.
Journal: Journal of Applied Statistics
Pages: 1289-1301
Issue: 6
Volume: 38
Year: 2011
Month: 5
X-DOI: 10.1080/02664763.2010.498502
File-URL: http://hdl.handle.net/10.1080/02664763.2010.498502
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1289-1301
Template-Type: ReDIF-Article 1.0
Author-Name: Edson Zangiacomi Martinez
Author-X-Name-First: Edson Zangiacomi
Author-X-Name-Last: Martinez
Author-Name: Davi Casale Aragon
Author-X-Name-First: Davi Casale
Author-X-Name-Last: Aragon
Author-Name: Jorge Alberto Achcar
Author-X-Name-First: Jorge Alberto
Author-X-Name-Last: Achcar
Title: A Bayesian model for estimating the malaria transition probabilities considering individuals lost to follow-up
Abstract:
It is known that patients may cease participating in a longitudinal study
and become lost to follow-up. The objective of this article is to present
a Bayesian model to estimate the malaria transition probabilities
considering individuals lost to follow-up. We consider a homogeneous
population, and it is assumed that the considered period of time is small
enough to avoid two or more transitions from one state of health to
another. The proposed model is based on a Gibbs sampling algorithm that
uses information of lost to follow-up at the end of the longitudinal
study. To simulate the unknown number of individuals with positive and
negative states of malaria at the end of the study and lost to follow-up,
two latent variables were introduced in the model. We used a real data set
and a simulated data to illustrate the application of the methodology. The
proposed model showed a good fit to these data sets, and the algorithm did
not show problems of convergence or lack of identifiability. We conclude
that the proposed model is a good alternative to estimate probabilities of
transitions from one state of health to the other in studies with low
adherence to follow-up.
Journal: Journal of Applied Statistics
Pages: 1303-1309
Issue: 6
Volume: 38
Year: 2011
Month: 5
X-DOI: 10.1080/02664763.2010.498503
File-URL: http://hdl.handle.net/10.1080/02664763.2010.498503
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1303-1309
Template-Type: ReDIF-Article 1.0
Author-Name: John Pemberton
Author-X-Name-First: John
Author-X-Name-Last: Pemberton
Title: Time Series Analysis with Applications in R, Second edition
Journal: Journal of Applied Statistics
Pages: 1311-1312
Issue: 6
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664760903075663
File-URL: http://hdl.handle.net/10.1080/02664760903075663
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1311-1312
Template-Type: ReDIF-Article 1.0
Author-Name: Jennifer H. Klapper
Author-X-Name-First: Jennifer H.
Author-X-Name-Last: Klapper
Title: Introductory Statistics with R, second edition
Journal: Journal of Applied Statistics
Pages: 1312-1313
Issue: 6
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664760903230516
File-URL: http://hdl.handle.net/10.1080/02664760903230516
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1312-1313
Template-Type: ReDIF-Article 1.0
Author-Name: Miroslav M. Ristić
Author-X-Name-First: Miroslav M.
Author-X-Name-Last: Ristić
Title: Guide to Teaching Statistics
Journal: Journal of Applied Statistics
Pages: 1313-1314
Issue: 6
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664760903230524
File-URL: http://hdl.handle.net/10.1080/02664760903230524
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1313-1314
Template-Type: ReDIF-Article 1.0
Author-Name: Boran Gazi
Author-X-Name-First: Boran
Author-X-Name-Last: Gazi
Title: Credit Risk Management
Journal: Journal of Applied Statistics
Pages: 1314-1314
Issue: 6
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664760903335083
File-URL: http://hdl.handle.net/10.1080/02664760903335083
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1314-1314
Template-Type: ReDIF-Article 1.0
Author-Name: Pieter Bastiaan Ober
Author-X-Name-First: Pieter
Author-X-Name-Last: Bastiaan Ober
Title: Modern Regression Methods
Journal: Journal of Applied Statistics
Pages: 1315-1315
Issue: 6
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664760903370791
File-URL: http://hdl.handle.net/10.1080/02664760903370791
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:6:p:1315-1315
Template-Type: ReDIF-Article 1.0
Author-Name: Pablo Mart�nez-Camblor
Author-X-Name-First: Pablo
Author-X-Name-Last: Mart�nez-Camblor
Author-Name: Carlos Carleos
Author-X-Name-First: Carlos
Author-X-Name-Last: Carleos
Author-Name: Norberto Corral
Author-X-Name-First: Norberto
Author-X-Name-Last: Corral
Title: Powerful nonparametric statistics to compare k independent ROC curves
Abstract:
The authors deal with the problem of comparing receiver operating
characteristic (ROC) curves from independent samples. From a nonparametric
approach, they propose and study three different statistics. Their
asymptotic distributions are obtained and a resample plan is considered.
In order to study the statistical power of the introduced statistics, a
simulation study is carried out. The (observed) results suggest that, for
the considered models, the new statistics are more powerful than the
usually employed ones (the Venkatraman test and the usual area under the
ROC curve criterion) in non-uniform dominance situations and quite good
otherwise.
Journal: Journal of Applied Statistics
Pages: 1317-1332
Issue: 7
Volume: 38
Year: 2011
Month: 5
X-DOI: 10.1080/02664763.2010.498504
File-URL: http://hdl.handle.net/10.1080/02664763.2010.498504
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1317-1332
Template-Type: ReDIF-Article 1.0
Author-Name: Albert Vexler
Author-X-Name-First: Albert
Author-X-Name-Last: Vexler
Author-Name: Jihnhee Yu
Author-X-Name-First: Jihnhee
Author-X-Name-Last: Yu
Author-Name: Alan D. Hutson
Author-X-Name-First: Alan D.
Author-X-Name-Last: Hutson
Title: Likelihood testing populations modeled by autoregressive process subject to the limit of detection in applications to longitudinal biomedical data
Abstract:
Dependent and often incomplete outcomes are commonly found in
longitudinal biomedical studies. We develop a likelihood function, which
implements the autoregressive process of outcomes, incorporating the limit
of detection problem and the probability of drop-out. The proposed
approach incorporates the characteristics of the longitudinal data in
biomedical research allowing us to carry out powerful tests to detect a
difference between study populations in terms of the growth rate and
drop-out rate. The formal notation of the likelihood function is
developed, making it possible to adapt the proposed method easily for
various different scenarios in terms of the number of groups to compare
and a variety of growth trend patterns. Useful inferential properties for
the proposed method are established, which take advantage of many
well-developed theorems regarding the likelihood approach. A broad
Monte-Carlo study confirms both the asymptotic results and illustrates
good power properties of the proposed method. We apply the proposed method
to three data sets obtained from mouse tumor experiments.
Journal: Journal of Applied Statistics
Pages: 1333-1346
Issue: 7
Volume: 38
Year: 2011
Month: 5
X-DOI: 10.1080/02664763.2010.498505
File-URL: http://hdl.handle.net/10.1080/02664763.2010.498505
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1333-1346
Template-Type: ReDIF-Article 1.0
Author-Name: Patrick D. Gerard
Author-X-Name-First: Patrick D.
Author-X-Name-Last: Gerard
Author-Name: Julia L. Sharp
Author-X-Name-First: Julia L.
Author-X-Name-Last: Sharp
Title: Testing for co-directional interactions using union--intersection and intersection--union methods
Abstract:
When interaction terms exist in a two-factor, factorial experiment, the
consideration and analysis of main effects are often restricted to those
situations where the interaction between factors is not significant.
Hinkelman and Kempthorne [4] softened that stance somewhat and advocate
testing main effects when the interaction is deemed co-directional but not
anti-directional. A test for the main effects in that situation may be
pragmatic to the practitioner and appealing to researchers in other
disciplines. Intersection--union and union--intersection methods are
examined for assessing the directional nature of significant interactions
so that the main effects in a two-factor factorial may be evaluated. The
tests suggested are conceptually straightforward and practical and
maintain the nominal Type-I error rate. Examples are provided to
illustrate the methods.
Journal: Journal of Applied Statistics
Pages: 1347-1358
Issue: 7
Volume: 38
Year: 2011
Month: 5
X-DOI: 10.1080/02664763.2010.498506
File-URL: http://hdl.handle.net/10.1080/02664763.2010.498506
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1347-1358
Template-Type: ReDIF-Article 1.0
Author-Name: Rand R. Wilcox
Author-X-Name-First: Rand R.
Author-X-Name-Last: Wilcox
Author-Name: Tian S. Tian
Author-X-Name-First: Tian S.
Author-X-Name-Last: Tian
Title: Measuring effect size: a robust heteroscedastic approach for two or more groups
Abstract:
Motivated by involvement in an intervention study, the paper proposes a
robust, heteroscedastic generalization of what is popularly known as
Cohen's d. The approach has the additional advantage of being readily
extended to situations where the goal is to compare more than two groups.
The method arises quite naturally from a regression perspective in
conjunction with a robust version of explanatory power. Moreover, it
provides a single numeric summary of how the groups compare in contrast to
other strategies aimed at dealing with heteroscedasticity. Kulinskaya and
Staudte [16] studied a heteroscedastic measure of effect size similar to
the one proposed here, but their measure of effect size depends on the
sample sizes making it difficult for applied researchers to interpret the
results. The approach used here is based on a generalization of Cohen's d
that obviates the issue of unequal sample sizes. Simulations and
illustrations demonstrate that the new measure of effect size can make a
practical difference regarding the conclusions reached.
Journal: Journal of Applied Statistics
Pages: 1359-1368
Issue: 7
Volume: 38
Year: 2011
Month: 5
X-DOI: 10.1080/02664763.2010.498507
File-URL: http://hdl.handle.net/10.1080/02664763.2010.498507
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1359-1368
Template-Type: ReDIF-Article 1.0
Author-Name: Ardian Harri
Author-X-Name-First: Ardian
Author-X-Name-Last: Harri
Author-Name: Keith H. Coble
Author-X-Name-First: Keith H.
Author-X-Name-Last: Coble
Title: Normality testing: two new tests using L-moments
Abstract:
Establishing that there is no compelling evidence that some population is
not normally distributed is fundamental to many statistical inferences,
and numerous approaches to testing the null hypothesis of normality have
been proposed. Fundamentally, the power of a test depends on which
specific deviation from normality may be presented in a distribution.
Knowledge of the potential nature of deviation from normality should
reasonably guide the researcher's selection of testing for non-normality.
In most settings, little is known aside from the data available for
analysis, so that selection of a test based on general applicability is
typically necessary. This research proposes and reports the power of two
new tests of normality. One of the new tests is a version of the
R-test that uses the L-moments, respectively, L-skewness
and L-kurtosis and the other test is based on normalizing transformations
of L-skewness and L-kurtosis. Both tests have high power relative to
alternatives. The test based on normalized transformations, in particular,
shows consistently high power and outperforms other normality tests
against a variety of distributions.
Journal: Journal of Applied Statistics
Pages: 1369-1379
Issue: 7
Volume: 38
Year: 2011
Month: 5
X-DOI: 10.1080/02664763.2010.498508
File-URL: http://hdl.handle.net/10.1080/02664763.2010.498508
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1369-1379
Template-Type: ReDIF-Article 1.0
Author-Name: Katsuyuki Takahashi
Author-X-Name-First: Katsuyuki
Author-X-Name-Last: Takahashi
Author-Name: Isao Shoji
Author-X-Name-First: Isao
Author-X-Name-Last: Shoji
Title: An empirical analysis of the volatility of the Japanese stock price index: a non-parametric approach
Abstract:
This paper presents an empirical analysis of stochastic features of
volatility in the Japanese stock price index, or TOPIX, using
high-frequency data sampled every 5 min. The process of TOPIX is
modeled by a stochastic differential equation with the time-homogeneous
drift and diffusion coefficients. To avoid the risk of misspecification
for the volatility function, which is defined by the squared diffusion
coefficient, the local polynomial model is applied to the data, and then
produced the estimates of the volatility function together with their
confidence intervals. The result of the estimation suggests that the
volatility function shows similar patterns for one period, but drastically
changes for another.
Journal: Journal of Applied Statistics
Pages: 1381-1394
Issue: 7
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664763.2010.505947
File-URL: http://hdl.handle.net/10.1080/02664763.2010.505947
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1381-1394
Template-Type: ReDIF-Article 1.0
Author-Name: Gleici Castro Perdoná
Author-X-Name-First: Gleici Castro
Author-X-Name-Last: Perdoná
Author-Name: Francisco Louzada-Neto
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada-Neto
Title: A general hazard model for lifetime data in the presence of cure rate
Abstract:
Historically, the cure rate model has been used for modeling
time-to-event data within which a significant proportion of patients are
assumed to be cured of illnesses, including breast cancer, non-Hodgkin
lymphoma, leukemia, prostate cancer, melanoma, and head and neck cancer.
Perhaps the most popular type of cure rate model is the mixture model
introduced by Berkson and Gage [1]. In this model, it is assumed that a
certain proportion of the patients are cured, in the sense that they do
not present the event of interest during a long period of time and can
found to be immune to the cause of failure under study. In this paper, we
propose a general hazard model which accommodates comprehensive families
of cure rate models as particular cases, including the model proposed by
Berkson and Gage. The maximum-likelihood-estimation procedure is
discussed. A simulation study analyzes the coverage probabilities of the
asymptotic confidence intervals for the parameters. A real data set on
children exposed to HIV by vertical transmission illustrates the
methodology.
Journal: Journal of Applied Statistics
Pages: 1395-1405
Issue: 7
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664763.2010.505948
File-URL: http://hdl.handle.net/10.1080/02664763.2010.505948
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1395-1405
Template-Type: ReDIF-Article 1.0
Author-Name: Cheolwoo Park
Author-X-Name-First: Cheolwoo
Author-X-Name-Last: Park
Author-Name: F�lix Hernández-Campos
Author-X-Name-First: F�lix
Author-X-Name-Last: Hernández-Campos
Author-Name: Long Le
Author-X-Name-First: Long
Author-X-Name-Last: Le
Author-Name: J. S. Marron
Author-X-Name-First: J. S.
Author-X-Name-Last: Marron
Author-Name: Juhyun Park
Author-X-Name-First: Juhyun
Author-X-Name-Last: Park
Author-Name: Vladas Pipiras
Author-X-Name-First: Vladas
Author-X-Name-Last: Pipiras
Author-Name: F. D. Smith
Author-X-Name-First: F. D.
Author-X-Name-Last: Smith
Author-Name: Richard L. Smith
Author-X-Name-First: Richard L.
Author-X-Name-Last: Smith
Author-Name: Michele Trovero
Author-X-Name-First: Michele
Author-X-Name-Last: Trovero
Author-Name: Zhengyuan Zhu
Author-X-Name-First: Zhengyuan
Author-X-Name-Last: Zhu
Title: Long-range dependence analysis of Internet traffic
Abstract:
Long-range-dependent time series are endemic in the statistical analysis
of Internet traffic. The Hurst parameter provides a good summary of
important self-similar scaling properties. We compare a number of
different Hurst parameter estimation methods and some important
variations. This is done in the context of a wide range of simulated,
laboratory-generated, and real data sets. Important differences between
the methods are highlighted. Deep insights are revealed on how well the
laboratory data mimic the real data. Non-stationarities, which are local
in time, are seen to be central issues and lead to both conceptual and
practical recommendations.
Journal: Journal of Applied Statistics
Pages: 1407-1433
Issue: 7
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664763.2010.505949
File-URL: http://hdl.handle.net/10.1080/02664763.2010.505949
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1407-1433
Template-Type: ReDIF-Article 1.0
Author-Name: Giovana O. Silva
Author-X-Name-First: Giovana O.
Author-X-Name-Last: Silva
Author-Name: Edwin M.M. Ortega
Author-X-Name-First: Edwin M.M.
Author-X-Name-Last: Ortega
Author-Name: Gilberto A. Paula
Author-X-Name-First: Gilberto A.
Author-X-Name-Last: Paula
Title: Residuals for log-Burr XII regression models in survival analysis
Abstract:
In this paper, we compare three residuals to assess departures from the
error assumptions as well as to detect outlying observations in log-Burr
XII regression models with censored observations. These residuals can also
be used for the log-logistic regression model, which is a special case of
the log-Burr XII regression model. For different parameter settings,
sample sizes and censoring percentages, various simulation studies are
performed and the empirical distribution of each residual is displayed and
compared with the standard normal distribution. These studies suggest that
the residual analysis usually performed in normal linear regression models
can be straightforwardly extended to the modified martingale-type residual
in log-Burr XII regression models with censored data.
Journal: Journal of Applied Statistics
Pages: 1435-1445
Issue: 7
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664763.2010.505950
File-URL: http://hdl.handle.net/10.1080/02664763.2010.505950
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1435-1445
Template-Type: ReDIF-Article 1.0
Author-Name: Yalian Li
Author-X-Name-First: Yalian
Author-X-Name-Last: Li
Author-Name: Hu Yang
Author-X-Name-First: Hu
Author-X-Name-Last: Yang
Title: Two kinds of restricted modified estimators in linear regression model
Abstract:
In this paper, we introduce two kinds of new restricted estimators called
restricted modified Liu estimator and restricted modified ridge estimator
based on prior information for the vector of parameters in a linear
regression model with linear restrictions. Furthermore, the performance of
the proposed estimators in mean squares error matrix sense is derived and
compared. Finally, a numerical example and a Monte Carlo simulation are
given to illustrate some of the theoretical results.
Journal: Journal of Applied Statistics
Pages: 1447-1454
Issue: 7
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664763.2010.505951
File-URL: http://hdl.handle.net/10.1080/02664763.2010.505951
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1447-1454
Template-Type: ReDIF-Article 1.0
Author-Name: Tommaso Proietti
Author-X-Name-First: Tommaso
Author-X-Name-Last: Proietti
Title: Multivariate temporal disaggregation with cross-sectional constraints
Abstract:
Multivariate temporal disaggregation deals with the historical
reconstruction and nowcasting of economic variables subject to temporal
and contemporaneous aggregation constraints. The problem involves a system
of time series that are related not only by a dynamic model but also by
accounting constraints. The paper introduces two fundamental (and
realistic) models that implement the multivariate best linear unbiased
estimation approach that has potential application to the temporal
disaggregation of the national accounts series. The multivariate
regression model with random walk disturbances is most suitable to deal
with the chained linked volumes (as the nature of the national accounts
time series suggests); however, in this case the accounting constraints
are not binding and the discrepancy has to be modeled by either a
trend-stationary or an integrated process. The tiny, compared with other
driving disturbances, size of the discrepancy prevents maximum-likelihood
estimation to be carried out, and the parameters have to be estimated
separately. The multivariate disaggregation with integrated random walk
disturbances is suitable for the national accounts aggregates expressed at
current prices, in which case the accounting constraints are binding.
Journal: Journal of Applied Statistics
Pages: 1455-1466
Issue: 7
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664763.2010.505952
File-URL: http://hdl.handle.net/10.1080/02664763.2010.505952
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1455-1466
Template-Type: ReDIF-Article 1.0
Author-Name: Shih-Chia Liu
Author-X-Name-First: Shih-Chia
Author-X-Name-Last: Liu
Author-Name: Kuo-Szu Chiang
Author-X-Name-First: Kuo-Szu
Author-X-Name-Last: Chiang
Author-Name: Cheng-Hsiang Lin
Author-X-Name-First: Cheng-Hsiang
Author-X-Name-Last: Lin
Author-Name: Ting-Chin Deng
Author-X-Name-First: Ting-Chin
Author-X-Name-Last: Deng
Title: Confidence interval procedures for proportions estimated by group testing with groups of unequal size adjusted for overdispersion
Abstract:
Group testing is a method of pooling a number of units together and
performing a single test on the resulting group. Group testing is an
appealing option when few individual units are thought to be infected and
the cost of the testing is non-negligible. Overdispersion is the
phenomenon of having greater variability than predicted by the random
component of the model; this is common in the modeling of binomial
distribution for group testing. The purpose of this paper is to provide a
comparison of several established methods of constructing confidence
intervals after adjusting for overdispersion. We evaluate and investigate
each method in six different cases of group testing. A method based on the
score statistic with correction for skewness is recommended. We illustrate
the methods using two data sets, one from the detection of seed
transmission and the other from serological testing for malaria.
Journal: Journal of Applied Statistics
Pages: 1467-1482
Issue: 7
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664763.2010.505953
File-URL: http://hdl.handle.net/10.1080/02664763.2010.505953
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1467-1482
Template-Type: ReDIF-Article 1.0
Author-Name: D. Stogiannis
Author-X-Name-First: D.
Author-X-Name-Last: Stogiannis
Author-Name: C. Caroni
Author-X-Name-First: C.
Author-X-Name-Last: Caroni
Author-Name: C. E. Anagnostopoulos
Author-X-Name-First: C. E.
Author-X-Name-Last: Anagnostopoulos
Author-Name: I. K. Toumpoulis
Author-X-Name-First: I. K.
Author-X-Name-Last: Toumpoulis
Title: Comparing first hitting time and proportional hazards regression models
Abstract:
Cox's widely used semi-parametric proportional hazards (PH) regression
model places restrictions on the possible shapes of the hazard function.
Models based on the first hitting time (FHT) of a stochastic process are
among the alternatives and have the attractive feature of being based on a
model of the underlying process. We review and compare the PH model and an
FHT model based on a Wiener process which leads to an inverse Gaussian
(IG) regression model. This particular model can also represent a
“cured fraction” or long-term survivors. A case study of
survival after coronary artery bypass grafting is used to examine the
interpretation of the IG model, especially in relation to covariates that
affect both of its parameters.
Journal: Journal of Applied Statistics
Pages: 1483-1492
Issue: 7
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664763.2010.505954
File-URL: http://hdl.handle.net/10.1080/02664763.2010.505954
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1483-1492
Template-Type: ReDIF-Article 1.0
Author-Name: Zheng Su
Author-X-Name-First: Zheng
Author-X-Name-Last: Su
Title: A class of designs for Phase I cancer clinical trials combining Bayesian and likelihood approaches
Abstract:
The Bayesian continual reassessment method (CRM) and its likelihood
version (CRML) provide important tools for the design of Phase I cancer
clinical trials. However, a poorly chosen prior distribution in CRM may
lead to inferior performance of the method in the early stage of a trial,
whereas the maximum-likelihood estimate used in CRML may result in initial
high variability. These features of CRM and CRML served as the motivations
for the development of this new class of designs, which combines the
Bayesian and the likelihood approaches and has CRM and CRML as special
cases. Simulation studies on a leukaemia trial show that the proposed
class of designs significantly outperforms the traditional up-and-down
design.
Journal: Journal of Applied Statistics
Pages: 1493-1498
Issue: 7
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664763.2010.505955
File-URL: http://hdl.handle.net/10.1080/02664763.2010.505955
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1493-1498
Template-Type: ReDIF-Article 1.0
Author-Name: Jos� Dias Curto
Author-X-Name-First: Jos� Dias
Author-X-Name-Last: Curto
Author-Name: Jos� Castro Pinto
Author-X-Name-First: Jos� Castro
Author-X-Name-Last: Pinto
Title: The corrected VIF (CVIF)
Abstract:
In this paper, we propose a new corrected variance inflation factor (VIF)
measure to evaluate the impact of the correlation among the explanatory
variables in the variance of the ordinary least squares estimators. We
show that the real impact on variance can be overestimated by the
traditional VIF when the explanatory variables contain no redundant
information about the dependent variable and a corrected version of this
multicollinearity indicator becomes necessary.
Journal: Journal of Applied Statistics
Pages: 1499-1507
Issue: 7
Volume: 38
Year: 2011
Month: 6
X-DOI: 10.1080/02664763.2010.505956
File-URL: http://hdl.handle.net/10.1080/02664763.2010.505956
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1499-1507
Template-Type: ReDIF-Article 1.0
Author-Name: Jin-Guan Lin
Author-X-Name-First: Jin-Guan
Author-X-Name-Last: Lin
Author-Name: Li-Xing Zhu
Author-X-Name-First: Li-Xing
Author-X-Name-Last: Zhu
Author-Name: Chun-Zheng Cao
Author-X-Name-First: Chun-Zheng
Author-X-Name-Last: Cao
Author-Name: Yong Li
Author-X-Name-First: Yong
Author-X-Name-Last: Li
Title: Tests of heteroscedasticity and correlation in multivariate t regression models with AR and ARMA errors
Abstract:
Heteroscedasticity checking in regression analysis plays an important
role in modelling. It is of great interest when random errors are
correlated, including autocorrelated and partial autocorrelated errors. In
this paper, we consider multivariate t linear regression
models, and construct the score test for the case of AR(1) errors, and
ARMA(s,d) errors. The asymptotic properties, including
asymptotic chi-square and approximate powers under local alternatives of
the score tests, are studied. Based on modified profile likelihood, the
adjusted score test is also developed. The finite sample performance of
the tests is investigated through Monte Carlo simulations, and also the
tests are illustrated with two real data sets.
Journal: Journal of Applied Statistics
Pages: 1509-1531
Issue: 7
Volume: 38
Year: 2011
Month: 8
X-DOI: 10.1080/02664763.2010.515301
File-URL: http://hdl.handle.net/10.1080/02664763.2010.515301
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:7:p:1509-1531
Template-Type: ReDIF-Article 1.0
Author-Name: An Creemers
Author-X-Name-First: An
Author-X-Name-Last: Creemers
Author-Name: Marc Aerts
Author-X-Name-First: Marc
Author-X-Name-Last: Aerts
Author-Name: Niel Hens
Author-X-Name-First: Niel
Author-X-Name-Last: Hens
Author-Name: Ziv Shkedy
Author-X-Name-First: Ziv
Author-X-Name-Last: Shkedy
Author-Name: Frank De Smet
Author-X-Name-First: Frank
Author-X-Name-Last: De Smet
Author-Name: Philippe Beutels
Author-X-Name-First: Philippe
Author-X-Name-Last: Beutels
Title: Revealing age-specific past and future unrelated costs of pneumococcal infections by flexible generalized estimating equations
Abstract:
We aimed to study the excess health-care expenditures for persons with a
known positive isolate of Streptococcus pneumoniae. The
data set was compiled by linking the database of the largest Belgian
Sickness Fund with data obtained from laboratories reporting pneumococcal
isolates. We analyzed the age-specific per-patient cumulative costs over
time, using generalized estimating equations (GEEs). The mean structure
was described by fractional polynomials. The quasi-likelihood under the
independence model criterion was used to compare different correlation
structures. We show for all age groups that the health-care costs incurred
by diagnosed pneumococcal patients are significantly larger than those
incurred by undiagnosed matched persons. This is not only the case at the
time of diagnosis but also long before and after the time of diagnosis.
These findings can be informative for the current debate on unrelated
costs in health economic evaluation, and GEEs could be used to estimate
these costs for other diseases. Finally, these results can be used to
inform policy on the expected budget impact of preventing pneumococcal
infections.
Journal: Journal of Applied Statistics
Pages: 1533-1547
Issue: 8
Volume: 38
Year: 2011
Month: 8
X-DOI: 10.1080/02664763.2010.515302
File-URL: http://hdl.handle.net/10.1080/02664763.2010.515302
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1533-1547
Template-Type: ReDIF-Article 1.0
Author-Name: Vijay Verma
Author-X-Name-First: Vijay
Author-X-Name-Last: Verma
Author-Name: Gianni Betti
Author-X-Name-First: Gianni
Author-X-Name-Last: Betti
Title: Taylor linearization sampling errors and design effects for poverty measures and other complex statistics
Abstract:
A systematic procedure for the derivation of linearized variables for the
estimation of sampling errors of complex nonlinear statistics involved in
the analysis of poverty and income inequality is developed. The linearized
variable extends the use of standard variance estimation formulae,
developed for linear statistics such as sample aggregates, to nonlinear
statistics. The context is that of cross-sectional samples of complex
design and reasonably large size, as typically used in population-based
surveys. Results of application of the procedure to a wide range of
poverty and inequality measures are presented. A standardized software for
the purpose has been developed and can be provided to interested users on
request. Procedures are provided for the estimation of the design effect
and its decomposition into the contribution of unequal sample weights and
of other design complexities such as clustering and stratification. The
consequence of treating a complex statistic as a simple ratio in
estimating its sampling error is also quantified. The second theme of the
paper is to compare the linearization approach with an alternative
approach based on the concept of replication, namely the Jackknife
repeated replication (JRR) method. The basis and application of the JRR
method is described, the exposition paralleling that of the linearization
method but in somewhat less detail. Based on data from an actual national
survey, estimates of standard errors and design effects from the two
methods are analysed and compared. The numerical results confirm that the
two alternative approaches generally give very similar results, though
notable differences can exist for certain statistics. Relative advantages
and limitations of the approaches are identified.
Journal: Journal of Applied Statistics
Pages: 1549-1576
Issue: 8
Volume: 38
Year: 2011
Month: 8
X-DOI: 10.1080/02664763.2010.515674
File-URL: http://hdl.handle.net/10.1080/02664763.2010.515674
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1549-1576
Template-Type: ReDIF-Article 1.0
Author-Name: Keunbaik Lee
Author-X-Name-First: Keunbaik
Author-X-Name-Last: Lee
Author-Name: Sanggil Kang
Author-X-Name-First: Sanggil
Author-X-Name-Last: Kang
Author-Name: Xuefeng Liu
Author-X-Name-First: Xuefeng
Author-X-Name-Last: Liu
Author-Name: Daekwan Seo
Author-X-Name-First: Daekwan
Author-X-Name-Last: Seo
Title: Likelihood-based approach for analysis of longitudinal nominal data using marginalized random effects models
Abstract:
Likelihood-based marginalized models using random effects have become
popular for analyzing longitudinal categorical data. These models permit
direct interpretation of marginal mean parameters and characterize the
serial dependence of longitudinal outcomes using random effects [12,22].
In this paper, we propose model that expands the use of previous models to
accommodate longitudinal nominal data. Random effects using a new
covariance matrix with a Kronecker product composition are used to explain
serial and categorical dependence. The Quasi-Newton algorithm is developed
for estimation. These proposed methods are illustrated with a real data
set and compared with other standard methods.
Journal: Journal of Applied Statistics
Pages: 1577-1590
Issue: 8
Volume: 38
Year: 2011
Month: 7
X-DOI: 10.1080/02664763.2010.515675
File-URL: http://hdl.handle.net/10.1080/02664763.2010.515675
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1577-1590
Template-Type: ReDIF-Article 1.0
Author-Name: Ming-Yuan Leon Li
Author-X-Name-First: Ming-Yuan Leon
Author-X-Name-Last: Li
Author-Name: Shang-En Shine Yu
Author-X-Name-First: Shang-En Shine
Author-X-Name-Last: Yu
Title: Do large firms overly use stock-based incentive compensation?
Abstract:
This study employs the panel threshold model to reexamine the
non-monotonic relationship between CEO stock-based compensation and firm
earnings across various firm-size conditions. The feasibility of the model
is tested using data for US non-financial firms from 1993 to 2005. Our
empirical results indicate that while a positive relationship between the
CEO stock-based pay and earnings is presented for small-size firms, a
negative impact of CEO stock-based compensation on earnings is shown when
large-size firms are concerned. Further, the longstanding puzzle of
whether the CEO stock-based pay could enhance earnings among earlier
studies could be satisfactorily explained by our empirical results.
Journal: Journal of Applied Statistics
Pages: 1591-1606
Issue: 8
Volume: 38
Year: 2011
Month: 7
X-DOI: 10.1080/02664763.2010.515676
File-URL: http://hdl.handle.net/10.1080/02664763.2010.515676
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1591-1606
Template-Type: ReDIF-Article 1.0
Author-Name: Li-Chu Chien
Author-X-Name-First: Li-Chu
Author-X-Name-Last: Chien
Title: Diagnostic plots in beta-regression models
Abstract:
Two diagnostic plots for selecting explanatory variables are introduced
to assess the accuracy of a generalized beta-linear model. The added
variable plot is developed to examine the need for adding a new
explanatory variable to the model. The constructed variable plot is
developed to identify the nonlinearity of the explanatory variable in the
model. The two diagnostic procedures are also useful for detecting unusual
observations that may affect the regression much. Simulation studies and
analysis of two practical examples are conducted to illustrate the
performances of the proposed plots.
Journal: Journal of Applied Statistics
Pages: 1607-1622
Issue: 8
Volume: 38
Year: 2011
Month: 7
X-DOI: 10.1080/02664763.2010.515677
File-URL: http://hdl.handle.net/10.1080/02664763.2010.515677
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1607-1622
Template-Type: ReDIF-Article 1.0
Author-Name: Liansheng Tang
Author-X-Name-First: Liansheng
Author-X-Name-Last: Tang
Author-Name: Ming Tan
Author-X-Name-First: Ming
Author-X-Name-Last: Tan
Author-Name: Xiao-Hua Zhou
Author-X-Name-First: Xiao-Hua
Author-X-Name-Last: Zhou
Title: A sequential conditional probability ratio test procedure for comparing diagnostic tests
Abstract:
In this paper, we derive sequential conditional probability ratio tests
to compare diagnostic tests without distributional assumptions on test
results. The test statistics in our method are nonparametric weighted
areas under the receiver-operating characteristic curves. By using the new
method, the decision of stopping the diagnostic trial early is unlikely to
be reversed should the trials continue to the planned end. The
conservatism reflected in this approach to have more conservative stopping
boundaries during the course of the trial is especially appealing for
diagnostic trials since the end point is not death. In addition, the
maximum sample size of our method is not greater than a fixed sample test
with similar power functions. Simulation studies are performed to evaluate
the properties of the proposed sequential procedure. We illustrate the
method using data from a thoracic aorta imaging study.
Journal: Journal of Applied Statistics
Pages: 1623-1632
Issue: 8
Volume: 38
Year: 2011
Month: 7
X-DOI: 10.1080/02664763.2010.515678
File-URL: http://hdl.handle.net/10.1080/02664763.2010.515678
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1623-1632
Template-Type: ReDIF-Article 1.0
Author-Name: Lucia Santana
Author-X-Name-First: Lucia
Author-X-Name-Last: Santana
Author-Name: Filidor Vilca
Author-X-Name-First: Filidor
Author-X-Name-Last: Vilca
Author-Name: V�ctor Leiva
Author-X-Name-First: V�ctor
Author-X-Name-Last: Leiva
Title: Influence analysis in skew-Birnbaum--Saunders regression models and applications
Abstract:
In this paper, we propose a method to assess influence in
skew-Birnbaum--Saunders regression models, which are an extension based on
the skew-normal distribution of the usual Birnbaum--Saunders (BS)
regression model. An interesting characteristic that the new regression
model has is the capacity of predicting extreme percentiles, which is not
possible with the BS model. In addition, since the observed likelihood
function associated with the new regression model is more complex than
that from the usual model, we facilitate the parameter estimation using a
type-EM algorithm. Moreover, we employ influence diagnostic tools that
considers this algorithm. Finally, a numerical illustration includes a
brief simulation study and an analysis of real data in order to show the
proposed methodology.
Journal: Journal of Applied Statistics
Pages: 1633-1649
Issue: 8
Volume: 38
Year: 2011
Month: 7
X-DOI: 10.1080/02664763.2010.515679
File-URL: http://hdl.handle.net/10.1080/02664763.2010.515679
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1633-1649
Template-Type: ReDIF-Article 1.0
Author-Name: Mitra Rahimzadeh
Author-X-Name-First: Mitra
Author-X-Name-Last: Rahimzadeh
Author-Name: Ebrahim Hajizadeh
Author-X-Name-First: Ebrahim
Author-X-Name-Last: Hajizadeh
Author-Name: Farzad Eskandari
Author-X-Name-First: Farzad
Author-X-Name-Last: Eskandari
Title: Non-mixture cure correlated frailty models in Bayesian approach
Abstract:
In this article, we develop a Bayesian approach for the estimation of two
cure correlated frailty models that have been extended to the cure frailty
models introduced by Yin [34]. We used the two different type of frailty
with bivariate log-normal distribution instead of gamma distribution. A
likelihood function was constructed based on a piecewise exponential
distribution function. The model parameters were estimated by the Markov
chain Monte Carlo method. The comparison of models is based on the Cox
correlated frailty model with log-normal distribution. A real data set of
bilateral corneal graft rejection was used to compare these models. The
results of this data, based on deviance information criteria, showed the
advantage of the proposed models.
Journal: Journal of Applied Statistics
Pages: 1651-1663
Issue: 8
Volume: 38
Year: 2011
Month: 8
X-DOI: 10.1080/02664763.2010.515966
File-URL: http://hdl.handle.net/10.1080/02664763.2010.515966
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1651-1663
Template-Type: ReDIF-Article 1.0
Author-Name: William J. Reed
Author-X-Name-First: William J.
Author-X-Name-Last: Reed
Title: A flexible parametric survival model which allows a bathtub-shaped hazard rate function
Abstract:
A new parametric (three-parameter) survival distribution, the
lognormal--power function distribution, with flexible
behaviour is introduced. Its hazard rate function can be either unimodal,
monotonically decreasing or can exhibit a bathtub shape. Special cases
include the lognormal distribution and the power function distribution,
with finite support. Regions of parameter space where the various forms of
the hazard-rate function prevail are established analytically. The
distribution lends itself readily to accelerated life regression
modelling. Applications to five data sets taken from the literature are
given. Also it is shown how the distribution can behave like a Weibull
distribution (with negative aging) for certain parameter values.
Journal: Journal of Applied Statistics
Pages: 1665-1680
Issue: 8
Volume: 38
Year: 2011
Month: 8
X-DOI: 10.1080/02664763.2010.516388
File-URL: http://hdl.handle.net/10.1080/02664763.2010.516388
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1665-1680
Template-Type: ReDIF-Article 1.0
Author-Name: Yeliz Mert Kantar
Author-X-Name-First: Yeliz Mert
Author-X-Name-Last: Kantar
Author-Name: Ilhan Usta
Author-X-Name-First: Ilhan
Author-X-Name-Last: Usta
Author-Name: Şükrü Acıtaş
Author-X-Name-First: Şükrü
Author-X-Name-Last: Acıtaş
Title: A Monte Carlo simulation study on partially adaptive estimators of linear regression models
Abstract:
This paper presents a comprehensive comparison of well-known partially
adaptive estimators (PAEs) in terms of efficiency in estimating regression
parameters. The aim is to identify the best estimators of regression
parameters when error terms follow from normal, Laplace, Student's
t, normal mixture, lognormal and gamma distribution via
the Monte Carlo simulation. In the results of the simulation, efficient
PAEs are determined in the case of symmetric leptokurtic and skewed
leptokurtic regression error data. Additionally, these estimators are also
compared in terms of regression applications. Regarding these
applications, using certain standard error estimators, it is shown that
PAEs can reduce the standard error of the slope parameter estimate
relative to ordinary least squares.
Journal: Journal of Applied Statistics
Pages: 1681-1699
Issue: 8
Volume: 38
Year: 2011
Month: 8
X-DOI: 10.1080/02664763.2010.516389
File-URL: http://hdl.handle.net/10.1080/02664763.2010.516389
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1681-1699
Template-Type: ReDIF-Article 1.0
Author-Name: L.H.A. Dal Bello
Author-X-Name-First: L.H.A. Dal
Author-X-Name-Last: Bello
Author-Name: A. F.C. Vieira
Author-X-Name-First: A. F.C.
Author-X-Name-Last: Vieira
Title: Optimization of a product performance using mixture experiments including process variables
Abstract:
This article presents a case study of a chemical compound used in the
delay mechanism to start a rocket engine. The compound consists in a
three-component mixture. Besides the components proportions, two process
variables are considered. The aim of the study is to investigate the mix
components proportions and the levels of process variables that set the
expected delay time as close as possible to the target value and, at the
same time, minimize the width of prediction interval for the response. A
linear regression model with normal responses was fitted. Through the
model developed, the optimal components proportions and the levels of the
process variables were determined. For the model selection, the use of the
backward method with an information criterion proved to be efficient in
the case under study.
Journal: Journal of Applied Statistics
Pages: 1701-1715
Issue: 8
Volume: 38
Year: 2011
Month: 8
X-DOI: 10.1080/02664763.2010.518370
File-URL: http://hdl.handle.net/10.1080/02664763.2010.518370
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1701-1715
Template-Type: ReDIF-Article 1.0
Author-Name: Afrânio M.C. Vieira
Author-X-Name-First: Afrânio M.C.
Author-X-Name-Last: Vieira
Author-Name: Roseli A. Leandro
Author-X-Name-First: Roseli A.
Author-X-Name-Last: Leandro
Author-Name: Clarice G.B. Dem�trio
Author-X-Name-First: Clarice G.B.
Author-X-Name-Last: Dem�trio
Author-Name: Geert Molenberghs
Author-X-Name-First: Geert
Author-X-Name-Last: Molenberghs
Title: Double generalized linear model for tissue culture proportion data: a Bayesian perspective
Abstract:
Joint generalized linear models and double generalized linear models
(DGLMs) were designed to model outcomes for which the variability can be
explained using factors and/or covariates. When such factors operate, the
usual normal regression models, which inherently exhibit constant
variance, will under-represent variation in the data and hence may lead to
erroneous inferences. For count and proportion data, such noise factors
can generate a so-called overdispersion effect, and the use of binomial
and Poisson models underestimates the variability and, consequently,
incorrectly indicate significant effects. In this manuscript, we propose a
DGLM from a Bayesian perspective, focusing on the case of proportion data,
where the overdispersion can be modeled using a random effect that depends
on some noise factors. The posterior joint density function was sampled
using Monte Carlo Markov Chain algorithms, allowing inferences over the
model parameters. An application to a data set on apple tissue culture is
presented, for which it is shown that the Bayesian approach is quite
feasible, even when limited prior information is available, thereby
generating valuable insight for the researcher about its experimental
results.
Journal: Journal of Applied Statistics
Pages: 1717-1731
Issue: 8
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664763.2010.529875
File-URL: http://hdl.handle.net/10.1080/02664763.2010.529875
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1717-1731
Template-Type: ReDIF-Article 1.0
Author-Name: Chun-Shu Chen
Author-X-Name-First: Chun-Shu
Author-X-Name-Last: Chen
Author-Name: Hong-Ding Yang
Author-X-Name-First: Hong-Ding
Author-X-Name-Last: Yang
Title: A joint modeling approach for spatial earthquake risk variations
Abstract:
Modeling spatial patterns and processes to assess the spatial variations
of data over a study region is an important issue in many fields. In this
paper, we focus on investigating the spatial variations of earthquake
risks after a main shock. Although earthquake risks have been extensively
studied in the literatures, to our knowledge, there does not exist a
suitable spatial model for assessing the problem. Therefore, we propose a
joint modeling approach based on spatial hierarchical Bayesian models and
spatial conditional autoregressive models to describe the spatial
variations in earthquake risks over the study region during two periods. A
family of stochastic algorithms based on a Markov chain Monte Carlo
technique is then performed for posterior computations. The probabilistic
issue for the changes of earthquake risks after a main shock is also
discussed. Finally, the proposed method is applied to the earthquake
records for Taiwan before and after the Chi-Chi earthquake.
Journal: Journal of Applied Statistics
Pages: 1733-1741
Issue: 8
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664763.2010.529883
File-URL: http://hdl.handle.net/10.1080/02664763.2010.529883
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1733-1741
Template-Type: ReDIF-Article 1.0
Author-Name: Søren Feodor Nielsen
Author-X-Name-First: Søren Feodor
Author-X-Name-Last: Nielsen
Title: SAS for data analysis
Journal: Journal of Applied Statistics
Pages: 1743-1744
Issue: 8
Volume: 38
Year: 2011
Month: 8
X-DOI: 10.1080/02664760903466805
File-URL: http://hdl.handle.net/10.1080/02664760903466805
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1743-1744
Template-Type: ReDIF-Article 1.0
Author-Name: Long Kang
Author-X-Name-First: Long
Author-X-Name-Last: Kang
Title: Time-series data analysis using EViews
Journal: Journal of Applied Statistics
Pages: 1744-1745
Issue: 8
Volume: 38
Year: 2011
Month: 8
X-DOI: 10.1080/02664760903466813
File-URL: http://hdl.handle.net/10.1080/02664760903466813
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1744-1745
Template-Type: ReDIF-Article 1.0
Author-Name: Kam Hamidieh
Author-X-Name-First: Kam
Author-X-Name-Last: Hamidieh
Title: Synthetic CDOs modelling, valuation and risk management
Journal: Journal of Applied Statistics
Pages: 1745-1746
Issue: 8
Volume: 38
Year: 2011
Month: 8
X-DOI: 10.1080/02664760903520148
File-URL: http://hdl.handle.net/10.1080/02664760903520148
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1745-1746
Template-Type: ReDIF-Article 1.0
Author-Name: Hassan S. Bakouch
Author-X-Name-First: Hassan S.
Author-X-Name-Last: Bakouch
Title: Probability, Markov chains, queues, and simulation
Journal: Journal of Applied Statistics
Pages: 1746-1746
Issue: 8
Volume: 38
Year: 2011
Month: 8
X-DOI: 10.1080/02664763.2010.484891
File-URL: http://hdl.handle.net/10.1080/02664763.2010.484891
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1746-1746
Template-Type: ReDIF-Article 1.0
Author-Name: Yinghui Wei
Author-X-Name-First: Yinghui
Author-X-Name-Last: Wei
Author-Name: Peter Neal
Author-X-Name-First: Peter
Author-X-Name-Last: Neal
Title: Statement of Withdrawal: Statistical analysis of an endemic disease from a capture--recapture experiment
Abstract:
There are a number of statistical techniques for analysing epidemic
outbreaks. However, many diseases are endemic within populations and the
analysis of such diseases are complicated by changing population
demography. Motivated by the spread of cowpox among rodent populations, a
combined mathematical model for population and disease dynamics is
introduced. An MCMC algorithm is then constructed to make statistical
inference for the model based on data being obtained from a
capture--recapture experiment. The statistical analysis is used to
identify the key elements in the spread of the cowpox virus.
Journal: Journal of Applied Statistics
Pages: 1747-1747
Issue: 8
Volume: 38
Year: 2011
Month: 1
X-DOI: 10.1080/02664763.2011.590298
File-URL: http://hdl.handle.net/10.1080/02664763.2011.590298
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:8:p:1747-1747
Template-Type: ReDIF-Article 1.0
Author-Name: Ashis SenGupta
Author-X-Name-First: Ashis
Author-X-Name-Last: SenGupta
Author-Name: Hon Keung Tony Ng
Author-X-Name-First: Hon Keung Tony
Author-X-Name-Last: Ng
Title: Nonparametric test for the homogeneity of the overall variability
Abstract:
In this paper, we propose a nonparametric test for homogeneity of overall
variabilities for two multi-dimensional populations. Comparisons between
the proposed nonparametric procedure and the asymptotic parametric
procedure and a permutation test based on standardized generalized
variances are made when the underlying populations are multivariate
normal. We also study the performance of these test procedures when the
underlying populations are non-normal. We observe that the nonparametric
procedure and the permutation test based on standardized generalized
variances are not as powerful as the asymptotic parametric test under
normality. However, they are reliable and powerful tests for comparing
overall variability under other multivariate distributions such as the
multivariate Cauchy, the multivariate Pareto and the multivariate
exponential distributions, even with small sample sizes. A Monte Carlo
simulation study is used to evaluate the performance of the proposed
procedures. An example from an educational study is used to illustrate the
proposed nonparametric test.
Journal: Journal of Applied Statistics
Pages: 1751-1768
Issue: 9
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664763.2010.529876
File-URL: http://hdl.handle.net/10.1080/02664763.2010.529876
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1751-1768
Template-Type: ReDIF-Article 1.0
Author-Name: Mahmoud Torabi
Author-X-Name-First: Mahmoud
Author-X-Name-Last: Torabi
Author-Name: Rhonda J. Rosychuk
Author-X-Name-First: Rhonda J.
Author-X-Name-Last: Rosychuk
Title: Spatio-temporal modelling using B-spline for disease mapping: analysis of childhood cancer trends
Abstract:
To examine childhood cancer diagnoses in the province of Alberta, Canada
during 1983--2004, we construct a generalized additive mixed model for the
analysis of geographic and temporal variability of cancer ratios. In this
model, spatially correlated random effects and temporal components are
adopted. The interaction between space and time is also accommodated.
Spatio-temporal models that use conditional autoregressive smoothing
across the spatial dimension and B-spline over the temporal dimension are
considered. We study the patterns of incidence ratios over time and
identify areas with consistently high ratio estimates as areas for
potential further investigation. We apply the method of penalized
quasi-likelihood to estimate the model parameters. We illustrate this
approach using a yearly data set of childhood cancer diagnoses in the
province of Alberta, Canada during 1983--2004.
Journal: Journal of Applied Statistics
Pages: 1769-1781
Issue: 9
Volume: 38
Year: 2011
Month: 10
X-DOI: 10.1080/02664763.2010.529877
File-URL: http://hdl.handle.net/10.1080/02664763.2010.529877
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1769-1781
Template-Type: ReDIF-Article 1.0
Author-Name: O. Collignon
Author-X-Name-First: O.
Author-X-Name-Last: Collignon
Author-Name: J.-M. Monnez
Author-X-Name-First: J.-M.
Author-X-Name-Last: Monnez
Author-Name: P. Vallois
Author-X-Name-First: P.
Author-X-Name-Last: Vallois
Author-Name: F. Codreanu
Author-X-Name-First: F.
Author-X-Name-Last: Codreanu
Author-Name: J.-M. Renaudin
Author-X-Name-First: J.-M.
Author-X-Name-Last: Renaudin
Author-Name: G. Kanny
Author-X-Name-First: G.
Author-X-Name-Last: Kanny
Author-Name: M. Brulliard
Author-X-Name-First: M.
Author-X-Name-Last: Brulliard
Author-Name: B. E. Bihain
Author-X-Name-First: B. E.
Author-X-Name-Last: Bihain
Author-Name: S. Jacquenet
Author-X-Name-First: S.
Author-X-Name-Last: Jacquenet
Author-Name: D. Moneret-Vautrin
Author-X-Name-First: D.
Author-X-Name-Last: Moneret-Vautrin
Title: Discriminant analyses of peanut allergy severity scores
Abstract:
Peanut allergy is one of the most prevalent food allergies. The
possibility of a lethal accidental exposure and the persistence of the
disease make it a public health problem. Evaluating the intensity of
symptoms is accomplished with a double blind placebo-controlled food
challenge (DBPCFC), which scores the severity of reactions and measures
the dose of peanut that elicits the first reaction. Since DBPCFC can
result in life-threatening responses, we propose an alternate procedure
with the long-term goal of replacing invasive allergy tests. Discriminant
analyses of DBPCFC score, the eliciting dose and the first accidental
exposure score were performed in 76 allergic patients using 6 immunoassays
and 28 skin prick tests. A multiple factorial analysis was performed to
assign equal weights to both groups of variables, and predictive models
were built by cross-validation with linear discriminant analysis,
k-nearest neighbours, classification and regression
trees, penalized support vector machine, stepwise logistic regression and
AdaBoost methods. We developed an algorithm for simultaneously clustering
eliciting dose values and selecting discriminant variables. Our main
conclusion is that antibody measurements offer information on the allergy
severity, especially those directed against rAra-h1 and
rAra-h3. Further independent validation of these results
and the use of new predictors will help extend this study to clinical
practices.
Journal: Journal of Applied Statistics
Pages: 1783-1799
Issue: 9
Volume: 38
Year: 2011
Month: 8
X-DOI: 10.1080/02664763.2010.529878
File-URL: http://hdl.handle.net/10.1080/02664763.2010.529878
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1783-1799
Template-Type: ReDIF-Article 1.0
Author-Name: Hannes Kazianka
Author-X-Name-First: Hannes
Author-X-Name-Last: Kazianka
Author-Name: Michael Mulyk
Author-X-Name-First: Michael
Author-X-Name-Last: Mulyk
Author-Name: Jürgen Pilz
Author-X-Name-First: Jürgen
Author-X-Name-Last: Pilz
Title: A Bayesian approach to estimating linear mixtures with unknown covariance structure
Abstract:
In this paper, we study a new Bayesian approach for the analysis of
linearly mixed structures. In particular, we consider the case of
hyperspectral images, which have to be decomposed into a collection of
distinct spectra, called endmembers, and a set of associated proportions
for every pixel in the scene. This problem, often referred to as spectral
unmixing, is usually considered on the basis of the linear mixing model
(LMM). In unsupervised approaches, the endmember signatures have to be
calculated by an endmember extraction algorithm, which generally relies on
the supposition that there are pure (unmixed) pixels contained in the
image. In practice, this assumption may not hold for highly mixed data and
consequently the extracted endmember spectra differ from the true ones. A
way out of this dilemma is to consider the problem under the normal
compositional model (NCM). Contrary to the LMM, the NCM treats the
endmembers as random Gaussian vectors and not as deterministic quantities.
Existing Bayesian approaches for estimating the proportions under the NCM
are restricted to the case that the covariance matrix of the Gaussian
endmembers is a multiple of the identity matrix. The self-evident
conclusion is that this model is not suitable when the variance differs
from one spectral channel to the other, which is a common phenomenon in
practice. In this paper, we first propose a Bayesian strategy for the
estimation of the mixing proportions under the assumption of varying
variances in the spectral bands. Then we generalize this model to handle
the case of a completely unknown covariance structure. For both
algorithms, we present Gibbs sampling strategies and compare their
performance with other, state of the art, unmixing routines on synthetic
as well as on real hyperspectral fluorescence spectroscopy data.
Journal: Journal of Applied Statistics
Pages: 1801-1817
Issue: 9
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664763.2010.529879
File-URL: http://hdl.handle.net/10.1080/02664763.2010.529879
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1801-1817
Template-Type: ReDIF-Article 1.0
Author-Name: Folefac D. Atem
Author-X-Name-First: Folefac D.
Author-X-Name-Last: Atem
Author-Name: Ravi K. Sharma
Author-X-Name-First: Ravi K.
Author-X-Name-Last: Sharma
Author-Name: Stewart J. Anderson
Author-X-Name-First: Stewart J.
Author-X-Name-Last: Anderson
Title: Fitting bivariate multilevel models to assess long-term changes in body mass index and cigarette smoking
Abstract:
Using data from the National Health interview Survey from 1997 to 2006,
we present a multilevel analysis of change in body mass index (BMI) and
number of cigarettes smoked per day in the USA. Smoking and obesity are
the leading causes of preventable mortality and morbidity in the USA and
most parts of the developed world. A two-stage bivariate model of changes
in obesity and number of cigarette smoked per day is proposed. At the
within subject stage, an individual's BMI status and the number of
cigarette smoked per day are jointly modeled as a function of an
individual growth trajectory plus a random error. At the between-subject
stage, the parameters of the individual growth trajectories are allowed to
vary as a function of differences between subjects with respect to
demographic and behavioral characteristics and with respect to the four
regions of the USA (Northeast, West, South and North central). Our
two-stage modeling techniques are more informative than standard
regression because they characterize both group-level (nomothetic) and
individual-level (idiographic) effects, yielding a more complete
understanding of the phenomena under study.
Journal: Journal of Applied Statistics
Pages: 1819-1831
Issue: 9
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664763.2010.529880
File-URL: http://hdl.handle.net/10.1080/02664763.2010.529880
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1819-1831
Template-Type: ReDIF-Article 1.0
Author-Name: Q. Li
Author-X-Name-First: Q.
Author-X-Name-Last: Li
Author-Name: G. Zheng
Author-X-Name-First: G.
Author-X-Name-Last: Zheng
Author-Name: R. Tiwari
Author-X-Name-First: R.
Author-X-Name-Last: Tiwari
Title: Analysis of ordered categorical data with score averaging: with applications to case-control genetic associations
Abstract:
The trend test is often used for the analysis of
2×K ordered categorical data, in which
K pre-specified increasing scores are used. There have
been discussions on how to assign these scores and the impact of the
outcomes on different scores. The scores are often assigned based on the
data-generating model. When this model is unknown, using the trend test is
not robust. We discuss the weighted average of a trend test over all
scientifically plausible choices of scores or models. This approach is
more computationally efficient than a commonly used robust test MAX when
K is large. Our discussion is for any ordered
2×K table, but simulation and applications to real
data are focused on case-control genetic association studies. Although
there is no single test optimal for all choices of scores, our numerical
results show that some score averaging tests can achieve the performance
of MAX.
Journal: Journal of Applied Statistics
Pages: 1833-1843
Issue: 9
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664763.2010.529881
File-URL: http://hdl.handle.net/10.1080/02664763.2010.529881
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1833-1843
Template-Type: ReDIF-Article 1.0
Author-Name: Anne C. Black
Author-X-Name-First: Anne C.
Author-X-Name-Last: Black
Author-Name: Ofer Harel
Author-X-Name-First: Ofer
Author-X-Name-Last: Harel
Author-Name: D. Betsy McCoach
Author-X-Name-First: D.
Author-X-Name-Last: Betsy McCoach
Title: Missing data techniques for multilevel data: implications of model misspecification
Abstract:
When modeling multilevel data, it is important to accurately represent
the interdependence of observations within clusters. Ignoring data
clustering may result in parameter misestimation. However, it is not well
established to what degree parameter estimates are affected by model
misspecification when applying missing data techniques (MDTs) to
incomplete multilevel data. We compare the performance of three MDTs with
incomplete hierarchical data. We consider the impact of imputation model
misspecification on the quality of parameter estimates by employing
multiple imputation under assumptions of a normal model (MI/NM) with
two-level cross-sectional data when values are missing at random on the
dependent variable at rates of 10%, 30%, and 50%. Five criteria are used
to compare estimates from MI/NM to estimates from MI assuming a linear
mixed model (MI/LMM) and maximum likelihood estimation to the same
incomplete data sets. With 10% missing data (MD), techniques performed
similarly for fixed-effects estimates, but variance components were biased
with MI/NM. Effects of model misspecification worsened at higher rates of
MD, with the hierarchical structure of the data markedly underrepresented
by biased variance component estimates. MI/LMM and maximum likelihood
provided generally accurate and unbiased parameter estimates but
performance was negatively affected by increased rates of MD.
Journal: Journal of Applied Statistics
Pages: 1845-1865
Issue: 9
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664763.2010.529882
File-URL: http://hdl.handle.net/10.1080/02664763.2010.529882
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1845-1865
Template-Type: ReDIF-Article 1.0
Author-Name: John J. Chen
Author-X-Name-First: John J.
Author-X-Name-Last: Chen
Author-Name: Guangxiang Zhang
Author-X-Name-First: Guangxiang
Author-X-Name-Last: Zhang
Author-Name: Chen Ji
Author-X-Name-First: Chen
Author-X-Name-Last: Ji
Author-Name: George F. Steinhardt
Author-X-Name-First: George F.
Author-X-Name-Last: Steinhardt
Title: Simple moment-based inferences of generalized concordance correlation
Abstract:
We proposed two simple moment-based procedures, one with (GCCC1) and one
without (GCCC2) normality assumptions, to generalize the inference of
concordance correlation coefficient for the evaluation of agreement among
multiple observers for measurements on a continuous scale. A modified
Fisher's Z-transformation was adapted to further improve
the inference. We compared the proposed methods with
U-statistic-based inference approach. Simulation analysis
showed desirable statistical properties of the simplified approach GCCC1,
in terms of coverage probabilities and coverage balance, especially for
small samples. GCCC2, which is distribution-free, behaved comparably with
the U-statistic-based procedure, but had a more intuitive
and explicit variance estimator. The utility of these approaches were
illustrated using two clinical data examples.
Journal: Journal of Applied Statistics
Pages: 1867-1882
Issue: 9
Volume: 38
Year: 2011
Month: 10
X-DOI: 10.1080/02664763.2010.529884
File-URL: http://hdl.handle.net/10.1080/02664763.2010.529884
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1867-1882
Template-Type: ReDIF-Article 1.0
Author-Name: Kathryn Bartimote-Aufflick
Author-X-Name-First: Kathryn
Author-X-Name-Last: Bartimote-Aufflick
Author-Name: Peter C. Thomson
Author-X-Name-First: Peter C.
Author-X-Name-Last: Thomson
Title: The analysis of ordinal time-series data via a transition (Markov) model
Abstract:
While standard techniques are available for the analysis of time-series
(longitudinal) data, and for ordinal (rating) data, not much is available
for the combination of the two, at least in a readily-usable form.
However, this data type is common place in the natural and health sciences
where repeated ratings are recorded on the same subject. To analyse these
data, this paper considers a transition (Markov) model where the rating of
a subject at one time depends explicitly on the observed rating at the
previous point of time by incorporating the previous rating as a predictor
variable. Complications arise with adequate handling of data at the first
observation (t=1), as there is no prior observation to
use as a predictor. To overcome this, it is postulated the existence of a
rating at time t=0; however it is treated as
‘missing data’ and the expectation--maximisation algorithm
used to accommodate this. The particular benefits of this method are shown
for shorter time series.
Journal: Journal of Applied Statistics
Pages: 1883-1897
Issue: 9
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664763.2010.529885
File-URL: http://hdl.handle.net/10.1080/02664763.2010.529885
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1883-1897
Template-Type: ReDIF-Article 1.0
Author-Name: Jie Chen
Author-X-Name-First: Jie
Author-X-Name-Last: Chen
Author-Name: Ayten Yiğiter
Author-X-Name-First: Ayten
Author-X-Name-Last: Yiğiter
Author-Name: Kuang-Chao Chang
Author-X-Name-First: Kuang-Chao
Author-X-Name-Last: Chang
Title: A Bayesian approach to inference about a change point model with application to DNA copy number experimental data
Abstract:
In this paper, we study the change-point inference problem motivated by
the genomic data that were collected for the purpose of monitoring DNA
copy number changes. DNA copy number changes or copy number variations
(CNVs) correspond to chromosomal aberrations and signify abnormality of a
cell. Cancer development or other related diseases are usually relevant to
DNA copy number changes on the genome. There are inherited random noises
in such data, therefore, there is a need to employ an appropriate
statistical model for identifying statistically significant DNA copy
number changes. This type of statistical inference is evidently crucial in
cancer researches, clinical diagnostic applications, and other related
genomic researches. For the high-throughput genomic data resulting from
DNA copy number experiments, a mean and variance change point model (MVCM)
for detecting the CNVs is appropriate. We propose to use a Bayesian
approach to study the MVCM for the cases of one change and propose to use
a sliding window to search for all CNVs on a given chromosome. We carry
out simulation studies to evaluate the estimate of the locus of the DNA
copy number change using the derived posterior probability. These
simulation results show that the approach is suitable for identifying copy
number changes. The approach is also illustrated on several chromosomes
from nine fibroblast cancer cell line data (array-based comparative
genomic hybridization data). All DNA copy number aberrations that have
been identified and verified by karyotyping are detected by our approach
on these cell lines.
Journal: Journal of Applied Statistics
Pages: 1899-1913
Issue: 9
Volume: 38
Year: 2011
Month: 9
X-DOI: 10.1080/02664763.2010.529886
File-URL: http://hdl.handle.net/10.1080/02664763.2010.529886
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1899-1913
Template-Type: ReDIF-Article 1.0
Author-Name: Paul H. Garthwaite
Author-X-Name-First: Paul H.
Author-X-Name-Last: Garthwaite
Author-Name: John R. Crawford
Author-X-Name-First: John R.
Author-X-Name-Last: Crawford
Title: Inference for a binomial proportion in the presence of ties
Abstract:
We suppose a case is to be compared with controls on the basis of a test
that gives a single discrete score. The score of the case may tie with the
scores of one or more controls. However, scores relate to an underlying
quantity of interest that is continuous and so an observed score can be
treated as the rounded value of an underlying continuous score. This makes
it reasonable to break ties. This paper addresses the problem of forming a
confidence interval for the proportion of controls that have a lower
underlying score than the case. In the absence of ties, this is the
standard task of making inferences about a binomial proportion and many
methods for forming confidence intervals have been proposed. We give a
general procedure to extend these methods to handle ties, under the
assumption that ties may be broken at random. Properties of the procedure
are given and an example examines its performance when it is used to
extend several methods. A real example shows that an estimated confidence
interval can be much too small if the uncertainty associated with ties is
not taken into account. Software implementing the procedure is freely
available.
Journal: Journal of Applied Statistics
Pages: 1915-1934
Issue: 9
Volume: 38
Year: 2011
Month: 10
X-DOI: 10.1080/02664763.2010.537649
File-URL: http://hdl.handle.net/10.1080/02664763.2010.537649
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1915-1934
Template-Type: ReDIF-Article 1.0
Author-Name: Anandamayee Majumdar
Author-X-Name-First: Anandamayee
Author-X-Name-Last: Majumdar
Author-Name: Corinna Gries
Author-X-Name-First: Corinna
Author-X-Name-Last: Gries
Author-Name: Jason Walker
Author-X-Name-First: Jason
Author-X-Name-Last: Walker
Title: A non-stationary spatial generalized linear mixed model approach for studying plant diversity
Abstract:
We analyze the multivariate spatial distribution of plant species
diversity, distributed across three ecologically distinct land uses, the
urban residential, urban non-residential, and desert. We model these data
using a spatial generalized linear mixed model. Here plant species counts
are assumed to be correlated within and among the spatial locations. We
implement this model across the Phoenix metropolis and surrounding desert.
Using a Bayesian approach, we utilized the Langevin--Hastings hybrid
algorithm. Under a generalization of a spatial log-Gaussian Cox model, the
log-intensities of the species count processes follow Gaussian
distributions. The purely spatial component corresponding to these
log-intensities are jointly modeled using a cross-convolution approach, in
order to depict a valid cross-correlation structure. We observe that this
approach yields non-stationarity of the model ensuing from different land
use types. We obtain predictions of various measures of plant diversity
including plant richness and the Shannon--Weiner diversity at observed
locations. We also obtain a prediction framework for plant preferences in
urban and desert plots.
Journal: Journal of Applied Statistics
Pages: 1935-1950
Issue: 9
Volume: 38
Year: 2011
Month: 10
X-DOI: 10.1080/02664763.2010.537650
File-URL: http://hdl.handle.net/10.1080/02664763.2010.537650
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1935-1950
Template-Type: ReDIF-Article 1.0
Author-Name: Tatiana B. Bordin
Author-X-Name-First: Tatiana B.
Author-X-Name-Last: Bordin
Author-Name: Hildete P. Pinheiro
Author-X-Name-First: Hildete P.
Author-X-Name-Last: Pinheiro
Author-Name: Alu�sio Pinheiro
Author-X-Name-First: Alu�sio
Author-X-Name-Last: Pinheiro
Title: Homogeneity tests among groups for microsatellite data
Abstract:
We propose a homogeneity test among groups on a quadratic distance
measure. The underlying mutation process in the microsatellite loci is
studied using the stepwise mutation model. Asymptotic normality of the
test statistic is proved under very mild regularity conditions. Resampling
methods, such as jackknife, are used in the application to build
confidence intervals for the difference in allelic variation between and
within groups. The method is applied in a real data to test whether there
are differences in the distribution of the repeated sequence among groups
defined by ethnicity and alcoholism index (ALDX1).
Journal: Journal of Applied Statistics
Pages: 1951-1962
Issue: 9
Volume: 38
Year: 2011
Month: 10
X-DOI: 10.1080/02664763.2010.537651
File-URL: http://hdl.handle.net/10.1080/02664763.2010.537651
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1951-1962
Template-Type: ReDIF-Article 1.0
Author-Name: Younan Chen
Author-X-Name-First: Younan
Author-X-Name-Last: Chen
Author-Name: Keying Ye
Author-X-Name-First: Keying
Author-X-Name-Last: Ye
Title: A Bayesian hierarchical approach to dual response surface modelling
Abstract:
In modern quality engineering, dual response surface methodology is a
powerful tool to model an industrial process by using both the mean and
the standard deviation of the measurements as the responses. The least
squares method in regression is often used to estimate the coefficients in
the mean and standard deviation models, and various decision criteria are
proposed by researchers to find the optimal conditions. Based on the
inherent hierarchical structure of the dual response problems, we propose
a Bayesian hierarchical approach to model dual response surfaces. Such an
approach is compared with two frequentist least squares methods by using
two real data sets and simulated data.
Journal: Journal of Applied Statistics
Pages: 1963-1975
Issue: 9
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664763.2010.545106
File-URL: http://hdl.handle.net/10.1080/02664763.2010.545106
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1963-1975
Template-Type: ReDIF-Article 1.0
Author-Name: Amjad D. Al-Nasser
Author-X-Name-First: Amjad D.
Author-X-Name-Last: Al-Nasser
Author-Name: Mohammad Y. Al-Rawwash
Author-X-Name-First: Mohammad Y.
Author-X-Name-Last: Al-Rawwash
Author-Name: Anas S. Alakhras
Author-X-Name-First: Anas S.
Author-X-Name-Last: Alakhras
Title: An approach to setting up a national customer satisfaction index: the Jordan case study
Abstract:
The aim of this paper was to develop a national customer satisfaction
index (CSI) in Jordan and to derive its theory using generalized maximum
entropy. During the course of this research, we conducted two different
surveys to complete the framework of this CSI. The first one is a pilot
study conducted based on a CSI basket in order to select the main factors
that comprise the Jordanian customer satisfaction index (JCSI). Based on
two different analyses, namely nonlinear principal component analysis and
factor analysis, the explained variances in the first and second
dimensions were 50.32 and 16.99% respectively. Also, Cronbach coefficients
α in the first and second dimensions were 0.923 and 0.521,
respectively. The results of this survey suggests the inclusion of
loyalty, complaint, expectation, image and service quality as the main CS
factors of our proposed model. The second study is a practical
implementation conducted on the Vocational Training Corporation in order
to evaluate the proposed JCSI. The results indicated that the suggested
components of the proposed model are significant and form a good fitted
model. We used the comparative fit index and the normed fit index as
goodness-of-fit measures to evaluate the effectiveness of our proposed
model. Both measures indicated that the proposed model is a promising one.
Journal: Journal of Applied Statistics
Pages: 1977-1993
Issue: 9
Volume: 38
Year: 2011
Month: 12
X-DOI: 10.1080/02664763.2010.545107
File-URL: http://hdl.handle.net/10.1080/02664763.2010.545107
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1977-1993
Template-Type: ReDIF-Article 1.0
Author-Name: Firoozeh Rivaz
Author-X-Name-First: Firoozeh
Author-X-Name-Last: Rivaz
Author-Name: Mohsen Mohammadzadeh
Author-X-Name-First: Mohsen
Author-X-Name-Last: Mohammadzadeh
Author-Name: Majid Jafari Khaledi
Author-X-Name-First: Majid Jafari
Author-X-Name-Last: Khaledi
Title: Spatio-temporal modeling and prediction of CO concentrations in Tehran city
Abstract:
One of the most important agents responsible for high pollution in Tehran
is carbon monoxide. Prediction of carbon monoxide is of immense help for
sustaining the inhabitants’ health level. In this paper, motivated
by the statistical analysis of carbon monoxide using the empirical Bayes
approach, we deal with the issue of prior specification for the model
parameters. In fact, the hyperparameters (the parameters of the prior law)
are estimated based on a sampling-based method which depends only on the
specification of the marginal spatial and temporal correlation structures.
We compare the predictive performance of this approach with the type II
maximum likelihood method. Results indicate that the proposed procedure
performs better for this data set.
Journal: Journal of Applied Statistics
Pages: 1995-2007
Issue: 9
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664763.2010.545108
File-URL: http://hdl.handle.net/10.1080/02664763.2010.545108
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:1995-2007
Template-Type: ReDIF-Article 1.0
Author-Name: Karimollah Hajian-Tilaki
Author-X-Name-First: Karimollah
Author-X-Name-Last: Hajian-Tilaki
Author-Name: James A. Hanley
Author-X-Name-First: James A.
Author-X-Name-Last: Hanley
Author-Name: Vahid Nassiri
Author-X-Name-First: Vahid
Author-X-Name-Last: Nassiri
Title: An extension of parametric ROC analysis for calculating diagnostic accuracy when underlying distributions are mixture of Gaussian
Abstract:
The semiparametric LABROC approach of fitting binormal model for
estimating AUC as a global index of accuracy has been justified (except
for bimodal forms), while for estimating a local index of accuracy such as
TPF, it may lead to a bias in severe departure of data from binormality.
We extended parametric ROC analysis for quantitative data when one or both
pair members are mixture of Gaussian (MG) in particular for bimodal forms.
We analytically showed that AUC and TPF are a mixture of weighting
parameters of different components of AUCs and TPFs of a mixture of
underlying distributions. In a simulation study of six configurations of
MG distributions:{bimodal, normal} and {bimodal, bimodal} pairs, the
parameters of MG distributions were estimated using the EM algorithm. The
results showed that the estimated AUC from our proposed model was
essentially unbiased, and that the bias in the estimated TPF at a
clinically relevant range of FPF was roughly 0.01 for a sample size of
n=100/100. In practice, with severe departures from
binormality, we recommend an extension of the LABROC and software
development for future research to allow for each member of the pair of
distributions to be a mixture of Gaussian that is a more flexible
parametric form.
Journal: Journal of Applied Statistics
Pages: 2009-2022
Issue: 9
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664763.2010.545109
File-URL: http://hdl.handle.net/10.1080/02664763.2010.545109
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:2009-2022
Template-Type: ReDIF-Article 1.0
Author-Name: Pär Stockhammar
Author-X-Name-First: Pär
Author-X-Name-Last: Stockhammar
Author-Name: Lars-Erik Öller
Author-X-Name-First: Lars-Erik
Author-X-Name-Last: Öller
Title: On the probability distribution of economic growth
Abstract:
Three important and significantly heteroscedastic gross domestic product
series are studied. Omnipresent heteroscedasticity is removed and the
distributions of the series are then compared to normal, normal mixture
and normal--asymmetric Laplace (NAL) distributions. NAL represents a
skewed and leptokurtic distribution, which is in line with the Aghion and
Howitt [1] model for economic growth, based on Schumpeter's idea of
creative destruction. Statistical properties of the NAL distributions are
provided and it is shown that NAL fits the data better than the
alternatives.
Journal: Journal of Applied Statistics
Pages: 2023-2041
Issue: 9
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664763.2010.545110
File-URL: http://hdl.handle.net/10.1080/02664763.2010.545110
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:2023-2041
Template-Type: ReDIF-Article 1.0
Author-Name: Eylem Deniz
Author-X-Name-First: Eylem
Author-X-Name-Last: Deniz
Author-Name: Oguz Akbilgic
Author-X-Name-First: Oguz
Author-X-Name-Last: Akbilgic
Author-Name: J. Andrew Howe
Author-X-Name-First: J. Andrew
Author-X-Name-Last: Howe
Title: Model selection using information criteria under a new estimation method: least squares ratio
Abstract:
In this study, we evaluate several forms of both Akaike-type and
Information Complexity (ICOMP)-type information criteria, in the context
of selecting an optimal subset least squares ratio (LSR) regression model.
Our simulation studies are designed to mimic many characteristics present
in real data -- heavy tails, multicollinearity, redundant variables, and
completely unnecessary variables. Our findings are that LSR in conjunction
with one of the ICOMP criteria is very good at selecting the true model.
Finally, we apply these methods to the familiar body fat data set.
Journal: Journal of Applied Statistics
Pages: 2043-2050
Issue: 9
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664763.2010.545111
File-URL: http://hdl.handle.net/10.1080/02664763.2010.545111
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:2043-2050
Template-Type: ReDIF-Article 1.0
Author-Name: Nader Fallah
Author-X-Name-First: Nader
Author-X-Name-Last: Fallah
Author-Name: Arnold Mitnitski
Author-X-Name-First: Arnold
Author-X-Name-Last: Mitnitski
Author-Name: Kenneth Rockwood
Author-X-Name-First: Kenneth
Author-X-Name-Last: Rockwood
Title: Applying neural network Poisson regression to predict cognitive score changes
Abstract:
In this study, we combined a Poisson regression model with neural
networks (neural network Poisson regression) to relax the traditional
Poisson regression assumption of linearity of the Poisson mean as a
function of covariates, while including it as a special case. In four
simulated examples, we found that the neural network Poisson regression
improved the performance of simple Poisson regression if the Poisson mean
was nonlinearly related to covariates. We also illustrated the performance
of the model in predicting five-year changes in cognitive scores, in
association with age and education level; we found that the proposed
approach had superior accuracy to conventional linear Poisson regression.
As the interpretability of the neural networks is often difficult, its
combination with conventional and more readily interpretable approaches
under the generalized linear model can benefit applications in
biomedicine.
Journal: Journal of Applied Statistics
Pages: 2051-2062
Issue: 9
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664763.2010.545112
File-URL: http://hdl.handle.net/10.1080/02664763.2010.545112
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:2051-2062
Template-Type: ReDIF-Article 1.0
Author-Name: Bhamidipati Narasimha Murthy
Author-X-Name-First: Bhamidipati Narasimha
Author-X-Name-Last: Murthy
Author-Name: Ngianga-Bakwin Kandala
Author-X-Name-First: Ngianga-Bakwin
Author-X-Name-Last: Kandala
Author-Name: Radhakrishnan Ezhil
Author-X-Name-First: Radhakrishnan
Author-X-Name-Last: Ezhil
Author-Name: Prabhdeep Kaur
Author-X-Name-First: Prabhdeep
Author-X-Name-Last: Kaur
Author-Name: Ramachandra Sudha
Author-X-Name-First: Ramachandra
Author-X-Name-Last: Sudha
Title: Statistical issues in studying the relative importance of body mass index, waist circumference, waist hip ratio and waist stature ratio to predict type 2 diabetes
Abstract:
Systematic and appropriate statistical analysis is needed to examine the
relative performance of anthropometrical indices, viz. body mass index
(BMI), waist circumference (WC), waist hip ratio (WHR) and waist stature
ratio (WSR) for predicting type 2 diabetes. Using information on
socio-demographic, anthropometric and biochemical variables from 2148
males, we examined collinearity and non-linearity among the predictors
before studying the association between anthropometric indices and type 2
diabetes. The variable involving in collinearity was removed from further
analysis, and the relative importance of BMI, WC and WHR was examined by
logistic regression analysis. To avoid non-interpretable odds ratios
(ORs), cut point theory is used. Optimal cut points are derived and tested
for significance. Multivariable fractional polynomial (MFP) algorithm is
applied to reconcile non-linearity. As expected, WSR and WC were collinear
with WHR and BMI. Since WSR was jointly as well as independently
collinear, it was dropped from further analysis. The OR for WHR could not
be interpreted meaningfully. Cut point theory was adopted. Deciles emerged
as the optimal cut point. MFP recognized non-linearity effects on the
outcome. Multicollinearity among the anthropometric indices was examined.
Optimal cut points were identified and used to study the relative ORs. On
the basis of the results of analysis, MFP is recommended to accommodate
non-linearity among the predictors. WHR is relatively more important and
significant than WC and BMI.
Journal: Journal of Applied Statistics
Pages: 2063-2070
Issue: 9
Volume: 38
Year: 2011
Month: 11
X-DOI: 10.1080/02664763.2010.545113
File-URL: http://hdl.handle.net/10.1080/02664763.2010.545113
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:38:y:2011:i:9:p:2063-2070
Template-Type: ReDIF-Article 1.0
Author-Name: Robert G. Aykroyd
Author-X-Name-First: Robert G.
Author-X-Name-Last: Aykroyd
Title: Editorial
Journal: Journal of Applied Statistics
Pages: 1-1
Issue: 1
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.643025
File-URL: http://hdl.handle.net/10.1080/02664763.2012.643025
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:1-1
Template-Type: ReDIF-Article 1.0
Author-Name: Charles G. Minard
Author-X-Name-First: Charles G.
Author-X-Name-Last: Minard
Author-Name: Wenyaw Chan
Author-X-Name-First: Wenyaw
Author-X-Name-Last: Chan
Author-Name: David W. Wetter
Author-X-Name-First: David W.
Author-X-Name-Last: Wetter
Author-Name: Carol J. Etzel
Author-X-Name-First: Carol J.
Author-X-Name-Last: Etzel
Title: Trends in smoking cessation: a Markov model approach
Abstract:
Intervention trials such as studies on smoking cessation may observe
multiple, discrete outcomes over time. When the outcome is binary,
participant observations may alternate between two states over the course
of the study. The generalized estimating equation (GEE) approach is
commonly used to analyze binary, longitudinal data in the context of
independent variables. However, the sequence of observations may be
assumed to follow a Markov chain with stationary transition probabilities
when observations are made at fixed time points. Participants favoring the
transition to one particular state over the other would be evidence of a
trend in the observations. Using a log-transformed trend parameter, the
determinants of a trend in a binary, longitudinal study may be evaluated
by maximizing the likelihood function. A new methodology is presented here
to test for the presence and determinants of a trend in binary,
longitudinal observations. Empirical studies are evaluated and comparisons
are made with the GEE approach. Practical application of the proposed
method is made to the data available from an intervention study on smoking
cessation.
Journal: Journal of Applied Statistics
Pages: 113-127
Issue: 1
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2011.578619
File-URL: http://hdl.handle.net/10.1080/02664763.2011.578619
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:113-127
Template-Type: ReDIF-Article 1.0
Author-Name: Claudia Angelini
Author-X-Name-First: Claudia
Author-X-Name-Last: Angelini
Author-Name: Daniela De Canditiis
Author-X-Name-First: Daniela
Author-X-Name-Last: De Canditiis
Author-Name: Marianna Pensky
Author-X-Name-First: Marianna
Author-X-Name-Last: Pensky
Title: Clustering time-course microarray data using functional Bayesian infinite mixture model
Abstract:
This paper presents a new Bayesian, infinite mixture model based,
clustering approach, specifically designed for time-course microarray
data. The problem is to group together genes which have
“similar” expression profiles, given the set of noisy
measurements of their expression levels over a specific time interval. In
order to capture temporal variations of each curve, a non-parametric
regression approach is used. Each expression profile is expanded over a
set of basis functions and the sets of coefficients of each curve are
subsequently modeled through a Bayesian infinite mixture of Gaussian
distributions. Therefore, the task of finding clusters of genes with
similar expression profiles is then reduced to the problem of grouping
together genes whose coefficients are sampled from the same distribution
in the mixture. Dirichlet processes prior is naturally employed in such
kinds of models, since it allows one to deal automatically with the
uncertainty about the number of clusters. The posterior inference is
carried out by a split and merge MCMC sampling scheme
which integrates out parameters of the component distributions and updates
only the latent vector of the cluster membership. The final configuration
is obtained via the maximum a posteriori estimator. The
performance of the method is studied using synthetic and real microarray
data and is compared with the performances of competitive techniques.
Journal: Journal of Applied Statistics
Pages: 129-149
Issue: 1
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2011.578620
File-URL: http://hdl.handle.net/10.1080/02664763.2011.578620
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:129-149
Template-Type: ReDIF-Article 1.0
Author-Name: Guoyi Zhang
Author-X-Name-First: Guoyi
Author-X-Name-Last: Zhang
Author-Name: Yan Lu
Author-X-Name-First: Yan
Author-X-Name-Last: Lu
Title: Bias-corrected random forests in regression
Abstract:
It is well known that random forests reduce the variance of the
regression predictors compared to a single tree, while leaving the bias
unchanged. In many situations, the dominating component in the risk turns
out to be the squared bias, which leads to the necessity of bias
correction. In this paper, random forests are used to estimate the
regression function. Five different methods for estimating bias are
proposed and discussed. Simulated and real data are used to study the
performance of these methods. Our proposed methods are significantly
effective in reducing bias in regression context.
Journal: Journal of Applied Statistics
Pages: 151-160
Issue: 1
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2011.578621
File-URL: http://hdl.handle.net/10.1080/02664763.2011.578621
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:151-160
Template-Type: ReDIF-Article 1.0
Author-Name: Faisal M. Zahid
Author-X-Name-First: Faisal M.
Author-X-Name-Last: Zahid
Author-Name: Shahla Ramzan
Author-X-Name-First: Shahla
Author-X-Name-Last: Ramzan
Title: Ordinal ridge regression with categorical predictors
Abstract:
In multi-category response models, categories are often ordered. In the
case of ordinal response models, the usual likelihood approach becomes
unstable with ill-conditioned predictor space or when the number of
parameters to be estimated is large relative to the sample size. The
likelihood estimates do not exist when the number of observations is less
than the number of parameters. The same problem arises if constraint on
the order of intercept values is not met during the iterative procedure.
Proportional odds models (POMs) are most commonly used for ordinal
responses. In this paper, penalized likelihood with quadratic penalty is
used to address these issues with a special focus on POMs. To avoid large
differences between two parameter values corresponding to the consecutive
categories of an ordinal predictor, the differences between the parameters
of two adjacent categories should be penalized. The considered
penalized-likelihood function penalizes the parameter estimates or
differences between the parameter estimates according to the type of
predictors. Mean-squared error for parameter estimates, deviance of fitted
probabilities and prediction error for ridge regression are compared with
usual likelihood estimates in a simulation study and an application.
Journal: Journal of Applied Statistics
Pages: 161-171
Issue: 1
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2011.578622
File-URL: http://hdl.handle.net/10.1080/02664763.2011.578622
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:161-171
Template-Type: ReDIF-Article 1.0
Author-Name: Miao-Yu Tsai
Author-X-Name-First: Miao-Yu
Author-X-Name-Last: Tsai
Title: Assessing inter- and intra-agreement for dependent binary data: a Bayesian hierarchical correlation approach
Abstract:
Agreement measures are designed to assess consistency between different
instruments rating measurements of interest. When the individual responses
are correlated with multilevel structure of nestings and clusters,
traditional approaches are not readily available to estimate the inter-
and intra-agreement for such complex multilevel settings. Our research
stems from conformity evaluation between optometric devices with
measurements on both eyes, equality tests of agreement in high myopic
status between monozygous twins and dizygous twins, and assessment of
reliability for different pathologists in dysplasia. In this paper, we
focus on applying a Bayesian hierarchical correlation model incorporating
adjustment for explanatory variables and nesting correlation structures to
assess the inter- and intra-agreement through correlations of random
effects for various sources. This Bayesian generalized linear
mixed-effects model (GLMM) is further compared with the approximate
intra-class correlation coefficients and kappa measures by the traditional
Cohen’s kappa statistic and the generalized estimating equations
(GEE) approach. The results of comparison studies reveal that the Bayesian
GLMM provides a reliable and stable procedure in estimating inter- and
intra-agreement simultaneously after adjusting for covariates and
correlation structures, in marked contrast to Cohen’s kappa and the
GEE approach.
Journal: Journal of Applied Statistics
Pages: 173-187
Issue: 1
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2011.578623
File-URL: http://hdl.handle.net/10.1080/02664763.2011.578623
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:173-187
Template-Type: ReDIF-Article 1.0
Author-Name: R. M. Jacques
Author-X-Name-First: R. M.
Author-X-Name-Last: Jacques
Author-Name: N. R.J. Fieller
Author-X-Name-First: N. R.J.
Author-X-Name-Last: Fieller
Author-Name: E. K. Ainscow
Author-X-Name-First: E. K.
Author-X-Name-Last: Ainscow
Title: A classification updating procedure motivated by high-content screening data
Abstract:
The current paradigm for the identification of candidate drugs within the
pharmaceutical industry typically involves the use of high-throughput
screens. High-content screening (HCS) is the term given to the process of
using an imaging platform to screen large numbers of compounds for some
desirable biological activity. Classification methods have important
applications in HCS experiments, where they are used to predict which
compounds have the potential to be developed into new drugs. In this
paper, a new classification method is proposed for batches of compounds
where the rule is updated sequentially using information from the
classification of previous batches. This methodology accounts for the
possibility that the training data are not a representative sample of the
test data and that the underlying group distributions may change as new
compounds are analysed. This technique is illustrated on an example data
set using linear discriminant analysis, k-nearest
neighbour and random forest classifiers. Random forests are shown to be
superior to the other classifiers and are further improved by the
additional updating algorithm in terms of an increase in the number of
true positives as well as a decrease in the number of false positives.
Journal: Journal of Applied Statistics
Pages: 189-198
Issue: 1
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2011.580335
File-URL: http://hdl.handle.net/10.1080/02664763.2011.580335
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:189-198
Template-Type: ReDIF-Article 1.0
Author-Name: P. Congdon
Author-X-Name-First: P.
Author-X-Name-Last: Congdon
Author-Name: C. D. Lloyd
Author-X-Name-First: C. D.
Author-X-Name-Last: Lloyd
Title: A spatial random-effects model for interzone flows: commuting in Northern Ireland
Abstract:
Government policy on employment, transport, and housing often depends on
reliable information about spatial variation in commuting flows across a
region. Simple commuting rates summarising inter-area flows may not
provide a full perspective on the underlying levels of commuting
attractivity of different areas (as destinations), or the varying
dependence of different areas (as origins) on outside employment. Areas
also vary in the degree of commuting self-containment, as expressed in
intra-area flows. This paper uses a spatial random-effects model to
develop indices of attractivity, extra-dependence, and self-containment
using a latent factor method. The methodology allows consideration of the
degree to which different explanatory influences (e.g. socioeconomic
structure, characteristics of road networks, employment density) affect
these aspects of commuting. The particular application is to commuting
flows in Northern Ireland, using 139 zones that aggregate smaller areas
(wards), so avoiding undue sparsity in the flow matrix. The analysis
involves Bayesian estimation, with the outputs comprising full densities
for extra-dependence, and attractivity scores and scores for intra-area
containment of zones. Spatial patterning in these aspects of commuting is
allowed for in the model used. One key pattern is the difference in latent
effect estimates for urban (in particular, Belfast) and rural areas
reflecting variable job opportunities in these areas.
Journal: Journal of Applied Statistics
Pages: 199-213
Issue: 1
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2011.580336
File-URL: http://hdl.handle.net/10.1080/02664763.2011.580336
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:199-213
Template-Type: ReDIF-Article 1.0
Author-Name: Wanbo Lu
Author-X-Name-First: Wanbo
Author-X-Name-Last: Lu
Author-Name: Daimin Shi
Author-X-Name-First: Daimin
Author-X-Name-Last: Shi
Title: A new compounding life distribution: the Weibull--Poisson distribution
Abstract:
In this paper, a new compounding distribution, named the Weibull--Poisson
distribution is introduced. The shape of failure rate function of the new
compounding distribution is flexible, it can be decreasing, increasing,
upside-down bathtub-shaped or unimodal. A comprehensive mathematical
treatment of the proposed distribution and expressions of its density,
cumulative distribution function, survival function, failure rate
function, the kth raw moment and quantiles are provided.
Maximum likelihood method using EM algorithm is developed for parameter
estimation. Asymptotic properties of the maximum likelihood estimates are
discussed, and intensive simulation studies are conducted for evaluating
the performance of parameter estimation. The use of the proposed
distribution is illustrated with examples.
Journal: Journal of Applied Statistics
Pages: 21-38
Issue: 1
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2011.575126
File-URL: http://hdl.handle.net/10.1080/02664763.2011.575126
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:21-38
Template-Type: ReDIF-Article 1.0
Author-Name: Jan C. Schuller
Author-X-Name-First: Jan C.
Author-X-Name-Last: Schuller
Title: The malicious host: a minimax solution of the Monty Hall problem
Abstract:
The classic solution of the Monty Hall problem tacitly assumes that,
after the candidate made his/her first choice, the host always allows the
candidate to switch doors after he/she showed to the candidate a losing
door, not initially chosen by the candidate. In view of actual TV shows,
it seems a more credible assumption that the host will or will not allow
switching. Under this assumption, possible strategies for the candidate
are discussed, with respect to a minimax solution of the problem. In
conclusion, the classic solution does not necessarily provide a good
guidance for a candidate on a game show. It is discussed that the
popularity of the problem is due to its incompleteness.
Journal: Journal of Applied Statistics
Pages: 215-221
Issue: 1
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2011.580337
File-URL: http://hdl.handle.net/10.1080/02664763.2011.580337
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:215-221
Template-Type: ReDIF-Article 1.0
Author-Name: Paula Brito
Author-X-Name-First: Paula
Author-X-Name-Last: Brito
Author-Name: A. Pedro Duarte Silva
Author-X-Name-First: A. Pedro
Author-X-Name-Last: Duarte Silva
Title: Modelling interval data with Normal and Skew-Normal distributions
Abstract:
A parametric modelling for interval data is proposed, assuming a
multivariate Normal or Skew-Normal distribution for the midpoints and
log-ranges of the interval variables. The intrinsic nature of the interval
variables leads to special structures of the variance--covariance matrix,
which is represented by five different possible configurations. Maximum
likelihood estimation for both models under all considered configurations
is studied. The proposed modelling is then considered in the context of
analysis of variance and multivariate analysis of variance testing. To
access the behaviour of the proposed methodology, a simulation study is
performed. The results show that, for medium or large sample sizes, tests
have good power and their true significance level approaches nominal
levels when the constraints assumed for the model are respected; however,
for small samples, sizes close to nominal levels cannot be guaranteed.
Applications to Chinese meteorological data in three different regions and
to credit card usage variables for different card designations, illustrate
the proposed methodology.
Journal: Journal of Applied Statistics
Pages: 3-20
Issue: 1
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2011.575125
File-URL: http://hdl.handle.net/10.1080/02664763.2011.575125
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:3-20
Template-Type: ReDIF-Article 1.0
Author-Name: Sammy Zahran
Author-X-Name-First: Sammy
Author-X-Name-Last: Zahran
Author-Name: Michael A. Long
Author-X-Name-First: Michael A.
Author-X-Name-Last: Long
Author-Name: Kenneth J. Berry
Author-X-Name-First: Kenneth J.
Author-X-Name-Last: Berry
Title: Measures of predictor sensitivity for order-insensitive partitioning of multiple correlation
Abstract:
Lindeman et al. [12] provide a unique solution to the
relative importance of correlated predictors in multiple regression by
averaging squared semi-partial correlations obtained for each predictor
across all p! orderings. In this paper, we propose a
series of predictor sensitivity statistics that complement the variance
decomposition procedure advanced by Lindeman et al. [12].
First, we detail the logic of averaging over orderings as a technique of
variance partitioning. Second, we assess predictors by conditional
dominance analysis, a qualitative procedure designed to overcome defects
in the Lindeman et al. [12] variance decomposition
solution. Third, we introduce a suite of indices to assess the sensitivity
of a predictor to model specification, advancing a series of
sensitivity-adjusted contribution statistics that allow for more definite
quantification of predictor relevance. Fourth, we describe the analytic
efficiency of our proposed technique against the Budescu conditional
dominance solution to the uneven contribution of predictors across all
p! orderings.
Journal: Journal of Applied Statistics
Pages: 39-51
Issue: 1
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2011.578614
File-URL: http://hdl.handle.net/10.1080/02664763.2011.578614
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:39-51
Template-Type: ReDIF-Article 1.0
Author-Name: Giancarlo Diana
Author-X-Name-First: Giancarlo
Author-X-Name-Last: Diana
Author-Name: Pier Francesco Perri
Author-X-Name-First: Pier Francesco
Author-X-Name-Last: Perri
Title: A calibration-based approach to sensitive data: a simulation study
Abstract:
In this paper, we discuss the use of auxiliary information to estimate
the population mean of a sensitive variable when data are perturbed by
means of three scrambled response devices, namely the additive, the
multiplicative and the mixed model. Emphasis is given to the calibration
approach, and the behavior of different estimators is investigated through
simulated and real data. It is shown that the use of auxiliary information
can considerably improve the efficiency of the estimates without
jeopardizing respondent privacy.
Journal: Journal of Applied Statistics
Pages: 53-65
Issue: 1
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2011.578615
File-URL: http://hdl.handle.net/10.1080/02664763.2011.578615
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:53-65
Template-Type: ReDIF-Article 1.0
Author-Name: Eugene Demidenko
Author-X-Name-First: Eugene
Author-X-Name-Last: Demidenko
Title: Confidence intervals and bands for the binormal ROC curve revisited
Abstract:
Two types of confidence intervals (CIs) and confidence bands (CBs) for
the receiver operating characteristic (ROC) curve are studied: pointwise
CIs and simultaneous CBs. An optimized version of the pointwise CI with
the shortest width is developed. A new ellipse-envelope simultaneous CB
for the ROC curve is suggested as an adaptation of the
Working--Hotelling-type CB implemented in a paper by Ma and Hall (1993).
Statistical simulations show that our ellipse-envelope CB covers the true
ROC curve with a probability close to nominal while the coverage
probability of the Ma and Hall CB is significantly smaller. Simulations
also show that our CI for the area under the ROC curve is close to nominal
while the coverage probability of the CI suggested by Hanley and McNail
(1982) uniformly overestimates the nominal value. Two examples illustrate
our simultaneous ROC bands: radiation dose estimation from time to
vomiting and discrimination of breast cancer from benign abnormalities
using electrical impedance measurements.
Journal: Journal of Applied Statistics
Pages: 67-79
Issue: 1
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2011.578616
File-URL: http://hdl.handle.net/10.1080/02664763.2011.578616
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:67-79
Template-Type: ReDIF-Article 1.0
Author-Name: Giovanni Masala
Author-X-Name-First: Giovanni
Author-X-Name-Last: Masala
Title: Earthquakes occurrences estimation through a parametric semi-Markov approach
Abstract:
The estimation of earthquakes’ occurrences prediction in seismic
areas is a challenging problem in seismology and earthquake engineering.
Indeed, the prevention and the quantification of possible damage provoked
by destructive earthquakes are directly linked to this kind of prevision.
In our paper, we adopt a parametric semi-Markov approach. This model
assumes that a sequence of earthquakes is seen as a Markov process and
besides it permits to take into consideration the more realistic
assumption of events’ dependence in space and time. The elapsed
time between two consecutive events is modeled as a general Weibull
distribution. We determine then the transition probabilities and the
so-called crossing states probabilities. We conclude then with a Monte
Carlo simulation and the model is validated through a large database
containing real data.
Journal: Journal of Applied Statistics
Pages: 81-96
Issue: 1
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2011.578617
File-URL: http://hdl.handle.net/10.1080/02664763.2011.578617
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:81-96
Template-Type: ReDIF-Article 1.0
Author-Name: Rabindra Nath Das
Author-X-Name-First: Rabindra Nath
Author-X-Name-Last: Das
Author-Name: Jeong-Soo Park
Author-X-Name-First: Jeong-Soo
Author-X-Name-Last: Park
Title: Discrepancy in regression estimates between log-normal and gamma: some case studies
Abstract:
In regression models with multiplicative error, estimation is often based
on either the log-normal or the gamma model. It is well known that the
gamma model with constant coefficient of variation and the log-normal
model with constant variance give almost the same analysis. This article
focuses on the discrepancies of the regression estimates between the two
models based on real examples. It identifies that even though the variance
or the coefficient of variation remains constant, but regression estimates
may be different between the two models. It also identifies that for the
same positive data set, the variance is constant under
the log-normal model but non-constant under the gamma model. For this data
set, the regression estimates are completely different
between the two models. In the process, it explains the causes of
discrepancies between the two models.
Journal: Journal of Applied Statistics
Pages: 97-111
Issue: 1
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2011.578618
File-URL: http://hdl.handle.net/10.1080/02664763.2011.578618
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:1:p:97-111
Template-Type: ReDIF-Article 1.0
Author-Name: Linus Schiöler
Author-X-Name-First: Linus
Author-X-Name-Last: Schiöler
Author-Name: Marianne Fris�n
Author-X-Name-First: Marianne
Author-X-Name-Last: Fris�n
Title: Multivariate outbreak detection
Abstract:
Online monitoring is needed to detect outbreaks of diseases such as
influenza. Surveillance is also needed for other kinds of outbreaks, in
the sense of an increasing expected value after a constant period.
Information on spatial location or other variables might be available and
may be utilized. We adapted a robust method for outbreak detection to a
multivariate case. The relation between the times of the onsets of the
outbreaks at different locations (or some other variable) was used to
determine the sufficient statistic for surveillance. The derived
maximum-likelihood estimator of the outbreak regression was
semi-parametric in the sense that the baseline and the slope were
non-parametric while the distribution belonged to the one-parameter
exponential family. The estimator was used in a generalized-likelihood
ratio surveillance method. The method was evaluated with respect to
robustness and efficiency in a simulation study and applied to spatial
data for detection of influenza outbreaks in Sweden.
Journal: Journal of Applied Statistics
Pages: 223-242
Issue: 2
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2011.584522
File-URL: http://hdl.handle.net/10.1080/02664763.2011.584522
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:223-242
Template-Type: ReDIF-Article 1.0
Author-Name: Jörg Drechsler
Author-X-Name-First: Jörg
Author-X-Name-Last: Drechsler
Title: New data dissemination approaches in old Europe -- synthetic datasets for a German establishment survey
Abstract:
Disseminating microdata to the public that provide a high level of data
utility, while at the same time guaranteeing the confidentiality of the
survey respondent is a difficult task. Generating multiply imputed
synthetic datasets is an innovative statistical disclosure limitation
technique with the potential of enabling the data disseminating agency to
achieve this twofold goal. So far, the approach was successfully
implemented only for a limited number of datasets in the U.S. In this
paper, we present the first successful implementation outside the U.S.:
the generation of partially synthetic datasets for an establishment panel
survey at the German Institute for Employment Research. We describe the
whole evolution of the project: from the early discussions concerning
variables at risk to the final synthesis. We also present our disclosure
risk evaluations and provide some first results on the data utility of the
generated datasets. A variance-inflated imputation model is introduced
that incorporates additional variability in the model for records that are
not sufficiently protected by the standard synthesis.
Journal: Journal of Applied Statistics
Pages: 243-265
Issue: 2
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2011.584523
File-URL: http://hdl.handle.net/10.1080/02664763.2011.584523
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:243-265
Template-Type: ReDIF-Article 1.0
Author-Name: Yu-Mei Chang
Author-X-Name-First: Yu-Mei
Author-X-Name-Last: Chang
Author-Name: Chun-Shu Chen
Author-X-Name-First: Chun-Shu
Author-X-Name-Last: Chen
Author-Name: Pao-Sheng Shen
Author-X-Name-First: Pao-Sheng
Author-X-Name-Last: Shen
Title: A jackknife-based versatile test for two-sample problems with right-censored data
Abstract:
For testing the equality of two survival functions, the weighted logrank
test and the weighted Kaplan--Meier test are the two most widely used
methods. Actually, each of these tests has advantages and defects against
various alternatives, while we cannot specify in advance the possible
types of the survival differences. Hence, how to choose a single test or
combine a number of competitive tests for indicating the diversities of
two survival functions without suffering a substantial loss in power is an
important issue. Instead of directly using a particular test which
generally performs well in some situations and poorly in others, we
further consider a class of tests indexed by a weighted parameter for
testing the equality of two survival functions in this paper. A delete-1
jackknife method is implemented for selecting weights such that the
variance of the test is minimized. Some numerical experiments are
performed under various alternatives for illustrating the superiority of
the proposed method. Finally, the proposed testing procedure is applied to
two real-data examples as well.
Journal: Journal of Applied Statistics
Pages: 267-277
Issue: 2
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2011.584524
File-URL: http://hdl.handle.net/10.1080/02664763.2011.584524
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:267-277
Template-Type: ReDIF-Article 1.0
Author-Name: Kiyotaka Iki
Author-X-Name-First: Kiyotaka
Author-X-Name-Last: Iki
Author-Name: Kouji Tahata
Author-X-Name-First: Kouji
Author-X-Name-Last: Tahata
Author-Name: Sadao Tomizawa
Author-X-Name-First: Sadao
Author-X-Name-Last: Tomizawa
Title: Measure of departure from marginal homogeneity using marginal odds for multi-way tables with ordered categories
Abstract:
For square contingency tables with ordered categories, this paper
proposes a measure to represent the degree of departure from the marginal
homogeneity model. It is expressed as the weighted sum of the
power-divergence or Patil--Taillie diversity index, and is a function of
marginal log odds ratios. The measure represents the degree of departure
from the equality of the log odds that the row variable is
i or below instead of i+1 or above and
the log odds that the column variable is i or below
instead of i+1 or above for every i. The
measure is also extended to multi-way tables. Examples are given.
Journal: Journal of Applied Statistics
Pages: 279-295
Issue: 2
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2011.586682
File-URL: http://hdl.handle.net/10.1080/02664763.2011.586682
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:279-295
Template-Type: ReDIF-Article 1.0
Author-Name: Julie McIntyre
Author-X-Name-First: Julie
Author-X-Name-Last: McIntyre
Author-Name: Ronald P. Barry
Author-X-Name-First: Ronald P.
Author-X-Name-Last: Barry
Title: Bivariate deconvolution with SIMEX: an application to mapping Alaska earthquake density
Abstract:
Constructing spatial density maps of seismic events, such as earthquake
hypocentres, is complicated by the fact that events are not located
precisely. In this paper, we present a method for estimating density maps
from event locations that are measured with error. The estimator is based
on the simulation--extrapolation method of estimation and is appropriate
for location errors that are either homoscedastic or heteroscedastic. A
simulation study shows that the estimator outperforms the standard
estimator of density that ignores location errors in the data, even when
location errors are spatially dependent. We apply our method to construct
an estimated density map of earthquake hypocenters using data from the
Alaska earthquake catalogue.
Journal: Journal of Applied Statistics
Pages: 297-308
Issue: 2
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2011.586683
File-URL: http://hdl.handle.net/10.1080/02664763.2011.586683
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:297-308
Template-Type: ReDIF-Article 1.0
Author-Name: Y. L. Lio
Author-X-Name-First: Y. L.
Author-X-Name-Last: Lio
Author-Name: Tzong-Ru Tsai
Author-X-Name-First: Tzong-Ru
Author-X-Name-Last: Tsai
Title: Estimation of δ=P(X>Y) for Burr XII distribution based on the progressively first failure-censored samples
Abstract:
Let X and Y have two-parameter Burr XII
distributions. The maximum-likelihood estimator of
δ=P(X>Y) is
studied under the progressively first failure-censored samples. Three
confidence intervals of δ are constructed by using an asymptotic
distribution of the maximum-likelihood estimator of δ and two
bootstrapping procedures, respectively. Some computational results from
intensive simulations are presented. An illustrative example is provided
to demonstrate the application of the proposed method.
Journal: Journal of Applied Statistics
Pages: 309-322
Issue: 2
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2011.586684
File-URL: http://hdl.handle.net/10.1080/02664763.2011.586684
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:309-322
Template-Type: ReDIF-Article 1.0
Author-Name: Petros E. Maravelakis
Author-X-Name-First: Petros E.
Author-X-Name-Last: Maravelakis
Title: Measurement error effect on the CUSUM control chart
Abstract:
The performance of the cumulative sum (CUSUM) control chart for the mean
when measurement error exists is investigated. It is shown that the CUSUM
chart is greatly affected by the measurement error. A similar result holds
for the case of the CUSUM chart for the mean with linearly increasing
variance. In this paper, we consider multiple measurements to reduce the
effect of measurement error on the charts performance. Finally, a
comparison of the CUSUM and EWMA charts is presented and certain
recommendations are given.
Journal: Journal of Applied Statistics
Pages: 323-336
Issue: 2
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2011.590188
File-URL: http://hdl.handle.net/10.1080/02664763.2011.590188
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:323-336
Template-Type: ReDIF-Article 1.0
Author-Name: Helle Sørensen
Author-X-Name-First: Helle
Author-X-Name-Last: Sørensen
Author-Name: Anders Tolver
Author-X-Name-First: Anders
Author-X-Name-Last: Tolver
Author-Name: Maj Halling Thomsen
Author-X-Name-First: Maj Halling
Author-X-Name-Last: Thomsen
Author-Name: Pia Haubro Andersen
Author-X-Name-First: Pia Haubro
Author-X-Name-Last: Andersen
Title: Quantification of symmetry for functional data with application to equine lameness classification
Abstract:
This paper presents a study on symmetry of repeated bi-phased data
signals, in particular, on quantification of the deviation between the two
parts of the signal. Three symmetry scores are defined using functional
data techniques such as smoothing and registration. One score is related
to the L 2-distance between the two parts of
the signal, whereas the other two are constructed to specifically measure
differences in amplitude and phase. Moreover, symmetry scores based on
functional principal component analysis (PCA) are examined. The scores are
applied to acceleration signals from a study on equine gait. The scores
turn out to be highly associated with lameness, and their applicability
for lameness quantification and detection is investigated. Four
classification approaches turn out to give similar results. The scores
describing amplitude and phase variation turn out to outperform the PCA
scores when it comes to the classification of lameness.
Journal: Journal of Applied Statistics
Pages: 337-360
Issue: 2
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2011.590189
File-URL: http://hdl.handle.net/10.1080/02664763.2011.590189
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:337-360
Template-Type: ReDIF-Article 1.0
Author-Name: Terence C. Mills
Author-X-Name-First: Terence C.
Author-X-Name-Last: Mills
Title: Semi-parametric modelling of temperature records
Abstract:
A range of instrumental and proxy temperature records are examined
semi-parametrically, using empirical densities and quantile
autoregressions containing a unit root, to assess the extent of
non-stationarity and the presence of global warming trends. Only the
instrumental records covering the last century and a half show any
evidence of non-stationarity, but the trend behaviour of these series
remains elusive.
Journal: Journal of Applied Statistics
Pages: 361-383
Issue: 2
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2011.590190
File-URL: http://hdl.handle.net/10.1080/02664763.2011.590190
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:361-383
Template-Type: ReDIF-Article 1.0
Author-Name: Caiya Zhang
Author-X-Name-First: Caiya
Author-X-Name-Last: Zhang
Author-Name: Yanbiao Xiang
Author-X-Name-First: Yanbiao
Author-X-Name-Last: Xiang
Author-Name: Xinmei Shen
Author-X-Name-First: Xinmei
Author-X-Name-Last: Shen
Title: Some multivariate goodness-of-fit tests based on data depth
Abstract:
Based on data depth, three types of nonparametric goodness-of-fit tests
for multivariate distribution are proposed in this paper. They are
Pearson’s chi-square test, tests based on EDF and tests based on
spacings, respectively. The Anderson--Darling (AD) test and the Greenwood
test for bivariate normal distribution and uniform distribution are
simulated. The results of simulation show that these two tests have low
type I error rates and become more efficient with the increase in sample
size. The AD-type test performs more powerfully than the Greenwood type
test.
Journal: Journal of Applied Statistics
Pages: 385-397
Issue: 2
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2011.594033
File-URL: http://hdl.handle.net/10.1080/02664763.2011.594033
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:385-397
Template-Type: ReDIF-Article 1.0
Author-Name: Michael R. Crager
Author-X-Name-First: Michael R.
Author-X-Name-Last: Crager
Title: Generalizing the standardized hazard ratio to multivariate proportional hazards regression, with an application to clinical~genomic studies
Abstract:
The standardized hazard ratio for univariate proportional hazards
regression is generalized as a scalar to multivariate proportional hazards
regression. Estimators of the standardized log hazard ratio are developed,
with corrections for bias and for regression to the mean in
high-dimensional analyses. Tests of point and interval null hypotheses and
confidence intervals are constructed. Cohort sampling study designs,
commonly used in prospective--retrospective clinical genomic studies, are
accommodated.
Journal: Journal of Applied Statistics
Pages: 399-417
Issue: 2
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2011.594034
File-URL: http://hdl.handle.net/10.1080/02664763.2011.594034
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:399-417
Template-Type: ReDIF-Article 1.0
Author-Name: Marcus B. Perry
Author-X-Name-First: Marcus B.
Author-X-Name-Last: Perry
Author-Name: Joseph J. Pignatiello
Author-X-Name-First: Joseph J.
Author-X-Name-Last: Pignatiello
Title: Identifying the time of change in the mean of a two-stage nested process
Abstract:
Statistical process control charts are used to distinguish between common
cause and special cause sources of variability. Once a control chart
signals, a search to find the special cause should be initiated. If
process analysts had knowledge of the change point, the search to find the
special cause could be easily facilitated. Relevant literature contains an
array of solutions to the change-point problem; however, these solutions
are most appropriate when the samples are assumed to be independent.
Unfortunately, the assumption of independence is often violated in
practice. This work considers one such case of non-independence that
frequently occurs in practice as a result of multi-stage sampling. Due to
its commonality in practice, we assume a two-stage nested random model as
the underlying process model and derive and evaluate a maximum-likelihood
estimator for the change point in the fixed-effects component of this
model. The estimator is applied to electron microscopy data obtained
following a genuine control chart signal and from a real machining process
where the important quality characteristic is the size of the surface
grains produced by the machining operation. We conduct a simulation study
to compare relative performances between the proposed change-point
estimator and a commonly used alternative developed under the assumption
of independent observations. The results suggest that both estimators are
approximately unbiased; however, the proposed estimator yields smaller
variance. The implication is that the proposed estimator is more precise,
and thus, the quality of the estimator is improved relative to the
alternative.
Journal: Journal of Applied Statistics
Pages: 419-433
Issue: 2
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2011.594035
File-URL: http://hdl.handle.net/10.1080/02664763.2011.594035
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:419-433
Template-Type: ReDIF-Article 1.0
Author-Name: Stephanie N. Dixon
Author-X-Name-First: Stephanie N.
Author-X-Name-Last: Dixon
Author-Name: Gerarda A. Darlington
Author-X-Name-First: Gerarda A.
Author-X-Name-Last: Darlington
Author-Name: Victoria Edge
Author-X-Name-First: Victoria
Author-X-Name-Last: Edge
Title: Applying a marginalized frailty model to competing risks
Abstract:
The marginalized frailty model is often used for the analysis of
correlated times in survival data. When only two correlated times are
analyzed, this model is often referred to as the Clayton--Oakes model
[7,22]. With time-to-event data, there may exist multiple end points
(competing risks) suggesting that an analysis focusing on all available
outcomes is of interest. The purpose of this work is to extend the single
risk marginalized frailty model to the multiple risk setting via
cause-specific hazards (CSH). The methods herein make use of the
marginalized frailty model described by Pipper and Martinussen [24]. As
such, this work uses the martingale theory to develop a likelihood based
on estimating equations and observed histories. The proposed multivariate
CSH model yields marginal regression parameter estimates while
accommodating the clustering of outcomes. The multivariate CSH model can
be fitted using a data augmentation algorithm described by Lunn and McNeil
[21] or by fitting a series of single risk models for each of the
competing risks. An example of the application of the multivariate CSH
model is provided through the analysis of a family-based follow-up study
of breast cancer with death in absence of breast cancer as a competing
risk.
Journal: Journal of Applied Statistics
Pages: 435-443
Issue: 2
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2011.595399
File-URL: http://hdl.handle.net/10.1080/02664763.2011.595399
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:435-443
Template-Type: ReDIF-Article 1.0
Author-Name: Inna Chervoneva
Author-X-Name-First: Inna
Author-X-Name-Last: Chervoneva
Author-Name: Tingting Zhan
Author-X-Name-First: Tingting
Author-X-Name-Last: Zhan
Author-Name: Boris Iglewicz
Author-X-Name-First: Boris
Author-X-Name-Last: Iglewicz
Author-Name: Walter W. Hauck
Author-X-Name-First: Walter W.
Author-X-Name-Last: Hauck
Author-Name: David E. Birk
Author-X-Name-First: David E.
Author-X-Name-Last: Birk
Title: Two-stage hierarchical modeling for analysis of subpopulations in conditional distributions
Abstract:
In this work, we develop the modeling and estimation approach for the
analysis of cross-sectional clustered data with multimodal conditional
distributions, where the main interest is in analysis of subpopulations.
It is proposed to model such data in a hierarchical model with conditional
distributions viewed as finite mixtures of normal components. With a large
number of observations in the lowest level clusters, a two-stage
estimation approach is used. In the first stage, the normal mixture
parameters in each lowest level cluster are estimated using robust
methods. Robust alternatives to the maximum-likelihood (ML) estimation are
used to provide stable results even for data with conditional
distributions such that their components may not quite meet normality
assumptions. Then the lowest level cluster-specific means and standard
deviations are modeled in a mixed effects model in the second stage. A
small simulation study was conducted to compare performance of finite
normal mixture population parameter estimates based on robust and ML
estimation in stage 1. The proposed modeling approach is illustrated
through the analysis of mice tendon fibril diameters data. Analyses
results address genotype differences between corresponding components in
the mixtures and demonstrate advantages of robust estimation in stage 1.
Journal: Journal of Applied Statistics
Pages: 445-460
Issue: 2
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2011.596193
File-URL: http://hdl.handle.net/10.1080/02664763.2011.596193
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:2:p:445-460
Template-Type: ReDIF-Article 1.0
Author-Name: M. Tariqul Hasan
Author-X-Name-First: M. Tariqul
Author-X-Name-Last: Hasan
Author-Name: Gary Sneddon
Author-X-Name-First: Gary
Author-X-Name-Last: Sneddon
Author-Name: Renjun Ma
Author-X-Name-First: Renjun
Author-X-Name-Last: Ma
Title: Regression analysis of zero-inflated time-series counts: application to air pollution related emergency room visit data
Abstract:
Time-series count data with excessive zeros frequently occur in
environmental, medical and biological studies. These data have been
traditionally handled by conditional and marginal modeling approaches
separately in the literature. The conditional modeling approaches are
computationally much simpler, whereas marginal modeling approaches can
link the overall mean with covariates directly. In this paper, we propose
new models that can have conditional and marginal modeling interpretations
for zero-inflated time-series counts using compound Poisson distributed
random effects. We also develop a computationally efficient estimation
method for our models using a quasi-likelihood approach. The proposed
method is illustrated with an application to air pollution-related
emergency room visits. We also evaluate the performance of our method
through simulation studies.
Journal: Journal of Applied Statistics
Pages: 467-476
Issue: 3
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2011.595778
File-URL: http://hdl.handle.net/10.1080/02664763.2011.595778
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:467-476
Template-Type: ReDIF-Article 1.0
Author-Name: Steven B. Caudill
Author-X-Name-First: Steven B.
Author-X-Name-Last: Caudill
Author-Name: James E. Long
Author-X-Name-First: James E.
Author-X-Name-Last: Long
Author-Name: Franklin G. Mixon
Author-X-Name-First: Franklin G.
Author-X-Name-Last: Mixon
Title: Female athletic participation and income: evidence from a latent class model
Abstract:
This paper introduces and applies an EM algorithm for the
maximum-likelihood estimation of a latent class version of the
grouped-data regression model. This new model is applied to examine the
effects of college athletic participation of females on incomes. No
evidence for an “athlete” effect in the case of females has
been found in the previous work by Long and Caudill [12], Henderson
et al. [10], and Caudill and Long [5]. Our study is the
first to find evidence of a lower wage for female athletes. This effect is
present in a regime characterizing 42% of the sample. Further analysis
indicates that female athletes in many otherwise low-paying jobs actually
get paid less than non-athletes.
Journal: Journal of Applied Statistics
Pages: 477-488
Issue: 3
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2011.596194
File-URL: http://hdl.handle.net/10.1080/02664763.2011.596194
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:477-488
Template-Type: ReDIF-Article 1.0
Author-Name: Jie Song
Author-X-Name-First: Jie
Author-X-Name-Last: Song
Author-Name: Herman W. Raadsma
Author-X-Name-First: Herman W.
Author-X-Name-Last: Raadsma
Author-Name: Peter C. Thomson
Author-X-Name-First: Peter C.
Author-X-Name-Last: Thomson
Title: Evaluation of false discovery rate and power via sample size in microarray studies
Abstract:
Microarray studies are now common for human, agricultural plant and
animal studies. False discovery rate (FDR) is widely used in the analysis
of large-scale microarray data to account for problems associated with
multiple testing. A well-designed microarray study should have adequate
statistical power to detect the differentially expressed (DE) genes, while
keeping the FDR acceptably low. In this paper, we used a mixture model of
expression responses involving DE genes and non-DE genes to analyse
theoretical FDR and power for simple scenarios where it is assumed that
each gene has equal error variance and the gene effects are independent. A
simulation study was used to evaluate the empirical FDR and power for more
complex scenarios with unequal error variance and gene dependence. Based
on this approach, we present a general guide for sample size requirement
at the experimental design stage for prospective microarray studies. This
paper presented an approach to explicitly connect the sample size with FDR
and power. While the methods have been developed in the context of
one-sample microarray studies, they are readily applicable to two-sample,
and could be adapted to multiple-sample studies.
Journal: Journal of Applied Statistics
Pages: 489-500
Issue: 3
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2011.602054
File-URL: http://hdl.handle.net/10.1080/02664763.2011.602054
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:489-500
Template-Type: ReDIF-Article 1.0
Author-Name: Alessandro Barbiero
Author-X-Name-First: Alessandro
Author-X-Name-Last: Barbiero
Title: Interval estimators for reliability: the bivariate normal case
Abstract:
This paper proposes procedures to provide confidence intervals (CIs) for
reliability in stress--strength models, considering the particular case of
a bivariate normal set-up. The suggested CIs are obtained by employing
either asymptotic variances of maximum-likelihood estimators or a
bootstrap procedure. The coverage and the accuracy of these intervals are
empirically checked through a simulation study and compared with those of
another proposal in the literature. An application to real data is
provided.
Journal: Journal of Applied Statistics
Pages: 501-512
Issue: 3
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2011.602055
File-URL: http://hdl.handle.net/10.1080/02664763.2011.602055
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:501-512
Template-Type: ReDIF-Article 1.0
Author-Name: Joseph Kang
Author-X-Name-First: Joseph
Author-X-Name-Last: Kang
Author-Name: Xiaogang Su
Author-X-Name-First: Xiaogang
Author-X-Name-Last: Su
Author-Name: Brian Hitsman
Author-X-Name-First: Brian
Author-X-Name-Last: Hitsman
Author-Name: Kiang Liu
Author-X-Name-First: Kiang
Author-X-Name-Last: Liu
Author-Name: Donald Lloyd-Jones
Author-X-Name-First: Donald
Author-X-Name-Last: Lloyd-Jones
Title: Tree-structured analysis of treatment effects with large observational data
Abstract:
Treatment effect in an observational study of relatively large scale can
be described as a mixture of effects among subgroups. In particular,
analysis for estimating the treatment effect at the level of an entire
sample potentially involves not only differential effects across subgroups
of the entire study cohort, but also differential propensities --
probabilities of receiving treatment given study subjects’
pretreatment history. Such complex heterogeneity is of great research
interest because the analysis of treatment effects can substantially
depend on the hidden data structure for effect sizes and propensities. To
uncover the unseen data structure, we propose a likelihood-based
regression tree method which we call marginal tree (MT). The MT method is
aimed at a simultaneous assessment of differential effects and propensity
scores so that both become homogeneous within each terminal node of the
resultant tree structure. We assess simulation performances of the MT
method by comparing it with other existing tree methods and illustrate its
use with a simulated data set, where the objective is to assess the
effects of dieting behavior on its subsequent emotional distress among
adolescent girls.
Journal: Journal of Applied Statistics
Pages: 513-529
Issue: 3
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2011.602056
File-URL: http://hdl.handle.net/10.1080/02664763.2011.602056
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:513-529
Template-Type: ReDIF-Article 1.0
Author-Name: Victor H. Lachos
Author-X-Name-First: Victor H.
Author-X-Name-Last: Lachos
Author-Name: Celso R.B. Cabral
Author-X-Name-First: Celso R.B.
Author-X-Name-Last: Cabral
Author-Name: Carlos A. Abanto-Valle
Author-X-Name-First: Carlos A.
Author-X-Name-Last: Abanto-Valle
Title: A non-iterative sampling Bayesian method for linear mixed models with normal independent distributions
Abstract:
In this paper, we utilize normal/independent (NI) distributions as a tool
for robust modeling of linear mixed models (LMM) under a Bayesian
paradigm. The purpose is to develop a non-iterative sampling method to
obtain i.i.d. samples approximately from the observed posterior
distribution by combining the inverse Bayes formulae, sampling/importance
resampling and posterior mode estimates from the expectation maximization
algorithm to LMMs with NI distributions, as suggested by Tan et
al. [33]. The proposed algorithm provides a novel alternative to
perfect sampling and eliminates the convergence problems of Markov chain
Monte Carlo methods. In order to examine the robust aspects of the NI
class, against outlying and influential observations, we present a
Bayesian case deletion influence diagnostics based on the
Kullback--Leibler divergence. Further, some discussions on model selection
criteria are given. The new methodologies are exemplified through a real
data set, illustrating the usefulness of the proposed methodology.
Journal: Journal of Applied Statistics
Pages: 531-549
Issue: 3
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2011.603292
File-URL: http://hdl.handle.net/10.1080/02664763.2011.603292
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:531-549
Template-Type: ReDIF-Article 1.0
Author-Name: C�lia Nunes
Author-X-Name-First: C�lia
Author-X-Name-Last: Nunes
Author-Name: Dário Ferreira
Author-X-Name-First: Dário
Author-X-Name-Last: Ferreira
Author-Name: Sandra S. Ferreira
Author-X-Name-First: Sandra S.
Author-X-Name-Last: Ferreira
Author-Name: João T. Mexia
Author-X-Name-First: João T.
Author-X-Name-Last: Mexia
Title: F-tests with a rare pathology
Abstract:
ANOVA is routinely used to compare pathologies. Nevertheless, in many
situations, the sample dimensions may not be known when planning the
study. This is specially relevant when one of the pathologies is rare.
Thus, the sample size for that pathology or for all pathologies must be
considered as random. Sample selection for the non-rare pathologies may be
carried out to increase the balance of the model. This leads to
F-tests with random non-centrality parameters and random
degrees of freedom for the errors. The distribution of such tests
statistics is obtained.
Journal: Journal of Applied Statistics
Pages: 551-561
Issue: 3
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2011.603293
File-URL: http://hdl.handle.net/10.1080/02664763.2011.603293
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:551-561
Template-Type: ReDIF-Article 1.0
Author-Name: Reza Drikvandi
Author-X-Name-First: Reza
Author-X-Name-Last: Drikvandi
Author-Name: Ahmad Khodadadi
Author-X-Name-First: Ahmad
Author-X-Name-Last: Khodadadi
Author-Name: Geert Verbeke
Author-X-Name-First: Geert
Author-X-Name-Last: Verbeke
Title: Testing variance components in balanced linear growth curve models
Abstract:
It is well known that the testing of zero variance components is a
non-standard problem since the null hypothesis is on the boundary of the
parameter space. The usual asymptotic chi-square distribution of the
likelihood ratio and score statistics under the null does not necessarily
hold because of this null hypothesis. To circumvent this difficulty in
balanced linear growth curve models, we introduce an appropriate test
statistic and suggest a permutation procedure to approximate its
finite-sample distribution. The proposed test alleviates the necessity of
any distributional assumptions for the random effects and errors and can
easily be applied for testing multiple variance components. Our simulation
studies show that the proposed test has Type I error rate close to
the nominal level. The power of the proposed test is also compared with
the likelihood ratio test in the simulations. An application on data from
an orthodontic study is presented and discussed.
Journal: Journal of Applied Statistics
Pages: 563-572
Issue: 3
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2011.603294
File-URL: http://hdl.handle.net/10.1080/02664763.2011.603294
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:563-572
Template-Type: ReDIF-Article 1.0
Author-Name: Richard W. Miller
Author-X-Name-First: Richard W.
Author-X-Name-Last: Miller
Title: Quad folding: a simple idea for the subjective property characterization of large sample sets
Abstract:
Quad analysis has proven useful for characterizing subjective properties,
primarily properties such as carpet hand and body in the textile industry.
In essence, it provides an efficient method for conducting paired
comparisons, the preferred method for quantifying such properties. An
extension to quad analysis, quad folding of one quad design into another,
is likewise an efficient method to rank order the subjective properties of
larger data sets. A rank ordering of 62 carpets by their body is provided
as an example of folding six groups of eight carpets (two replicated) into
one another.
Journal: Journal of Applied Statistics
Pages: 573-580
Issue: 3
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2011.604307
File-URL: http://hdl.handle.net/10.1080/02664763.2011.604307
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:573-580
Template-Type: ReDIF-Article 1.0
Author-Name: Christian H. Weiß
Author-X-Name-First: Christian H.
Author-X-Name-Last: Weiß
Title: Fully observed INAR(1) processes
Abstract:
The innovations of an INAR(1) process (integer-valued
autoregressive) are usually assumed to
be unobservable. There are, however, situations in practice, where also
the innovations can be uncovered, i.e. where we are concerned with a
fully observed INAR(1)
process. We analyze stochastic properties of such a fully
observed INAR(1) process and explore the relation between the INAR(1)
model and certain metapopulation models. We show how the additional
knowledge about the innovations can be used for parameter estimation, for
model diagnostics, and for forecasting. Our findings are illustrated with
two real-data examples.
Journal: Journal of Applied Statistics
Pages: 581-598
Issue: 3
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2011.604308
File-URL: http://hdl.handle.net/10.1080/02664763.2011.604308
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:581-598
Template-Type: ReDIF-Article 1.0
Author-Name: Hsiu-Wen Chen
Author-X-Name-First: Hsiu-Wen
Author-X-Name-Last: Chen
Author-Name: Weng Kee Wong
Author-X-Name-First: Weng Kee
Author-X-Name-Last: Wong
Author-Name: Hongquan Xu
Author-X-Name-First: Hongquan
Author-X-Name-Last: Xu
Title: An augmented approach to the desirability function
Abstract:
The desirability function is widely used in the engineering field to
tackle the problem of optimizing multiple responses simultaneously. This
approach does not account for the variability in the predicted responses
and minimizing this variability to have narrower prediction intervals is
desirable. We propose to add this capability in the desirability function
and also incorporate the relative importance of optimizing the multiple
responses and minimizing the variances of the predicted responses at the
same time. We show that the benefits of our augmented approach using two
real data sets by comparing our solutions with those obtained from the
desirability approach. In particular, it is shown that our approach offers
greater flexibility and the solutions can reduce the variances of all the
predicted responses resulting in narrower prediction intervals.
Journal: Journal of Applied Statistics
Pages: 599-613
Issue: 3
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2011.605437
File-URL: http://hdl.handle.net/10.1080/02664763.2011.605437
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:599-613
Template-Type: ReDIF-Article 1.0
Author-Name: Miguel Angel Uribe-Opazo
Author-X-Name-First: Miguel Angel
Author-X-Name-Last: Uribe-Opazo
Author-Name: Joelmir Andr� Borssoi
Author-X-Name-First: Joelmir Andr�
Author-X-Name-Last: Borssoi
Author-Name: Manuel Galea
Author-X-Name-First: Manuel
Author-X-Name-Last: Galea
Title: Influence diagnostics in Gaussian spatial linear models
Abstract:
Spatial linear models have been applied in numerous fields such as
agriculture, geoscience and environmental sciences, among many others.
Spatial dependence structure modelling, using a geostatistical approach,
is an indispensable tool to estimate the parameters that define this
structure. However, this estimation may be greatly affected by the
presence of atypical observations in the sampled data. The purpose of this
paper is to use diagnostic techniques to assess the sensitivity of the
maximum-likelihood estimators, covariance functions and linear predictor
to small perturbations in the data and/or the spatial linear model
assumptions. The methodology is illustrated with two real data sets. The
results allowed us to conclude that the presence of atypical values in the
sample data have a strong influence on thematic maps, changing the spatial
dependence structure.
Journal: Journal of Applied Statistics
Pages: 615-630
Issue: 3
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2011.607802
File-URL: http://hdl.handle.net/10.1080/02664763.2011.607802
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:615-630
Template-Type: ReDIF-Article 1.0
Author-Name: Miran A. Jaffa
Author-X-Name-First: Miran A.
Author-X-Name-Last: Jaffa
Author-Name: Ayad A. Jaffa
Author-X-Name-First: Ayad A.
Author-X-Name-Last: Jaffa
Author-Name: Stuart R. Lipsitz
Author-X-Name-First: Stuart R.
Author-X-Name-Last: Lipsitz
Title: Slope estimation of covariates that influence renal outcome following renal transplant adjusting for informative right censoring
Abstract:
A new statistical model is proposed to estimate population and individual
slopes that are adjusted for covariates and informative right censoring.
Individual slopes are assumed to have a mean that depends on the
population slope for the covariates. The number of observations for each
individual is modeled as a truncated discrete distribution with mean
dependent on the individual subjects’ slopes. Our simulation study
results indicated that the associated bias and mean squared errors for the
proposed model were comparable to those associated with the model that
only adjusts for informative right censoring. The proposed model was
illustrated using renal transplant dataset to estimate population slopes
for covariates that could impact the outcome of renal function following
renal transplantation.
Journal: Journal of Applied Statistics
Pages: 631-642
Issue: 3
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2011.610441
File-URL: http://hdl.handle.net/10.1080/02664763.2011.610441
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:631-642
Template-Type: ReDIF-Article 1.0
Author-Name: Hsiuying Wang
Author-X-Name-First: Hsiuying
Author-X-Name-Last: Wang
Author-Name: Shan-Lin Hung
Author-X-Name-First: Shan-Lin
Author-X-Name-Last: Hung
Title: Phylogenetic tree selection by the adjusted k-means approach
Abstract:
The reconstruction of phylogenetic trees is one of the most important and
interesting problems of the evolutionary study. There are many methods
proposed in the literature for constructing phylogenetic trees. Each
approach is based on different criteria and evolutionary models. However,
the topologies of trees constructed from different methods may be quite
different. The topological errors may be due to unsuitable criterions or
evolutionary models. Since there are many tree construction approaches, we
are interested in selecting a better tree to fit the true model. In this
study, we propose an adjusted k-means approach and a
misclassification error score criterion to solve the problem. The
simulation study shows this method can select better trees among the
potential candidates, which can provide a useful way in phylogenetic tree
selection.
Journal: Journal of Applied Statistics
Pages: 643-655
Issue: 3
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2011.610442
File-URL: http://hdl.handle.net/10.1080/02664763.2011.610442
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:643-655
Template-Type: ReDIF-Article 1.0
Author-Name: H. E.T. Holgersson
Author-X-Name-First: H. E.T.
Author-X-Name-Last: Holgersson
Author-Name: Peter S. Karlsson
Author-X-Name-First: Peter S.
Author-X-Name-Last: Karlsson
Author-Name: Rashid Mansoor
Author-X-Name-First: Rashid
Author-X-Name-Last: Mansoor
Title: Estimating mean-standard deviation ratios of financial data
Abstract:
This article treats the problem of linking the relation between excess
return and risk of financial assets when the returns follow a factor
structure. The authors propose three different estimators and their
consistencies are established in cases when the number of assets in the
cross-section (n) and the number of observations over
time (T) are of comparable size. An empirical
investigation is conducted on the Stockholm stock exchange market where
the mean-standard deviation ratio is calculated for small- mid- and large
cap segments, respectively.
Journal: Journal of Applied Statistics
Pages: 657-671
Issue: 3
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2011.610443
File-URL: http://hdl.handle.net/10.1080/02664763.2011.610443
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:657-671
Template-Type: ReDIF-Article 1.0
Author-Name: Manabu Kuroki
Author-X-Name-First: Manabu
Author-X-Name-Last: Kuroki
Title: Optimizing a control plan using a structural equation model with an application to statistical process analysis
Abstract:
In the case where non-experimental data are available from an industrial
process and a directed graph for how various factors affect a response
variable is known based on a substantive understanding of the process, we
consider a problem in which a control plan involving multiple treatment
variables is conducted in order to bring a response variable close to a
target value with variation reduction. Using statistical causal analysis
with linear (recursive and non-recursive) structural equation models, we
configure an optimal control plan involving multiple treatment variables
through causal parameters. Based on the formulation, we clarify the causal
mechanism for how the variance of a response variable changes when the
control plan is conducted. The results enable us to evaluate the effect of
a control plan on the variance of a response variable from
non-experimental data and provide a new application of linear structural
equation models to engineering science.
Journal: Journal of Applied Statistics
Pages: 673-694
Issue: 3
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2011.610444
File-URL: http://hdl.handle.net/10.1080/02664763.2011.610444
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:3:p:673-694
Template-Type: ReDIF-Article 1.0
Author-Name: J. C.F. de Winter
Author-X-Name-First: J. C.F.
Author-X-Name-Last: de Winter
Author-Name: D. Dodou
Author-X-Name-First: D.
Author-X-Name-Last: Dodou
Title: Factor recovery by principal axis factoring and maximum likelihood factor analysis as a function of factor pattern and sample size
Abstract:
Principal axis factoring (PAF) and maximum likelihood factor analysis
(MLFA) are two of the most popular estimation methods in exploratory
factor analysis. It is known that PAF is better able to recover weak
factors and that the maximum likelihood estimator is asymptotically
efficient. However, there is almost no evidence regarding which method
should be preferred for different types of factor patterns and sample
sizes. Simulations were conducted to investigate factor recovery by PAF
and MLFA for distortions of ideal simple structure and sample sizes
between 25 and 5000. Results showed that PAF is preferred for population
solutions with few indicators per factor and for overextraction. MLFA
outperformed PAF in cases of unequal loadings within factors and for
underextraction. It was further shown that PAF and MLFA do not always
converge with increasing sample size. The simulation findings were
confirmed by an empirical study as well as by a classic plasmode,
Thurstone's box problem. The present results are of practical value for
factor analysts.
Journal: Journal of Applied Statistics
Pages: 695-710
Issue: 4
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2011.610445
File-URL: http://hdl.handle.net/10.1080/02664763.2011.610445
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:695-710
Template-Type: ReDIF-Article 1.0
Author-Name: Irene Rocchetti
Author-X-Name-First: Irene
Author-X-Name-Last: Rocchetti
Author-Name: Domenica Taruscio
Author-X-Name-First: Domenica
Author-X-Name-Last: Taruscio
Author-Name: Marco Alfò
Author-X-Name-First: Marco
Author-X-Name-Last: Alfò
Title: Modeling delay in diagnosis of NF: under reportincg, incidence and prevalence estimates
Abstract:
In this paper, we analyze data from the Italian National Register of Rare
Diseases (NRRD) focusing, in particular, on the geo-temporal distribution
of patients affected by neurofibromatosis type 1 (NF1, ICD9CM code
237.71). The aim is at deriving a corrected measure of incidence for the
period 2007--2009 using a single source, and to provide NF1 prevalence
estimates for the period 2001--2006 through the use of capture--recapture
methods over two sources. In the first case, a reverse hazard estimator
for the delay in diagnosis of NF1 is used to estimate the probability that
a generic unit belonging to the population of interest has been registered
by the archive of reference. For the second purpose, two-source
capture--recapture methods have been used to estimate the number of NF1
prevalent units in Italy for the period 2001--2006, matching information
provided by the NRRD and the national register of hospital discharge,
Scheda di Dimissione Ospedaliera (in the following SDO), archives.
Journal: Journal of Applied Statistics
Pages: 711-721
Issue: 4
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2011.610446
File-URL: http://hdl.handle.net/10.1080/02664763.2011.610446
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:711-721
Template-Type: ReDIF-Article 1.0
Author-Name: Kouji Tahata
Author-X-Name-First: Kouji
Author-X-Name-Last: Tahata
Title: Quasi-asymmetry model for square tables with nominal categories
Abstract:
For an R×R square contingency
table with nominal categories, the present paper proposes a model which
indicates that the absolute values of log odds of the odds ratio for rows
i and j and columns j
and R to the corresponding symmetric odds ratio for rows
j and R and columns i
and j are constant for every
i>j>R. The model is an
extension of the quasi-symmetry model and states a structure of asymmetry
of odds ratios. An example is given.
Journal: Journal of Applied Statistics
Pages: 723-729
Issue: 4
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2011.610447
File-URL: http://hdl.handle.net/10.1080/02664763.2011.610447
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:723-729
Template-Type: ReDIF-Article 1.0
Author-Name: Rosaria Lombardo
Author-X-Name-First: Rosaria
Author-X-Name-Last: Lombardo
Author-Name: Pietro Amenta
Author-X-Name-First: Pietro
Author-X-Name-Last: Amenta
Author-Name: Myrtille Vivien
Author-X-Name-First: Myrtille
Author-X-Name-Last: Vivien
Author-Name: Robert Sabatier
Author-X-Name-First: Robert
Author-X-Name-Last: Sabatier
Title: Sensory analysis via multi-block multivariate additive PLS splines
Abstract:
In the last decade, much effort has been spent on modelling dependence
between sensory variables and chemical--physical ones, especially when
observed at different occasions/spaces/times or if collected from several
groups (blocks) of variables. In this paper, we propose a nonlinear
generalization of multi-block partial least squares with the inclusion of
variable interactions. We show the performance of the method on a known
data set.
Journal: Journal of Applied Statistics
Pages: 731-743
Issue: 4
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2011.611239
File-URL: http://hdl.handle.net/10.1080/02664763.2011.611239
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:731-743
Template-Type: ReDIF-Article 1.0
Author-Name: Amer Ibrahim Al-Omari
Author-X-Name-First: Amer Ibrahim
Author-X-Name-Last: Al-Omari
Author-Name: Abdul Haq
Author-X-Name-First: Abdul
Author-X-Name-Last: Haq
Title: Improved quality control charts for monitoring the process mean, using double-ranked set sampling methods
Abstract:
Statistical control charts are widely used in the manufacturing industry.
The Shewhart-type control charts are developed to improve the monitoring
process mean by using the double quartile-ranked set sampling, quartile
double-ranked set sampling, and double extreme-ranked set sampling
methods. In terms of the average run length, the performance of the
proposed control charts are compared with the existing control charts
based on simple random sampling, ranked set sampling and extreme-ranked
set sampling methods. An application of real data is also considered to
investigate the performance of the suggested process mean control charts.
The findings of the study revealed that the newly suggested control charts
are superior to the existing counterparts.
Journal: Journal of Applied Statistics
Pages: 745-763
Issue: 4
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2011.611488
File-URL: http://hdl.handle.net/10.1080/02664763.2011.611488
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:745-763
Template-Type: ReDIF-Article 1.0
Author-Name: Krishna K. Saha
Author-X-Name-First: Krishna K.
Author-X-Name-Last: Saha
Author-Name: Debaraj Sen
Author-X-Name-First: Debaraj
Author-X-Name-Last: Sen
Author-Name: Chun Jin
Author-X-Name-First: Chun
Author-X-Name-Last: Jin
Title: Profile likelihood-based confidence interval for the dispersion parameter in count data
Abstract:
The importance of the dispersion parameter in counts occurring in
toxicology, biology, clinical medicine, epidemiology, and other similar
studies is well known. A couple of procedures for the construction of
confidence intervals (CIs) of the dispersion parameter have been
investigated, but little attention has been paid to the accuracy of its
CIs. In this paper, we introduce the profile likelihood (PL) approach and
the hybrid profile variance (HPV) approach for constructing the CIs of the
dispersion parameter for counts based on the negative binomial model. The
non-parametric bootstrap (NPB) approach based on the maximum likelihood
(ML) estimates of the dispersion parameter is also considered. We then
compare our proposed approaches with an asymptotic approach based on the
ML and the restricted ML (REML) estimates of the dispersion parameter as
well as the parametric bootstrap (PB) approach based on the ML estimates
of the dispersion parameter. As assessed by Monte Carlo simulations, the
PL approach has the best small-sample performance, followed by the REML,
HPV, NPB, and PB approaches. Three examples to biological count data are
presented.
Journal: Journal of Applied Statistics
Pages: 765-783
Issue: 4
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2011.616581
File-URL: http://hdl.handle.net/10.1080/02664763.2011.616581
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:765-783
Template-Type: ReDIF-Article 1.0
Author-Name: Yao Yu
Author-X-Name-First: Yao
Author-X-Name-Last: Yu
Author-Name: Jun Wang
Author-X-Name-First: Jun
Author-X-Name-Last: Wang
Title: Lattice-oriented percolation system applied to volatility behavior of stock market
Abstract:
In this paper, a discrete time series of stock price process is modeled
by the two-dimensional lattice-oriented bond percolation system.
Percolation theory, as one of statistical physics systems, has brought new
understanding and techniques to a broad range of topics in nature and
society. According to this financial model, we studied the statistical
behaviors of the stock price from the model and the real stock prices by
comparison. We also investigated the probability distributions, the long
memory and the long-range correlations of price returns for the actual
data and the simulative data. The empirical research exhibits that for
proper parameters, the simulative data of the financial model can fit the
real markets to a certain extent.
Journal: Journal of Applied Statistics
Pages: 785-797
Issue: 4
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2011.620081
File-URL: http://hdl.handle.net/10.1080/02664763.2011.620081
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:785-797
Template-Type: ReDIF-Article 1.0
Author-Name: Rahim Alhamzawi
Author-X-Name-First: Rahim
Author-X-Name-Last: Alhamzawi
Author-Name: Keming Yu
Author-X-Name-First: Keming
Author-X-Name-Last: Yu
Title: Variable selection in quantile regression via Gibbs sampling
Abstract:
Due to computational challenges and non-availability of conjugate prior
distributions, Bayesian variable selection in quantile regression models
is often a difficult task. In this paper, we address these two issues for
quantile regression models. In particular, we develop an informative
stochastic search variable selection (ISSVS) for quantile regression
models that introduces an informative prior distribution. We adopt prior
structures which incorporate historical data into the current data by
quantifying them with a suitable prior distribution on the model
parameters. This allows ISSVS to search more efficiently in the model
space and choose the more likely models. In addition, a Gibbs sampler is
derived to facilitate the computation of the posterior probabilities. A
major advantage of ISSVS is that it avoids instability in the posterior
estimates for the Gibbs sampler as well as convergence problems that may
arise from choosing vague priors. Finally, the proposed methods are
illustrated with both simulation and real data.
Journal: Journal of Applied Statistics
Pages: 799-813
Issue: 4
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2011.620082
File-URL: http://hdl.handle.net/10.1080/02664763.2011.620082
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:799-813
Template-Type: ReDIF-Article 1.0
Author-Name: Darold T. Barnum
Author-X-Name-First: Darold T.
Author-X-Name-Last: Barnum
Author-Name: John M. Gleason
Author-X-Name-First: John M.
Author-X-Name-Last: Gleason
Author-Name: Matthew G. Karlaftis
Author-X-Name-First: Matthew G.
Author-X-Name-Last: Karlaftis
Author-Name: Glen T. Schumock
Author-X-Name-First: Glen T.
Author-X-Name-Last: Schumock
Author-Name: Karen L. Shields
Author-X-Name-First: Karen L.
Author-X-Name-Last: Shields
Author-Name: Sonali Tandon
Author-X-Name-First: Sonali
Author-X-Name-Last: Tandon
Author-Name: Surrey M. Walton
Author-X-Name-First: Surrey M.
Author-X-Name-Last: Walton
Title: Estimating DEA confidence intervals with statistical panel data analysis
Abstract:
This paper describes a statistical method for estimating data envelopment
analysis (DEA) score confidence intervals for individual organizations or
other entities. This method applies statistical panel data analysis, which
provides proven and powerful methodologies for diagnostic testing and for
estimation of confidence intervals. DEA scores are tested for violations
of the standard statistical assumptions including contemporaneous
correlation, serial correlation, heteroskedasticity and the absence of a
normal distribution. Generalized least squares statistical models are used
to adjust for violations that are present and to estimate valid confidence
intervals within which the true efficiency of each individual
decision-making unit occurs. This method is illustrated with two sets of
panel data, one from large US urban transit systems and the other from a
group of US hospital pharmacies.
Journal: Journal of Applied Statistics
Pages: 815-828
Issue: 4
Volume: 39
Year: 2012
Month: 9
X-DOI: 10.1080/02664763.2011.620948
File-URL: http://hdl.handle.net/10.1080/02664763.2011.620948
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:815-828
Template-Type: ReDIF-Article 1.0
Author-Name: Chun-Xia Zhang
Author-X-Name-First: Chun-Xia
Author-X-Name-Last: Zhang
Author-Name: Guan-Wei Wang
Author-X-Name-First: Guan-Wei
Author-X-Name-Last: Wang
Author-Name: Jiang-She Zhang
Author-X-Name-First: Jiang-She
Author-X-Name-Last: Zhang
Title: An empirical bias--variance analysis of DECORATE ensemble method at different training sample sizes
Abstract:
DECORATE (Diverse Ensemble Creation by Oppositional Relabeling of
Artificial Training Examples) is a classifier combination technique to
construct a set of diverse base classifiers using additional artificially
generated training instances. The predictions from the base classifiers
are then integrated into one by the mean combination rule. In order to
gain more insight about its effectiveness and advantages, this paper
utilizes a large experiment to study the bias--variance analysis of
DECORATE as well as some other widely used ensemble methods (such as
bagging, AdaBoost, random forest) at different training sample sizes. The
experimental results yield the following conclusions. For small training
sets, DECORATE has a dominant advantage over its rivals and its success is
attributed to the larger bias reduction achieved by it than the other
algorithms. With increase in training data, AdaBoost benefits most and the
bias reduced by it gradually turns to be significant while its variance
reduction is also medium. Thus, AdaBoost performs best with large training
samples. Moreover, random forest behaves always second best regardless of
small or large training sets and it is seen to mainly decrease variance
while maintaining low bias. Bagging seems to be an intermediate one since
it reduces variance primarily.
Journal: Journal of Applied Statistics
Pages: 829-850
Issue: 4
Volume: 39
Year: 2012
Month: 9
X-DOI: 10.1080/02664763.2011.620949
File-URL: http://hdl.handle.net/10.1080/02664763.2011.620949
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:829-850
Template-Type: ReDIF-Article 1.0
Author-Name: F. Lombard
Author-X-Name-First: F.
Author-X-Name-Last: Lombard
Author-Name: C. J. Potgieter
Author-X-Name-First: C. J.
Author-X-Name-Last: Potgieter
Title: A multivariate rank test for comparing mass size distributions
Abstract:
Particle size analyses of a raw material are commonplace in the mineral
processing industry. Knowledge of particle size distributions is crucial
in planning milling operations to enable an optimum degree of liberation
of valuable mineral phases, to minimize plant losses due to an excess of
oversize or undersize material or to attain a size distribution that fits
a contractual specification. The problem addressed in the present paper is
how to test the equality of two or more underlying size distributions. A
distinguishing feature of these size distributions is that they are not
based on counts of individual particles. Rather, they are mass size
distributions giving the fractions of the total mass of a sampled material
lying in each of a number of size intervals. As such, the data are
compositional in nature, using the terminology of Aitchison [1] that is,
multivariate vectors the components of which add to 100%. In the
literature, various versions of Hotelling's T -super-2
have been used to compare matched pairs of such compositional data. In
this paper, we propose a robust test procedure based on ranks as a
competitor to Hotelling's T -super-2. In contrast to the
latter statistic, the power of the rank test is not unduly affected by the
presence of outliers or of zeros among the data.
Journal: Journal of Applied Statistics
Pages: 851-865
Issue: 4
Volume: 39
Year: 2012
Month: 9
X-DOI: 10.1080/02664763.2011.623155
File-URL: http://hdl.handle.net/10.1080/02664763.2011.623155
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:851-865
Template-Type: ReDIF-Article 1.0
Author-Name: P. Berchialla
Author-X-Name-First: P.
Author-X-Name-Last: Berchialla
Author-Name: S. Snidero
Author-X-Name-First: S.
Author-X-Name-Last: Snidero
Author-Name: A. Stancu
Author-X-Name-First: A.
Author-X-Name-Last: Stancu
Author-Name: C. Scarinzi
Author-X-Name-First: C.
Author-X-Name-Last: Scarinzi
Author-Name: R. Corradetti
Author-X-Name-First: R.
Author-X-Name-Last: Corradetti
Author-Name: D. Gregori
Author-X-Name-First: D.
Author-X-Name-Last: Gregori
Title: Understanding the epidemiology of foreign body injuries in children using a data-driven Bayesian network
Abstract:
Bayesian networks (BNs) are probabilistic expert systems which have
emerged over the last few decades as a powerful data mining technique.
Also, BNs have become especially popular in biomedical applications where
they have been used for diagnosing diseases and studying complex cellular
networks, among many other applications. In this study, we built a BN in a
fully automated way in order to analyse data regarding injuries due to the
inhalation, ingestion and aspiration of foreign bodies (FBs) in children.
Then, a sensitivity analysis was carried out to characterize the
uncertainty associated with the model. While other studies focused on
characteristics such as shape, consistency and dimensions of the FBs which
caused injuries, we propose an integrated environment which makes the
relationships among the factors underlying the problem clear. The
advantage of this approach is that it gives a picture of the influence of
critical factors on the injury severity and allows for the comparison of
the effect of different FB characteristics (volume, FB type, shape and
consistency) and children's features (age and gender) on the risk of
experiencing a hospitalization. The rates it consents to calculate provide
a more rational basis for promoting care-givers’ education of the
most influential risk factors regarding the adverse outcomes.
Journal: Journal of Applied Statistics
Pages: 867-874
Issue: 4
Volume: 39
Year: 2012
Month: 9
X-DOI: 10.1080/02664763.2011.623156
File-URL: http://hdl.handle.net/10.1080/02664763.2011.623156
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:867-874
Template-Type: ReDIF-Article 1.0
Author-Name: Sifat Sharmin
Author-X-Name-First: Sifat
Author-X-Name-Last: Sharmin
Author-Name: Md. Israt Rayhan
Author-X-Name-First: Md. Israt
Author-X-Name-Last: Rayhan
Title: Spatio-temporal modeling of infectious disease dynamics
Abstract:
A stochastic model, which is well suited to capture space--time
dependence of an infectious disease, was employed in this study to
describe the underlying spatial and temporal pattern of measles in Barisal
Division, Bangladesh. The model has two components: an endemic component
and an epidemic component; weights are used in the epidemic component for
better accounting of the disease spread into different geographical
regions. We illustrate our findings using a data set of monthly measles
counts in the six districts of Barisal, from January 2000 to August 2009,
collected from the Expanded Program on Immunization, Bangladesh. The
negative binomial model with both the seasonal and autoregressive
components was found to be suitable for capturing space--time dependence
of measles in Barisal. Analyses were done using general optimization
routines, which provided the maximum likelihood estimates with the
corresponding standard errors.
Journal: Journal of Applied Statistics
Pages: 875-882
Issue: 4
Volume: 39
Year: 2012
Month: 9
X-DOI: 10.1080/02664763.2011.624593
File-URL: http://hdl.handle.net/10.1080/02664763.2011.624593
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:875-882
Template-Type: ReDIF-Article 1.0
Author-Name: Benjamin Neustifter
Author-X-Name-First: Benjamin
Author-X-Name-Last: Neustifter
Author-Name: Stephen L. Rathbun
Author-X-Name-First: Stephen L.
Author-X-Name-Last: Rathbun
Author-Name: Saul Shiffman
Author-X-Name-First: Saul
Author-X-Name-Last: Shiffman
Title: Mixed-Poisson point process with partially observed covariates: ecological momentary assessment of smoking
Abstract:
Ecological momentary assessment is an emerging method of data collection
in behavioral research that may be used to capture the times of repeated
behavioral events on electronic devices and information on
subjects’ psychological states through the electronic
administration of questionnaires at times selected from a
probability-based design as well as the event times. A method for fitting
a mixed-Poisson point-process model is proposed for the impact of
partially observed, time-varying covariates on the timing of repeated
behavioral events. A random frailty is included in the point-process
intensity to describe the variation in baseline rates of event occurrence
among subjects. Covariate coefficients are estimated using estimating
equations constructed by replacing the integrated intensity in the Poisson
score equations with a design-unbiased estimator. An estimator is also
proposed for the variance of the random frailties. Our estimators are
robust in the sense that no model assumptions are made regarding the
distribution of the time-varying covariates or the distribution of the
random effects. However, subject effects are estimated under gamma
frailties using an approximate hierarchical likelihood. The proposed
approach is illustrated using smoking data.
Journal: Journal of Applied Statistics
Pages: 883-899
Issue: 4
Volume: 39
Year: 2012
Month: 9
X-DOI: 10.1080/02664763.2011.626848
File-URL: http://hdl.handle.net/10.1080/02664763.2011.626848
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:883-899
Template-Type: ReDIF-Article 1.0
Author-Name: Megan Othus
Author-X-Name-First: Megan
Author-X-Name-Last: Othus
Author-Name: Yi Li
Author-X-Name-First: Yi
Author-X-Name-Last: Li
Author-Name: Ram Tiwari
Author-X-Name-First: Ram
Author-X-Name-Last: Tiwari
Title: Change-point cure models with application to estimating the change-point effect of age of diagnosis among prostate cancer patients
Abstract:
Previous research on prostate cancer survival trends in the United States
National Cancer Institute's Surveillance Epidemiology and End Results
database has indicated a potential change-point in the age of diagnosis of
prostate cancer around age 50. Identifying a change-point value in
prostate cancer survival and cure could have important policy and health
care management implications. Statistical analysis of this data has to
address two complicating features: (1) change-point models are not smooth
functions and so present computational and theoretical difficulties; and
(2) models for prostate cancer survival need to account for the fact that
many men diagnosed with prostate cancer can be effectively cured of their
disease with early treatment. We develop a cure survival model that allows
for change-point effects in covariates to investigate a potential
change-point in the age of diagnosis of prostate cancer. Our results do
not indicate that age under 50 is associated with increased hazard of
death from prostate cancer.
Journal: Journal of Applied Statistics
Pages: 901-911
Issue: 4
Volume: 39
Year: 2012
Month: 9
X-DOI: 10.1080/02664763.2011.626849
File-URL: http://hdl.handle.net/10.1080/02664763.2011.626849
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:901-911
Template-Type: ReDIF-Article 1.0
Author-Name: Yun-Huan Lee
Author-X-Name-First: Yun-Huan
Author-X-Name-Last: Lee
Author-Name: Chun-Shu Chen
Author-X-Name-First: Chun-Shu
Author-X-Name-Last: Chen
Title: Autoregressive model selection based on a prediction perspective
Abstract:
The autoregressive (AR) model is a popular method for fitting and
prediction in analyzing time-dependent data, where selecting an accurate
model among considered orders is a crucial issue. Two commonly used
selection criteria are the Akaike information criterion and the Bayesian
information criterion. However, the two criteria are known to suffer
potential problems regarding overfit and underfit, respectively.
Therefore, using them would perform well in some situations, but poorly in
others. In this paper, we propose a new criterion in terms of the
prediction perspective based on the concept of generalized degrees of
freedom for AR model selection. We derive an approximately unbiased
estimator of mean-squared prediction errors based on a data perturbation
technique for selecting the order parameter, where the estimation
uncertainty involved in a modeling procedure is considered. Some numerical
experiments are performed to illustrate the superiority of the proposed
method over some commonly used order selection criteria. Finally, the
methodology is applied to a real data example to predict the weekly rate
of return on the stock price of Taiwan Semiconductor Manufacturing Company
and the results indicate that the proposed method is satisfactory.
Journal: Journal of Applied Statistics
Pages: 913-922
Issue: 4
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2011.636418
File-URL: http://hdl.handle.net/10.1080/02664763.2011.636418
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:913-922
Template-Type: ReDIF-Article 1.0
Author-Name: Hassan S. Bakouch
Author-X-Name-First: Hassan S.
Author-X-Name-Last: Bakouch
Title: Time series: Modeling, Computation, and Inference
Journal: Journal of Applied Statistics
Pages: 923-923
Issue: 4
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2012.657378
File-URL: http://hdl.handle.net/10.1080/02664763.2012.657378
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:923-923
Template-Type: ReDIF-Article 1.0
Author-Name: Søren Feodor Nielsen
Author-X-Name-First: Søren Feodor
Author-X-Name-Last: Nielsen
Title: Introduction to general and generalized linear models
Journal: Journal of Applied Statistics
Pages: 923-924
Issue: 4
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2012.657403
File-URL: http://hdl.handle.net/10.1080/02664763.2012.657403
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:923-924
Template-Type: ReDIF-Article 1.0
Author-Name: Long Kang
Author-X-Name-First: Long
Author-X-Name-Last: Kang
Title: The oxford handbook of credit derivatives
Journal: Journal of Applied Statistics
Pages: 924-925
Issue: 4
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2012.657406
File-URL: http://hdl.handle.net/10.1080/02664763.2012.657406
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:924-925
Template-Type: ReDIF-Article 1.0
Author-Name: Mariano Ruiz Espejo
Author-X-Name-First: Mariano Ruiz
Author-X-Name-Last: Espejo
Title: Exercises and solutions in biostatistical theory
Journal: Journal of Applied Statistics
Pages: 925-926
Issue: 4
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2012.657407
File-URL: http://hdl.handle.net/10.1080/02664763.2012.657407
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:4:p:925-926
Template-Type: ReDIF-Article 1.0
Author-Name: Adrián Quintero-Sarmiento
Author-X-Name-First: Adrián
Author-X-Name-Last: Quintero-Sarmiento
Author-Name: Edilberto Cepeda-Cuervo
Author-X-Name-First: Edilberto
Author-X-Name-Last: Cepeda-Cuervo
Author-Name: Vicente Núñez-Antón
Author-X-Name-First: Vicente
Author-X-Name-Last: Núñez-Antón
Title: Estimating infant mortality in Colombia: some overdispersion modelling approaches
Abstract:
It is common to fit generalized linear models with binomial and Poisson
responses, where the data show a variability that is greater than the
theoretical variability assumed by the model. This phenomenon, known as
overdispersion, may spoil inferences about the model by considering
significant parameters associated with variables that have no significant
effect on the dependent variable. This paper explains some methods to
detect overdispersion and presents and evaluates three well-known
methodologies that have shown their usefulness in correcting this problem,
using random mean models, quasi-likelihood methods and a double
exponential family. In addition, it proposes some new Bayesian model
extensions that have proved their usefulness in correcting the
overdispersion problem. Finally, using the information provided by the
National Demographic and Health Survey 2005, the departmental factors that
have an influence on the mortality of children under 5 years and female
postnatal period screening are determined. Based on the results,
extensions that generalize some of the aforementioned models are also
proposed, and their use is motivated by the data set under study. The
results conclude that the proposed overdispersion models provide a better
statistical fit of the data.
Journal: Journal of Applied Statistics
Pages: 1011-1036
Issue: 5
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2011.634395
File-URL: http://hdl.handle.net/10.1080/02664763.2011.634395
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:1011-1036
Template-Type: ReDIF-Article 1.0
Author-Name: Ali Reza Soltanian
Author-X-Name-First: Ali Reza
Author-X-Name-Last: Soltanian
Author-Name: Soghrat Faghihzadeh
Author-X-Name-First: Soghrat
Author-X-Name-Last: Faghihzadeh
Title: A generalization of the Grizzle model to the estimation of treatment effects in crossover trials with non-compliance
Abstract:
Compliance with one specified dosing strategy of assigned treatments is a
common problem in randomized drug clinical trials. Recently, there has
been much interest in methods used for analysing treatment effects in
randomized clinical trials that are subject to non-compliance. In this
paper, we estimate and compare treatment effects based on the Grizzle
model (GM) (ignorable non-compliance) as the custom model and the
generalized Grizzle model (GGM) (non-ignorable non-compliance) as the new
model. A real data set based on the treatment of knee osteoarthritis is
used to compare these models. The results based on the likelihood ratio
statistics and simulation study show the advantage of the proposed model
(GGM) over the custom model (GGM).
Journal: Journal of Applied Statistics
Pages: 1037-1048
Issue: 5
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2011.634396
File-URL: http://hdl.handle.net/10.1080/02664763.2011.634396
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:1037-1048
Template-Type: ReDIF-Article 1.0
Author-Name: Cibele M. Russo
Author-X-Name-First: Cibele M.
Author-X-Name-Last: Russo
Author-Name: Gilberto A. Paula
Author-X-Name-First: Gilberto A.
Author-X-Name-Last: Paula
Author-Name: Francisco Jos� A. Cysneiros
Author-X-Name-First: Francisco Jos� A.
Author-X-Name-Last: Cysneiros
Author-Name: Reiko Aoki
Author-X-Name-First: Reiko
Author-X-Name-Last: Aoki
Title: Influence diagnostics in heteroscedastic and/or autoregressive nonlinear elliptical models for correlated data
Abstract:
In this paper, we propose nonlinear elliptical models for correlated data
with heteroscedastic and/or autoregressive structures. Our aim is to
extend the models proposed by Russo et al. 22 by
considering a more sophisticated scale structure to deal with variations
in data dispersion and/or a possible autocorrelation among measurements
taken throughout the same experimental unit. Moreover, to avoid the
possible influence of outlying observations or to take into account the
non-normal symmetric tails of the data, we assume elliptical contours for
the joint distribution of random effects and errors, which allows us to
attribute different weights to the observations. We propose an iterative
algorithm to obtain the maximum-likelihood estimates for the parameters
and derive the local influence curvatures for some specific perturbation
schemes. The motivation for this work comes from a pharmacokinetic
indomethacin data set, which was analysed previously by Bocheng and Xuping
1 under normality.
Journal: Journal of Applied Statistics
Pages: 1049-1067
Issue: 5
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2011.636030
File-URL: http://hdl.handle.net/10.1080/02664763.2011.636030
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:1049-1067
Template-Type: ReDIF-Article 1.0
Author-Name: Vinicius P. Israel
Author-X-Name-First: Vinicius P.
Author-X-Name-Last: Israel
Author-Name: H�lio S. Migon
Author-X-Name-First: H�lio S.
Author-X-Name-Last: Migon
Title: Stochastic models for greenhouse gas emission rate estimation from hydroelectric reservoirs: a Bayesian hierarchical approach
Abstract:
Herein, we propose a fully Bayesian approach to the greenhouse gas
emission problem. The goal of this work is to estimate the emission rate
of polluting gases from the area flooded by hydroelectric reservoirs. We
present models for gas concentration evolution in two ways: first, by
proposing them from ordinary differential equation solutions and, second,
by using stochastic differential equations with a discretization scheme.
Finally, we present techniques to estimate the emission rate for the
entire reservoir. In order to carry out the inference, we use the Bayesian
framework with Monte Carlo via Markov Chain methods. Discretization
schemes over continuous differential equations are used when necessary.
These models applied to greenhouse gas emission and Bayesian inference for
this purpose are completely new in statistical literature, as far as we
know, and contribute to estimate the amount of polluting gases released
from hydroelectric reservoirs in Brazil. The proposed models are applied
in a real data set and results are presented.
Journal: Journal of Applied Statistics
Pages: 1069-1086
Issue: 5
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2011.636417
File-URL: http://hdl.handle.net/10.1080/02664763.2011.636417
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:1069-1086
Template-Type: ReDIF-Article 1.0
Author-Name: Marjan Mansourian
Author-X-Name-First: Marjan
Author-X-Name-Last: Mansourian
Author-Name: Anoshirvan Kazemnejad
Author-X-Name-First: Anoshirvan
Author-X-Name-Last: Kazemnejad
Author-Name: Iraj Kazemi
Author-X-Name-First: Iraj
Author-X-Name-Last: Kazemi
Author-Name: Farid Zayeri
Author-X-Name-First: Farid
Author-X-Name-Last: Zayeri
Author-Name: Masoud Soheilian
Author-X-Name-First: Masoud
Author-X-Name-Last: Soheilian
Title: Bayesian analysis of longitudinal ordered data with flexible random effects using McMC: application to diabetic macular Edema data
Abstract:
In the analysis of correlated ordered data, mixed-effect models are
frequently used to control the subject heterogeneity effects. A common
assumption in fitting these models is the normality of random effects. In
many cases, this is unrealistic, making the estimation results unreliable.
This paper considers several flexible models for random effects and
investigates their properties in the model fitting. We adopt a
proportional odds logistic regression model and incorporate the skewed
version of the normal, Student's t and slash
distributions for the effects. Stochastic representations for various
flexible distributions are proposed afterwards based on the mixing
strategy approach. This reduces the computational burden being performed
by the McMC technique. Furthermore, this paper addresses the
identifiability restrictions and suggests a procedure to handle this
issue. We analyze a real data set taken from an ophthalmic clinical trial.
Model selection is performed by suitable Bayesian model selection
criteria.
Journal: Journal of Applied Statistics
Pages: 1087-1100
Issue: 5
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.638367
File-URL: http://hdl.handle.net/10.1080/02664763.2011.638367
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:1087-1100
Template-Type: ReDIF-Article 1.0
Author-Name: Saïd Hanchane
Author-X-Name-First: Saïd
Author-X-Name-Last: Hanchane
Author-Name: Tarek Mostafa
Author-X-Name-First: Tarek
Author-X-Name-Last: Mostafa
Title: Solving endogeneity problems in multilevel estimation: an example using education production functions
Abstract:
This paper explores endogeneity problems in multilevel estimation of
education production functions. The focus is on level 2 endogeneity which
arises from correlations between student characteristics and omitted
school variables. Theses correlations are mainly the result of student
stratification between schools. From an econometric point of view, the
correlations between student and school characteristics imply that the
omission of some variables may generate endogeneity bias. Therefore, an
estimation approach based on the Mundlak [20] technique is developed in
order to tackle bias and to generate consistent estimates. Note that our
analysis can be extended to any multilevel-structured data (students
nested within schools, employees within firms, firms within regions, etc).
The entire analysis is undertaken in a comparative context between three
countries: Germany, Finland and the UK. Each one of them represents a
particular system. For instance, Finland is known for its extreme
comprehensiveness, Germany for early selection and the UK for its
liberalism. These countries are used to illustrate the theory and to prove
that the level of bias arising from omitted variables varies according to
the characteristics of education systems.
Journal: Journal of Applied Statistics
Pages: 1101-1114
Issue: 5
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.638705
File-URL: http://hdl.handle.net/10.1080/02664763.2011.638705
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:1101-1114
Template-Type: ReDIF-Article 1.0
Author-Name: K. Hron
Author-X-Name-First: K.
Author-X-Name-Last: Hron
Author-Name: P. Filzmoser
Author-X-Name-First: P.
Author-X-Name-Last: Filzmoser
Author-Name: K. Thompson
Author-X-Name-First: K.
Author-X-Name-Last: Thompson
Title: Linear regression with compositional explanatory variables
Abstract:
Compositional explanatory variables should not be directly used in a
linear regression model because any inference statistic can become
misleading. While various approaches for this problem were proposed, here
an approach based on the isometric logratio (ilr) transformation is used.
It turns out that the resulting model is easy to handle, and that
parameter estimation can be done in like in usual linear regression.
Moreover, it is possible to use the ilr variables for inference statistics
in order to obtain an appropriate interpretation of the model.
Journal: Journal of Applied Statistics
Pages: 1115-1128
Issue: 5
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.644268
File-URL: http://hdl.handle.net/10.1080/02664763.2011.644268
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:1115-1128
Template-Type: ReDIF-Article 1.0
Author-Name: C. Kiruthika
Author-X-Name-First: C.
Author-X-Name-Last: Kiruthika
Author-Name: R. Chandrasekaran
Author-X-Name-First: R.
Author-X-Name-Last: Chandrasekaran
Title: Classification of textile fabrics using statistical multivariate techniques
Abstract:
In this study, an attempt has been made to classify the textile fabrics
based on the physical properties using statistical multivariate techniques
like discriminant analysis and cluster analysis. Initially, the
discriminant functions have been constructed for the classification of the
three known categories of fabrics made up of polyster, lyocell/viscose and
treated-polyster. The classification yielded hundred per cent accuracy.
Each of the three different categories of fabrics has been further
subjected to the K-means clustering algorithm that yielded three clusters.
These clusters are subjected to discriminant analysis which again yielded
a 100% correct classification, indicating that the clusters are well
separated. The properties of clusters are also investigated with respect
to the measurements.
Journal: Journal of Applied Statistics
Pages: 1129-1138
Issue: 5
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.644521
File-URL: http://hdl.handle.net/10.1080/02664763.2011.644521
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:1129-1138
Template-Type: ReDIF-Article 1.0
Author-Name: Rob Deardon
Author-X-Name-First: Rob
Author-X-Name-Last: Deardon
Author-Name: Babak Habibzadeh
Author-X-Name-First: Babak
Author-X-Name-Last: Habibzadeh
Author-Name: Hau Yi Chung
Author-X-Name-First: Hau Yi
Author-X-Name-Last: Chung
Title: Spatial measurement error in infectious disease models
Abstract:
Individual-level models (ILMs) for infectious disease can be used to
model disease spread between individuals while taking into account
important covariates. One important covariate in determining the risk of
infection transfer can be spatial location. At the same time, measurement
error is a concern in many areas of statistical analysis, and infectious
disease modelling is no exception. In this paper, we are concerned with
the issue of measurement error in the recorded location of individuals
when using a simple spatial ILM to model the spread of disease within a
population. An ILM that incorporates spatial location random effects is
introduced within a hierarchical Bayesian framework. This model is tested
upon both simulated data and data from the UK 2001 foot-and-mouth disease
epidemic. The ability of the model to successfully identify both the
spatial infection kernel and the basic reproduction number
(R 0) of the disease is tested.
Journal: Journal of Applied Statistics
Pages: 1139-1150
Issue: 5
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.644522
File-URL: http://hdl.handle.net/10.1080/02664763.2011.644522
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:1139-1150
Template-Type: ReDIF-Article 1.0
Author-Name: Francesco Lagona
Author-X-Name-First: Francesco
Author-X-Name-Last: Lagona
Author-Name: Marco Picone
Author-X-Name-First: Marco
Author-X-Name-Last: Picone
Title: Model-based clustering of multivariate skew data with circular components and missing values
Abstract:
Motivated by classification issues that arise in marine studies, we
propose a latent-class mixture model for the unsupervised classification
of incomplete quadrivariate data with two linear and two circular
components. The model integrates bivariate circular densities and
bivariate skew normal densities to capture the association between
toroidal clusters of bivariate circular observations and planar clusters
of bivariate linear observations. Maximum-likelihood estimation of the
model is facilitated by an expectation maximization (EM) algorithm that
treats unknown class membership and missing values as different sources of
incomplete information. The model is exploited on hourly observations of
wind speed and direction and wave height and direction to identify a
number of sea regimes, which represent specific distributional shapes that
the data take under environmental latent conditions.
Journal: Journal of Applied Statistics
Pages: 927-945
Issue: 5
Volume: 39
Year: 2012
Month: 9
X-DOI: 10.1080/02664763.2011.626850
File-URL: http://hdl.handle.net/10.1080/02664763.2011.626850
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:927-945
Template-Type: ReDIF-Article 1.0
Author-Name: Wei Ning
Author-X-Name-First: Wei
Author-X-Name-Last: Ning
Title: Empirical likelihood ratio test for a mean change point model with a linear trend followed by an abrupt change
Abstract:
In this paper, a change point model with the mean being constant up to
some unknown point, and increasing linearly to another unknown point, then
dropping back to the original level is studied. A nonparametric method
based on the empirical likelihood test is proposed to detect and estimate
the locations of change points. Under some mild conditions, the asymptotic
null distribution of an empirical likelihood ratio test statistic is shown
to have the extreme distribution. The consistency of the test is also
proved. Simulations of the powers of the test indicate that it performs
well under different assumptions of the data distribution. The test is
applied to the aircraft arrival time data set and the Stanford heart
transplant data set.
Journal: Journal of Applied Statistics
Pages: 947-961
Issue: 5
Volume: 39
Year: 2012
Month: 9
X-DOI: 10.1080/02664763.2011.628647
File-URL: http://hdl.handle.net/10.1080/02664763.2011.628647
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:947-961
Template-Type: ReDIF-Article 1.0
Author-Name: Wei Liu
Author-X-Name-First: Wei
Author-X-Name-Last: Liu
Author-Name: Lang Wu
Author-X-Name-First: Lang
Author-X-Name-Last: Wu
Title: Two-step and likelihood methods for HIV viral dynamic models with covariate measurement errors and missing data
Abstract:
HIV viral dynamic models have received much attention in the literature.
Long-term viral dynamics may be modelled by semiparametric nonlinear
mixed-effect models, which incorporate large variation between subjects
and autocorrelation within subjects and are flexible in modelling complex
viral load trajectories. Time-dependent covariates may be introduced in
the dynamic models to partially explain the between-individual variations.
In the presence of measurement errors and missing data in time-dependent
covariates, we show that the commonly used two-step method may give
approximately unbiased estimates but may under-estimate standard errors.
We propose a two-stage bootstrap method to adjust the standard errors in
the two-step method and a likelihood method.
Journal: Journal of Applied Statistics
Pages: 963-978
Issue: 5
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2011.632404
File-URL: http://hdl.handle.net/10.1080/02664763.2011.632404
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:963-978
Template-Type: ReDIF-Article 1.0
Author-Name: Susana Mendes
Author-X-Name-First: Susana
Author-X-Name-Last: Mendes
Author-Name: M. Jos� Fernández-Gómez
Author-X-Name-First: M. Jos�
Author-X-Name-Last: Fernández-Gómez
Author-Name: Mário Jorge Pereira
Author-X-Name-First: Mário Jorge
Author-X-Name-Last: Pereira
Author-Name: Ulisses Miranda Azeiteiro
Author-X-Name-First: Ulisses Miranda
Author-X-Name-Last: Azeiteiro
Author-Name: M. Purificación Galindo-Villardón
Author-X-Name-First: M. Purificación
Author-X-Name-Last: Galindo-Villardón
Title: An empirical comparison of Canonical Correspondence Analysis and STATICO in the identification of spatio-temporal ecological relationships
Abstract:
The wide-ranging and rapidly evolving nature of ecological studies mean
that it is not possible to cover all existing and emerging techniques for
analyzing multivariate data. However, two important methods enticed many
followers: the Canonical Correspondence Analysis (CCA) and the STATICO
analysis. Despite the particular characteristics of each, they have
similarities and differences, which when analyzed properly, can, together,
provide important complementary results to those that are usually
exploited by researchers. If on one hand, the use of CCA is completely
generalized and implemented, solving many problems formulated by
ecologists, on the other hand, this method has some weaknesses mainly
caused by the imposition of the number of variables that is required to be
applied (much higher in comparison with samples). Also, the STATICO method
has no such restrictions, but requires that the number of variables
(species or environment) is the same in each time or space. Yet, the
STATICO method presents information that can be more detailed since it
allows visualizing the variability within groups (either in time or
space). In this study, the data needed for implementing these methods are
sketched, as well as the comparison is made showing the advantages and
disadvantages of each method. The treated ecological data are a sequence
of pairs of ecological tables, where species abundances and environmental
variables are measured at different, specified locations, over the course
of time.
Journal: Journal of Applied Statistics
Pages: 979-994
Issue: 5
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2011.634393
File-URL: http://hdl.handle.net/10.1080/02664763.2011.634393
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:979-994
Template-Type: ReDIF-Article 1.0
Author-Name: Göran Kauermann
Author-X-Name-First: Göran
Author-X-Name-Last: Kauermann
Author-Name: Nina Westerheide
Author-X-Name-First: Nina
Author-X-Name-Last: Westerheide
Title: To move or not to move to find a new job: spatial duration time model with dynamic covariate effects
Abstract:
The aim of this paper is to show the flexibility and capacity of
penalized spline smoothing as estimation routine for modelling duration
time data. We analyse the unemployment behaviour in Germany between 2000
and 2004 using a massive database from the German Federal Employment
Agency. To investigate dynamic covariate effects and differences between
competing job markets depending on the distance between former and recent
working place, a functional duration time model with competing risks is
used. It is build upon a competing hazard function where some of the
smooth covariate effects are allowed to vary with unemployment duration.
The focus of our analysis is on contrasting the spatial, economic and
individual covariate effects of the competing job markets and on analysing
their general influence on the unemployed's re-employment probabilities.
As a result of our analyses, we reveal differences concerning gender, age
and education. We also discover an effect between the newly formed and the
old West German states. Moreover, the spatial pattern between the
considered job markets differs.
Journal: Journal of Applied Statistics
Pages: 995-1009
Issue: 5
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2011.634394
File-URL: http://hdl.handle.net/10.1080/02664763.2011.634394
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:5:p:995-1009
Template-Type: ReDIF-Article 1.0
Author-Name: Christopher H. Morrell
Author-X-Name-First: Christopher H.
Author-X-Name-Last: Morrell
Author-Name: Larry J. Brant
Author-X-Name-First: Larry J.
Author-X-Name-Last: Brant
Author-Name: Shan Sheng
Author-X-Name-First: Shan
Author-X-Name-Last: Sheng
Author-Name: E. Jeffrey Metter
Author-X-Name-First: E. Jeffrey
Author-X-Name-Last: Metter
Title: Screening for prostate cancer using multivariate mixed-effects models
Abstract:
Using several variables known to be related to prostate cancer, a
multivariate classification method is developed to predict the onset of
clinical prostate cancer. A multivariate mixed-effects model is used to
describe longitudinal changes in prostate-specific antigen (PSA), a free
testosterone index (FTI), and body mass index (BMI) before any clinical
evidence of prostate cancer. The patterns of change in these three
variables are allowed to vary depending on whether the subject develops
prostate cancer or not and the severity of the prostate cancer at
diagnosis. An application of Bayes’ theorem provides posterior
probabilities that we use to predict whether an individual will develop
prostate cancer and, if so, whether it is a high-risk or a low-risk
cancer. The classification rule is applied sequentially one multivariate
observation at a time until the subject is classified as a cancer case or
until the last observation has been used. We perform the analyses using
each of the three variables individually, combined together in pairs, and
all three variables together in one analysis. We compare the
classification results among the various analyses and a simulation study
demonstrates how the sensitivity of prediction changes with respect to the
number and type of variables used in the prediction process.
Journal: Journal of Applied Statistics
Pages: 1151-1175
Issue: 6
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.644523
File-URL: http://hdl.handle.net/10.1080/02664763.2011.644523
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1151-1175
Template-Type: ReDIF-Article 1.0
Author-Name: Liuxia Wang
Author-X-Name-First: Liuxia
Author-X-Name-Last: Wang
Title: Bayesian principal component regression with data-driven component selection
Abstract:
Principal component regression (PCR) has two steps: estimating the
principal components and performing the regression using these components.
These steps generally are performed sequentially. In PCR, a crucial issue
is the selection of the principal components to be included in regression.
In this paper, we build a hierarchical probabilistic PCR model with a
dynamic component selection procedure. A latent variable is introduced to
select promising subsets of components based upon the significance of the
relationship between the response variable and principal components in the
regression step. We illustrate this model using real and simulated
examples. The simulations demonstrate that our approach outperforms some
existing methods in terms of root mean squared error of the regression
coefficient.
Journal: Journal of Applied Statistics
Pages: 1177-1189
Issue: 6
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.644524
File-URL: http://hdl.handle.net/10.1080/02664763.2011.644524
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1177-1189
Template-Type: ReDIF-Article 1.0
Author-Name: Edwin M.M. Ortega
Author-X-Name-First: Edwin M.M.
Author-X-Name-Last: Ortega
Author-Name: Gauss M. Cordeiro
Author-X-Name-First: Gauss M.
Author-X-Name-Last: Cordeiro
Author-Name: Michael W. Kattan
Author-X-Name-First: Michael W.
Author-X-Name-Last: Kattan
Title: The negative binomial--beta Weibull regression model to predict the cure of prostate cancer
Abstract:
In this article, for the first time, we propose the negative
binomial--beta Weibull (BW) regression model for studying the recurrence
of prostate cancer and to predict the cure fraction for patients with
clinically localized prostate cancer treated by open radical
prostatectomy. The cure model considers that a fraction of the survivors
are cured of the disease. The survival function for the population of
patients can be modeled by a cure parametric model using the BW
distribution. We derive an explicit expansion for the moments of the
recurrence time distribution for the uncured individuals. The proposed
distribution can be used to model survival data when the hazard rate
function is increasing, decreasing, unimodal and bathtub shaped. Another
advantage is that the proposed model includes as special sub-models some
of the well-known cure rate models discussed in the literature. We derive
the appropriate matrices for assessing local influence on the parameter
estimates under different perturbation schemes. We analyze a real data set
for localized prostate cancer patients after open radical prostatectomy.
Journal: Journal of Applied Statistics
Pages: 1191-1210
Issue: 6
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.644525
File-URL: http://hdl.handle.net/10.1080/02664763.2011.644525
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1191-1210
Template-Type: ReDIF-Article 1.0
Author-Name: Piotr Kulczycki
Author-X-Name-First: Piotr
Author-X-Name-Last: Kulczycki
Author-Name: Malgorzata Charytanowicz
Author-X-Name-First: Malgorzata
Author-X-Name-Last: Charytanowicz
Author-Name: Piotr A. Kowalski
Author-X-Name-First: Piotr A.
Author-X-Name-Last: Kowalski
Author-Name: Szymon Lukasik
Author-X-Name-First: Szymon
Author-X-Name-Last: Lukasik
Title: The Complete Gradient Clustering Algorithm: properties in practical applications
Abstract:
The aim of this paper is to present a Complete Gradient Clustering
Algorithm, its applicational aspects and properties, as well as to
illustrate them with specific practical problems from the subject of
bioinformatics (the categorization of grains for seed production),
management (the design of a marketing support strategy for a mobile phone
operator) and engineering (the synthesis of a fuzzy controller). The main
property of the Complete Gradient Clustering Algorithm is that it does not
require strict assumptions regarding the desired number of clusters, which
allows to better suit its obtained number to a real data structure. In the
basic version it is possible to provide a complete set of procedures for
defining the values of all functions and parameters relying on the
optimization criterions. It is also possible to point out parameters, the
potential change which implies influence on the size of the number of
clusters (while still not giving an exact number) and the proportion
between their numbers in dense and sparse areas of data elements.
Moreover, the Complete Gradient Clustering Algorithm can be used to
identify and possibly eliminate atypical elements (outliers). These
properties proved to be very useful in the presented applications and may
also be functional in many other practical problems.
Journal: Journal of Applied Statistics
Pages: 1211-1224
Issue: 6
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.644526
File-URL: http://hdl.handle.net/10.1080/02664763.2011.644526
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1211-1224
Template-Type: ReDIF-Article 1.0
Author-Name: Florence George
Author-X-Name-First: Florence
Author-X-Name-Last: George
Author-Name: B. M. Golam Kibria
Author-X-Name-First: B. M.
Author-X-Name-Last: Golam Kibria
Title: Confidence intervals for estimating the population signal-to-noise ratio: a simulation study
Abstract:
This paper considered several confidence intervals for estimating the
population signal-to-noise ratio based on parametric, non-parametric and
modified methods. A simulation study has been conducted to compare the
performance of the interval estimators under both symmetric and skewed
distributions. We reported coverage probability and average width of the
interval estimators. Based on the simulation study, we observed that some
of our proposed interval estimators are performing better in the sense of
smaller width and coverage probability and have been recommended for the
researchers.
Journal: Journal of Applied Statistics
Pages: 1225-1240
Issue: 6
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.644527
File-URL: http://hdl.handle.net/10.1080/02664763.2011.644527
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1225-1240
Template-Type: ReDIF-Article 1.0
Author-Name: Zhiyong Zhang
Author-X-Name-First: Zhiyong
Author-X-Name-Last: Zhang
Author-Name: John J. McArdle
Author-X-Name-First: John J.
Author-X-Name-Last: McArdle
Author-Name: John R. Nesselroade
Author-X-Name-First: John R.
Author-X-Name-Last: Nesselroade
Title: Growth rate models: emphasizing growth rate analysis through growth curve modeling
Abstract:
To emphasize growth rate analysis, we develop a general method to
reparametrize growth curve models to analyze rates of growth for a variety
of growth trajectories, such as quadratic and exponential growth. The
resulting growth rate models are shown to be related to rotations of
growth curves. Estimated conveniently through growth curve modeling
techniques, growth rate models have advantages above and beyond
traditional growth curve models. The proposed growth rate models are used
to analyze longitudinal data from the National Longitudinal Study of Youth
(NLSY) on children's mathematics performance scores including covariates
of gender and behavioral problems (BPI). Individual differences are found
in rates of growth from ages 6 to 11. Associations with BPI, gender, and
their interaction to rates of growth are found to vary with age.
Implications of the models and the findings are discussed.
Journal: Journal of Applied Statistics
Pages: 1241-1262
Issue: 6
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.644528
File-URL: http://hdl.handle.net/10.1080/02664763.2011.644528
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1241-1262
Template-Type: ReDIF-Article 1.0
Author-Name: Andrea Cutillo
Author-X-Name-First: Andrea
Author-X-Name-Last: Cutillo
Author-Name: Claudio Ceccarelli
Author-X-Name-First: Claudio
Author-X-Name-Last: Ceccarelli
Title: The internal relocation premium: are migrants positively or negatively selected? Evidence from Italy
Abstract:
This paper analyzes the wage returns from internal migration for recent
graduates in Italy. We employ a switching regression model that accounts
for the endogeneity of the individual's choice to relocate to get a job
after graduation: the omission of this selection decision can lead to
biased estimates, as there is potential correlation between earnings and
unobserved traits, exerting an influence on the decision to migrate. The
empirical results sustain the appropriateness of the estimation technique
and show that there is a significant pay gap between migrants and
non-migrants; migrants seem to be positively selected and the migration
premium is downward biased through OLS estimates. The endogeneity of
migration shows up both as a negative intercept effect and as a positive
slope effect, the second being larger than the first: bad knowledge of the
local labor market and financial constraints lead migrants to accept a low
basic wage but, due to relevant returns to their characteristics, they
finally obtain a higher wage than the others.
Journal: Journal of Applied Statistics
Pages: 1263-1278
Issue: 6
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.644529
File-URL: http://hdl.handle.net/10.1080/02664763.2011.644529
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1263-1278
Template-Type: ReDIF-Article 1.0
Author-Name: Vlasios Voudouris
Author-X-Name-First: Vlasios
Author-X-Name-Last: Voudouris
Author-Name: Robert Gilchrist
Author-X-Name-First: Robert
Author-X-Name-Last: Gilchrist
Author-Name: Robert Rigby
Author-X-Name-First: Robert
Author-X-Name-Last: Rigby
Author-Name: John Sedgwick
Author-X-Name-First: John
Author-X-Name-Last: Sedgwick
Author-Name: Dimitrios Stasinopoulos
Author-X-Name-First: Dimitrios
Author-X-Name-Last: Stasinopoulos
Title: Modelling skewness and kurtosis with the BCPE density in GAMLSS
Abstract:
This paper illustrates the power of modern statistical modelling in
understanding processes characterised by data that are skewed and have
heavy tails. Our particular substantive problem concerns film box-office
revenues. We are able to show that traditional modelling techniques based
on the Pareto--Levy--Mandelbrot distribution led to what is actually a
poorly supported conclusion that these data have infinite variance. This
in turn led to the dominant paradigm of the movie business that
‘nobody knows anything’ and hence that box-office revenues
cannot be predicted. Using the Box--Cox power exponential distribution
within the generalized additive models for location, scale and shape
framework, we are able to model box-office revenues and develop
probabilistic statements about revenues.
Journal: Journal of Applied Statistics
Pages: 1279-1293
Issue: 6
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.644530
File-URL: http://hdl.handle.net/10.1080/02664763.2011.644530
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1279-1293
Template-Type: ReDIF-Article 1.0
Author-Name: Ioannis Vrontos
Author-X-Name-First: Ioannis
Author-X-Name-Last: Vrontos
Title: Evidence for hedge fund predictability from a multivariate Student's t full-factor GARCH model
Abstract:
Extending previous work on hedge fund return predictability, this paper
introduces the idea of modelling the conditional distribution of hedge
fund returns using Student's t full-factor multivariate
GARCH models. This class of models takes into account the stylized facts
of hedge fund return series, that is, heteroskedasticity, fat tails and
deviations from normality. For the proposed class of multivariate
predictive regression models, we derive analytic expressions for the score
and the Hessian matrix, which can be used within classical and Bayesian
inferential procedures to estimate the model parameters, as well as to
compare different predictive regression models. We propose a Bayesian
approach to model comparison which provides posterior probabilities for
various predictive models that can be used for model averaging. Our
empirical application indicates that accounting for fat tails and
time-varying covariances/correlations provides a more appropriate
modelling approach of the underlying dynamics of financial series and
improves our ability to predict hedge fund returns.
Journal: Journal of Applied Statistics
Pages: 1295-1321
Issue: 6
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.644771
File-URL: http://hdl.handle.net/10.1080/02664763.2011.644771
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1295-1321
Template-Type: ReDIF-Article 1.0
Author-Name: S. Lalitha
Author-X-Name-First: S.
Author-X-Name-Last: Lalitha
Author-Name: Nirpeksh Kumar
Author-X-Name-First: Nirpeksh
Author-X-Name-Last: Kumar
Title: Multiple outlier test for upper outliers in an exponential sample
Abstract:
In this paper, a test statistic for testing upper outliers with a
slippage alternative, in an exponential sample is proposed. No tables for
critical values are required as they can be calculated easily for any
sample size. A simulation study is also carried out to compare the
performance of the test with the maximum likelihood ratio test and other
existing tests.
Journal: Journal of Applied Statistics
Pages: 1323-1330
Issue: 6
Volume: 39
Year: 2012
Month: 11
X-DOI: 10.1080/02664763.2011.645158
File-URL: http://hdl.handle.net/10.1080/02664763.2011.645158
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1323-1330
Template-Type: ReDIF-Article 1.0
Author-Name: Jean-Marc Bardet
Author-X-Name-First: Jean-Marc
Author-X-Name-Last: Bardet
Author-Name: Imen Kammoun
Author-X-Name-First: Imen
Author-X-Name-Last: Kammoun
Author-Name: Veronique Billat
Author-X-Name-First: Veronique
Author-X-Name-Last: Billat
Title: A new process for modeling heartbeat signals during exhaustive run with an adaptive estimator of its fractal parameters
Abstract:
This paper is devoted to a new study of the fractal behavior of
heartbeats during a marathon. Such a case is interesting since it allows
the examination of heart behavior during a very long exercise in order to
reach reliable conclusions on the long-term properties of heartbeats.
Three points of this study can be highlighted. First, the whole race
heartbeats of each runner are automatically divided into several stages
where the signal is nearly stationary and these stages are detected with
an adaptive change points detection method. Secondly, a new process called
the locally fractional Gaussian noise (LFGN) is proposed to fit such data.
Finally, a wavelet-based method using a specific mother wavelet provides
an adaptive procedure for estimating low frequency and high frequency
fractal parameters as well as the corresponding frequency bandwidths. Such
an estimator is theoretically proved to converge in the case of LFGNs, and
simulations confirm this consistency. Moreover, an adaptive chi-squared
goodness-of-fit test is also built, using this wavelet-based estimator.
The application of this method to marathon heartbeat series indicates that
the LFGN fits well data at each stage and that the low frequency fractal
parameter increases during the race. A detection of a too large low
frequency fractal parameter during the race could help prevent the too
frequent heart failures occurring during marathons.
Journal: Journal of Applied Statistics
Pages: 1331-1351
Issue: 6
Volume: 39
Year: 2012
Month: 12
X-DOI: 10.1080/02664763.2011.646962
File-URL: http://hdl.handle.net/10.1080/02664763.2011.646962
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1331-1351
Template-Type: ReDIF-Article 1.0
Author-Name: Xu-Qing Liu
Author-X-Name-First: Xu-Qing
Author-X-Name-Last: Liu
Author-Name: Bo Li
Author-X-Name-First: Bo
Author-X-Name-Last: Li
Title: General linear estimators under the prediction error sum of squares criterion in a linear regression model
Abstract:
In this paper, the notion of the general linear estimator and its
modified version are introduced using the singular value decomposition
theorem in the linear regression model y=X
β+e to improve some classical linear
estimators. The optimal selections of the biasing parameters involved are
theoretically given under the prediction error sum of squares criterion. A
numerical example and a simulation study are finally conducted to
illustrate the superiority of the proposed estimators.
Journal: Journal of Applied Statistics
Pages: 1353-1361
Issue: 6
Volume: 39
Year: 2012
Month: 12
X-DOI: 10.1080/02664763.2011.646963
File-URL: http://hdl.handle.net/10.1080/02664763.2011.646963
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1353-1361
Template-Type: ReDIF-Article 1.0
Author-Name: Christian M. Hafner
Author-X-Name-First: Christian M.
Author-X-Name-Last: Hafner
Title: Cross-correlating wavelet coefficients with applications to high-frequency financial time series
Abstract:
This paper uses a new concept in wavelet analysis to explore a financial
transaction data set including returns, durations, and volume. The concept
is based on a decomposition of the Allan covariance of two series into
cross-covariances of wavelet coefficients, which allows a natural
interpretation of cross-correlations in terms of frequencies. It is
applied to financial transaction data including returns, durations between
transactions, and trading volume. At high frequencies, we find significant
spillover from durations to volume and a strong contemporaneous relation
between durations and returns, whereas a strong causality between volume
and volatility exists at various frequencies.
Journal: Journal of Applied Statistics
Pages: 1363-1379
Issue: 6
Volume: 39
Year: 2012
Month: 12
X-DOI: 10.1080/02664763.2011.649716
File-URL: http://hdl.handle.net/10.1080/02664763.2011.649716
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1363-1379
Template-Type: ReDIF-Article 1.0
Author-Name: Eric J. Beh
Author-X-Name-First: Eric J.
Author-X-Name-Last: Beh
Title: Exploratory multivariate analysis by example using R
Journal: Journal of Applied Statistics
Pages: 1381-1382
Issue: 6
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.657409
File-URL: http://hdl.handle.net/10.1080/02664763.2012.657409
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1381-1382
Template-Type: ReDIF-Article 1.0
Author-Name: Georgi N. Boshnakov
Author-X-Name-First: Georgi N.
Author-X-Name-Last: Boshnakov
Title: Using R for data management, statistical analysis, and graphics
Journal: Journal of Applied Statistics
Pages: 1382-1383
Issue: 6
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.657412
File-URL: http://hdl.handle.net/10.1080/02664763.2012.657412
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1382-1383
Template-Type: ReDIF-Article 1.0
Author-Name: Eugenia Stoimenova
Author-X-Name-First: Eugenia
Author-X-Name-Last: Stoimenova
Title: Robust nonparametric statistical methods
Journal: Journal of Applied Statistics
Pages: 1383-1384
Issue: 6
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.657414
File-URL: http://hdl.handle.net/10.1080/02664763.2012.657414
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1383-1384
Template-Type: ReDIF-Article 1.0
Author-Name: Eugenia Stoimenova
Author-X-Name-First: Eugenia
Author-X-Name-Last: Stoimenova
Title: Nonparametric statistical inference
Journal: Journal of Applied Statistics
Pages: 1384-1385
Issue: 6
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.657415
File-URL: http://hdl.handle.net/10.1080/02664763.2012.657415
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:6:p:1384-1385
Template-Type: ReDIF-Article 1.0
Author-Name: Shu-Fu Kuo
Author-X-Name-First: Shu-Fu
Author-X-Name-Last: Kuo
Author-Name: Yu-Shan Shih
Author-X-Name-First: Yu-Shan
Author-X-Name-Last: Shih
Title: Variable selection for functional density trees
Abstract:
In this paper, the exhaustive search principle used in functional trees
for classifying densities is shown to select variables with more split
points. A new variable selection scheme is proposed to correct this bias.
The Pearson chi-squared tests for associated two-way contingency tables
are used to select the variables. Through simulation, we show that the new
method can control bias and is more powerful in selecting split variable.
Journal: Journal of Applied Statistics
Pages: 1387-1395
Issue: 7
Volume: 39
Year: 2012
Month: 12
X-DOI: 10.1080/02664763.2011.649717
File-URL: http://hdl.handle.net/10.1080/02664763.2011.649717
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1387-1395
Template-Type: ReDIF-Article 1.0
Author-Name: Jin-Hong Park
Author-X-Name-First: Jin-Hong
Author-X-Name-Last: Park
Title: Nonparametric approach to intervention time series modeling
Abstract:
Time series are often affected by interventions such as strikes,
earthquakes, or policy changes. In the current paper, we build a practical
nonparametric intervention model using the central mean subspace in time
series. We estimate the central mean subspace for time series taking into
account known interventions by using the Nadaraya--Watson kernel
estimator. We use the modified Bayesian information criterion to estimate
the unknown lag and dimension. Finally, we demonstrate that this
nonparametric approach for intervened time series performs well in
simulations and in a real data analysis such as the Monthly
average of the oxidant.
Journal: Journal of Applied Statistics
Pages: 1397-1408
Issue: 7
Volume: 39
Year: 2012
Month: 12
X-DOI: 10.1080/02664763.2011.650684
File-URL: http://hdl.handle.net/10.1080/02664763.2011.650684
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1397-1408
Template-Type: ReDIF-Article 1.0
Author-Name: I. Ardoino
Author-X-Name-First: I.
Author-X-Name-Last: Ardoino
Author-Name: E. M. Biganzoli
Author-X-Name-First: E. M.
Author-X-Name-Last: Biganzoli
Author-Name: C. Bajdik
Author-X-Name-First: C.
Author-X-Name-Last: Bajdik
Author-Name: P. J. Lisboa
Author-X-Name-First: P. J.
Author-X-Name-Last: Lisboa
Author-Name: P. Boracchi
Author-X-Name-First: P.
Author-X-Name-Last: Boracchi
Author-Name: F. Ambrogi
Author-X-Name-First: F.
Author-X-Name-Last: Ambrogi
Title: Flexible parametric modelling of the hazard function in breast cancer studies
Abstract:
In cancer research, study of the hazard function provides useful insights
into disease dynamics, as it describes the way in which the (conditional)
probability of death changes with time. The widely utilized Cox
proportional hazard model uses a stepwise nonparametric estimator for the
baseline hazard function, and therefore has a limited utility. The use of
parametric models and/or other approaches that enables direct estimation
of the hazard function is often invoked. A recent work by Cox et
al. [6] has stimulated the use of the flexible parametric model
based on the Generalized Gamma (GG) distribution, supported by the
development of optimization software. The GG distribution allows
estimation of different hazard shapes in a single framework. We use the GG
model to investigate the shape of the hazard function in early breast
cancer patients. The flexible approach based on a piecewise exponential
model and the nonparametric additive hazards model are also considered.
Journal: Journal of Applied Statistics
Pages: 1409-1421
Issue: 7
Volume: 39
Year: 2012
Month: 12
X-DOI: 10.1080/02664763.2011.650685
File-URL: http://hdl.handle.net/10.1080/02664763.2011.650685
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1409-1421
Template-Type: ReDIF-Article 1.0
Author-Name: A. Mart�n Andr�s
Author-X-Name-First: A. Mart�n
Author-X-Name-Last: Andr�s
Author-Name: M. Álvarez Hernández
Author-X-Name-First: M. Álvarez
Author-X-Name-Last: Hernández
Author-Name: I. Herranz Tejedor
Author-X-Name-First: I. Herranz
Author-X-Name-Last: Tejedor
Title: Asymptotic two-tailed confidence intervals for the difference of proportions
Abstract:
In order to obtain a two-tailed confidence interval for the difference
between two proportions (independent samples), the current literature on
the subject has proposed a great number of asymptotic methods. This paper
assesses 80 classical asymptotic methods (including the best proposals
made in the literature) and concludes that (1) the best solution consists
of adding 0.5 to all of the data and inverting the test based on the
arcsine transformation; (2) a solution which is a little worse than the
previous one (but much easier and even better when both samples are
balanced) is a modification of the adjusted Wald method proposed by
Agresti and Caffo (usually adding to all of the
data and then applying the classical Wald CI); (3) surprisingly, the
classical score method is among the worst solutions, since it provides
excessively liberal results.
Journal: Journal of Applied Statistics
Pages: 1423-1435
Issue: 7
Volume: 39
Year: 2012
Month: 12
X-DOI: 10.1080/02664763.2011.650686
File-URL: http://hdl.handle.net/10.1080/02664763.2011.650686
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1423-1435
Template-Type: ReDIF-Article 1.0
Author-Name: Carol Y. Lin
Author-X-Name-First: Carol Y.
Author-X-Name-Last: Lin
Author-Name: Lance A. Waller
Author-X-Name-First: Lance A.
Author-X-Name-Last: Waller
Author-Name: Robert H. Lyles
Author-X-Name-First: Robert H.
Author-X-Name-Last: Lyles
Title: The likelihood approach for the comparison of medical diagnostic system with multiple binary tests
Abstract:
Detection (diagnosis) techniques play an important role in clinical
medicine. Early detection of diseases could be life-saving, and the
consequences of false-positives and false-negatives could be costly. Using
multiple measurements strategy is a popular tool to increase diagnostic
accuracy. In addition to the new diagnostic technology, recent advances in
genomics, proteomics, and other areas have allowed some of these newly
developed individual biomarkers measured by non-invasive and inexpensive
procedures (e.g. samples from serum, urine or stool) to progress from
basic discovery research to assay development. As more tests become
commercially available, there is an increasing interest for clinicians to
request combinations of various non-invasive and inexpensive tests to
increase diagnostic accuracy. Using information regarding individual test
sensitivities and specificities, we proposed a likelihood approach to
combine individual test results and to approximate or estimate the
combined sensitivities and specificities of various tests taking into
account the conditional correlations to quantify system performance. To
illustrate this approach, we considered an example using various
combinations of diagnostic tests to detect bladder cancer.
Journal: Journal of Applied Statistics
Pages: 1437-1454
Issue: 7
Volume: 39
Year: 2012
Month: 12
X-DOI: 10.1080/02664763.2011.650688
File-URL: http://hdl.handle.net/10.1080/02664763.2011.650688
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1437-1454
Template-Type: ReDIF-Article 1.0
Author-Name: Margaret R. Donald
Author-X-Name-First: Margaret R.
Author-X-Name-Last: Donald
Author-Name: Chris Strickland
Author-X-Name-First: Chris
Author-X-Name-Last: Strickland
Author-Name: Clair L. Alston
Author-X-Name-First: Clair L.
Author-X-Name-Last: Alston
Author-Name: Rick Young
Author-X-Name-First: Rick
Author-X-Name-Last: Young
Author-Name: Kerrie L. Mengersen
Author-X-Name-First: Kerrie L.
Author-X-Name-Last: Mengersen
Title: Comparison of three-dimensional profiles over time
Abstract:
In this paper, we describe an analysis for data collected on a
three-dimensional spatial lattice with treatments applied at the
horizontal lattice points. Spatial correlation is accounted for using a
conditional autoregressive model. Observations are defined as neighbours
only if they are at the same depth. This allows the corresponding variance
components to vary by depth. We use the Markov chain Monte Carlo method
with block updating, together with Krylov subspace methods, for efficient
estimation of the model. The method is applicable to both regular and
irregular horizontal lattices and hence to data collected at any set of
horizontal sites for a set of depths or heights, for example, water column
or soil profile data. The model for the three-dimensional data is applied
to agricultural trial data for five separate days taken roughly six months
apart in order to determine possible relationships over time. The purpose
of the trial is to determine a form of cropping that leads to less moist
soils in the root zone and beyond. We estimate moisture for each date,
depth and treatment accounting for spatial correlation and determine
relationships of these and other parameters over time.
Journal: Journal of Applied Statistics
Pages: 1455-1474
Issue: 7
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.654771
File-URL: http://hdl.handle.net/10.1080/02664763.2012.654771
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1455-1474
Template-Type: ReDIF-Article 1.0
Author-Name: Jianbo Li
Author-X-Name-First: Jianbo
Author-X-Name-Last: Li
Author-Name: Minggao Gu
Author-X-Name-First: Minggao
Author-X-Name-Last: Gu
Author-Name: Tao Hu
Author-X-Name-First: Tao
Author-X-Name-Last: Hu
Title: General partially linear varying-coefficient transformation models for ranking data
Abstract:
In this paper,we propose a class of general partially linear
varying-coefficient transformation models for ranking data. In the models,
the functional coefficients are viewed as nuisance parameters and
approximated by B-spline smoothing approximation technique. The B-spline
coefficients and regression parameters are estimated by rank-based maximum
marginal likelihood method. The three-stage Monte Carlo Markov Chain
stochastic approximation algorithm based on ranking data is used to
compute estimates and the corresponding variances for all the B-spline
coefficients and regression parameters. Through three simulation studies
and a Hong Kong horse racing data application, the proposed procedure is
illustrated to be accurate, stable and practical.
Journal: Journal of Applied Statistics
Pages: 1475-1488
Issue: 7
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.658357
File-URL: http://hdl.handle.net/10.1080/02664763.2012.658357
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1475-1488
Template-Type: ReDIF-Article 1.0
Author-Name: S�bastien Li-Thiao-T�
Author-X-Name-First: S�bastien
Author-X-Name-Last: Li-Thiao-T�
Author-Name: Daudin Jean-Jacques
Author-X-Name-First: Daudin
Author-X-Name-Last: Jean-Jacques
Author-Name: Robin St�phane
Author-X-Name-First: Robin
Author-X-Name-Last: St�phane
Title: Bayesian model averaging for estimating the number of classes: applications to the total number of species in metagenomics
Abstract:
The species abundance distribution and the total number of species are
fundamental descriptors of the biodiversity of an ecological community.
This paper focuses on situations where large numbers of rare species are
not observed in the data set due to insufficient sampling of the
community, as is the case in metagenomics for the study of microbial
diversity. We use a truncated mixture model for the observations to
explicitly tackle the missing data and propose methods to estimate the
total number of species and, in particular, a Bayesian credibility
interval for this number. We focus on computationally efficient procedures
with variational methods and importance sampling as opposed to Markov
Chain Monte Carlo sampling, and we use Bayesian model averaging as the
number of components of the mixture model is unknown.
Journal: Journal of Applied Statistics
Pages: 1489-1504
Issue: 7
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.658358
File-URL: http://hdl.handle.net/10.1080/02664763.2012.658358
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1489-1504
Template-Type: ReDIF-Article 1.0
Author-Name: Shuyi Jiang
Author-X-Name-First: Shuyi
Author-X-Name-Last: Jiang
Title: Survival in the U.S. petroleum refining industry
Abstract:
Of the 324 petroleum refineries operating in the U.S. in 1982, only 149
were still in the hands of their original owners in 2007. Using duration
analysis, this paper explores why refineries change ownership or shut
down. Plants are more likely to ‘survive’ with their
original owners if they are older or larger, but less likely if the owner
is a major integrated firm, or the refinery is a more technologically
complex one. This latter result differs from existing research on the
issue. This paper also presents a split population model to relax the
general assumption of the duration model that all refiners will eventually
close down; the empirical results show that the split population model
converges on a standard hazard model; the log-logistic version fits best.
Finally, a multinomial logit model is estimated to analyze the factors
that influence the refinery plant's choices of staying open, closing, or
changing ownership. Plant size, age and technology usage have positive
impacts on the likelihood that a refinery will stay open, or change
ownership (rather than close down).
Journal: Journal of Applied Statistics
Pages: 1505-1530
Issue: 7
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.658359
File-URL: http://hdl.handle.net/10.1080/02664763.2012.658359
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1505-1530
Template-Type: ReDIF-Article 1.0
Author-Name: Jinook Jeong
Author-X-Name-First: Jinook
Author-X-Name-Last: Jeong
Author-Name: Byunguk Kang
Author-X-Name-First: Byunguk
Author-X-Name-Last: Kang
Title: Wild-bootstrapped variance-ratio test for autocorrelation in the presence of heteroskedasticity
Abstract:
The Breusch--Godfrey LM test is one of the most popular tests for
autocorrelation. However, it has been shown that the LM test may be
erroneous when there exist heteroskedastic errors in a regression model.
Recently, remedies have been proposed by Godfrey and Tremayne [9] and Shim
et al. [21]. This paper suggests three wild-bootstrapped
variance-ratio (WB-VR) tests for autocorrelation in the presence of
heteroskedasticity. We show through a Monte Carlo simulation that our
WB-VR tests have better small sample properties and are robust to the
structure of heteroskedasticity.
Journal: Journal of Applied Statistics
Pages: 1531-1542
Issue: 7
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.658360
File-URL: http://hdl.handle.net/10.1080/02664763.2012.658360
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1531-1542
Template-Type: ReDIF-Article 1.0
Author-Name: Federico Palacios-González
Author-X-Name-First: Federico
Author-X-Name-Last: Palacios-González
Author-Name: Rosa Mar�a Garc�a-Fernández
Author-X-Name-First: Rosa Mar�a
Author-X-Name-Last: Garc�a-Fernández
Title: Interpretation of the coefficient of determination of an ANOVA model as a measure of polarization
Abstract:
In this paper, it is demonstrated that coefficient of determination of an
ANOVA linear model provides a measure of polarization. Taking as the
starting point the link between polarization and dispersion, we
reformulate the measure of polarization of Zhang and Kanbur using the
decomposition of the variance instead of the decomposition of the Theil
index. We show that the proposed measure is equivalent to the coefficient
of determination of an ANOVA linear model that explains, for example, the
income of the households as a function of any population characteristic
such as education, gender, occupation, etc. This result provides an
alternative way to analyse polarization by sub-populations characteristics
and at the same time allows us to compare sub-populations via the
estimated coefficients of the ANOVA model.
Journal: Journal of Applied Statistics
Pages: 1543-1555
Issue: 7
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.658361
File-URL: http://hdl.handle.net/10.1080/02664763.2012.658361
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1543-1555
Template-Type: ReDIF-Article 1.0
Author-Name: Nicholas T. Longford
Author-X-Name-First: Nicholas T.
Author-X-Name-Last: Longford
Author-Name: Maria Grazia Pittau
Author-X-Name-First: Maria Grazia
Author-X-Name-Last: Pittau
Author-Name: Roberto Zelli
Author-X-Name-First: Roberto
Author-X-Name-Last: Zelli
Author-Name: Riccardo Massari
Author-X-Name-First: Riccardo
Author-X-Name-Last: Massari
Title: Poverty and inequality in European regions
Abstract:
The European Union Statistics on Income and Living Conditions (EU-SILC)
is the main source of information about poverty and economic inequality in
the member states of the European Union. The sample sizes of its annual
national surveys are sufficient for reliable estimation at the national
level but not for inferences at the sub-national level, failing to respond
to a rising demand from policy-makers and local authorities. We provide a
comprehensive map of median income, inequality (Gini coefficient and
Lorenz curve) and poverty (poverty rates) based on the equivalised
household income in the countries in which the EU-SILC is conducted. We
study the distribution of income of households (pro-rated to its members),
not merely its median (or mean), because we regard its dispersion and
frequency of lower extremes (relative poverty) as important
characteristics. The estimation for the regions with small sample sizes is
improved by the small-area methods. The uncertainty of complex nonlinear
statistics is assessed by bootstrap. Household-level sampling weights are
taken into account in both the estimates and the associated bootstrap
standard errors.
Journal: Journal of Applied Statistics
Pages: 1557-1576
Issue: 7
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.661705
File-URL: http://hdl.handle.net/10.1080/02664763.2012.661705
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1557-1576
Template-Type: ReDIF-Article 1.0
Author-Name: Alexander A. Correa
Author-X-Name-First: Alexander A.
Author-X-Name-Last: Correa
Author-Name: Pere Grima
Author-X-Name-First: Pere
Author-X-Name-Last: Grima
Author-Name: Xavier Tort-Martorell
Author-X-Name-First: Xavier
Author-X-Name-Last: Tort-Martorell
Title: Experimentation order in factorial designs: new findings
Abstract:
Under some very reasonable hypotheses, it becomes evident that
randomizing the run order of a factorial experiment does not always
neutralize the effect of undesirable factors. Yet, these factors do have
an influence on the response, depending on the order in which the
experiments are conducted. On the other hand, changing the factor levels
is many times costly; therefore it is not reasonable to leave to chance
the number of changes necessary. For this reason, run orders that offer
the minimum number of factor level changes and at the same time minimize
the possible influence of undesirable factors on the experimentation have
been sought. Sequences which are known to produce the desired properties
in designs with 8 and 16 experiments can be found in the literature. In
this paper, we provide the best possible sequences for designs with 32
experiments, as well as sequences that offer excellent properties for
designs with 64 and 128 experiments. The method used to find them is based
on a mixture of algorithmic searches and an augmentation of smaller
designs.
Journal: Journal of Applied Statistics
Pages: 1577-1591
Issue: 7
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.661706
File-URL: http://hdl.handle.net/10.1080/02664763.2012.661706
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1577-1591
Template-Type: ReDIF-Article 1.0
Author-Name: Alan D. Hutson
Author-X-Name-First: Alan D.
Author-X-Name-Last: Hutson
Author-Name: Gregory E. Wilding
Author-X-Name-First: Gregory E.
Author-X-Name-Last: Wilding
Title: Maintaining the exchangeability assumption for a two-group permutation test in the non-randomized setting
Abstract:
In this note, we develop a new two-group bootstrap-permutation test that
utilizes the tail-extrapolated quantile function estimator for the
bootstrap component. This test is an extension of the standard two-group
permutation test, that through its construction is defined to meet the
exchangeability assumption, and thus it guarantees that the type I error
is appropriately bounded by definition. This methodology is particularly
useful in the non-randomized two-group setting for which the
exchangeability assumption for the traditional two-group permutation test
is untestable. We develop some theoretical results for the new test,
followed by a simulation study and an example.
Journal: Journal of Applied Statistics
Pages: 1593-1603
Issue: 7
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.661707
File-URL: http://hdl.handle.net/10.1080/02664763.2012.661707
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1593-1603
Template-Type: ReDIF-Article 1.0
Author-Name: Young Joo Yoon
Author-X-Name-First: Young Joo
Author-X-Name-Last: Yoon
Author-Name: Cheolwoo Park
Author-X-Name-First: Cheolwoo
Author-X-Name-Last: Park
Author-Name: Erik Hofmeister
Author-X-Name-First: Erik
Author-X-Name-Last: Hofmeister
Author-Name: Sangwook Kang
Author-X-Name-First: Sangwook
Author-X-Name-Last: Kang
Title: Group variable selection in cardiopulmonary cerebral resuscitation data for veterinary patients
Abstract:
Cardiopulmonary cerebral resuscitation (CPCR) is a procedure to restore
spontaneous circulation in patients with cardiopulmonary arrest (CPA).
While animals with CPA generally have a lower success rate of CPCR than
people do, CPCR studies in veterinary patients have been limited. In this
paper, we construct a model for predicting success or failure of CPCR, and
identifying and evaluating factors that affect the success of CPCR in
veterinary patients. Due to reparametrization using multiple dummy
variables or close proximity in nature, many variables in the data form
groups, and thus a desirable method should take this grouping feature into
account in variable selection. To accomplish these goals, we propose an
adaptive group bridge method for a logistic regression model. The
performance of the proposed method is evaluated under different simulated
setups and compared with several other regression methods. Using the
logistic group bridge model, we analyze data from a CPCR study for
veterinary patients and discuss their implications on the practice of
veterinary medicine.
Journal: Journal of Applied Statistics
Pages: 1605-1621
Issue: 7
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.661929
File-URL: http://hdl.handle.net/10.1080/02664763.2012.661929
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:7:p:1605-1621
Template-Type: ReDIF-Article 1.0
Author-Name: Thaung Lwin
Author-X-Name-First: Thaung
Author-X-Name-Last: Lwin
Title: Modelling of connected processes
Abstract:
The problem of comparing, contrasting and combining information from
different sets of data is an enduring one in many practical applications
of statistics. A specific problem of combining information from different
sources arose in integrating information from three different sets of data
generated by three different sampling campaigns at the input stage as well
as at the output stage of a grey-water treatment process. For each stage,
a common process trend function needs to be estimated to describe the
input and output material process behaviours. Once the common input and
output process models are established, it is required to estimate the
efficiency of the grey-water treatment method. A synthesized tool for
modelling different sets of process data is created by assembling and
organizing a number of existing techniques: (i) a mixed model of fixed and
random effects, extended to allow for a nonlinear fixed effect, (ii)
variogram modelling, a geostatistical technique, (iii) a weighted least
squares regression embedded in an iterative maximum-likelihood technique
to handle linear/nonlinear fixed and random effects and (iv) a formulation
of a transfer-function model for the input and output processes together
with a corresponding nonlinear maximum-likelihood method for estimation of
a transfer function. The synthesized tool is demonstrated, in a new case
study, to contrast and combine information from connected process models
and to determine the change in one quality characteristic, namely pH, of
the input and output materials of a grey-water filtering process.
Journal: Journal of Applied Statistics
Pages: 1623-1641
Issue: 8
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.663345
File-URL: http://hdl.handle.net/10.1080/02664763.2012.663345
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1623-1641
Template-Type: ReDIF-Article 1.0
Author-Name: Tatjana Pavlenko
Author-X-Name-First: Tatjana
Author-X-Name-Last: Pavlenko
Author-Name: Anders Björkström
Author-X-Name-First: Anders
Author-X-Name-Last: Björkström
Author-Name: Annika Tillander
Author-X-Name-First: Annika
Author-X-Name-Last: Tillander
Title: Covariance structure approximation via gLasso in high-dimensional supervised classification
Abstract:
Recent work has shown that the Lasso-based regularization is very useful
for estimating the high-dimensional inverse covariance matrix. A
particularly useful scheme is based on penalizing the ℓ1
norm of the off-diagonal elements to encourage sparsity. We embed this
type of regularization into high-dimensional classification. A two-stage
estimation procedure is proposed which first recovers structural zeros of
the inverse covariance matrix and then enforces block sparsity by moving
non-zeros closer to the main diagonal. We show that the block-diagonal
approximation of the inverse covariance matrix leads to an additive
classifier, and demonstrate that accounting for the structure can yield
better performance accuracy. Effect of the block size on classification is
explored, and a class of asymptotically equivalent structure
approximations in a high-dimensional setting is specified. We suggest a
variable selection at the block level and investigate properties of this
procedure in growing dimension asymptotics. We present a consistency
result on the feature selection procedure, establish asymptotic lower an
upper bounds for the fraction of separative blocks and specify constraints
under which the reliable classification with block-wise feature selection
can be performed. The relevance and benefits of the proposed approach are
illustrated on both simulated and real data.
Journal: Journal of Applied Statistics
Pages: 1643-1666
Issue: 8
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.663346
File-URL: http://hdl.handle.net/10.1080/02664763.2012.663346
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1643-1666
Template-Type: ReDIF-Article 1.0
Author-Name: Yueqin Zhao
Author-X-Name-First: Yueqin
Author-X-Name-Last: Zhao
Author-Name: Dayanand N. Naik
Author-X-Name-First: Dayanand N.
Author-X-Name-Last: Naik
Title: Hypothesis testing with Rao's quadratic entropy and its application to Dinosaur biodiversity
Abstract:
Entropy indices, such as Shannon entropy and Gini-Simpson index, have
been used for analysing biological diversities. However, these entropy
indices are based on abundance of the species only and they do not take
differences between the species into consideration. Rao's quadratic
entropy has found many applications in different fields including ecology.
Further, the quadratic entropy (QE) index is the only ecological diversity
index that reflects both the differences and abundances of the species.
The problem of testing of hypothesis of the equality of QEs is formulated
as a problem of comparing practical equivalence intervals. Simulation
experiments are used to compare various equivalence intervals. Previously
analyzed dinosaur data are used to illustrate the methods for determining
biodiversity.
Journal: Journal of Applied Statistics
Pages: 1667-1680
Issue: 8
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.663347
File-URL: http://hdl.handle.net/10.1080/02664763.2012.663347
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1667-1680
Template-Type: ReDIF-Article 1.0
Author-Name: Anna Klimova
Author-X-Name-First: Anna
Author-X-Name-Last: Klimova
Author-Name: Tamás Rudas
Author-X-Name-First: Tamás
Author-X-Name-Last: Rudas
Title: Coordinate-free analysis of trends in British social mobility
Abstract:
This paper is intended to make a contribution to the ongoing debate about
declining social mobility in Great Britain by analyzing mobility tables
based on data from the 1991 British Household Panel Survey and the 2005
General Household Survey. The models proposed here generalize Hauser's
levels models and allow for a semi-parametric analysis of change in social
mobility. The cell frequencies are assumed to be equal to the product of
three effects: the effect of the father's position for the given year, the
effect of the son's position for the given year, and the mobility effect
related to the difference between the father's and the son's positions. A
generalization of the iterative proportional fitting procedure is proposed
and applied to computing the maximum likelihood estimates of the cell
frequencies. The standard errors of the estimated parameters are computed
under the product-multinomial sampling assumption. The results indicate
opposing trends of mobility between the two timepoints. Fewer steps up or
down in the society became less likely, while more steps became somewhat
more likely.
Journal: Journal of Applied Statistics
Pages: 1681-1691
Issue: 8
Volume: 39
Year: 2012
Month: 1
X-DOI: 10.1080/02664763.2012.663348
File-URL: http://hdl.handle.net/10.1080/02664763.2012.663348
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1681-1691
Template-Type: ReDIF-Article 1.0
Author-Name: Valentina Mameli
Author-X-Name-First: Valentina
Author-X-Name-Last: Mameli
Author-Name: Monica Musio
Author-X-Name-First: Monica
Author-X-Name-Last: Musio
Author-Name: Erik Sauleau
Author-X-Name-First: Erik
Author-X-Name-Last: Sauleau
Author-Name: Annibale Biggeri
Author-X-Name-First: Annibale
Author-X-Name-Last: Biggeri
Title: Large sample confidence intervals for the skewness parameter of the skew-normal distribution based on Fisher's transformation
Abstract:
The skew-normal model is a class of distributions that extends the
Gaussian family by including a skewness parameter. This model presents
some inferential problems linked to the estimation of the skewness
parameter. In particular its maximum likelihood estimator can be infinite
especially for moderate sample sizes and is not clear how to calculate
confidence intervals for this parameter. In this work, we show how these
inferential problems can be solved if we are interested in the
distribution of extreme statistics of two random variables with joint
normal distribution. Such situations are not uncommon in applications,
especially in medical and environmental contexts, where it can be relevant
to estimate the distribution of extreme statistics. A theoretical result,
found by Loperfido [7], proves that such extreme statistics have a
skew-normal distribution with skewness parameter that can be expressed as
a function of the correlation coefficient between the two initial
variables. It is then possible, using some theoretical results involving
the correlation coefficient, to find approximate confidence intervals for
the parameter of skewness. These theoretical intervals are then compared
with parametric bootstrap intervals by means of a simulation study. Two
applications are given using real data.
Journal: Journal of Applied Statistics
Pages: 1693-1702
Issue: 8
Volume: 39
Year: 2012
Month: 2
X-DOI: 10.1080/02664763.2012.668177
File-URL: http://hdl.handle.net/10.1080/02664763.2012.668177
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1693-1702
Template-Type: ReDIF-Article 1.0
Author-Name: Wafaa Benyelles
Author-X-Name-First: Wafaa
Author-X-Name-Last: Benyelles
Author-Name: Tahar Mourid
Author-X-Name-First: Tahar
Author-X-Name-Last: Mourid
Title: On a minimum distance estimate of the period in functional autoregressive processes
Abstract:
We consider a continuous time random process with functional
autoregressive representation. We state statistical results on a mean
functional estimator determining a minimum distance estimator of the
period giving consistency and a limit law stated in Mourid and Benyelles
[13]. Then we discuss their performance on numerical simulations and on
real data analyzing the cycle of a climatic phenomena.
Journal: Journal of Applied Statistics
Pages: 1703-1718
Issue: 8
Volume: 39
Year: 2012
Month: 2
X-DOI: 10.1080/02664763.2012.668178
File-URL: http://hdl.handle.net/10.1080/02664763.2012.668178
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1703-1718
Template-Type: ReDIF-Article 1.0
Author-Name: Lan Zhu
Author-X-Name-First: Lan
Author-X-Name-Last: Zhu
Author-Name: Su Chen
Author-X-Name-First: Su
Author-X-Name-Last: Chen
Author-Name: Zhuoxin Jiang
Author-X-Name-First: Zhuoxin
Author-X-Name-Last: Jiang
Author-Name: Zhiwu Zhang
Author-X-Name-First: Zhiwu
Author-X-Name-Last: Zhang
Author-Name: Hung-Chih Ku
Author-X-Name-First: Hung-Chih
Author-X-Name-Last: Ku
Author-Name: Xuesong Li
Author-X-Name-First: Xuesong
Author-X-Name-Last: Li
Author-Name: Melinda McCann
Author-X-Name-First: Melinda
Author-X-Name-Last: McCann
Author-Name: Steve Harris
Author-X-Name-First: Steve
Author-X-Name-Last: Harris
Author-Name: George Lust
Author-X-Name-First: George
Author-X-Name-Last: Lust
Author-Name: Pual Jones
Author-X-Name-First: Pual
Author-X-Name-Last: Jones
Author-Name: Rory Todhunter
Author-X-Name-First: Rory
Author-X-Name-Last: Todhunter
Title: Identification of quantitative trait loci for canine hip dysplasia by two sequential multipoint linkage analyses
Abstract:
Canine hip dysplasia (CHD) is characterized by hip laxity and subluxation
that can lead to hip osteoarthritis. Studies have shown the involvement of
multiple genetic regions in the expression of CHD. Although we have
associated some variants in the region of fibrillin 2 with CHD in a subset
of dogs, no major disease-associated gene has been identified. The focus
of this study is to identify quantitative trait loci (QTL) associated with
CHD. Two sequential multipoint linkage analyses based on a reversible jump
Markov chain Monte Carlo approach were applied on a cross-breed pedigree
of 366 dogs. Hip radiographic trait (Norberg Angle, NA) on both hips of
each dog was tested for linkage to 21,455 single nucleotide polymorphisms
across 39 chromosomes. Putative QTL for the NA was found on 11 chromosomes
(1, 2, 3, 4, 7, 14, 19, 21, 32, 36, and 39). Identification of genes in
the QTL region(s) can assist in identification of the aberrant genes and
biochemical pathways involving hip dysplasia in both dogs and humans.
Journal: Journal of Applied Statistics
Pages: 1719-1731
Issue: 8
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2012.673121
File-URL: http://hdl.handle.net/10.1080/02664763.2012.673121
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1719-1731
Template-Type: ReDIF-Article 1.0
Author-Name: Huiyun Wu
Author-X-Name-First: Huiyun
Author-X-Name-Last: Wu
Author-Name: Qingxia Chen
Author-X-Name-First: Qingxia
Author-X-Name-Last: Chen
Author-Name: Lorraine B. Ware
Author-X-Name-First: Lorraine B.
Author-X-Name-Last: Ware
Author-Name: Tatsuki Koyama
Author-X-Name-First: Tatsuki
Author-X-Name-Last: Koyama
Title: A Bayesian approach for generalized linear models with explanatory biomarker measurement variables subject to detection limit: an application to acute lung injury
Abstract:
Biomarkers have the potential to improve our understanding of disease
diagnosis and prognosis. Biomarker levels that fall below the assay
detection limits (DLs), however, compromise the application of biomarkers
in research and practice. Most existing methods to handle non-detects
focus on a scenario in which the response variable is subject to the DL;
only a few methods consider explanatory variables when dealing with DLs.
We propose a Bayesian approach for generalized linear models with
explanatory variables subject to lower, upper, or interval DLs. In
simulation studies, we compared the proposed Bayesian approach to four
commonly used methods in a logistic regression model with explanatory
variable measurements subject to the DL. We also applied the Bayesian
approach and other four methods in a real study, in which a panel of
cytokine biomarkers was studied for their association with acute lung
injury (ALI). We found that IL8 was associated with a moderate increase in
risk for ALI in the model based on the proposed Bayesian approach.
Journal: Journal of Applied Statistics
Pages: 1733-1747
Issue: 8
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2012.681362
File-URL: http://hdl.handle.net/10.1080/02664763.2012.681362
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1733-1747
Template-Type: ReDIF-Article 1.0
Author-Name: Júlia Teles
Author-X-Name-First: Júlia
Author-X-Name-Last: Teles
Title: Concordance coefficients to measure the agreement among several sets of ranks
Abstract:
In this paper, two measures of agreement among several sets of ranks,
Kendall's concordance coefficient and top-down concordance coefficient,
are reviewed. In order to illustrate the utility of these measures, two
examples, in the fields of health and sports, are presented. A Monte Carlo
simulation study was carried out to compare the performance of Kendall's
and top-down concordance coefficients in detecting several types and
magnitudes of agreements. The data generation scheme was developed in
order to induce an agreement with different intensities among
m (m>2) sets of ranks in non-directional
and directional rank agreement scenarios. The performance of each
coefficient was estimated by the proportion of rejected null hypotheses,
assessed at 5% significance level, when testing whether the underlying
population concordance coefficient is sufficiently greater than zero. For
the directional rank agreement scenario, the top-down concordance
coefficient allowed to achieve a percentage of significant concordances
that was higher than the one achieved by Kendall's concordance
coefficient. Mainly, when the degree of agreement was small, the results
of the simulation study pointed to the advantage of using a weighted rank
concordance, namely the top-down concordance coefficient, simultaneously
with Kendall's concordance coefficient, enabling the detection of
agreement (in a top-down sense) in situations not detected by Kendall's
concordance coefficient.
Journal: Journal of Applied Statistics
Pages: 1749-1764
Issue: 8
Volume: 39
Year: 2012
Month: 3
X-DOI: 10.1080/02664763.2012.681460
File-URL: http://hdl.handle.net/10.1080/02664763.2012.681460
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1749-1764
Template-Type: ReDIF-Article 1.0
Author-Name: Ghazi Shukur
Author-X-Name-First: Ghazi
Author-X-Name-Last: Shukur
Author-Name: Zangin Zeebari
Author-X-Name-First: Zangin
Author-X-Name-Last: Zeebari
Title: Median regression for SUR models with the same explanatory variables in each equation
Abstract:
In this paper we introduce an interesting feature of the generalized
least absolute deviations method for seemingly unrelated regression
equations (SURE) models. Contrary to the collapse of generalized
leasts-quares parameter estimations of SURE models to the ordinary
least-squares estimations of the individual equations when the same
regressors are common between all equations, the estimations of the
proposed methodology are not identical to the least absolute deviations
estimations of the individual equations. This is important since contrary
to the least-squares methods, one can take advantage of efficiency gain
due to cross-equation correlations even if the system includes the same
regressors in each equation.
Journal: Journal of Applied Statistics
Pages: 1765-1779
Issue: 8
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2012.682566
File-URL: http://hdl.handle.net/10.1080/02664763.2012.682566
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1765-1779
Template-Type: ReDIF-Article 1.0
Author-Name: Nairanjana Dasgupta
Author-X-Name-First: Nairanjana
Author-X-Name-Last: Dasgupta
Author-Name: Monte J. Shaffer
Author-X-Name-First: Monte J.
Author-X-Name-Last: Shaffer
Title: Many-to-one comparison of nonlinear growth curves for Washington's Red Delicious apple
Abstract:
In this article, we are interested in comparing growth curves for the Red
Delicious apple in several locations to that of a reference site. Although
such multiple comparisons are common for linear models, statistical
techniques for nonlinear models are not prolific. We theoretically derive
a test statistic, considering the issues of sample size and design points.
Under equal sample sizes and same design points, our test statistic is
based on the maximum of an equi-correlated multivariate chi-square
distribution. Under unequal sample sizes and design points, we derive a
general correlation structure, and then utilize the multivariate normal
distribution to numerically compute critical points for the maximum of the
multivariate chi-square. We apply this statistical technique to compare
the growth of Red Delicious apples at six locations to a reference site in
the state of Washington in 2009. Finally, we perform simulations to verify
the performance of our proposed procedure for Type I error and marginal
power. Our proposed method performs well in regard to both.
Journal: Journal of Applied Statistics
Pages: 1781-1795
Issue: 8
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2012.683168
File-URL: http://hdl.handle.net/10.1080/02664763.2012.683168
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1781-1795
Template-Type: ReDIF-Article 1.0
Author-Name: Wenceslao Gonzalez-Manteiga
Author-X-Name-First: Wenceslao
Author-X-Name-Last: Gonzalez-Manteiga
Author-Name: Guillermo Henry
Author-X-Name-First: Guillermo
Author-X-Name-Last: Henry
Author-Name: Daniela Rodriguez
Author-X-Name-First: Daniela
Author-X-Name-Last: Rodriguez
Title: Partly linear models on Riemannian manifolds
Abstract:
In partly linear models, the dependence of the response
y on (x -super-T, t) is
modeled through the relationship y=x
-super-T
β+g(t)+ϵ, where
ϵ is independent of (x -super-T, t).
We are interested in developing an estimation procedure that allows us to
combine the flexibility of the partly linear models, studied by several
authors, but including some variables that belong to a non-Euclidean
space. The motivating application of this paper deals with the explanation
of the atmospheric SO2 pollution incidents using these models
when some of the predictive variables belong in a cylinder. In this paper,
the estimators of β and g are
constructed when the explanatory variables t take values
on a Riemannian manifold and the asymptotic properties of the proposed
estimators are obtained under suitable conditions. We illustrate the use
of this estimation approach using an environmental data set and we explore
the performance of the estimators through a simulation study.
Journal: Journal of Applied Statistics
Pages: 1797-1809
Issue: 8
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2012.683169
File-URL: http://hdl.handle.net/10.1080/02664763.2012.683169
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1797-1809
Template-Type: ReDIF-Article 1.0
Author-Name: Yu-Jau Lin
Author-X-Name-First: Yu-Jau
Author-X-Name-Last: Lin
Author-Name: Y. L. Lio
Author-X-Name-First: Y. L.
Author-X-Name-Last: Lio
Title: Bayesian inference under progressive type-I interval censoring
Abstract:
Bayesian estimation for population parameter under progressive type-I
interval censoring is studied via Markov Chain Monte Carlo (MCMC)
simulation. Two competitive statistical models, generalized exponential
and Weibull distributions for modeling a real data set containing 112
patients with plasma cell myeloma, are studied for illustration. In model
selection, a novel Bayesian procedure which involves a mixture model is
proposed. Then the mix proportion is estimated through MCMC and used as
the model selection criterion.
Journal: Journal of Applied Statistics
Pages: 1811-1824
Issue: 8
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2012.683170
File-URL: http://hdl.handle.net/10.1080/02664763.2012.683170
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1811-1824
Template-Type: ReDIF-Article 1.0
Author-Name: Vasileios Maroulas
Author-X-Name-First: Vasileios
Author-X-Name-Last: Maroulas
Title: Error analysis of stochastic flight trajectory prediction models
Abstract:
This paper focuses on the analysis of errors between a flight trajectory
prediction model and flight data. A novel stochastic prediction flight
model is compared with the popular fly-by and fly-over turn models. The
propagated error is measured using either spatial coordinates or angles.
Depending on the case, the distribution of error is estimated and
confidence bounds for the linear and directional mean are provided for all
three stochastic flight models.
Journal: Journal of Applied Statistics
Pages: 1825-1841
Issue: 8
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2012.683171
File-URL: http://hdl.handle.net/10.1080/02664763.2012.683171
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1825-1841
Template-Type: ReDIF-Article 1.0
Author-Name: Han Lin Shang
Author-X-Name-First: Han Lin
Author-X-Name-Last: Shang
Title: Graphics for statistics and data analysis with R
Journal: Journal of Applied Statistics
Pages: 1843-1844
Issue: 8
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.679355
File-URL: http://hdl.handle.net/10.1080/02664763.2012.679355
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1843-1844
Template-Type: ReDIF-Article 1.0
Author-Name: Ivana Holloway
Author-X-Name-First: Ivana
Author-X-Name-Last: Holloway
Title: Design and analysis of quality of life studies in clinical trials
Journal: Journal of Applied Statistics
Pages: 1844-1845
Issue: 8
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.657416
File-URL: http://hdl.handle.net/10.1080/02664763.2012.657416
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1844-1845
Template-Type: ReDIF-Article 1.0
Author-Name: Chris Beeley
Author-X-Name-First: Chris
Author-X-Name-Last: Beeley
Title: Applied Bayesian hierarchical methods
Journal: Journal of Applied Statistics
Pages: 1845-1845
Issue: 8
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.657788
File-URL: http://hdl.handle.net/10.1080/02664763.2012.657788
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1845-1845
Template-Type: ReDIF-Article 1.0
Author-Name: John Paul Gosling
Author-X-Name-First: John Paul
Author-X-Name-Last: Gosling
Title: Bayesian analysis made simple: An excel GUI for WinBUGS
Journal: Journal of Applied Statistics
Pages: 1845-1846
Issue: 8
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.679822
File-URL: http://hdl.handle.net/10.1080/02664763.2012.679822
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1845-1846
Template-Type: ReDIF-Article 1.0
Author-Name: Zhengfeng Guo
Author-X-Name-First: Zhengfeng
Author-X-Name-Last: Guo
Title: Linear model methodology
Journal: Journal of Applied Statistics
Pages: 1846-1847
Issue: 8
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.679823
File-URL: http://hdl.handle.net/10.1080/02664763.2012.679823
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1846-1847
Template-Type: ReDIF-Article 1.0
Author-Name: Søren Feodor Nielsen
Author-X-Name-First: Søren Feodor
Author-X-Name-Last: Nielsen
Title: An elementary introduction to mathematical finance
Journal: Journal of Applied Statistics
Pages: 1847-1848
Issue: 8
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.680688
File-URL: http://hdl.handle.net/10.1080/02664763.2012.680688
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1847-1848
Template-Type: ReDIF-Article 1.0
Author-Name: Derek S. Young
Author-X-Name-First: Derek S.
Author-X-Name-Last: Young
Title: Optimal experimental design with R
Journal: Journal of Applied Statistics
Pages: 1848-1849
Issue: 8
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.680689
File-URL: http://hdl.handle.net/10.1080/02664763.2012.680689
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1848-1849
Template-Type: ReDIF-Article 1.0
Author-Name: A. M. Mosammam
Author-X-Name-First: A. M.
Author-X-Name-Last: Mosammam
Title: The Oxford handbook of nonlinear filtering
Journal: Journal of Applied Statistics
Pages: 1849-1850
Issue: 8
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.680690
File-URL: http://hdl.handle.net/10.1080/02664763.2012.680690
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1849-1850
Template-Type: ReDIF-Article 1.0
Author-Name: Şebnem Er
Author-X-Name-First: Şebnem
Author-X-Name-Last: Er
Title: Multivariate generalized linear models using R
Journal: Journal of Applied Statistics
Pages: 1851-1851
Issue: 8
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.681563
File-URL: http://hdl.handle.net/10.1080/02664763.2012.681563
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:8:p:1851-1851
Template-Type: ReDIF-Article 1.0
Author-Name: Francesca Martella
Author-X-Name-First: Francesca
Author-X-Name-Last: Martella
Author-Name: Maurizio Vichi
Author-X-Name-First: Maurizio
Author-X-Name-Last: Vichi
Title: Clustering microarray data using model-based double K-means
Abstract:
The microarray technology allows the measurement of expression levels of
thousands of genes simultaneously. The dimension and complexity of gene
expression data obtained by microarrays create challenging data analysis
and management problems ranging from the analysis of images produced by
microarray experiments to biological interpretation of results. Therefore,
statistical and computational approaches are beginning to assume a
substantial position within the molecular biology area. We consider the
problem of simultaneously clustering genes and tissue samples (in general
conditions) of a microarray data set. This can be useful for revealing
groups of genes involved in the same molecular process as well as groups
of conditions where this process takes place. The need of finding a subset
of genes and tissue samples defining a homogeneous block had led to the
application of double clustering techniques on gene
expression data. Here, we focus on an extension of standard
K-means to simultaneously cluster observations and
features of a data matrix, namely double
K-means introduced by Vichi (2000). We introduce
this model in a probabilistic framework and discuss the advantages of
using this approach. We also develop a coordinate ascent algorithm and
test its performance via simulation studies and real data set. Finally, we
validate the results obtained on the real data set by building resampling
confidence intervals for block centroids.
Journal: Journal of Applied Statistics
Pages: 1853-1869
Issue: 9
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2012.683172
File-URL: http://hdl.handle.net/10.1080/02664763.2012.683172
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:1853-1869
Template-Type: ReDIF-Article 1.0
Author-Name: F. Lombard
Author-X-Name-First: F.
Author-X-Name-Last: Lombard
Author-Name: R. K. Maxwell
Author-X-Name-First: R. K.
Author-X-Name-Last: Maxwell
Title: A cusum procedure to detect deviations from uniformity in angular data
Abstract:
We propose a sequential cumulative sum procedure to detect deviations
from uniformity in angular data. The method is motivated by a problem in
high-energy astrophysics and is illustrated by an application to data.
Journal: Journal of Applied Statistics
Pages: 1871-1880
Issue: 9
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2012.683857
File-URL: http://hdl.handle.net/10.1080/02664763.2012.683857
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:1871-1880
Template-Type: ReDIF-Article 1.0
Author-Name: Pao-Sheng Shen
Author-X-Name-First: Pao-Sheng
Author-X-Name-Last: Shen
Title: Semiparametric mixed-effects models for clustered doubly censored data
Abstract:
The Cox proportional frailty model with a random effect has been proposed
for the analysis of right-censored data which consist of a large number of
small clusters of correlated failure time observations. For right-censored
data, Cai et al. [3] proposed a class of semiparametric
mixed-effects models which provides useful alternatives to the Cox model.
We demonstrate that the approach of Cai et al. [3] can be
used to analyze clustered doubly censored data when both left- and
right-censoring variables are always observed. The asymptotic properties
of the proposed estimator are derived. A simulation study is conducted to
investigate the performance of the proposed estimator.
Journal: Journal of Applied Statistics
Pages: 1881-1892
Issue: 9
Volume: 39
Year: 2012
Month: 4
X-DOI: 10.1080/02664763.2012.684874
File-URL: http://hdl.handle.net/10.1080/02664763.2012.684874
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:1881-1892
Template-Type: ReDIF-Article 1.0
Author-Name: Gikuang Jeff Chen
Author-X-Name-First: Gikuang Jeff
Author-X-Name-Last: Chen
Title: A simple way to deal with multicollinearity
Abstract:
Despite the long and frustrating history of struggling with the wrong
signs or other types of implausible estimates under multicollinearity, it
turns out that the problem can be solved in a surprisingly easy way. This
paper presents a simple approach that ensures both statistically sound and
theoretically consistent estimates under multicollinearity. The approach
is simple in the sense that it requires nothing but basic statistical
methods plus a piece of a priori knowledge. In addition,
the approach is robust even to the extreme case when the a
priori knowledge is wrong. A simulation test shows astonishingly
superior performance of the method in repeated samples comparing to the
OLS, the Ridge Regression and the Dropping-Variable approach.
Journal: Journal of Applied Statistics
Pages: 1893-1909
Issue: 9
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2012.690857
File-URL: http://hdl.handle.net/10.1080/02664763.2012.690857
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:1893-1909
Template-Type: ReDIF-Article 1.0
Author-Name: Chung-I Li
Author-X-Name-First: Chung-I
Author-X-Name-Last: Li
Author-Name: Jeh-Nan Pan
Author-X-Name-First: Jeh-Nan
Author-X-Name-Last: Pan
Title: Sample size determination for estimating multivariate process capability indices based on lower confidence limits
Abstract:
With the advent of modern technology, manufacturing processes have become
very sophisticated; a single quality characteristic can no longer reflect
a product's quality. In order to establish performance measures for
evaluating the capability of a multivariate manufacturing process, several
new multivariate capability (NMC) indices, such as NMC
p and NMC pm , have
been developed over the past few years. However, the sample size
determination for multivariate process capability indices has not been
thoroughly considered in previous studies. Generally, the larger the
sample size, the more accurate an estimation will be. However, too large a
sample size may result in excessive costs. Hence, the trade-off between
sample size and precision in estimation is a critical issue. In this
paper, the lower confidence limits of NMC p
and NMC pm indices are used to determine the
appropriate sample size. Moreover, a procedure for conducting the
multivariate process capability study is provided. Finally, two numerical
examples are given to demonstrate that the proper determination of sample
size for multivariate process indices can achieve a good balance between
sampling costs and estimation precision.
Journal: Journal of Applied Statistics
Pages: 1911-1920
Issue: 9
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2012.690858
File-URL: http://hdl.handle.net/10.1080/02664763.2012.690858
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:1911-1920
Template-Type: ReDIF-Article 1.0
Author-Name: Ainhoa Oguiza Tovar
Author-X-Name-First: Ainhoa Oguiza
Author-X-Name-Last: Tovar
Author-Name: Inmaculada Gallastegui Zulaica
Author-X-Name-First: Inmaculada Gallastegui
Author-X-Name-Last: Zulaica
Author-Name: Vicente Núñez-Antón
Author-X-Name-First: Vicente
Author-X-Name-Last: Núñez-Antón
Title: Analysis of pseudo-panel data with dependent samples
Abstract:
In this paper, we discuss a model for pseudo-panel data when some but not
all of the individuals stay in the sample for more than one period. We use
data on the labor market of the Basque Country from 1993 to 1999 treated
through FORTRAN 77 programing. We construct economically reasonable age
cohorts for active population and use gender, qualification and social
status as explanatory variables in our model. Given the class of data we
use, we analyze the properties of the random error and estimate the model
through maximum likelihood, finding significant results from an applied
point of view.
Journal: Journal of Applied Statistics
Pages: 1921-1937
Issue: 9
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2012.696593
File-URL: http://hdl.handle.net/10.1080/02664763.2012.696593
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:1921-1937
Template-Type: ReDIF-Article 1.0
Author-Name: R. L.J. Coetzer
Author-X-Name-First: R. L.J.
Author-X-Name-Last: Coetzer
Author-Name: R. F Rossouw
Author-X-Name-First: R. F
Author-X-Name-Last: Rossouw
Author-Name: N. J. Le Roux
Author-X-Name-First: N. J.
Author-X-Name-Last: Le Roux
Title: Efficient maximin distance designs for experiments in mixtures
Abstract:
In this paper, different dissimilarity measures are investigated to
construct maximin designs for compositional data. Specifically, the effect
of different dissimilarity measures on the maximin design criterion for
two case studies is presented. Design evaluation criteria are proposed to
distinguish between the maximin designs generated. An optimization
algorithm is also presented. Divergence is found to be the best
dissimilarity measure to use in combination with the maximin design
criterion for creating space-filling designs for mixture variables.
Journal: Journal of Applied Statistics
Pages: 1939-1951
Issue: 9
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2012.697131
File-URL: http://hdl.handle.net/10.1080/02664763.2012.697131
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:1939-1951
Template-Type: ReDIF-Article 1.0
Author-Name: Jorge Alberto Achcar
Author-X-Name-First: Jorge Alberto
Author-X-Name-Last: Achcar
Author-Name: Edilberto Cepeda-Cuervo
Author-X-Name-First: Edilberto
Author-X-Name-Last: Cepeda-Cuervo
Author-Name: Eliane R. Rodrigues
Author-X-Name-First: Eliane R.
Author-X-Name-Last: Rodrigues
Title: Weibull and generalised exponential overdispersion models with an application to ozone air pollution
Abstract:
We consider the problem of estimating the mean and variance of the time
between occurrences of an event of interest (inter-occurrences times)
where some forms of dependence between two consecutive time intervals are
allowed. Two basic density functions are taken into account. They are the
Weibull and the generalised exponential density functions. In order to
capture the dependence between two consecutive inter-occurrences times, we
assume that either the shape and/or the scale parameters of the two
density functions are given by auto-regressive models. The expressions for
the mean and variance of the inter-occurrences times are presented. The
models are applied to the ozone data from two regions of Mexico City. The
estimation of the parameters is performed using a Bayesian point of view
via Markov chain Monte Carlo (MCMC) methods.
Journal: Journal of Applied Statistics
Pages: 1953-1963
Issue: 9
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2012.697132
File-URL: http://hdl.handle.net/10.1080/02664763.2012.697132
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:1953-1963
Template-Type: ReDIF-Article 1.0
Author-Name: E. Bahrami Samani
Author-X-Name-First: E. Bahrami
Author-X-Name-Last: Samani
Author-Name: Y. Amirian
Author-X-Name-First: Y.
Author-X-Name-Last: Amirian
Author-Name: M. Ganjali
Author-X-Name-First: M.
Author-X-Name-Last: Ganjali
Title: Likelihood estimation for longitudinal zero-inflated power series regression models
Abstract:
In this paper, a zero-inflated power series regression model for
longitudinal count data with excess zeros is presented. We demonstrate how
to calculate the likelihood for such data when it is assumed that the
increment in the cumulative total follows a discrete distribution with a
location parameter that depends on a linear function of explanatory
variables. Simulation studies indicate that this method can provide
improvements in obtaining standard errors of the estimates. We also
calculate the dispersion index for this model. The influence of a small
perturbation of the dispersion index of the zero-inflated model on
likelihood displacement is also studied. The zero-inflated negative
binomial regression model is illustrated on data regarding joint damage in
psoriatic arthritis.
Journal: Journal of Applied Statistics
Pages: 1965-1974
Issue: 9
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2012.699951
File-URL: http://hdl.handle.net/10.1080/02664763.2012.699951
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:1965-1974
Template-Type: ReDIF-Article 1.0
Author-Name: Loukia Meligkotsidou
Author-X-Name-First: Loukia
Author-X-Name-Last: Meligkotsidou
Author-Name: Elias Tzavalis
Author-X-Name-First: Elias
Author-X-Name-Last: Tzavalis
Author-Name: Ioannis D. Vrontos
Author-X-Name-First: Ioannis D.
Author-X-Name-Last: Vrontos
Title: A Bayesian panel data framework for examining the economic growth convergence hypothesis: do the G7 countries converge?
Abstract:
In this paper, we suggest a Bayesian panel (longitudinal) data approach
to test for the economic growth convergence hypothesis. This approach can
control for possible effects of initial income conditions, observed
covariates and cross-sectional correlation of unobserved common error
terms on inference procedures about the unit root hypothesis based on
panel data dynamic models. Ignoring these effects can lead to spurious
evidence supporting economic growth divergence. The application of our
suggested approach to real gross domestic product panel data of the G7
countries indicates that the economic growth convergence hypothesis is
supported by the data. Our empirical analysis shows that evidence of
economic growth divergence for the G7 countries can be attributed to not
accounting for the presence of exogenous covariates in the model.
Journal: Journal of Applied Statistics
Pages: 1975-1990
Issue: 9
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2012.699952
File-URL: http://hdl.handle.net/10.1080/02664763.2012.699952
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:1975-1990
Template-Type: ReDIF-Article 1.0
Author-Name: Hongxia Yang
Author-X-Name-First: Hongxia
Author-X-Name-Last: Yang
Author-Name: Jun Wang
Author-X-Name-First: Jun
Author-X-Name-Last: Wang
Author-Name: Alexsandra Mojslovic
Author-X-Name-First: Alexsandra
Author-X-Name-Last: Mojslovic
Title: Cascade model with Dirichlet process for analyzing multiple dyadic matrices
Abstract:
Dyadic matrices are natural data representations in a wide range of
domains. A dyadic matrix often involves two types of abstract objects and
is based on observations of pairs of elements with one element from each
object. Owing to the increasing needs from practical applications, dyadic
data analysis has recently attracted increasing attention and many
techniques have been developed. However, most existing approaches, such as
co-clustering and relational reasoning, only handle a single dyadic table
and lack flexibility to perform prediction using multiple dyadic matrices.
In this article, we propose a general nonparametric Bayesian framework
with a cascaded structure to model multiple dyadic matrices and then
describe an efficient hybrid Gibbs sampling algorithm for posterior
inference and analysis. Empirical evaluations using both synthetic data
and real data show that the proposed model captures the hidden structure
of data and generalizes the predictive inference in a unique way.
Journal: Journal of Applied Statistics
Pages: 1991-2003
Issue: 9
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2012.699953
File-URL: http://hdl.handle.net/10.1080/02664763.2012.699953
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:1991-2003
Template-Type: ReDIF-Article 1.0
Author-Name: Enrico A. Colosimo
Author-X-Name-First: Enrico A.
Author-X-Name-Last: Colosimo
Author-Name: Maria Arlene Fausto
Author-X-Name-First: Maria Arlene
Author-X-Name-Last: Fausto
Author-Name: Marta Afonso Freitas
Author-X-Name-First: Marta Afonso
Author-X-Name-Last: Freitas
Author-Name: Jorge Andrade Pinto
Author-X-Name-First: Jorge Andrade
Author-X-Name-Last: Pinto
Title: Practical modeling strategies for unbalanced longitudinal data analysis
Abstract:
In practice, data are often measured repeatedly on the same individual at
several points in time. Main interest often relies in characterizing the
way the response changes in time, and the predictors of that change.
Marginal, mixed and transition are frequently considered to be the main
models for continuous longitudinal data analysis. These approaches are
proposed primarily for balanced longitudinal design. However, in clinic
studies, data are usually not balanced and some restrictions are necessary
in order to use these models. This paper was motivated by a data set
related to longitudinal height measurements in children of HIV-infected
mothers that was recorded at the university hospital of the Federal
University in Minas Gerais, Brazil. This data set is severely unbalanced.
The goal of this paper is to assess the application of continuous
longitudinal models for the analysis of unbalanced data set.
Journal: Journal of Applied Statistics
Pages: 2005-2013
Issue: 9
Volume: 39
Year: 2012
Month: 5
X-DOI: 10.1080/02664763.2012.699954
File-URL: http://hdl.handle.net/10.1080/02664763.2012.699954
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:2005-2013
Template-Type: ReDIF-Article 1.0
Author-Name: Bhim Singh
Author-X-Name-First: Bhim
Author-X-Name-Last: Singh
Author-Name: B. V.S. Sisodia
Author-X-Name-First: B. V.S.
Author-X-Name-Last: Sisodia
Author-Name: Anupam Singh
Author-X-Name-First: Anupam
Author-X-Name-Last: Singh
Author-Name: R. P. Kaushal
Author-X-Name-First: R. P.
Author-X-Name-Last: Kaushal
Title: A note on the estimation methods of crop production at the block level
Abstract:
This paper presents a method of estimation of crop-production statistics
at smaller geographical levels like a community development block
(generally referred to as a block) to make area-specific plans for
agricultural development programmes in India. Using available
district-level data on crop yield from crop-cutting experiments and data
on auxiliary variables from various administrative sources, a suitable
regression model is fitted. The fitted model is then used to predict the
crop production at the block level. Some scaled estimators are also
developed using predicted estimates. An empirical study is also carried
out to judge the merits of the proposed estimators.
Journal: Journal of Applied Statistics
Pages: 2015-2027
Issue: 9
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.700449
File-URL: http://hdl.handle.net/10.1080/02664763.2012.700449
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:2015-2027
Template-Type: ReDIF-Article 1.0
Author-Name: N. A. Samat
Author-X-Name-First: N. A.
Author-X-Name-Last: Samat
Author-Name: D. F. Percy
Author-X-Name-First: D. F.
Author-X-Name-Last: Percy
Title: Vector-borne infectious disease mapping with stochastic difference equations: an analysis of dengue disease in Malaysia
Abstract:
Few publications consider the estimation of relative risk for
vector-borne infectious diseases. Most of these articles involve
exploratory analysis that includes the study of covariates and their
effects on disease distribution and the study of geographic information
systems to integrate patient-related information. The aim of this paper is
to introduce an alternative method of relative risk estimation based on
discrete time--space stochastic SIR-SI models
(susceptible--infective--recovered for human populations;
susceptible--infective for vector populations) for the transmission of
vector-borne infectious diseases, particularly dengue disease. First, we
describe deterministic compartmental SIR-SI models that are suitable for
dengue disease transmission. We then adapt these to develop corresponding
discrete time--space stochastic SIR-SI models. Finally, we develop an
alternative method of estimating the relative risk for dengue disease
mapping based on these models and apply them to analyse dengue data from
Malaysia. This new approach offers a better model for estimating the
relative risk for dengue disease mapping compared with the other common
approaches, because it takes into account the transmission process of the
disease while allowing for covariates and spatial correlation between
risks in adjacent regions.
Journal: Journal of Applied Statistics
Pages: 2029-2046
Issue: 9
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.700450
File-URL: http://hdl.handle.net/10.1080/02664763.2012.700450
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:2029-2046
Template-Type: ReDIF-Article 1.0
Author-Name: Silvia Bacci
Author-X-Name-First: Silvia
Author-X-Name-Last: Bacci
Title: Longitudinal data: different approaches in the context of item-response theory models
Abstract:
In this paper, some extended Rasch models are analyzed in the presence of
longitudinal measurements of a latent variable. Two main approaches,
multidimensional and multilevel, are compared: we investigate the
different information that can be obtained from the latent variable, and
we give advice on the use of the different kinds of models. The
multidimensional and multilevel approaches are illustrated with a
simulation study and with a longitudinal study on the health-related
quality of life in terminal cancer patients.
Journal: Journal of Applied Statistics
Pages: 2047-2065
Issue: 9
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.700451
File-URL: http://hdl.handle.net/10.1080/02664763.2012.700451
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:2047-2065
Template-Type: ReDIF-Article 1.0
Author-Name: H. Zhang
Author-X-Name-First: H.
Author-X-Name-Last: Zhang
Author-Name: Q. Yu
Author-X-Name-First: Q.
Author-X-Name-Last: Yu
Author-Name: C. Feng
Author-X-Name-First: C.
Author-X-Name-Last: Feng
Author-Name: D. Gunzler
Author-X-Name-First: D.
Author-X-Name-Last: Gunzler
Author-Name: P. Wu
Author-X-Name-First: P.
Author-X-Name-Last: Wu
Author-Name: X. M. Tu
Author-X-Name-First: X. M.
Author-X-Name-Last: Tu
Title: A new look at the difference between the GEE and the GLMM when modeling longitudinal count responses
Abstract:
Poisson log-linear regression is a popular model for count responses. We
examine two popular extensions of this model -- the generalized estimating
equations (GEE) and the generalized linear mixed-effects model (GLMM) --
to longitudinal data analysis and complement the existing literature on
characterizing the relationship between the two dueling paradigms in this
setting. Unlike linear regression, the GEE and the GLMM carry significant
conceptual and practical implications when applied to modeling count data.
Our findings shed additional light on the differences between the two
classes of models when used for count data. Our considerations are
demonstrated by both real study and simulated data.
Journal: Journal of Applied Statistics
Pages: 2067-2079
Issue: 9
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.700452
File-URL: http://hdl.handle.net/10.1080/02664763.2012.700452
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:9:p:2067-2079
Template-Type: ReDIF-Article 1.0
Author-Name: Meral Ebegil
Author-X-Name-First: Meral
Author-X-Name-Last: Ebegil
Author-Name: Fikri Gökpınar
Author-X-Name-First: Fikri
Author-X-Name-Last: Gökpınar
Title: A test statistic to choose between Liu-type and least-squares estimator based on mean square error criteria
Abstract:
In this study, the necessary and sufficient conditions for the Liu-type
(LT) biased estimator are determined. A test for choosing between the LT
estimator and least-squares estimator is obtained by using these necessary
and sufficient conditions. Also, a simulation study is carried out to
compare this estimator against the ridge estimator. Furthermore, a
numerical example is given for defined test statistic.
Journal: Journal of Applied Statistics
Pages: 2081-2096
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.700453
File-URL: http://hdl.handle.net/10.1080/02664763.2012.700453
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2081-2096
Template-Type: ReDIF-Article 1.0
Author-Name: Jiun-Yi Wang
Author-X-Name-First: Jiun-Yi
Author-X-Name-Last: Wang
Author-Name: Li-Ching Chen
Author-X-Name-First: Li-Ching
Author-X-Name-Last: Chen
Author-Name: Hui-Min Lin
Author-X-Name-First: Hui-Min
Author-X-Name-Last: Lin
Title: Robust methods for detecting familial aggregation of a quantitative trait in matched case--control family studies
Abstract:
Assessing familial aggregation of a disease or its underlying
quantitative traits is often undertaken as the first step in the
investigation of possible genetic causes. When some major confounding
variables are known and difficult to be quantified, the matched
case--control family design provides an opportunity to eliminate biased
results. In such a design, cases and matched controls are ascertained
first, with subsequent recruitment of other members in their families. For
the study of complex diseases, many continuously distributed quantitative
traits or biomedical evaluations are of primary clinical and health
significance, and distributions of these continuous outcomes are
frequently skewed or non-normal. A non-normal distributed outcome may lead
some standard statistical methods to suffer from loss of substantial
power. To deal with the problem, in this study, we thus propose a
rank-based test for detecting familial aggregation of a quantitative trait
with the use of a within-cluster resampling process. According to our
simulation studies, the proposed test expresses qualified and robust power
performance. Specifically, the proposed test is slightly less powerful
than the generalized estimating equations approach if the trait is
normally distributed, and it is apparently more powerful if the trait
distribution is essentially skewed or heavy-tailed. A user-friendly
R-script and an executable file to perform the proposed test are available
online to allow its implementation on ordinary research.
Journal: Journal of Applied Statistics
Pages: 2097-2111
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.702204
File-URL: http://hdl.handle.net/10.1080/02664763.2012.702204
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2097-2111
Template-Type: ReDIF-Article 1.0
Author-Name: Sterling McPherson
Author-X-Name-First: Sterling
Author-X-Name-Last: McPherson
Author-Name: Celestina Barbosa-Leiker
Author-X-Name-First: Celestina
Author-X-Name-Last: Barbosa-Leiker
Title: An example of a two-part latent growth curve model for semicontinuous outcomes in the health sciences
Abstract:
A new method of modeling coronary artery calcium (CAC) is needed in order
to properly understand the probability of onset and growth of CAC. CAC
remains a controversial indicator of cardiovascular disease (CVD) risk,
but this may be due to ill-equipped methods of specifying CAC during the
analysis phase of studies reporting an analysis where CAC is the primary
outcome. The modern method of two-part latent growth modeling may
represent a strong alternative to the myriad of existing methods for
modeling CAC. We provide a brief overview of existing methods of analysis
used for CAC before introducing the general latent growth curve model, how
it extends into a two-part (semicontinuous) growth model, and how the
ubiquitous problem of missing data can be effectively handled. We then
present an example of how to model CAC using this framework. We
demonstrate that utilizing this type of modeling strategy can result in
traditional predictors of CAC (e.g. age, gender, and high-density
lipoprotein cholesterol), exerting a different impact on the two
different, yet simultaneous, operationalizations of CAC. This method of
analyzing CAC could inform future analyses of CAC and inform subsequent
discussions about the nature of its potential to inform long-term CVD risk
and heart events.
Journal: Journal of Applied Statistics
Pages: 2113-2128
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.702205
File-URL: http://hdl.handle.net/10.1080/02664763.2012.702205
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2113-2128
Template-Type: ReDIF-Article 1.0
Author-Name: Nazif Çalış
Author-X-Name-First: Nazif
Author-X-Name-Last: Çalış
Author-Name: Hamza Erol
Author-X-Name-First: Hamza
Author-X-Name-Last: Erol
Title: A new per-field classification method using mixture discriminant analysis
Abstract:
In this study, a new per-field classification method is proposed for
supervised classification of remotely sensed multispectral image data of
an agricultural area using Gaussian mixture discriminant analysis (MDA).
For the proposed per-field classification method, multivariate Gaussian
mixture models constructed for control and test fields can have fixed or
different number of components and each component can have different or
common covariance matrix structure. The discrimination function and the
decision rule of this method are established according to the average
Bhattacharyya distance and the minimum values of the average Bhattacharyya
distances, respectively. The proposed per-field classification method is
analyzed for different structures of a covariance matrix with fixed and
different number of components. Also, we classify the remotely sensed
multispectral image data using the per-pixel classification method based
on Gaussian MDA.
Journal: Journal of Applied Statistics
Pages: 2129-2140
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.702263
File-URL: http://hdl.handle.net/10.1080/02664763.2012.702263
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2129-2140
Template-Type: ReDIF-Article 1.0
Author-Name: Youhei Kawasaki
Author-X-Name-First: Youhei
Author-X-Name-Last: Kawasaki
Author-Name: Etsuo Miyaoka
Author-X-Name-First: Etsuo
Author-X-Name-Last: Miyaoka
Title: A Bayesian inference of P(λ1>λ2) for two Poisson parameters
Abstract:
The statistical inference drawn from the difference between two
independent Poisson parameters is often discussed in the medical
literature. However, such discussions are usually based on the frequentist
viewpoint rather than the Bayesian viewpoint. Here, we propose an index
θ=P(λ1, post>λ2, post), where
λ1, post and λ2, post denote Poisson
parameters following posterior density. We provide an exact and an
approximate expression for calculating θ using the conjugate gamma
prior and compare the probabilities obtained using the approximate and the
exact expressions. Moreover, we also show a relation between θ and
the p-value. We also highlight the significance of
θ by applying it to the result of actual clinical trials. Our
findings suggest that θ may provide useful information in a clinical
trial.
Journal: Journal of Applied Statistics
Pages: 2141-2152
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.702264
File-URL: http://hdl.handle.net/10.1080/02664763.2012.702264
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2141-2152
Template-Type: ReDIF-Article 1.0
Author-Name: Jiin-Huarng Guo
Author-X-Name-First: Jiin-Huarng
Author-X-Name-Last: Guo
Title: Optimal sample size planning for the Wilcoxon--Mann--Whitney and van Elteren tests under cost constraints
Abstract:
Sampling cost is a crucial factor in sample size planning, particularly
when the treatment group is more expensive than the control group. To
either minimize the total cost or maximize the statistical power of the
test, we used the distribution-free Wilcoxon--Mann--Whitney test for two
independent samples and the van Elteren test for randomized block design,
respectively. We then developed approximate sample size formulas when the
distribution of data is abnormal and/or unknown. This study derived the
optimal sample size allocation ratio for a given statistical power by
considering the cost constraints, so that the resulting sample sizes could
minimize either the total cost or the total sample size. Moreover, for a
given total cost, the optimal sample size allocation is recommended to
maximize the statistical power of the test. The proposed formula is not
only innovative, but also quick and easy. We also applied real data from a
clinical trial to illustrate how to choose the sample size for a
randomized two-block design. For nonparametric methods, no existing
commercial software for sample size planning has considered the cost
factor, and therefore the proposed methods can provide important insights
related to the impact of cost constraints.
Journal: Journal of Applied Statistics
Pages: 2153-2164
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.702265
File-URL: http://hdl.handle.net/10.1080/02664763.2012.702265
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2153-2164
Template-Type: ReDIF-Article 1.0
Author-Name: Luigi D'Ambra
Author-X-Name-First: Luigi
Author-X-Name-Last: D'Ambra
Author-Name: Antonello D'Ambra
Author-X-Name-First: Antonello
Author-X-Name-Last: D'Ambra
Author-Name: Pasquale Sarnacchiaro
Author-X-Name-First: Pasquale
Author-X-Name-Last: Sarnacchiaro
Title: Visualizing main effects and interaction in multiple non-symmetric correspondence analysis
Abstract:
Non-symmetric correspondence analysis (NSCA) is a useful technique for
analysing a two-way contingency table. Frequently, the predictor variables
are more than one; in this paper, we consider two categorical variables as
predictor variables and one response variable. Interaction represents the
joint effects of predictor variables on the response variable. When
interaction is present, the interpretation of the main effects is
incomplete or misleading. To separate the main effects and the interaction
term, we introduce a method that, starting from the coordinates of
multiple NSCA and using a two-way analysis of variance without
interaction, allows a better interpretation of the impact of the predictor
variable on the response variable. The proposed method has been applied on
a well-known three-way contingency table proposed by Bockenholt and
Bockenholt in which they cross-classify subjects by person's
attitude towards abortion, number of years of education and
religion. We analyse the case where the variables
education and religion influence a
person's attitude towards abortion.
Journal: Journal of Applied Statistics
Pages: 2165-2175
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.702266
File-URL: http://hdl.handle.net/10.1080/02664763.2012.702266
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2165-2175
Template-Type: ReDIF-Article 1.0
Author-Name: Yulei He
Author-X-Name-First: Yulei
Author-X-Name-Last: He
Author-Name: Trivellore E. Raghunathan
Author-X-Name-First: Trivellore E.
Author-X-Name-Last: Raghunathan
Title: Multiple imputation using multivariate gh transformations
Abstract:
Multiple imputation has emerged as a popular approach to handling data
sets with missing values. For incomplete continuous variables, imputations
are usually produced using multivariate normal models. However, this
approach might be problematic for variables with a strong non-normal
shape, as it would generate imputations incoherent with actual
distributions and thus lead to incorrect inferences. For non-normal data,
we consider a multivariate extension of Tukey's gh
distribution/transformation [38] to accommodate skewness and/or kurtosis
and capture the correlation among the variables. We propose an algorithm
to fit the incomplete data with the model and generate imputations. We
apply the method to a national data set for hospital performance on
several standard quality measures, which are highly skewed to the left and
substantially correlated with each other. We use Monte Carlo studies to
assess the performance of the proposed approach. We discuss possible
generalizations and give some advices to practitioners on how to handle
non-normal incomplete data.
Journal: Journal of Applied Statistics
Pages: 2177-2198
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.702268
File-URL: http://hdl.handle.net/10.1080/02664763.2012.702268
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2177-2198
Template-Type: ReDIF-Article 1.0
Author-Name: H. Fotouhi
Author-X-Name-First: H.
Author-X-Name-Last: Fotouhi
Author-Name: M. Golalizadeh
Author-X-Name-First: M.
Author-X-Name-Last: Golalizadeh
Title: Exploring the variability of DNA molecules via principal geodesic analysis on the shape space
Abstract:
Most of the linear statistics deal with data lying in a Euclidean space.
However, there are many examples, such as DNA molecule topological
structures, in which the initial or the transformed data lie in a
non-Euclidean space. To get a measure of variability in these situations,
the principal component analysis (PCA) is usually performed on a Euclidean
tangent space as it cannot be directly implemented on a non-Euclidean
space. Instead, principal geodesic analysis (PGA) is a new tool that
provides a measure of variability for nonlinear statistics. In this paper,
the performance of this new tool is compared with that of the PCA using a
real data set representing a DNA molecular structure. It is shown that due
to the nonlinearity of space, the PGA explains more variability of the
data than the PCA.
Journal: Journal of Applied Statistics
Pages: 2199-2207
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.704353
File-URL: http://hdl.handle.net/10.1080/02664763.2012.704353
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2199-2207
Template-Type: ReDIF-Article 1.0
Author-Name: Francisco Louzada
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada
Author-Name: Vicente G. Cancho
Author-X-Name-First: Vicente G.
Author-X-Name-Last: Cancho
Author-Name: Mari Roman
Author-X-Name-First: Mari
Author-X-Name-Last: Roman
Author-Name: Jos� G. Leite
Author-X-Name-First: Jos� G.
Author-X-Name-Last: Leite
Title: A new long-term lifetime distribution induced by a latent complementary risk framework
Abstract:
In this paper, we proposed a new three-parameter long-term lifetime
distribution induced by a latent complementary risk framework with
decreasing, increasing and unimodal hazard function, the long-term
complementary exponential geometric distribution. The new distribution
arises from latent competing risk scenarios, where the lifetime associated
scenario, with a particular risk, is not observable, rather we observe
only the maximum lifetime value among all risks, and the presence of
long-term survival. The properties of the proposed distribution are
discussed, including its probability density function and explicit
algebraic formulas for its reliability, hazard and quantile functions and
order statistics. The parameter estimation is based on the usual
maximum-likelihood approach. A simulation study assesses the performance
of the estimation procedure. We compare the new distribution with its
particular cases, as well as with the long-term Weibull distribution on
three real data sets, observing its potential and competitiveness in
comparison with some usual long-term lifetime distributions.
Journal: Journal of Applied Statistics
Pages: 2209-2222
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.706264
File-URL: http://hdl.handle.net/10.1080/02664763.2012.706264
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2209-2222
Template-Type: ReDIF-Article 1.0
Author-Name: J�rôme Kasparian
Author-X-Name-First: J�rôme
Author-X-Name-Last: Kasparian
Author-Name: Antoine Rolland
Author-X-Name-First: Antoine
Author-X-Name-Last: Rolland
Title: OECD's ‘Better Life Index’: can any country be well ranked?
Abstract:
We critically review the Better Life Index (BLI) recently introduced by
the Organization for Economic Co-operation and Development (OECD). We
discuss methodological issues in the definition of the criteria used to
rank the countries, as well as in their aggregation method. Moreover, we
explore the unique option offered by the BLI to apply one's own weight set
to 11 criteria. Although 16 countries can be ranked first by choosing
ad hoc weightings, only Canada, Australia and Sweden do
so over a substantial fraction of the parameter space defined by all
possible weight sets. Furthermore, most pairwise comparisons between
countries are insensitive to the choice of the weights. Therefore, the BLI
establishes a hierarchy among the evaluated countries, independent of the
chosen set of weights.
Journal: Journal of Applied Statistics
Pages: 2223-2230
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.706265
File-URL: http://hdl.handle.net/10.1080/02664763.2012.706265
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2223-2230
Template-Type: ReDIF-Article 1.0
Author-Name: Jonathan Gillard
Author-X-Name-First: Jonathan
Author-X-Name-Last: Gillard
Title: A generalised Box--Cox transformation for the parametric estimation of clinical reference intervals
Abstract:
Parametric methods for the calculation of reference intervals in clinical
studies often rely on the identification of a suitable transformation so
that the transformed data can be assumed to be drawn from a Gaussian
distribution. In this paper, the two-stage transformation recommended by
the International Federation for Clinical Chemistry is compared with a
novel generalised Box--Cox family of transformations. Investigation is
also made of sample sizes needed to achieve certain criteria of
reliability in the calculated reference interval. Simulations are used to
show that the generalised Box--Cox family achieves a lower bias than the
two-stage transformation. It was found that there is a possibility that
the two-stage transformation will result in percentile estimates that
cannot be back-transformed to obtain the required reference intervals, a
difficulty not observed when using the generalised Box--Cox family
introduced in this paper.
Journal: Journal of Applied Statistics
Pages: 2231-2245
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.706266
File-URL: http://hdl.handle.net/10.1080/02664763.2012.706266
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2231-2245
Template-Type: ReDIF-Article 1.0
Author-Name: Yungtai Lo
Author-X-Name-First: Yungtai
Author-X-Name-Last: Lo
Title: Estimating the prevalence of low-lumbar spine bone mineral density in older men with or at risk for HIV infection using normal mixture models
Abstract:
Bone mineral density decreases naturally as we age because existing bone
tissue is reabsorbed by the body faster than new bone tissue is
synthesized. When this occurs, bones lose calcium and other minerals. What
is normal bone mineral density for men 50 years and older? Suitable
diagnostic cutoff values for men are less well defined than for women. In
this paper, we propose using normal mixture models to estimate the
prevalence of low-lumbar spine bone mineral density in men 50 years and
older with or at risk for human immunodeficiency virus infection when
normal values of bone mineral density are not generally known. The
Box--Cox power transformation is used to determine which transformation
best suits normal mixture distributions. Parametric bootstrap tests are
used to determine the number of mixture components and to determine
whether the mixture components are homoscedastic or heteroscedastic.
Journal: Journal of Applied Statistics
Pages: 2247-2258
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.706267
File-URL: http://hdl.handle.net/10.1080/02664763.2012.706267
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2247-2258
Template-Type: ReDIF-Article 1.0
Author-Name: Wen-Liang Hung
Author-X-Name-First: Wen-Liang
Author-X-Name-Last: Hung
Author-Name: Shou-Jen Chang-Chien
Author-X-Name-First: Shou-Jen
Author-X-Name-Last: Chang-Chien
Author-Name: Miin-Shen Yang
Author-X-Name-First: Miin-Shen
Author-X-Name-Last: Yang
Title: Self-updating clustering algorithm for estimating the parameters in mixtures of von Mises distributions
Abstract:
The EM algorithm is the standard method for estimating the parameters in
finite mixture models. Yang and Pan [25] proposed a generalized
classification maximum likelihood procedure, called the fuzzy
c-directions (FCD) clustering algorithm, for estimating
the parameters in mixtures of von Mises distributions. Two main drawbacks
of the EM algorithm are its slow convergence and the dependence of the
solution on the initial value used. The choice of initial values is of
great importance in the algorithm-based literature as it can heavily
influence the speed of convergence of the algorithm and its ability to
locate the global maximum. On the other hand, the algorithmic frameworks
of EM and FCD are closely related. Therefore, the drawbacks of FCD are the
same as those of the EM algorithm. To resolve these problems, this paper
proposes another clustering algorithm, which can self-organize local
optimal cluster numbers without using cluster validity functions. These
numerical results clearly indicate that the proposed algorithm is superior
in performance of EM and FCD algorithms. Finally, we apply the proposed
algorithm to two real data sets.
Journal: Journal of Applied Statistics
Pages: 2259-2274
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.706268
File-URL: http://hdl.handle.net/10.1080/02664763.2012.706268
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2259-2274
Template-Type: ReDIF-Article 1.0
Author-Name: Hong Li
Author-X-Name-First: Hong
Author-X-Name-Last: Li
Author-Name: Wei Ning
Author-X-Name-First: Wei
Author-X-Name-Last: Ning
Title: Multiple comparisons with a control under heteroscedasticity
Abstract:
This article investigates three procedures on the multiple comparisons
with a control in the presence of unequal error variances. The advantages
of the proposed methods are illustrated through two examples. The
performance of the proposed methods and other alternative methods is
compared by simulation studies. The results show that the typical methods
assuming equal variance will have inflated error rate and may lead to
erroneous inference when the equal variance assumption fails. In addition,
the simulation study shows that the proposed approaches always control the
family-wise error rate at a specified nominal level α, while some
established methods are liberal and have inflated error rate in some
scenarios.
Journal: Journal of Applied Statistics
Pages: 2275-2283
Issue: 10
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.706269
File-URL: http://hdl.handle.net/10.1080/02664763.2012.706269
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2275-2283
Template-Type: ReDIF-Article 1.0
Author-Name: Taeyoung Park
Author-X-Name-First: Taeyoung
Author-X-Name-Last: Park
Author-Name: Robert T. Krafty
Author-X-Name-First: Robert T.
Author-X-Name-Last: Krafty
Author-Name: Alvaro I. Sánchez
Author-X-Name-First: Alvaro I.
Author-X-Name-Last: Sánchez
Title: Bayesian semi-parametric analysis of Poisson change-point regression models: application to policy-making in Cali, Colombia
Abstract:
A Poisson regression model with an offset assumes a constant baseline
rate after accounting for measured covariates, which may lead to biased
estimates of coefficients in an inhomogeneous Poisson process. To
correctly estimate the effect of time-dependent covariates, we propose a
Poisson change-point regression model with an offset that allows a
time-varying baseline rate. When the non-constant pattern of a log
baseline rate is modeled with a non-parametric step function, the
resulting semi-parametric model involves a model component of varying
dimensions and thus requires a sophisticated varying-dimensional inference
to obtain the correct estimates of model parameters of a fixed dimension.
To fit the proposed varying-dimensional model, we devise a
state-of-the-art Markov chain Monte Carlo-type algorithm based on partial
collapse. The proposed model and methods are used to investigate the
association between the daily homicide rates in Cali, Colombia, and the
policies that restrict the hours during which the legal sale of alcoholic
beverages is permitted. While simultaneously identifying the latent
changes in the baseline homicide rate which correspond to the incidence of
sociopolitical events, we explore the effect of policies governing the
sale of alcohol on homicide rates and seek a policy that balances the
economic and cultural dependencies on alcohol sales to the health of the
public.
Journal: Journal of Applied Statistics
Pages: 2285-2298
Issue: 10
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2012.709227
File-URL: http://hdl.handle.net/10.1080/02664763.2012.709227
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2285-2298
Template-Type: ReDIF-Article 1.0
Author-Name: Alex Karagrigoriou
Author-X-Name-First: Alex
Author-X-Name-Last: Karagrigoriou
Title: Statistical inference: The minimum distance approach
Journal: Journal of Applied Statistics
Pages: 2299-2300
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2012.681565
File-URL: http://hdl.handle.net/10.1080/02664763.2012.681565
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2299-2300
Template-Type: ReDIF-Article 1.0
Author-Name: Hassan S. Bakouch
Author-X-Name-First: Hassan S.
Author-X-Name-Last: Bakouch
Title: Applied time series analysis
Journal: Journal of Applied Statistics
Pages: 2300-2301
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2012.682445
File-URL: http://hdl.handle.net/10.1080/02664763.2012.682445
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2300-2301
Template-Type: ReDIF-Article 1.0
Author-Name: Eliane R. Rodrigues
Author-X-Name-First: Eliane R.
Author-X-Name-Last: Rodrigues
Title: Reversibility and stochastic networks
Journal: Journal of Applied Statistics
Pages: 2301-2302
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2012.682448
File-URL: http://hdl.handle.net/10.1080/02664763.2012.682448
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2301-2302
Template-Type: ReDIF-Article 1.0
Author-Name: Kassim S. Mwitondi
Author-X-Name-First: Kassim S.
Author-X-Name-Last: Mwitondi
Title: Statistical data mining using SAS applications
Journal: Journal of Applied Statistics
Pages: 2302-2302
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2012.682451
File-URL: http://hdl.handle.net/10.1080/02664763.2012.682451
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2302-2302
Template-Type: ReDIF-Article 1.0
Author-Name: Claire Keeble
Author-X-Name-First: Claire
Author-X-Name-Last: Keeble
Title: The R primer
Journal: Journal of Applied Statistics
Pages: 2303-2303
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2012.682453
File-URL: http://hdl.handle.net/10.1080/02664763.2012.682453
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2303-2303
Template-Type: ReDIF-Article 1.0
Author-Name: Yiannis Kamarianakis
Author-X-Name-First: Yiannis
Author-X-Name-Last: Kamarianakis
Title: The Oxford handbook of economic forecasting
Journal: Journal of Applied Statistics
Pages: 2303-2304
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2012.682455
File-URL: http://hdl.handle.net/10.1080/02664763.2012.682455
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2303-2304
Template-Type: ReDIF-Article 1.0
Author-Name: Isaac Dialsingh
Author-X-Name-First: Isaac
Author-X-Name-Last: Dialsingh
Title: Expansions and asymptotics for statistics
Journal: Journal of Applied Statistics
Pages: 2304-2305
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2012.692541
File-URL: http://hdl.handle.net/10.1080/02664763.2012.692541
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2304-2305
Template-Type: ReDIF-Article 1.0
Author-Name: Isaac Dialsingh
Author-X-Name-First: Isaac
Author-X-Name-Last: Dialsingh
Title: Large-scale inference: empirical Bayes methods for estimation, testing, and prediction
Journal: Journal of Applied Statistics
Pages: 2305-2305
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2012.692543
File-URL: http://hdl.handle.net/10.1080/02664763.2012.692543
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2305-2305
Template-Type: ReDIF-Article 1.0
Author-Name: P. William Hughes
Author-X-Name-First: P.
Author-X-Name-Last: William Hughes
Title: Biostatistics: a computing approach
Journal: Journal of Applied Statistics
Pages: 2306-2306
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2012.692545
File-URL: http://hdl.handle.net/10.1080/02664763.2012.692545
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2306-2306
Template-Type: ReDIF-Article 1.0
Author-Name: Rolando de la Cruz
Author-X-Name-First: Rolando
Author-X-Name-Last: de la Cruz
Title: Bayesian ideas and data analysis: An introduction for scientists and statisticians
Journal: Journal of Applied Statistics
Pages: 2306-2307
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/17429145.2012.693248
File-URL: http://hdl.handle.net/10.1080/17429145.2012.693248
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2306-2307
Template-Type: ReDIF-Article 1.0
Author-Name: Yves Laberge
Author-X-Name-First: Yves
Author-X-Name-Last: Laberge
Title: The Oxford handbook of quantitative asset management
Journal: Journal of Applied Statistics
Pages: 2307-2308
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2012.694257
File-URL: http://hdl.handle.net/10.1080/02664763.2012.694257
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2307-2308
Template-Type: ReDIF-Article 1.0
Author-Name: Theofanis Sapatinas
Author-X-Name-First: Theofanis
Author-X-Name-Last: Sapatinas
Title: Statistics for high-dimensional data
Journal: Journal of Applied Statistics
Pages: 2308-2309
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2012.694258
File-URL: http://hdl.handle.net/10.1080/02664763.2012.694258
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2308-2309
Template-Type: ReDIF-Article 1.0
Author-Name: David J. Hand
Author-X-Name-First: David J.
Author-X-Name-Last: Hand
Title: Who's #1? The science of rating and ranking
Journal: Journal of Applied Statistics
Pages: 2309-2310
Issue: 10
Volume: 39
Year: 2012
Month: 10
X-DOI: 10.1080/02664763.2012.701375
File-URL: http://hdl.handle.net/10.1080/02664763.2012.701375
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:10:p:2309-2310
Template-Type: ReDIF-Article 1.0
Author-Name: G. E. Salcedo
Author-X-Name-First: G. E.
Author-X-Name-Last: Salcedo
Author-Name: R. F. Porto
Author-X-Name-First: R. F.
Author-X-Name-Last: Porto
Author-Name: S. Y. Roa
Author-X-Name-First: S. Y.
Author-X-Name-Last: Roa
Author-Name: F. R. Momo
Author-X-Name-First: F. R.
Author-X-Name-Last: Momo
Title: A wavelet-based time-varying autoregressive model for non-stationary and irregular time series
Abstract:
In this work we propose an autoregressive model with parameters varying
in time applied to irregularly spaced non-stationary time series. We
expand all the functional parameters in a wavelet basis and estimate the
coefficients by least squares after truncation at a suitable resolution
level. We also present some simulations in order to evaluate both the
estimation method and the model behavior on finite samples. Applications
to silicates and nitrites irregularly observed data are provided as well.
Journal: Journal of Applied Statistics
Pages: 2313-2325
Issue: 11
Volume: 39
Year: 2012
Month: 6
X-DOI: 10.1080/02664763.2012.702267
File-URL: http://hdl.handle.net/10.1080/02664763.2012.702267
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2313-2325
Template-Type: ReDIF-Article 1.0
Author-Name: S. Eftekhari Mahabadi
Author-X-Name-First: S. Eftekhari
Author-X-Name-Last: Mahabadi
Author-Name: M. Ganjali
Author-X-Name-First: M.
Author-X-Name-Last: Ganjali
Title: An index of local sensitivity to non-ignorability for parametric survival models with potential non-random missing covariate: an application to the SEER cancer registry data
Abstract:
Several survival regression models have been developed to assess the
effects of covariates on failure times. In various settings, including
surveys, clinical trials and epidemiological studies, missing data may
often occur due to incomplete covariate data. Most existing methods for
lifetime data are based on the assumption of missing at random (MAR)
covariates. However, in many substantive applications, it is important to
assess the sensitivity of key model inferences to the MAR assumption. The
index of sensitivity to non-ignorability (ISNI) is a local sensitivity
tool to measure the potential sensitivity of key model parameters to small
departures from the ignorability assumption, needless of estimating a
complicated non-ignorable model. We extend this sensitivity index to
evaluate the impact of a covariate that is potentially missing, not at
random in survival analysis, using parametric survival models. The
approach will be applied to investigate the impact of missing tumor grade
on post-surgical mortality outcomes in individuals with pancreas-head
cancer in the Surveillance, Epidemiology, and End Results data set. For
patients suffering from cancer, tumor grade is an important risk factor.
Many individuals in these data with pancreas-head cancer have missing
tumor grade information. Our ISNI analysis shows that the magnitude of
effect for most covariates (with significant effect on the survival time
distribution), specifically surgery and tumor grade as some important risk
factors in cancer studies, highly depends on the missing mechanism
assumption of the tumor grade. Also a simulation study is conducted to
evaluate the performance of the proposed index in detecting sensitivity of
key model parameters.
Journal: Journal of Applied Statistics
Pages: 2327-2348
Issue: 11
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2012.710196
File-URL: http://hdl.handle.net/10.1080/02664763.2012.710196
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2327-2348
Template-Type: ReDIF-Article 1.0
Author-Name: M. C. Pardo
Author-X-Name-First: M. C.
Author-X-Name-Last: Pardo
Author-Name: R. Alonso
Author-X-Name-First: R.
Author-X-Name-Last: Alonso
Title: A generalized Q--Q plot for longitudinal data
Abstract:
Most biomedical research is carried out using longitudinal studies. The
method of generalized estimating equations (GEEs) introduced by Liang and
Zeger [Longitudinal data analysis using generalized linear
models, Biometrika 73 (1986), pp. 13--22] and Zeger and Liang
[Longitudinal data analysis for discrete and continuous
outcomes, Biometrics 42 (1986), pp. 121--130] has become a
standard method for analyzing non-normal longitudinal data. Since then, a
large variety of GEEs have been proposed. However, the model diagnostic
problem has not been explored intensively. Oh et al.
[Modeldiagnostic plots for repeated measures data using the
generalized estimating equations approach, Comput. Statist. Data
Anal. 53 (2008), pp. 222--232] proposed residual plots based on the
quantile--quantile (Q--Q) plots of the χ-super-2-distribution for
repeated-measures data using the GEE methodology. They considered the
Pearson, Anscombe and deviance residuals. In this work, we propose to
extend this graphical diagnostic using a generalized residual. A
simulation study is presented as well as two examples illustrating the
proposed generalized Q--Q plots.
Journal: Journal of Applied Statistics
Pages: 2349-2362
Issue: 11
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2012.710896
File-URL: http://hdl.handle.net/10.1080/02664763.2012.710896
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2349-2362
Template-Type: ReDIF-Article 1.0
Author-Name: Nicole White
Author-X-Name-First: Nicole
Author-X-Name-Last: White
Author-Name: Helen Johnson
Author-X-Name-First: Helen
Author-X-Name-Last: Johnson
Author-Name: Peter Silburn
Author-X-Name-First: Peter
Author-X-Name-Last: Silburn
Author-Name: Kerrie Mengersen
Author-X-Name-First: Kerrie
Author-X-Name-Last: Mengersen
Title: Dirichlet process mixture models for unsupervised clustering of symptoms in Parkinson's disease
Abstract:
In this paper, the goal of identifying disease subgroups based on
differences in observed symptom profile is considered. Commonly referred
to as phenotype identification, solutions to this task often involve the
application of unsupervised clustering techniques. In this paper, we
investigate the application of a Dirichlet process mixture model for this
task. This model is defined by the placement of the Dirichlet process on
the unknown components of a mixture model, allowing for the expression of
uncertainty about the partitioning of observed data into homogeneous
subgroups. To exemplify this approach, an application to phenotype
identification in Parkinson's disease is considered, with symptom profiles
collected using the Unified Parkinson's Disease Rating Scale.
Journal: Journal of Applied Statistics
Pages: 2363-2377
Issue: 11
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2012.710897
File-URL: http://hdl.handle.net/10.1080/02664763.2012.710897
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2363-2377
Template-Type: ReDIF-Article 1.0
Author-Name: Charles C. Taylor
Author-X-Name-First: Charles C.
Author-X-Name-Last: Taylor
Author-Name: Kanti V. Mardia
Author-X-Name-First: Kanti V.
Author-X-Name-Last: Mardia
Author-Name: Marco Di Marzio
Author-X-Name-First: Marco
Author-X-Name-Last: Di Marzio
Author-Name: Agnese Panzera
Author-X-Name-First: Agnese
Author-X-Name-Last: Panzera
Title: Validating protein structure using kernel density estimates
Abstract:
Measuring the quality of determined protein structures is a very
important problem in bioinformatics. Kernel density estimation is a
well-known nonparametric method which is often used for exploratory data
analysis. Recent advances, which have extended previous linear methods to
multi-dimensional circular data, give a sound basis for the analysis of
conformational angles of protein backbones, which lie on the torus. By
using an energy test, which is based on interpoint distances, we initially
investigate the dependence of the angles on the amino acid type. Then, by
computing tail probabilities which are based on amino-acid conditional
density estimates, a method is proposed which permits inference on a test
set of data. This can be used, for example, to validate protein
structures, choose between possible protein predictions and highlight
unusual residue angles.
Journal: Journal of Applied Statistics
Pages: 2379-2388
Issue: 11
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2012.710898
File-URL: http://hdl.handle.net/10.1080/02664763.2012.710898
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2379-2388
Template-Type: ReDIF-Article 1.0
Author-Name: Manoj Kumar Rastogi
Author-X-Name-First: Manoj Kumar
Author-X-Name-Last: Rastogi
Author-Name: Yogesh Mani Tripathi
Author-X-Name-First: Yogesh Mani
Author-X-Name-Last: Tripathi
Author-Name: Shuo-Jye Wu
Author-X-Name-First: Shuo-Jye
Author-X-Name-Last: Wu
Title: Estimating the parameters of a bathtub-shaped distribution under progressive type-II censoring
Abstract:
We consider the problem of estimating unknown parameters, reliability
function and hazard function of a two parameter bathtub-shaped
distribution on the basis of progressive type-II censored sample. The
maximum likelihood estimators and Bayes estimators are derived for two
unknown parameters, reliability function and hazard function. The Bayes
estimators are obtained against squared error, LINEX and entropy loss
functions. Also, using the Lindley approximation method we have obtained
approximate Bayes estimators against these loss functions. Some numerical
comparisons are made among various proposed estimators in terms of their
mean square error values and some specific recommendations are given.
Finally, two data sets are analyzed to illustrate the proposed methods.
Journal: Journal of Applied Statistics
Pages: 2389-2411
Issue: 11
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2012.710899
File-URL: http://hdl.handle.net/10.1080/02664763.2012.710899
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2389-2411
Template-Type: ReDIF-Article 1.0
Author-Name: Samuel Iddi
Author-X-Name-First: Samuel
Author-X-Name-Last: Iddi
Author-Name: Geert Molenberghs
Author-X-Name-First: Geert
Author-X-Name-Last: Molenberghs
Title: A joint marginalized multilevel model for longitudinal outcomes
Abstract:
The shared-parameter model and its so-called hierarchical or
random-effects extension are widely used joint modeling approaches for a
combination of longitudinal continuous, binary, count, missing, and
survival outcomes that naturally occurs in many clinical and other
studies. A random effect is introduced and shared or allowed to differ
between two or more repeated measures or longitudinal outcomes, thereby
acting as a vehicle to capture association between the outcomes in these
joint models. It is generally known that parameter estimates in a linear
mixed model (LMM) for continuous repeated measures or longitudinal
outcomes allow for a marginal interpretation, even though a hierarchical
formulation is employed. This is not the case for the generalized linear
mixed model (GLMM), that is, for non-Gaussian outcomes. The aforementioned
joint models formulated for continuous and binary or two longitudinal
binomial outcomes, using the LMM and GLMM, will naturally have marginal
interpretation for parameters associated with the continuous outcome but a
subject-specific interpretation for the fixed effects parameters relating
covariates to binary outcomes. To derive marginally meaningful parameters
for the binary models in a joint model, we adopt the marginal multilevel
model (MMM) due to Heagerty [13] and Heagerty and Zeger [14] and formulate
a joint MMM for two longitudinal responses. This enables to (1) capture
association between the two responses and (2) obtain parameter estimates
that have a population-averaged interpretation for both outcomes. The
model is applied to two sets of data. The results are compared with those
obtained from the existing approaches such as generalized estimating
equations, GLMM, and the model of Heagerty [13]. Estimates were found to
be very close to those from single analysis per outcome but the joint
model yields higher precision and allows for quantifying the association
between outcomes. Parameters were estimated using maximum likelihood. The
model is easy to fit using available tools such as the SAS NLMIXED
procedure.
Journal: Journal of Applied Statistics
Pages: 2413-2430
Issue: 11
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2012.711302
File-URL: http://hdl.handle.net/10.1080/02664763.2012.711302
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2413-2430
Template-Type: ReDIF-Article 1.0
Author-Name: Mahmoud Torabi
Author-X-Name-First: Mahmoud
Author-X-Name-Last: Torabi
Title: Spatial modeling using frequentist approach for disease mapping
Abstract:
In this article, a generalized linear mixed model (GLMM) based on a
frequentist approach is employed to examine spatial trend of asthma data.
However, the frequentist analysis of GLMM is computationally difficult. On
the other hand, the Bayesian analysis of GLMM has been computationally
convenient due to the advent of Markov chain Monte Carlo algorithms.
Recently developed data cloning (DC) method, which yields to maximum
likelihood estimate, provides frequentist approach to complex mixed models
and equally computationally convenient method. We use DC to conduct
frequentist analysis of spatial models. The advantages of the DC approach
are that the answers are independent of the choice of the priors,
non-estimable parameters are flagged automatically, and the possibility of
improper posterior distributions is completely avoided. We illustrate this
approach using a real dataset of asthma visits to hospital in the province
of Manitoba, Canada, during 2000--2010. The performance of the DC approach
in our application is also studied through a simulation study.
Journal: Journal of Applied Statistics
Pages: 2431-2439
Issue: 11
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2012.711814
File-URL: http://hdl.handle.net/10.1080/02664763.2012.711814
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2431-2439
Template-Type: ReDIF-Article 1.0
Author-Name: Yan Ma
Author-X-Name-First: Yan
Author-X-Name-Last: Ma
Title: On inference for Kendall's τ within a longitudinal data setting
Abstract:
Kendall's τ is a non-parametric measure of correlation based on
ranks and is used in a wide range of research disciplines. Although
methods are available for making inference about Kendall's τ, none
has been extended to modeling multiple Kendall's τs arising in
longitudinal data analysis. Compounding this problem is the pervasive
issue of missing data in such study designs. In this article, we develop a
novel approach to provide inference about Kendall's τ within a
longitudinal study setting under both complete and missing data. The
proposed approach is illustrated with simulated data and applied to an HIV
prevention study.
Journal: Journal of Applied Statistics
Pages: 2441-2452
Issue: 11
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2012.712954
File-URL: http://hdl.handle.net/10.1080/02664763.2012.712954
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2441-2452
Template-Type: ReDIF-Article 1.0
Author-Name: Luzia Gonçalves
Author-X-Name-First: Luzia
Author-X-Name-Last: Gonçalves
Author-Name: M. Rosário de Oliveira
Author-X-Name-First: M. Rosário
Author-X-Name-Last: de Oliveira
Author-Name: Cláudia Pascoal
Author-X-Name-First: Cláudia
Author-X-Name-Last: Pascoal
Author-Name: Ana Pires
Author-X-Name-First: Ana
Author-X-Name-Last: Pires
Title: Sample size for estimating a binomial proportion: comparison of different methods
Abstract:
The poor performance of the Wald method for constructing confidence
intervals (CIs) for a binomial proportion has been demonstrated in a vast
literature. The related problem of sample size determination needs to be
updated and comparative studies are essential to understanding the
performance of alternative methods. In this paper, the sample size is
obtained for the Clopper--Pearson, Bayesian (Uniform and Jeffreys priors),
Wilson, Agresti--Coull, Anscombe, and Wald methods. Two two-step
procedures are used: one based on the expected length (EL) of the CI and
another one on its first-order approximation. In the first step, all
possible solutions that satisfy the optimal criterion are obtained. In the
second step, a single solution is proposed according to a new criterion
(e.g. highest coverage probability (CP)). In practice, it is expected a
sample size reduction, therefore, we explore the behavior of the methods
admitting 30% and 50% of losses. For all the methods, the ELs are
inflated, as expected, but the coverage probabilities remain close to the
original target (with few exceptions). It is not easy to suggest a method
that is optimal throughout the range (0, 1) for p.
Depending on whether the goal is to achieve CP approximately or above the
nominal level different recommendations are made.
Journal: Journal of Applied Statistics
Pages: 2453-2473
Issue: 11
Volume: 39
Year: 2012
Month: 7
X-DOI: 10.1080/02664763.2012.713919
File-URL: http://hdl.handle.net/10.1080/02664763.2012.713919
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2453-2473
Template-Type: ReDIF-Article 1.0
Author-Name: Kanti V. Mardia
Author-X-Name-First: Kanti V.
Author-X-Name-Last: Mardia
Author-Name: John T. Kent
Author-X-Name-First: John T.
Author-X-Name-Last: Kent
Author-Name: Zhengzheng Zhang
Author-X-Name-First: Zhengzheng
Author-X-Name-Last: Zhang
Author-Name: Charles C. Taylor
Author-X-Name-First: Charles C.
Author-X-Name-Last: Taylor
Author-Name: Thomas Hamelryck
Author-X-Name-First: Thomas
Author-X-Name-Last: Hamelryck
Title: Mixtures of concentrated multivariate sine distributions with applications to bioinformatics
Abstract:
Motivated by examples in protein bioinformatics, we study a mixture model
of multivariate angular distributions. The distribution treated here
(multivariate sine distribution) is a multivariate extension of the
well-known von Mises distribution on the circle. The density of the sine
distribution has an intractable normalizing constant and here we propose
to replace it in the concentrated case by a simple approximation. We study
the EM algorithm for this distribution and apply it to a practical example
from protein bioinformatics.
Journal: Journal of Applied Statistics
Pages: 2475-2492
Issue: 11
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.719221
File-URL: http://hdl.handle.net/10.1080/02664763.2012.719221
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2475-2492
Template-Type: ReDIF-Article 1.0
Author-Name: Matthew Benigni
Author-X-Name-First: Matthew
Author-X-Name-Last: Benigni
Author-Name: Reinhard Furrer
Author-X-Name-First: Reinhard
Author-X-Name-Last: Furrer
Title: Spatio-temporal improvised explosive device monitoring: improving detection to minimise attacks
Abstract:
The improvised explosive device (IED) is a weapon of strategic influence
on today's battlefield. IED detonations occur predominantly on roads,
footpaths, or trails. Therefore, locations are best described when
constrained to the road network, and some spaces on the network are more
dangerous at specific times of the day. We propose a statistical model
that reduces the spatial location to one dimension and uses a cyclic time
as a second dimension. Based on the Poisson process methodology, we
develop normalised, inhomogeneous, bivariate intensity functions measuring
the threat of attack to support resourcing decisions. A simulation and an
analysis of attacks on a main supply route in Baghdad are given to
illustrate the proposed methods. Additionally, we provide an overview of
the growing demand for the analysis efforts in support of operations in
Afghanistan and Iraq, and provide an extensive literature review of
developments in counter-IED analysis.
Journal: Journal of Applied Statistics
Pages: 2493-2508
Issue: 11
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.719222
File-URL: http://hdl.handle.net/10.1080/02664763.2012.719222
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2493-2508
Template-Type: ReDIF-Article 1.0
Author-Name: Alan Kimber
Author-X-Name-First: Alan
Author-X-Name-Last: Kimber
Author-Name: Shah-Jalal Sarker
Author-X-Name-First: Shah-Jalal
Author-X-Name-Last: Sarker
Title: A covariance-based test for shared frailty in multivariate lifetime data
Abstract:
We decompose the score statistic for testing for shared finite variance
frailty in multivariate lifetime data into marginal and covariance-based
terms. The null properties of the covariance-based statistic are derived
in the context of parametric lifetime models. Its non-null properties are
estimated using simulation and compared with those of the score test and
two likelihood ratio tests when the underlying lifetime distribution is
Weibull. Some examples are used to illustrate the covariance-based test. A
case is made for using the covariance-based statistic as a simple
diagnostic procedure for shared frailty in a parametric exploratory
analysis of multivariate lifetime data and a link to the bivariate
Clayton--Oakes copula model is shown.
Journal: Journal of Applied Statistics
Pages: 2509-2522
Issue: 11
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.720966
File-URL: http://hdl.handle.net/10.1080/02664763.2012.720966
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2509-2522
Template-Type: ReDIF-Article 1.0
Author-Name: Irina Chis Ster
Author-X-Name-First: Irina Chis
Author-X-Name-Last: Ster
Title: Inference for serological surveys investigating past exposures to infections resulting in long-lasting immunity -- an approach using finite mixture models with concomitant information
Abstract:
This paper is concerned with developing a latent class mixture modelling
technique which efficiently exploits data from serological surveys aiming
to investigate past exposures to infections resulting in long-term or
life-lasting immunity. Mixture components featured by antibody
assays’ distribution are associated with the serological groups in
the population, whilst the probability mixture that an individual belongs
to the positive serological group is regarded as an age-dependent
prevalence. The latter embeds a mechanistic model which explains the
infection process, accounting for heterogeneities, contact patterns in the
population and incorporating elements of study design. A Bayesian
framework for statistical inference using Markov chain Monte Carlo
estimation methods naturally accommodates missing responses in the data
and allows straightforward assessement of uncertainties in nonlinear
models. The applicability of the method is illustrated by investigating
past exposure to varicella zoster virus infection in pre-school children,
using data from a large scale UK cohort study which included a
cross-sectional serological survey based on oral fluid samples.
Journal: Journal of Applied Statistics
Pages: 2523-2542
Issue: 11
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.722608
File-URL: http://hdl.handle.net/10.1080/02664763.2012.722608
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:11:p:2523-2542
Template-Type: ReDIF-Article 1.0
Author-Name: Liu-Cang Wu
Author-X-Name-First: Liu-Cang
Author-X-Name-Last: Wu
Author-Name: Zhong-Zhan Zhang
Author-X-Name-First: Zhong-Zhan
Author-X-Name-Last: Zhang
Author-Name: Deng-Ke Xu
Author-X-Name-First: Deng-Ke
Author-X-Name-Last: Xu
Title: Variable selection in joint mean and variance models of Box--Cox transformation
Abstract:
In many applications, a single Box--Cox transformation cannot necessarily
produce the normality, constancy of variance and linearity of systematic
effects. In this paper, by establishing a heterogeneous linear regression
model for the Box--Cox transformed response, we propose a hybrid strategy,
in which variable selection is employed to reduce the dimension of the
explanatory variables in joint mean and variance models, and Box--Cox
transformation is made to remedy the response. We propose a unified
procedure which can simultaneously select significant variables in the
joint mean and variance models of Box--Cox transformation which provide a
useful extension of the ordinary normal linear regression models. With
appropriate choice of the tuning parameters, we establish the consistency
of this procedure and the oracle property of the obtained estimators.
Moreover, we also consider the maximum profile likelihood estimator of the
Box--Cox transformation parameter. Simulation studies and a real example
are used to illustrate the application of the proposed methods.
Journal: Journal of Applied Statistics
Pages: 2543-2555
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.722609
File-URL: http://hdl.handle.net/10.1080/02664763.2012.722609
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2543-2555
Template-Type: ReDIF-Article 1.0
Author-Name: Chang-Shuai Li
Author-X-Name-First: Chang-Shuai
Author-X-Name-Last: Li
Title: Analysis of hedging based on co-persistence theory
Abstract:
This article analyzes the relationship between co-persistence and
hedging, which indicates that the co-persistence ratio is just the
long-term hedging ratio. The new method of exhaustive search algorithm for
deriving co-persistence ratio is derived in the article. And we also
develop a new hedging strategy of combining co-persistence with dynamic
hedging which can enhance the hedging effectiveness and reduce the
persistence of the hedged portfolio. Finally, our strategy is illustrated
to study the hedge of JIASHI300 index and HS300 stock index future.
Journal: Journal of Applied Statistics
Pages: 2557-2567
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.722610
File-URL: http://hdl.handle.net/10.1080/02664763.2012.722610
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2557-2567
Template-Type: ReDIF-Article 1.0
Author-Name: Elias Zintzaras
Author-X-Name-First: Elias
Author-X-Name-Last: Zintzaras
Title: The power of generalized odds ratio in assessing association in genetic studies with known mode of inheritance
Abstract:
The generalized odds ratio (ORG) is a novel model-free
approach to test the association in genetic studies by estimating the
overall risk effect based on the complete genotype distribution. However,
the power of ORG has not been explored and, particularly, in a
setting where the mode of inheritance is known. A population genetics
model was simulated in order to define the mode of inheritance of a
pertinent gene--disease association in advance. Then, the power of
ORG was explored based on this model and compared with the
chi-square test for trend. The model considered bi- and tri-allelic
gene--disease associations, and deviations from the Hardy--Weinberg
equilibrium (HWE). The simulations showed that bi- and tri-allelic
variants have the same pattern of power results. The power of
ORG increases with increase in the frequency of mutant allele
and the coefficient of selection and, of course, the degree of dominance
of the mutant allele. The deviation from HWE has a considerable impact on
power only for small values of the above parameters. The ORG
showed superiority in power compared with the chi-square test for trend
when there is deviation from HWE; otherwise, the pattern of results was
similar in both the approaches.
Journal: Journal of Applied Statistics
Pages: 2569-2581
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.722611
File-URL: http://hdl.handle.net/10.1080/02664763.2012.722611
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2569-2581
Template-Type: ReDIF-Article 1.0
Author-Name: Kei Hirose
Author-X-Name-First: Kei
Author-X-Name-Last: Hirose
Author-Name: Tomoyuki Higuchi
Author-X-Name-First: Tomoyuki
Author-X-Name-Last: Higuchi
Title: Creating facial animation of characters via MoCap data
Abstract:
We consider the problem of generating 3D facial animation of characters.
An efficient procedure is realized by using the motion capture data (MoCap
data), which is obtained by tracking the facial markers from an
actor/actress. In some cases of artistic animation, the MoCap
actor/actress and the 3D character facial animation show different
expressions. For example, from the original facial MoCap data of speaking,
a user would like to create the character facial animation of speaking
with a smirk. In this paper, we propose a new easy-to-use system for
making character facial animation via MoCap data. Our system is based on
the interpolation: once the character facial expressions of the starting
and the ending frames are given, the intermediate frames are automatically
generated by information from the MoCap data. The interpolation procedure
consists of three stages. First, the time axis of animation is divided
into several intervals by the fused lasso signal approximator. In the
second stage, we use the kernel k-means clustering to
obtain control points. Finally, the interpolation is realized by using the
control points. The user can easily create a wide variety of 3D character
facial expressions by changing the control points.
Journal: Journal of Applied Statistics
Pages: 2583-2597
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.724391
File-URL: http://hdl.handle.net/10.1080/02664763.2012.724391
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2583-2597
Template-Type: ReDIF-Article 1.0
Author-Name: Vera Hofer
Author-X-Name-First: Vera
Author-X-Name-Last: Hofer
Author-Name: Johannes Leitner
Author-X-Name-First: Johannes
Author-X-Name-Last: Leitner
Title: A bivariate Sarmanov regression model for count data with generalised Poisson marginals
Abstract:
We present a bivariate regression model for count data that allows for
positive as well as negative correlation of the response variables. The
covariance structure is based on the Sarmanov distribution and consists of
a product of generalised Poisson marginals and a factor that depends on
particular functions of the response variables. The closed form of the
probability function is derived by means of the moment-generating
function. The model is applied to a large real dataset on health care
demand. Its performance is compared with alternative models presented in
the literature. We find that our model is significantly better than or at
least equivalent to the benchmark models. It gives insights into
influences on the variance of the response variables.
Journal: Journal of Applied Statistics
Pages: 2599-2617
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.724661
File-URL: http://hdl.handle.net/10.1080/02664763.2012.724661
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2599-2617
Template-Type: ReDIF-Article 1.0
Author-Name: Suzanne V. Landram
Author-X-Name-First: Suzanne V.
Author-X-Name-Last: Landram
Author-Name: Frank G. Landram
Author-X-Name-First: Frank G.
Author-X-Name-Last: Landram
Title: A computational understanding of partial and part determination coefficients
Abstract:
A computational understanding of partial and part determination
coefficients brings additional insight concerning their interpretations in
regression. It also enables one to easily identify synergistic
combinations. Benefits from the practical interpretation of synergism have
yet to be fully explored and exploited. Hence, this study provides a new
dimension in the analysis of data.
Journal: Journal of Applied Statistics
Pages: 2619-2626
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.724662
File-URL: http://hdl.handle.net/10.1080/02664763.2012.724662
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2619-2626
Template-Type: ReDIF-Article 1.0
Author-Name: Zangin Zeebari
Author-X-Name-First: Zangin
Author-X-Name-Last: Zeebari
Title: Developing ridge estimation method for median regression
Abstract:
In this paper, the ridge estimation method is generalized to the median
regression. Though the least absolute deviation (LAD) estimation method is
robust in the presence of non-Gaussian or asymmetric error terms, it can
still deteriorate into a severe multicollinearity problem when
non-orthogonal explanatory variables are involved. The proposed method
increases the efficiency of the LAD estimators by reducing the variance
inflation and giving more room for the bias to get a smaller mean squared
error of the LAD estimators. This paper includes an application of the new
methodology and a simulation study as well.
Journal: Journal of Applied Statistics
Pages: 2627-2638
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.724663
File-URL: http://hdl.handle.net/10.1080/02664763.2012.724663
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2627-2638
Template-Type: ReDIF-Article 1.0
Author-Name: Byron J. Gajewski
Author-X-Name-First: Byron J.
Author-X-Name-Last: Gajewski
Author-Name: Robert Lee
Author-X-Name-First: Robert
Author-X-Name-Last: Lee
Author-Name: Nancy Dunton
Author-X-Name-First: Nancy
Author-X-Name-Last: Dunton
Title: Data envelopment analysis in the presence of measurement error: case study from the National Database of Nursing Quality Indicators-super-® (NDNQI-super-®)
Abstract:
Data envelopment analysis (DEA) is the most commonly used approach for
evaluating healthcare efficiency [B. Hollingsworth, The measurement of
efficiency and productivity of health care delivery. Health
Economics 17(10) (2008), pp. 1107--1128], but a long-standing
concern is that DEA assumes that data are measured without error. This is
quite unlikely, and DEA and other efficiency analysis techniques may yield
biased efficiency estimates if it is not realized [B.J. Gajewski, R. Lee,
M. Bott, U. Piamjariyakul, and R.L. Taunton, On estimating the
distribution of data envelopment analysis efficiency scores: an
application to nursing homes’ care planning process.
Journal of Applied Statistics 36(9) (2009), pp. 933--944;
J. Ruggiero, Data envelopment analysis with stochastic data.
Journal of the Operational Research Society 55 (2004),
pp. 1008--1012]. We propose to address measurement error systematically
using a Bayesian method (Bayesian DEA). We will apply Bayesian DEA to data
from the National Database of Nursing Quality Indicators-super-® to
estimate nursing units’ efficiency. Several external reliability
studies inform the posterior distribution of the measurement error on the
DEA variables. We will discuss the case of generalizing the approach to
situations where an external reliability study is not feasible.
Journal: Journal of Applied Statistics
Pages: 2639-2653
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.724664
File-URL: http://hdl.handle.net/10.1080/02664763.2012.724664
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2639-2653
Template-Type: ReDIF-Article 1.0
Author-Name: Rand R. Wilcox
Author-X-Name-First: Rand R.
Author-X-Name-Last: Wilcox
Author-Name: David M. Erceg-Hurn
Author-X-Name-First: David M.
Author-X-Name-Last: Erceg-Hurn
Title: Comparing two dependent groups via quantiles
Abstract:
This paper considers two general ways dependent groups might be compared
based on quantiles. The first compares the quantiles of the marginal
distributions. The second focuses on the lower and upper quantiles of the
usual difference scores. Methods for comparing quantiles have been derived
that typically assume that sampling is from a continuous distribution.
There are exceptions, but generally, when sampling from a discrete
distribution where tied values are likely, extant methods can perform
poorly, even with a large sample size. One reason is that extant methods
for estimating the standard error can perform poorly. Another is that
quantile estimators based on a single-order statistic, or a weighted
average of two-order statistics, are not necessarily asymptotically
normal. Our main result is that when using the Harrell--Davis estimator,
good control over the Type I error probability can be achieved in
simulations via a standard percentile bootstrap method, even when there
are tied values, provided the sample sizes are not too small. In addition,
the two methods considered here can have substantially higher power than
alternative procedures. Using real data, we illustrate how quantile
comparisons can be used to gain a deeper understanding of how groups
differ.
Journal: Journal of Applied Statistics
Pages: 2655-2664
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.724665
File-URL: http://hdl.handle.net/10.1080/02664763.2012.724665
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2655-2664
Template-Type: ReDIF-Article 1.0
Author-Name: Mariagiulia Matteucci
Author-X-Name-First: Mariagiulia
Author-X-Name-Last: Matteucci
Author-Name: Stefania Mignani
Author-X-Name-First: Stefania
Author-X-Name-Last: Mignani
Author-Name: Bernard P. Veldkamp
Author-X-Name-First: Bernard P.
Author-X-Name-Last: Veldkamp
Title: The use of predicted values for item parameters in item response theory models: an application in intelligence tests
Abstract:
In testing, item response theory models are widely used in order to
estimate item parameters and individual abilities. However, even
unidimensional models require a considerable sample size so that all
parameters can be estimated precisely. The introduction of empirical prior
information about candidates and items might reduce the number of
candidates needed for parameter estimation. Using data for IQ measurement,
this work shows how empirical information about items can be used
effectively for item calibration and in adaptive testing. First, we
propose multivariate regression trees to predict the item parameters based
on a set of covariates related to the item-solving process. Afterwards, we
compare the item parameter estimation when tree-fitted values are included
in the estimation or when they are ignored. Model estimation is fully
Bayesian, and is conducted via Markov chain Monte Carlo methods. The
results are two-fold: (a) in item calibration, it is shown that the
introduction of prior information is effective with short test lengths and
small sample sizes and (b) in adaptive testing, it is demonstrated that
the use of the tree-fitted values instead of the estimated parameters
leads to a moderate increase in the test length, but provides a
considerable saving of resources.
Journal: Journal of Applied Statistics
Pages: 2665-2683
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.725034
File-URL: http://hdl.handle.net/10.1080/02664763.2012.725034
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2665-2683
Template-Type: ReDIF-Article 1.0
Author-Name: Osvaldo Venegas
Author-X-Name-First: Osvaldo
Author-X-Name-Last: Venegas
Author-Name: Francisco Rodr�guez
Author-X-Name-First: Francisco
Author-X-Name-Last: Rodr�guez
Author-Name: H�ctor W. Gómez
Author-X-Name-First: H�ctor W.
Author-X-Name-Last: Gómez
Author-Name: Juan F. Olivares-Pacheco
Author-X-Name-First: Juan F.
Author-X-Name-Last: Olivares-Pacheco
Author-Name: Heleno Bolfarine
Author-X-Name-First: Heleno
Author-X-Name-Last: Bolfarine
Title: Robust modeling using the generalized epsilon-skew-t distribution
Abstract:
In this paper, an alternative skew Student-t family of
distributions is studied. It is obtained as an extension of the
generalized Student-t (GS-t) family
introduced by McDonald and Newey [10]. The extension that is obtained can
be seen as a reparametrization of the skewed GS-t
distribution considered by Theodossiou [14]. A key element in the
construction of such an extension is that it can be stochastically
represented as a mixture of an epsilon-skew-power-exponential distribution
[1] and a generalized-gamma distribution. From this representation, we can
readily derive theoretical properties and easy-to-implement simulation
schemes. Furthermore, we study some of its main properties including
stochastic representation, moments and asymmetry and kurtosis
coefficients. We also derive the Fisher information matrix, which is shown
to be nonsingular for some special cases such as when the asymmetry
parameter is null, that is, at the vicinity of symmetry, and discuss
maximum-likelihood estimation. Simulation studies for some particular
cases and real data analysis are also reported, illustrating the
usefulness of the extension considered.
Journal: Journal of Applied Statistics
Pages: 2685-2698
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.725462
File-URL: http://hdl.handle.net/10.1080/02664763.2012.725462
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2685-2698
Template-Type: ReDIF-Article 1.0
Author-Name: S. Mukhopadhyay
Author-X-Name-First: S.
Author-X-Name-Last: Mukhopadhyay
Author-Name: I. Das
Author-X-Name-First: I.
Author-X-Name-Last: Das
Author-Name: K. Das
Author-X-Name-First: K.
Author-X-Name-Last: Das
Title: Selection of a stroke risk model based on transcranial Doppler ultrasound velocity
Abstract:
Increased transcranial Doppler ultrasound (TCD) velocity is an indicator
of cerebral infarction in children with sickle cell disease (SCD). In this
article, the parallel genetic algorithm (PGA) is used to select a stroke
risk model with TCD velocity as the response variable. Development of such
a stroke risk model leads to the identification of children with SCD who
are at a higher risk of stroke and their treatment in the early stages.
Using blood velocity data from SCD patients, it is shown that the PGA is
an easy-to-use computationally variable selection tool. The results of the
PGA are also compared with those obtained from the stochastic search
variable selection method, the Dantzig selector and conventional
techniques such as stepwise selection and best subset selection.
Journal: Journal of Applied Statistics
Pages: 2699-2712
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.725463
File-URL: http://hdl.handle.net/10.1080/02664763.2012.725463
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2699-2712
Template-Type: ReDIF-Article 1.0
Author-Name: H. E.T. Holgersson
Author-X-Name-First: H. E.T.
Author-X-Name-Last: Holgersson
Author-Name: Peter S. Karlsson
Author-X-Name-First: Peter S.
Author-X-Name-Last: Karlsson
Title: Three estimators of the Mahalanobis distance in high-dimensional data
Abstract:
This paper treats the problem of estimating the Mahalanobis distance for
the purpose of detecting outliers in high-dimensional data. Three
ridge-type estimators are proposed and risk functions for deciding an
appropriate value of the ridge coefficient are developed. It is argued
that one of the ridge estimator has particularly tractable properties,
which is demonstrated through outlier analysis of real and simulated data.
Journal: Journal of Applied Statistics
Pages: 2713-2720
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.725464
File-URL: http://hdl.handle.net/10.1080/02664763.2012.725464
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2713-2720
Template-Type: ReDIF-Article 1.0
Author-Name: Julio C. Ferreira
Author-X-Name-First: Julio C.
Author-X-Name-Last: Ferreira
Author-Name: Marta A. Freitas
Author-X-Name-First: Marta A.
Author-X-Name-Last: Freitas
Author-Name: Enrico A. Colosimo
Author-X-Name-First: Enrico A.
Author-X-Name-Last: Colosimo
Title: Degradation data analysis for samples under unequal operating conditions: a case study on train wheels
Abstract:
Traditionally, reliability assessment of devices has been based on life
tests (LTs) or accelerated life tests (ALTs). However, these approaches
are not practical for high-reliability devices which are not likely to
fail in experiments of reasonable length. For these devices, LTs or ALTs
will end up with a high censoring rate compromising the traditional
estimation methods. An alternative approach is to monitor the devices for
a period of time and assess their reliability from the changes in
performance (degradation) observed during the experiment. In this paper,
we present a model to evaluate the problem of train wheel degradation,
which is related to the failure modes of train derailments. We first
identify the most significant working conditions affecting the wheel wear
using a nonlinear mixed-effects (NLME) model where the log-rate of wear is
a linear function of some working conditions such as side, truck and axle
positions. Next, we estimate the failure time distribution by working
condition analytically. Point and interval estimates of reliability
figures by working condition are also obtained. We compare the results of
the analysis via an NLME to the ones obtained by an approximate
degradation analysis.
Journal: Journal of Applied Statistics
Pages: 2721-2739
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.725465
File-URL: http://hdl.handle.net/10.1080/02664763.2012.725465
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2721-2739
Template-Type: ReDIF-Article 1.0
Author-Name: Ahmed A. Soliman
Author-X-Name-First: Ahmed A.
Author-X-Name-Last: Soliman
Author-Name: A. H. Abd Ellah
Author-X-Name-First: A. H.
Author-X-Name-Last: Abd Ellah
Author-Name: N. A. Abou-Elheggag
Author-X-Name-First: N. A.
Author-X-Name-Last: Abou-Elheggag
Author-Name: A. A. Modhesh
Author-X-Name-First: A. A.
Author-X-Name-Last: Modhesh
Title: Estimation of the coefficient of variation for non-normal model using progressive first-failure-censoring data
Abstract:
The coefficient of variation (CV) is extensively used in many areas of
applied statistics including quality control and sampling. It is regarded
as a measure of stability or uncertainty, and can indicate the relative
dispersion of data in the population to the population mean. In this
article, based on progressive first-failure-censored data, we study the
behavior of the CV of a random variable that follows a Burr-XII
distribution. Specifically, we compute the maximum likelihood estimations
and the confidence intervals of CV based on the observed Fisher
information matrix using asymptotic distribution of the maximum likelihood
estimator and also by using the bootstrapping technique. In addition, we
propose to apply Markov Chain Monte Carlo techniques to tackle this
problem, which allows us to construct the credible intervals. A numerical
example based on real data is presented to illustrate the implementation
of the proposed procedure. Finally, Monte Carlo simulations are performed
to observe the behavior of the proposed methods.
Journal: Journal of Applied Statistics
Pages: 2741-2758
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.725466
File-URL: http://hdl.handle.net/10.1080/02664763.2012.725466
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2741-2758
Template-Type: ReDIF-Article 1.0
Author-Name: Yinghui Wei
Author-X-Name-First: Yinghui
Author-X-Name-Last: Wei
Author-Name: Peter Neal
Author-X-Name-First: Peter
Author-X-Name-Last: Neal
Author-Name: Sandra Telfer
Author-X-Name-First: Sandra
Author-X-Name-Last: Telfer
Author-Name: Mike Begon
Author-X-Name-First: Mike
Author-X-Name-Last: Begon
Title: Statistical analysis of an endemic disease from a capture--recapture experiment
Abstract:
There are a number of statistical techniques for analysing epidemic
outbreaks. However, many diseases are endemic within populations and the
analysis of such diseases is complicated by changing population
demography. Motivated by the spread of cowpox amongst rodent populations,
a combined mathematical model for population and disease dynamics is
introduced. A Markov chain Monte Carlo algorithm is then constructed to
make statistical inference for the model based on data being obtained from
a capture--recapture experiment. The statistical analysis is used to
identify the key elements in the spread of the cowpox virus.
Journal: Journal of Applied Statistics
Pages: 2759-2773
Issue: 12
Volume: 39
Year: 2012
Month: 8
X-DOI: 10.1080/02664763.2012.725467
File-URL: http://hdl.handle.net/10.1080/02664763.2012.725467
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:39:y:2012:i:12:p:2759-2773
Template-Type: ReDIF-Article 1.0
Author-Name: Robert G. Aykroyd
Author-X-Name-First: Robert G.
Author-X-Name-Last: Aykroyd
Title: Editorial
Journal: Journal of Applied Statistics
Pages: 1-1
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2013.745767
File-URL: http://hdl.handle.net/10.1080/02664763.2013.745767
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:1-1
Template-Type: ReDIF-Article 1.0
Author-Name: Russell J. Bowater
Author-X-Name-First: Russell J.
Author-X-Name-Last: Bowater
Author-Name: Gabriel Escarela
Author-X-Name-First: Gabriel
Author-X-Name-Last: Escarela
Title: Heterogeneity and study size in random-effects meta-analysis
Abstract:
It is well known that heterogeneity between studies in a meta-analysis
can be either caused by diversity, for example, variations in populations
and interventions, or caused by bias, that is, variations in design
quality and conduct of the studies. Heterogeneity that is due to bias is
difficult to deal with. On the other hand, heterogeneity that is due to
diversity is taken into account by a standard random-effects model.
However, such a model generally assumes that heterogeneity does not vary
according to study-level variables such as the size of the studies in the
meta-analysis and the type of study design used. This paper develops
models that allow for this type of variation in heterogeneity and
discusses the properties of the resulting methods. The models are fitted
using the maximum-likelihood method and by modifying the Paule--Mandel
method. Furthermore, a real-world argument is given to support the
assumption that the inter-study variance is inversely proportional to
study size. Under this assumption, the corresponding random-effects method
is shown to be connected with standard fixed-effect meta-analysis in a way
that may well appeal to many clinicians. The models and methods that are
proposed are applied to data from two large systematic reviews.
Journal: Journal of Applied Statistics
Pages: 2-16
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.700448
File-URL: http://hdl.handle.net/10.1080/02664763.2012.700448
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:2-16
Template-Type: ReDIF-Article 1.0
Author-Name: Francis Pike
Author-X-Name-First: Francis
Author-X-Name-Last: Pike
Author-Name: Lisa Weissfeld
Author-X-Name-First: Lisa
Author-X-Name-Last: Weissfeld
Title: Joint modeling of censored longitudinal and event time data
Abstract:
Censoring of a longitudinal outcome often occurs when data are collected
in a biomedical study and where the interest is in the survival and or
longitudinal experiences of a study population. In the setting considered
herein, we encountered upper and lower censored data as the result of
restrictions imposed on measurements from a kinetic model producing
“biologically implausible” kidney clearances. The goal of
this paper is to outline the use of a joint model to determine the
association between a censored longitudinal outcome and a time to event
endpoint. This paper extends Guo and Carlin's [6] paper to accommodate
censored longitudinal data, in a commercially available software platform,
by linking a mixed effects Tobit model to a suitable parametric survival
distribution. Our simulation results showed that our joint Tobit model
outperforms a joint model made up of the more naïve or
“fill-in” method for the longitudinal component. In this
case, the upper and/or lower limits of censoring are replaced by the limit
of detection. We illustrated the use of this approach with example data
from the hemodialysis (HEMO) study [3] and examined the association
between doubly censored kidney clearance values and survival.
Journal: Journal of Applied Statistics
Pages: 17-27
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.725468
File-URL: http://hdl.handle.net/10.1080/02664763.2012.725468
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:17-27
Template-Type: ReDIF-Article 1.0
Author-Name: J. M. Fernández-Ponce
Author-X-Name-First: J. M.
Author-X-Name-Last: Fernández-Ponce
Author-Name: F. Palacios-Rodr�guez
Author-X-Name-First: F.
Author-X-Name-Last: Palacios-Rodr�guez
Author-Name: M. R. Rodr�guez-Griñolo
Author-X-Name-First: M. R.
Author-X-Name-Last: Rodr�guez-Griñolo
Title: Bayesian influence diagnostics in radiocarbon dating
Abstract:
Linear models constitute the primary statistical technique for any
experimental science. A major topic in this area is the detection of
influential subsets of data, that is, of observations that are influential
in terms of their effect on the estimation of parameters in linear
regression or of the total population parameters. Numerous studies exist
on radiocarbon dating which propose a value consensus and remove possible
outliers after the corresponding testing. An influence analysis for the
value consensus from a Bayesian perspective is developed in this article.
Journal: Journal of Applied Statistics
Pages: 28-39
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.725531
File-URL: http://hdl.handle.net/10.1080/02664763.2012.725531
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:28-39
Template-Type: ReDIF-Article 1.0
Author-Name: Antonio S.M. Arroyo
Author-X-Name-First: Antonio S.M.
Author-X-Name-Last: Arroyo
Author-Name: Antonio Garc�a-Ferrer
Author-X-Name-First: Antonio
Author-X-Name-Last: Garc�a-Ferrer
Author-Name: Aránzazu de Juan Fernández
Author-X-Name-First: Aránzazu
Author-X-Name-Last: de Juan Fernández
Author-Name: Roc�o Sánchez-Mangas
Author-X-Name-First: Roc�o
Author-X-Name-Last: Sánchez-Mangas
Title: Lower posterior death probabilities from a quick medical response in road traffic accidents
Abstract:
Introduction: We use data from Spain on roads and
motorways traffic accidents in May 2004 to quantify the statistical
association between quick medical response time and mortality rate.
Method: Probit and logit parameters are estimated by a
Bayesian method in which samples from the posterior densities are obtained
through an MCMC simulation scheme. We provide posterior credible intervals
and posterior partial effects of a quick medical response at several time
levels over which we express our prior beliefs. Results:
A reduction of 5 min, from a 25-min response-time level, is
associated with lower posterior probabilities of death in roads and
motorways accidents of 24% and 30%, respectively.
Journal: Journal of Applied Statistics
Pages: 40-58
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.727788
File-URL: http://hdl.handle.net/10.1080/02664763.2012.727788
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:40-58
Template-Type: ReDIF-Article 1.0
Author-Name: Paul H. Garthwaite
Author-X-Name-First: Paul H.
Author-X-Name-Last: Garthwaite
Author-Name: Shafeeqah A. Al-Awadhi
Author-X-Name-First: Shafeeqah A.
Author-X-Name-Last: Al-Awadhi
Author-Name: Fadlalla G. Elfadaly
Author-X-Name-First: Fadlalla G.
Author-X-Name-Last: Elfadaly
Author-Name: David J. Jenkinson
Author-X-Name-First: David J.
Author-X-Name-Last: Jenkinson
Title: Prior distribution elicitation for generalized linear and piecewise-linear models
Abstract:
An elicitation method is proposed for quantifying subjective opinion
about the regression coefficients of a generalized linear model. Opinion
between a continuous predictor variable and the dependent variable is
modelled by a piecewise-linear function, giving a flexible model that can
represent a wide variety of opinion. To quantify his or her opinions, the
expert uses an interactive computer program, performing assessment tasks
that involve drawing graphs and bar-charts to specify medians and other
quantiles. Opinion about the regression coefficients is represented by a
multivariate normal distribution whose parameters are determined from the
assessments. It is practical to use the procedure with models containing a
large number of parameters. This is illustrated through practical examples
and the benefit from using prior knowledge is examined through
cross-validation.
Journal: Journal of Applied Statistics
Pages: 59-75
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.734794
File-URL: http://hdl.handle.net/10.1080/02664763.2012.734794
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:59-75
Template-Type: ReDIF-Article 1.0
Author-Name: Pablo Mart�nez-Camblor
Author-X-Name-First: Pablo
Author-X-Name-Last: Mart�nez-Camblor
Author-Name: Norberto Corral
Author-X-Name-First: Norberto
Author-X-Name-Last: Corral
Author-Name: Jesus Mar�a de la Hera
Author-X-Name-First: Jesus
Author-X-Name-Last: Mar�a de la Hera
Title: Hypothesis test for paired samples in the presence of missing data
Abstract:
Missing data are present in almost all statistical analysis. In simple
paired design tests, when some subject has one of the involved variables
missing in the so-called partially overlapping samples
scheme, it is usually discarded for the analysis. The lack of consistency
between the information reported in the univariate and multivariate
analysis is, perhaps, the main consequence. Although the randomness on the
missing mechanism (missingness completely at random) is an usual and
needed assumption for this particular situation, missing data presence
could lead to serious inconsistencies on the reported conclusions. In this
paper, the authors develop a simple and direct procedure which allows
using the whole available information in order to perform paired tests. In
particular, the proposed methodology is applied to check the equality
among the means from two paired samples. In addition, the use of two
different resampling techniques is also explored. Finally, real-world data
are analysed.
Journal: Journal of Applied Statistics
Pages: 76-87
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.734795
File-URL: http://hdl.handle.net/10.1080/02664763.2012.734795
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:76-87
Template-Type: ReDIF-Article 1.0
Author-Name: Isabella Sulis
Author-X-Name-First: Isabella
Author-X-Name-Last: Sulis
Author-Name: Vincenza Capursi
Author-X-Name-First: Vincenza
Author-X-Name-Last: Capursi
Title: Building up adjusted indicators of students’ evaluation of university courses using generalized item response models
Abstract:
This article advances a proposal for building up adjusted composite
indicators of the quality of university courses from students’
assessments. The flexible framework of Generalized Item Response Models is
adopted here for controlling the sources of heterogeneity in the data
structure that make evaluations across courses not directly comparable.
Specifically, it allows us to: jointly model students’ ratings to
the set of items which define the quality of university courses;
explicitly consider the dimensionality of the items composing the
evaluation form; evaluate and remove the effect of potential confounding
factors which may affect students’ evaluation; model the
intra-cluster variability at course level. The approach simultaneously
deals with: (i) multilevel data structure; (ii) multidimensional latent
trait; (iii) personal explanatory latent regression models. The paper pays
attention to the potential of such a flexible approach in the analysis of
students evaluation of university courses in order to explore both how the
quality of the different aspects (teaching, management, etc.) is perceived
by students and how to make meaningful comparisons across them on the
basis of adjusted indicators.
Journal: Journal of Applied Statistics
Pages: 88-102
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.734796
File-URL: http://hdl.handle.net/10.1080/02664763.2012.734796
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:88-102
Template-Type: ReDIF-Article 1.0
Author-Name: K. J. Kachiashvili
Author-X-Name-First: K. J.
Author-X-Name-Last: Kachiashvili
Author-Name: M. A. Hashmi
Author-X-Name-First: M. A.
Author-X-Name-Last: Hashmi
Author-Name: A. Mueed
Author-X-Name-First: A.
Author-X-Name-Last: Mueed
Title: Quasi-optimal Bayesian procedures of many hypotheses testing
Abstract:
Quasi-optimal procedures of testing many hypotheses are described in this
paper. They significantly simplify the Bayesian algorithms of hypothesis
testing and computation of the risk function. The relations allowing for
obtaining the estimations for the values of average risks in optimum tasks
are given. The obtained general solutions are reduced to concrete formulae
for a multivariate normal distribution of probabilities. The methods of
approximate computation of the risk functions in Bayesian tasks of testing
many hypotheses are offered. The properties and interrelations of the
developed methods and algorithms are investigated. On the basis of a
simulation, the validity of the obtained results and conclusions drawn is
presented.
Journal: Journal of Applied Statistics
Pages: 103-122
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.734797
File-URL: http://hdl.handle.net/10.1080/02664763.2012.734797
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:103-122
Template-Type: ReDIF-Article 1.0
Author-Name: Yanchun Bao
Author-X-Name-First: Yanchun
Author-X-Name-Last: Bao
Author-Name: Hongsheng Dai
Author-X-Name-First: Hongsheng
Author-X-Name-Last: Dai
Author-Name: Tao Wang
Author-X-Name-First: Tao
Author-X-Name-Last: Wang
Author-Name: Sung-Kiang Chuang
Author-X-Name-First: Sung-Kiang
Author-X-Name-Last: Chuang
Title: A joint modelling approach for clustered recurrent events and death events
Abstract:
In dental implant research studies, events such as implant complications
including pain or infection may be observed recurrently before failure
events, i.e. the death of implants. It is natural to assume that recurrent
events and failure events are correlated to each other, since they happen
on the same implant (subject) and complication times have strong effects
on the implant survival time. On the other hand, each patient may have
more than one implant. Therefore these recurrent events or failure events
are clustered since implant complication times or failure times within the
same patient (cluster) are likely to be correlated. The overall implant
survival times and recurrent complication times are both interesting to
us. In this paper, a joint modelling approach is proposed for modelling
complication events and dental implant survival times simultaneously. The
proposed method uses a frailty process to model the correlation within
cluster and the correlation within subjects. We use Bayesian methods to
obtain estimates of the parameters. Performance of the joint models are
shown via simulation studies and data analysis.
Journal: Journal of Applied Statistics
Pages: 123-140
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.735225
File-URL: http://hdl.handle.net/10.1080/02664763.2012.735225
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:123-140
Template-Type: ReDIF-Article 1.0
Author-Name: Eugenia Nissi
Author-X-Name-First: Eugenia
Author-X-Name-Last: Nissi
Author-Name: Annalina Sarra
Author-X-Name-First: Annalina
Author-X-Name-Last: Sarra
Title: A simulation study on the hybrid nature of Tango's index
Abstract:
Since the early 1990s, there has been an increasing interest in
statistical methods for detecting global spatial clustering in data sets.
Tango's index is one of the most widely used spatial statistics for
assessing whether spatially distributed disease rates are independent or
clustered. Interestingly, this statistic can be partitioned into the sum
of two terms: one term is similar to the usual chi-square statistic, being
based on deviation patterns between the observed and expected values, and
the other term, similar to Moran's I, is able to detect the proximity of
similar values. In this paper, we examine this hybrid nature of Tango's
index. The goal is to evaluate the possibility of distinguishing the
spatial sources of clustering: lack of fit or spatial autocorrelation. To
comply with the aims of the work, a simulation study is performed, by
which examples of patterns driving the goodness-of-fit and spatial
autocorrelation components of the statistic are provided. As for the
latter aspect, it is worth noting that inducing spatial association among
count data without adding lack of fit is not an easy task. In this
respect, the overlapping sums method is adopted. The main findings of the
simulation experiment are illustrated and a comparison with a previous
research on this topic is also highlighted.
Journal: Journal of Applied Statistics
Pages: 141-151
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.738189
File-URL: http://hdl.handle.net/10.1080/02664763.2012.738189
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:141-151
Template-Type: ReDIF-Article 1.0
Author-Name: Han Lin Shang
Author-X-Name-First: Han Lin
Author-X-Name-Last: Shang
Title: Functional time series approach for forecasting very short-term electricity demand
Abstract:
This empirical paper presents a number of functional modelling and
forecasting methods for predicting very short-term (such as
minute-by-minute) electricity demand. The proposed functional methods
slice a seasonal univariate time series (TS) into a TS of curves; reduce
the dimensionality of curves by applying functional principal component
analysis before using a univariate TS forecasting method and regression
techniques. As data points in the daily electricity demand are
sequentially observed, a forecast updating method can greatly improve the
accuracy of point forecasts. Moreover, we present a non-parametric
bootstrap approach to construct and update prediction intervals, and
compare the point and interval forecast accuracy with some naive benchmark
methods. The proposed methods are illustrated by the half-hourly
electricity demand from Monday to Sunday in South Australia.
Journal: Journal of Applied Statistics
Pages: 152-168
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.740619
File-URL: http://hdl.handle.net/10.1080/02664763.2012.740619
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:152-168
Template-Type: ReDIF-Article 1.0
Author-Name: Donald W. Zimmerman
Author-X-Name-First: Donald W.
Author-X-Name-Last: Zimmerman
Title: Heterogeneity of variance and biased hypothesis tests
Abstract:
This study examined the influence of heterogeneity of variance on Type I
error rates and power of the independent-samples Student's
t-test of equality of means on samples of scores from
normal and 10 non-normal distributions. The same test of equality of means
was performed on corresponding rank-transformed scores. For many
non-normal distributions, both versions produced anomalous power
functions, resulting partly from the fact that the hypothesis test was
biased, so that under some conditions, the probability of
rejecting H 0 decreased as the difference
between means increased. In all cases where bias occurred, the
t-test on ranks exhibited substantially greater bias than
the t-test on scores. This anomalous result was
independent of the more familiar changes in Type I error rates and power
attributable to unequal sample sizes combined with unequal variances.
Journal: Journal of Applied Statistics
Pages: 169-193
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.740620
File-URL: http://hdl.handle.net/10.1080/02664763.2012.740620
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:169-193
Template-Type: ReDIF-Article 1.0
Author-Name: S. H. Lin
Author-X-Name-First: S. H.
Author-X-Name-Last: Lin
Author-Name: R. S. Wang
Author-X-Name-First: R. S.
Author-X-Name-Last: Wang
Title: Modified method on the means for several log-normal distributions
Abstract:
Among statistical inferences, one of the main interests is drawing the
inferences about the log-normal means since the log-normal distribution is
a well-known candidate model for analyzing positive and right-skewed data.
In the past, the researchers only focused on one or two log-normal
populations or used the large sample theory or quadratic procedure to deal
with several log-normal distributions. In this article, we focus on making
inferences on several log-normal means based on the modification of the
quadratic method, in which the researchers often used the vector of the
generalized variables to deal with the means of the symmetric
distributions. Simulation studies show that the quadratic method performs
well only for symmetric distributions. However, the modified procedure
fits both symmetric and skew distribution. The numerical results show that
the proposed modified procedure can provide the confidence interval with
coverage probabilities close to the nominal level and the hypothesis
testing performed with satisfactory results.
Journal: Journal of Applied Statistics
Pages: 194-208
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.740622
File-URL: http://hdl.handle.net/10.1080/02664763.2012.740622
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:194-208
Template-Type: ReDIF-Article 1.0
Author-Name: Xu-Qing Liu
Author-X-Name-First: Xu-Qing
Author-X-Name-Last: Liu
Author-Name: Feng Gao
Author-X-Name-First: Feng
Author-X-Name-Last: Gao
Author-Name: Zhen-Feng Yu
Author-X-Name-First: Zhen-Feng
Author-X-Name-Last: Yu
Title: Improved ridge estimators in a linear regression model
Abstract:
In this paper, the notion of the improved ridge estimator (IRE) is put
forward in the linear regression model y=X
β+e. The problem arises if augmenting the
equation
0=c′α+ε instead of 0=C
α+ϵ to the model. Three special IREs
are considered and studied under the mean-squared error criterion and the
prediction error sum of squares criterion. The simulations demonstrate
that the proposed estimators are effective and recommendable, especially
when multicollinearity is severe.
Journal: Journal of Applied Statistics
Pages: 209-220
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.740623
File-URL: http://hdl.handle.net/10.1080/02664763.2012.740623
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:209-220
Template-Type: ReDIF-Article 1.0
Author-Name: Cesaltina Pires
Author-X-Name-First: Cesaltina
Author-X-Name-Last: Pires
Author-Name: Andreia Dion�sio
Author-X-Name-First: Andreia
Author-X-Name-Last: Dion�sio
Author-Name: Lu�s Coelho
Author-X-Name-First: Lu�s
Author-X-Name-Last: Coelho
Title: Estimating utility functions using generalized maximum entropy
Abstract:
This paper estimates von Neumann and Morgenstern utility functions using
the generalized maximum entropy (GME), applied to data obtained by utility
elicitation methods. Given the statistical advantages of this approach, we
provide a comparison of the performance of the GME estimator with ordinary
least square (OLS) in a real data small sample setup. The results confirm
the ones obtained for small samples through Monte Carlo simulations. The
difference between the two estimators is small and it decreases as the
width of the parameter support vector increases. Moreover, the GME
estimator is more precise than the OLS one. Overall, the results suggest
that GME is an interesting alternative to OLS in the estimation of utility
functions when data are generated by utility elicitation methods.
Journal: Journal of Applied Statistics
Pages: 221-234
Issue: 1
Volume: 40
Year: 2013
Month: 1
X-DOI: 10.1080/02664763.2012.740625
File-URL: http://hdl.handle.net/10.1080/02664763.2012.740625
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:1:p:221-234
Template-Type: ReDIF-Article 1.0
Author-Name: Faping Duan
Author-X-Name-First: Faping
Author-X-Name-Last: Duan
Author-Name: Daniel Ogden
Author-X-Name-First: Daniel
Author-X-Name-Last: Ogden
Author-Name: Ling Xu
Author-X-Name-First: Ling
Author-X-Name-Last: Xu
Author-Name: Kang Liu
Author-X-Name-First: Kang
Author-X-Name-Last: Liu
Author-Name: George Lust
Author-X-Name-First: George
Author-X-Name-Last: Lust
Author-Name: Jody Sandler
Author-X-Name-First: Jody
Author-X-Name-Last: Sandler
Author-Name: Nathan L. Dykes
Author-X-Name-First: Nathan L.
Author-X-Name-Last: Dykes
Author-Name: Lan Zhu
Author-X-Name-First: Lan
Author-X-Name-Last: Zhu
Author-Name: Steven Harris
Author-X-Name-First: Steven
Author-X-Name-Last: Harris
Author-Name: Paul Jones
Author-X-Name-First: Paul
Author-X-Name-Last: Jones
Author-Name: Rory J. Todhunter
Author-X-Name-First: Rory J.
Author-X-Name-Last: Todhunter
Author-Name: Zhiwu Zhang
Author-X-Name-First: Zhiwu
Author-X-Name-Last: Zhang
Title: Principal component analysis of canine hip dysplasia phenotypes and their statistical power for genome-wide association mapping
Abstract:
The aims of this study were to undertake principal component analysis
(PCA) of hip dysplasia (HD) and to examine the power of the principal
components (PCs) in genome-wide association studies. A cohort of 278 dogs
for PCA and that of 369 dogs for genotyping were used. The distraction
index (DI), the dorsolateral subluxation (DLS) score, the Norberg angle
(NA), and the extended-hip radiographic (EHR) score were used for the PCA.
One thousand single-nucleotide polymorphisms (SNPs) (of 23,500) were used
to simulate genetic locus sharing between the HD phenotypes and 1000 SNPs
were used to calculate the genetic mapping power of the PCs. The DI and
the DLS score (first group) reflected hip laxity and the NA and the EHR
score (second group) reflected the congruency between the femoral head and
acetabulum. The average hip measurements of the two groups reflected in
the first PC captured 55% of total radiographic variation. The first four
PCs captured 90% of the total variation. The PCs had higher statistical
mapping power to detect pleiotropic quantitative trait loci (QTL) than the
raw phenotypes. The PCA demonstrated for the first time that HD can be
reduced mathematically into simpler components essential for its genetic
dissection. Genes that contribute jointly to all four radiographic hip
phenotypes can be detected by mapping their first four PCs, while those
contributing to individual phenotypes can be mapped by association with
the individual raw phenotype.
Journal: Journal of Applied Statistics
Pages: 235-251
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.740617
File-URL: http://hdl.handle.net/10.1080/02664763.2012.740617
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:235-251
Template-Type: ReDIF-Article 1.0
Author-Name: Liyong Fu
Author-X-Name-First: Liyong
Author-X-Name-Last: Fu
Author-Name: Yuancai Lei
Author-X-Name-First: Yuancai
Author-X-Name-Last: Lei
Author-Name: Ram P. Sharma
Author-X-Name-First: Ram P.
Author-X-Name-Last: Sharma
Author-Name: Shouzheng Tang
Author-X-Name-First: Shouzheng
Author-X-Name-Last: Tang
Title: Parameter estimation of nonlinear mixed-effects models using first-order conditional linearization and the EM algorithm
Abstract:
Nonlinear mixed-effects (NLME) models are flexible enough to handle
repeated-measures data from various disciplines. In this article, we
propose both maximum-likelihood and restricted maximum-likelihood
estimations of NLME models using first-order conditional expansion (FOCE)
and the expectation--maximization (EM) algorithm. The FOCE-EM algorithm
implemented in the ForStat procedure SNLME is compared
with the Lindstrom and Bates (LB) algorithm implemented in both the SAS
macro NLINMIX and the S-Plus/R function nlme in terms of
computational efficiency and statistical properties. Two realworld data
sets an orange tree data set and a Chinese fir (Cunninghamia
lanceolata) data set, and a simulated data set were used for
evaluation. FOCE-EM converged for all mixed models derived from the base
model in the two realworld cases, while LB did not, especially for the
models in which random effects are simultaneously considered in several
parameters to account for between-subject variation. However, both
algorithms had identical estimated parameters and fit statistics for the
converged models. We therefore recommend using FOCE-EM in NLME models,
particularly when convergence is a concern in model selection.
Journal: Journal of Applied Statistics
Pages: 252-265
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.740621
File-URL: http://hdl.handle.net/10.1080/02664763.2012.740621
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:252-265
Template-Type: ReDIF-Article 1.0
Author-Name: Yuanhua Feng
Author-X-Name-First: Yuanhua
Author-X-Name-Last: Feng
Title: An iterative plug-in algorithm for decomposing seasonal time series using the Berlin Method
Abstract:
We propose a fast data-driven procedure for decomposing seasonal time
series using the Berlin Method, the procedure used, e.g. by the German
Federal Statistical Office in this context. The formula of the asymptotic
optimal bandwidth h A is obtained. Methods for
estimating the unknowns in h A are proposed.
The algorithm is developed by adapting the well-known iterative plug-in
idea to time series decomposition. Asymptotic behaviour of the proposal is
investigated. Some computational aspects are discussed in detail. Data
examples show that the proposal works very well in practice and that
data-driven bandwidth selection offers new possibilities to improve the
Berlin Method. Deep insights into the iterative plug-in rule are also
provided.
Journal: Journal of Applied Statistics
Pages: 266-281
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.740626
File-URL: http://hdl.handle.net/10.1080/02664763.2012.740626
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:266-281
Template-Type: ReDIF-Article 1.0
Author-Name: Francesca Greselin
Author-X-Name-First: Francesca
Author-X-Name-Last: Greselin
Author-Name: Leo Pasquazzi
Author-X-Name-First: Leo
Author-X-Name-Last: Pasquazzi
Author-Name: Ričardas Zitikis
Author-X-Name-First: Ričardas
Author-X-Name-Last: Zitikis
Title: Contrasting the Gini and Zenga indices of economic inequality
Abstract:
The current financial turbulence in Europe inspires and perhaps requires
researchers to rethink how to measure incomes, wealth, and other
parameters of interest to policy-makers and others. The noticeable
increase in disparities between less and more fortunate individuals
suggests that measures based upon comparing the incomes of less fortunate
with the mean of the entire population may not be adequate. The classical
Gini and related indices of economic inequality, however, are based
exactly on such comparisons. It is because of this reason that in this
paper we explore and contrast the classical Gini index with a new Zenga
index, the latter being based on comparisons of the means of less and more
fortunate sub-populations, irrespectively of the threshold that might be
used to delineate the two sub-populations. The empirical part of the paper
is based on the 2001 wave of the European Community Household Panel data
set provided by EuroStat. Even though sample sizes appear to be large, we
supplement the estimated Gini and Zenga indices with measures of
variability in the form of normal, t-bootstrap, and
bootstrap bias-corrected and accelerated confidence intervals.
Journal: Journal of Applied Statistics
Pages: 282-297
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.740627
File-URL: http://hdl.handle.net/10.1080/02664763.2012.740627
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:282-297
Template-Type: ReDIF-Article 1.0
Author-Name: Ileana Baldi
Author-X-Name-First: Ileana
Author-X-Name-Last: Baldi
Author-Name: Eva Pagano
Author-X-Name-First: Eva
Author-X-Name-Last: Pagano
Author-Name: Paola Berchialla
Author-X-Name-First: Paola
Author-X-Name-Last: Berchialla
Author-Name: Alessandro Desideri
Author-X-Name-First: Alessandro
Author-X-Name-Last: Desideri
Author-Name: Alberto Ferrando
Author-X-Name-First: Alberto
Author-X-Name-Last: Ferrando
Author-Name: Franco Merletti
Author-X-Name-First: Franco
Author-X-Name-Last: Merletti
Author-Name: Dario Gregori
Author-X-Name-First: Dario
Author-X-Name-Last: Gregori
Title: Modeling healthcare costs in simultaneous presence of asymmetry, heteroscedasticity and correlation
Abstract:
Highly skewed outcome distributions observed across clusters are common
in medical research. The aim of this paper is to understand how regression
models widely used for accommodating asymmetry fit clustered data under
heteroscedasticity. In a simulation study, we provide evidence on the
performance of the Gamma Generalized Linear Mixed Model (GLMM) and
log-Linear Mixed-Effect (LME) model under a variety of data-generating
mechanisms. Two case studies from health expenditures literature, the cost
of strategies after myocardial infarction randomized clinical trial on the
cost of strategies after myocardial infarction and the European Pressure
Ulcer Advisory Panel hospital prevalence survey of pressure ulcers, are
analyzed and discussed. According to simulation results, the log-LME model
for a Gamma response can lead to estimations that are biased by as much as
10% of the true value, depending on the error variance. In the Gamma GLMM,
the bias never exceeds 1%, regardless of the extent of heteroscedasticity,
and the confidence intervals perform as nominally stated under most
conditions. The Gamma GLMM with a log link seems to be more robust to both
Gamma and log-normal generating mechanisms than the log-LME model.
Journal: Journal of Applied Statistics
Pages: 298-310
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.740628
File-URL: http://hdl.handle.net/10.1080/02664763.2012.740628
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:298-310
Template-Type: ReDIF-Article 1.0
Author-Name: Lai Wei
Author-X-Name-First: Lai
Author-X-Name-Last: Wei
Author-Name: Alan D. Hutson
Author-X-Name-First: Alan D.
Author-X-Name-Last: Hutson
Title: A comment on sample size calculations for binomial confidence intervals
Abstract:
In this article we examine sample size calculations for a binomial
proportion based on the confidence interval width of the Agresti--Coull,
Wald and Wilson Score intervals. We pointed out that the commonly used
methods based on known and fixed standard errors cannot guarantee the
desired confidence interval width given a hypothesized proportion.
Therefore, a new adjusted sample size calculation method was introduced,
which is based on the conditional expectation of the width of the
confidence interval given the hypothesized proportion. With the reduced
sample size, the coverage probability can still maintain at the nominal
level and is very competitive to the converge probability for the original
sample size.
Journal: Journal of Applied Statistics
Pages: 311-319
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.740629
File-URL: http://hdl.handle.net/10.1080/02664763.2012.740629
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:311-319
Template-Type: ReDIF-Article 1.0
Author-Name: J. Andrew Howe
Author-X-Name-First: J. Andrew
Author-X-Name-Last: Howe
Author-Name: Hamparsum Bozdogan
Author-X-Name-First: Hamparsum
Author-X-Name-Last: Bozdogan
Title: Robust mixture model cluster analysis using adaptive kernels
Abstract:
The traditional mixture model assumes that a dataset is composed of
several populations of Gaussian distributions. In real life, however, data
often do not fit the restrictions of normality very well. It is likely
that data from a single population exhibiting either asymmetrical or
heavy-tail behavior could be erroneously modeled as two populations,
resulting in suboptimal decisions. To avoid these pitfalls, we generalize
the mixture model using adaptive kernel density estimators. Because kernel
density estimators enforce no functional form, we can adapt to non-normal
asymmetric, kurtotic, and tail characteristics in each population
independently. This, in effect, robustifies mixture modeling. We adapt two
computational algorithms, genetic algorithm with regularized Mahalanobis
distance and genetic expectation maximization algorithm, to optimize the
kernel mixture model (KMM) and use results from robust estimation theory
in order to data-adaptively regularize both. Finally, we likewise extend
the information criterion ICOMP to score the KMM. We use these tools to
simultaneously select the best mixture model and classify all observations
without making any subjective decisions. The performance
of the KMM is demonstrated on two medical datasets; in both cases, we
recover the clinically determined group structure and substantially
improve patient classification rates over the Gaussian mixture model.
Journal: Journal of Applied Statistics
Pages: 320-336
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.740630
File-URL: http://hdl.handle.net/10.1080/02664763.2012.740630
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:320-336
Template-Type: ReDIF-Article 1.0
Author-Name: Andreas Quatember
Author-X-Name-First: Andreas
Author-X-Name-Last: Quatember
Author-Name: Monika Cornelia Hausner
Author-X-Name-First: Monika Cornelia
Author-X-Name-Last: Hausner
Title: A family of methods for statistical disclosure control
Abstract:
Statistical disclosure control (SDC) is a balancing act between mandatory
data protection and the comprehensible demand from researchers for access
to original data. In this paper, a family of methods is defined to
‘mask’ sensitive variables before data files can be
released. In the first step, the variable to be masked is
‘cloned’ (C). Then, the duplicated variable as a whole or
just a part of it is ‘suppressed’ (S). The masking
procedure's third step ‘imputes’ (I) data for these
artificial missings. Then, the original variable can be deleted and its
masked substitute has to serve as the basis for the analysis of data. The
idea of this general ‘CSI framework’ is to open the wide
field of imputation methods for SDC. The method applied in the I-step can
make use of available auxiliary variables including the original variable.
Different members of this family of methods delivering variance estimators
are discussed in some detail. Furthermore, a simulation study analyzes
various methods belonging to the family with respect to both, the quality
of parameter estimation and privacy protection. Based on the results
obtained, recommendations are formulated for different estimation tasks.
Journal: Journal of Applied Statistics
Pages: 337-346
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.743975
File-URL: http://hdl.handle.net/10.1080/02664763.2012.743975
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:337-346
Template-Type: ReDIF-Article 1.0
Author-Name: Xiting Cao
Author-X-Name-First: Xiting
Author-X-Name-Last: Cao
Author-Name: Baolin Wu
Author-X-Name-First: Baolin
Author-X-Name-Last: Wu
Author-Name: Marshall I. Hertz
Author-X-Name-First: Marshall I.
Author-X-Name-Last: Hertz
Title: Empirical null distribution-based modeling of multi-class differential gene expression detection
Abstract:
In this paper, we study the multi-class differential gene expression
detection for microarray data. We propose a likelihood-based approach to
estimating an empirical null distribution to incorporate gene interactions
and provide a more accurate false-positive control than the commonly used
permutation or theoretical null distribution-based approach. We propose to
rank important genes by p-values or local false discovery
rate based on the estimated empirical null distribution. Through
simulations and application to lung transplant microarray data, we
illustrate the competitive performance of the proposed method.
Journal: Journal of Applied Statistics
Pages: 347-357
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.743976
File-URL: http://hdl.handle.net/10.1080/02664763.2012.743976
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:347-357
Template-Type: ReDIF-Article 1.0
Author-Name: Baolin Wu
Author-X-Name-First: Baolin
Author-X-Name-Last: Wu
Title: Sparse cluster analysis of large-scale discrete variables with application to single nucleotide polymorphism data
Abstract:
Currently, extreme large-scale genetic data present significant
challenges for cluster analysis. Most of the existing clustering methods
are typically built on the Euclidean distance and geared toward analyzing
continuous response. They work well for clustering, e.g. microarray gene
expression data, but often perform poorly for clustering, e.g. large-scale
single nucleotide polymorphism (SNP) data. In this paper, we study the
penalized latent class model for clustering extremely large-scale discrete
data. The penalized latent class model takes into account the discrete
nature of the response using appropriate generalized linear models and
adopts the lasso penalized likelihood approach for simultaneous model
estimation and selection of important covariates. We develop very
efficient numerical algorithms for model estimation based on the iterative
coordinate descent approach and further develop the
expectation--maximization algorithm to incorporate and model missing
values. We use simulation studies and applications to the international
HapMap SNP data to illustrate the competitive performance of the penalized
latent class model.
Journal: Journal of Applied Statistics
Pages: 358-367
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.743977
File-URL: http://hdl.handle.net/10.1080/02664763.2012.743977
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:358-367
Template-Type: ReDIF-Article 1.0
Author-Name: Dongliang Wang
Author-X-Name-First: Dongliang
Author-X-Name-Last: Wang
Author-Name: Alan D. Hutson
Author-X-Name-First: Alan D.
Author-X-Name-Last: Hutson
Title: Joint confidence region estimation of L-moment ratios with an extension to right censored data
Abstract:
L-moments, defined as specific linear combinations of expectations of
order statistics, have been advocated by Hosking 7 and others in the
literature as meaningful replacements to that of classic moments in a wide
variety of applications. One particular use of L-moments is to classify
distributions based on the so-called L-skewness and L-kurtosis measures
and given by an L-moment ratio diagram. This method parallels the classic
moment-based plot of skewness and kurtosis corresponding to the Pearson
system of distributions. In general, these methods have been more
descriptive in nature and failed to consider the corresponding variation
and covariance of the point estimators. In this note, we propose two
procedures to estimate the 100(1−α)% joint confidence region
of L-skewness and L-kurtosis, given both complete and censored data. The
procedures are derived based on asymptotic normality of L-moment
estimators or through a novel empirical characteristic function (c.f.)
approach. Simulation results are provided for comparing the performance of
these procedures in terms of their respective coverage probabilities. The
new and novel c.f.-based confidence region provided superior coverage
probability as compared to the standard bootstrap procedure across all
parameter settings. The proposed methods are illustrated via an
application to a complete Buffalo snow fall data set and to a censored
breast cancer data set, respectively.
Journal: Journal of Applied Statistics
Pages: 368-379
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.744386
File-URL: http://hdl.handle.net/10.1080/02664763.2012.744386
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:368-379
Template-Type: ReDIF-Article 1.0
Author-Name: Peixin Zhao
Author-X-Name-First: Peixin
Author-X-Name-Last: Zhao
Author-Name: Liugen Xue
Author-X-Name-First: Liugen
Author-X-Name-Last: Xue
Title: Instrumental variable-based empirical likelihood inferences for varying-coefficient models with error-prone covariates
Abstract:
This paper presents the empirical likelihood inferences for a class of
varying-coefficient models with error-prone covariates. We focus on the
case that the covariance matrix of the measurement errors is unknown and
neither repeated measurements nor validation data are available. We
propose an instrumental variable-based empirical likelihood inference
method and show that the proposed empirical log-likelihood ratio is
asymptotically chi-squared. Then, the confidence intervals for the
varying-coefficient functions are constructed. Some simulation studies and
a real data application are used to assess the finite sample performance
of the proposed empirical likelihood procedure.
Journal: Journal of Applied Statistics
Pages: 380-396
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.744810
File-URL: http://hdl.handle.net/10.1080/02664763.2012.744810
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:380-396
Template-Type: ReDIF-Article 1.0
Author-Name: Ilhan Usta
Author-X-Name-First: Ilhan
Author-X-Name-Last: Usta
Title: Different estimation methods for the parameters of the extended Burr XII distribution
Abstract:
The extended three-parameter Burr XII (EBXII) distribution has recently
attracted considerable attention for modeling data from various scientific
fields since it yields a wide range of skewness and kurtosis values.
However, it is well known that the parameter estimates have significant
effects on the success of a distribution in real-life applications. In
this study, modified moment estimators (MMEs) and modified
probability-weighted moments estimators (MPWMEs) are used to estimate the
parameters of the EBXII distribution. These two considered estimators are
also compared with the commonly used maximum-likelihood, percentiles,
least-squares and weighted least-squares estimators in terms of bias and
efficiency via an extensive numerical simulation. The MMEs and MPWMEs are
observed to perform well in varying sample cases, and the simulation
results are supported with application through a real-life data set.
Journal: Journal of Applied Statistics
Pages: 397-414
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.743974
File-URL: http://hdl.handle.net/10.1080/02664763.2012.743974
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:397-414
Template-Type: ReDIF-Article 1.0
Author-Name: Elena Abascal
Author-X-Name-First: Elena
Author-X-Name-Last: Abascal
Author-Name: Vidal D�az de Rada
Author-X-Name-First: Vidal D�az
Author-X-Name-Last: de Rada
Author-Name: Ignacio Garc�a Lautre
Author-X-Name-First: Ignacio Garc�a
Author-X-Name-Last: Lautre
Author-Name: M. Isabel Landaluce
Author-X-Name-First: M. Isabel
Author-X-Name-Last: Landaluce
Title: Extending dual multiple factor analysis to categorical tables
Abstract:
This paper describes a proposal for the extension of the dual multiple
factor analysis (DMFA) method developed by Lê and Pagès 15 to
the analysis of categorical tables in which the same set of variables is
measured on different sets of individuals. The extension of DMFA is based
on the transformation of categorical variables into properly weighted
indicator variables, in a way analogous to that used in the multiple
factor analysis of categorical variables. The DMFA of categorical
variables enables visual comparison of the association structures between
categories over the sample as a whole and in the various subsamples (sets
of individuals). For each category, DMFA allows us to obtain its global
(considering all the individuals) and partial (considering each set of
individuals) coordinates in a factor space. This visual analysis allows us
to compare the set of individuals to identify their similarities and
differences. The suitability of the technique is illustrated through two
applications: one using simulated data for two groups of individuals with
very different association structures and the other using real data from a
voting intention survey in which some respondents were interviewed by
telephone and others face to face. The results indicate that the two data
collection methods, while similar, are not entirely equivalent.
Journal: Journal of Applied Statistics
Pages: 415-428
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.745836
File-URL: http://hdl.handle.net/10.1080/02664763.2012.745836
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:415-428
Template-Type: ReDIF-Article 1.0
Author-Name: P. Angelopoulos
Author-X-Name-First: P.
Author-X-Name-Last: Angelopoulos
Author-Name: K. Drosou
Author-X-Name-First: K.
Author-X-Name-Last: Drosou
Author-Name: C. Koukouvinos
Author-X-Name-First: C.
Author-X-Name-Last: Koukouvinos
Title: An orthogonal arrays approach to robust parameter designs methodology
Abstract:
Robust parameter design methodology was originally introduced by Taguchi
[14] as an engineering methodology for quality improvement of products and
processes. A robust design of a system is one in which two different types
of factors are varied; control factors and noise factors. Control factors
are variables with levels that are adjustable, whereas noise factors are
variables with levels that are hard or impossible to control during normal
conditions, such as environmental conditions and raw-material properties.
Robust parameter design aims at the reduction of process variation by
properly selecting the levels of control factors so that the process
becomes insensitive to changes in noise factors. Taguchi [1415] proposed
the use of crossed arrays (inner--outer arrays) for robust parameter
design. A crossed array is the cross-product of an orthogonal array (OA)
involving control factors (inner array) and an OA involving noise factors
(outer array). Objecting to the run size and the flexibility of crossed
arrays, several authors combined control and noise factors in a single
design matrix, which is called a combined array, instead of crossed
arrays. In this framework, we present the use of OAs in Taguchi's
methodology as a useful tool for designing robust parameter designs with
economical run size.
Journal: Journal of Applied Statistics
Pages: 429-437
Issue: 2
Volume: 40
Year: 2012
Month: 2
X-DOI: 10.1080/02664763.2012.745838
File-URL: http://hdl.handle.net/10.1080/02664763.2012.745838
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:429-437
Template-Type: ReDIF-Article 1.0
Author-Name: Vasyl Golosnoy
Author-X-Name-First: Vasyl
Author-X-Name-Last: Golosnoy
Author-Name: Jens Hogrefe
Author-X-Name-First: Jens
Author-X-Name-Last: Hogrefe
Title: Signaling NBER turning points: a sequential approach
Abstract:
The dates of the U.S. business cycle are reported by the National Bureau
of Economic Research with a considerable delay, so an early notion of
turning points is of particular interest. This paper proposes a novel
sequential classification approach designed for timely signaling these
turning points, using the time series of coincident economic indicators.
The approach exhibits a range of theoretical optimality properties for
early signaling, moreover, it is transparent and easy to implement. The
empirical study evaluates the signaling ability of the proposed
methodology.
Journal: Journal of Applied Statistics
Pages: 438-448
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.748017
File-URL: http://hdl.handle.net/10.1080/02664763.2012.748017
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:438-448
Template-Type: ReDIF-Article 1.0
Author-Name: Hea-Jung Kim
Author-X-Name-First: Hea-Jung
Author-X-Name-Last: Kim
Title: Optimal asymmetric classification procedures for interval-screened normal data
Abstract:
Statistical methods for an asymmetric normal classification do not adapt
well to the situations where the population distributions are perturbed by
an interval-screening scheme. This paper explores methods for providing an
optimal classification of future samples in this situation. The properties
of the screened population distributions are considered and two optimal
regions for classifying the future samples are obtained. These
developments yield yet other rules for the interval-screened asymmetric
normal classification. The rules are studied from several aspects such as
the probability of misclassification, robustness, and estimation of the
rules. The investigation of the performance of the rules as well as the
illustration of the screened classification idea, using two numerical
examples, is also considered.
Journal: Journal of Applied Statistics
Pages: 449-462
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.748018
File-URL: http://hdl.handle.net/10.1080/02664763.2012.748018
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:449-462
Template-Type: ReDIF-Article 1.0
Author-Name: Shuangzhe Liu
Author-X-Name-First: Shuangzhe
Author-X-Name-Last: Liu
Title: Econometric methods for labour economics
Journal: Journal of Applied Statistics
Pages: 463-464
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.749026
File-URL: http://hdl.handle.net/10.1080/02664763.2012.749026
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:463-464
Template-Type: ReDIF-Article 1.0
Author-Name: Kassim S. Mwitondi
Author-X-Name-First: Kassim S.
Author-X-Name-Last: Mwitondi
Title: Data mining with Rattle and R
Journal: Journal of Applied Statistics
Pages: 464-464
Issue: 2
Volume: 40
Year: 2013
Month: 2
X-DOI: 10.1080/02664763.2012.749050
File-URL: http://hdl.handle.net/10.1080/02664763.2012.749050
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:2:p:464-464
Template-Type: ReDIF-Article 1.0
Author-Name: Stephan Stahlschmidt
Author-X-Name-First: Stephan
Author-X-Name-Last: Stahlschmidt
Author-Name: Helmut Tausendteufel
Author-X-Name-First: Helmut
Author-X-Name-Last: Tausendteufel
Author-Name: Wolfgang K. Härdle
Author-X-Name-First: Wolfgang K.
Author-X-Name-Last: Härdle
Title: Bayesian networks for sex-related homicides: structure learning and prediction
Abstract:
Sex-related homicides tend to arouse wide media coverage and
thus raise the urgency to find the responsible offender. However, due to
the low frequency of such crimes, domain knowledge lacks completeness. We
have therefore accumulated a large data-set and apply several structural
learning algorithms to the data in order to combine their results into a
single general graphic model. The graphical model broadly presents a
distinction between an offender and a situation-driven crime. A
situation-driven crime may be characterised by, amongst others, an
offender lacking preparation and typically attacking a known victim in
familiar surroundings. On the other hand, offender-driven crimes may be
identified by the high level of forensic awareness demonstrated by the
offender and the sophisticated measures applied to control the victim. The
prediction performance of the graphical model is evaluated via a model
averaging approach on the outcome variable offender's age. The combined
graph undercuts the error rate of the single algorithms and an appropriate
threshold results in an error rate of less than 10%, which describes a
promising level for an actual implementation by the police.
Journal: Journal of Applied Statistics
Pages: 1155-1171
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.780235
File-URL: http://hdl.handle.net/10.1080/02664763.2013.780235
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1155-1171
Template-Type: ReDIF-Article 1.0
Author-Name: Raffaella Calabrese
Author-X-Name-First: Raffaella
Author-X-Name-Last: Calabrese
Author-Name: Silvia Angela Osmetti
Author-X-Name-First: Silvia Angela
Author-X-Name-Last: Osmetti
Title: Modelling small and medium enterprise loan defaults as rare events: the generalized extreme value regression model
Abstract:
A pivotal characteristic of credit defaults that is ignored
by most credit scoring models is the rarity of the event. The most widely
used model to estimate the probability of default is the logistic
regression model. Since the dependent variable represents a rare event,
the logistic regression model shows relevant drawbacks, for example,
underestimation of the default probability, which could be very risky for
banks. In order to overcome these drawbacks, we propose the generalized
extreme value regression model. In particular, in a generalized linear
model (GLM) with the binary-dependent variable we suggest the quantile
function of the GEV distribution as link function, so our attention is
focused on the tail of the response curve for values close to one. The
estimation procedure used is the maximum-likelihood method. This model
accommodates skewness and it presents a generalisation of GLMs with
complementary log--log link function. We analyse its performance by
simulation studies. Finally, we apply the proposed model to empirical data
on Italian small and medium enterprises.
Journal: Journal of Applied Statistics
Pages: 1172-1188
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.784894
File-URL: http://hdl.handle.net/10.1080/02664763.2013.784894
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1172-1188
Template-Type: ReDIF-Article 1.0
Author-Name: Wan-Min Tsai
Author-X-Name-First: Wan-Min
Author-X-Name-Last: Tsai
Author-Name: Albert Vexler
Author-X-Name-First: Albert
Author-X-Name-Last: Vexler
Author-Name: Gregory Gurevich
Author-X-Name-First: Gregory
Author-X-Name-Last: Gurevich
Title: An extensive power evaluation of a novel two-sample density-based empirical likelihood ratio test for paired data with an application to a treatment study of attention-deficit/hyperactivity disorder and severe mood dysregulation
Abstract:
In many case-control studies, it is common to utilize paired
data when treatments are being evaluated. In this article, we propose and
examine an efficient distribution-free test to compare two independent
samples, where each is based on paired observations. We extend and modify
the density-based empirical likelihood ratio test presented by Gurevich
and Vexler [7] to formulate an appropriate parametric likelihood ratio
test statistic corresponding to the hypothesis of our interest and then to
approximate the test statistic nonparametrically. We conduct an extensive
Monte Carlo study to evaluate the proposed test. The results of the
performed simulation study demonstrate the robustness of the proposed test
with respect to values of test parameters. Furthermore, an extensive power
analysis via Monte Carlo simulations confirms that the proposed method
outperforms the classical and general procedures in most cases related to
a wide class of alternatives. An application to a real paired data study
illustrates that the proposed test can be efficiently implemented in
practice.
Journal: Journal of Applied Statistics
Pages: 1189-1208
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.784895
File-URL: http://hdl.handle.net/10.1080/02664763.2013.784895
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1189-1208
Template-Type: ReDIF-Article 1.0
Author-Name: Donatella Vicari
Author-X-Name-First: Donatella
Author-X-Name-Last: Vicari
Author-Name: Maurizio Vichi
Author-X-Name-First: Maurizio
Author-X-Name-Last: Vichi
Title: Multivariate linear regression for heterogeneous data
Abstract:
The problem of multivariate regression modelling in the
presence of heterogeneous data is dealt to address the relevant issue of
the influence of such heterogeneity in assessing the linear relations
between responses and explanatory variables. In spite of its popularity,
clusterwise regression is not designed to identify the linear
relationships within ‘homogeneous’ clusters exhibiting
internal cohesion and external separation. A within-clusterwise regression
is introduced to achieve this aim and, since the possible presence of a
linear relation ‘between’ clusters should be also taken into
account, a general regression model is introduced to account for both the
between-cluster and the within-cluster regression variation. Some
decompositions of the variance of the responses accounted for are also
given, the least-squares estimation of the parameters is derived, together
with an appropriate coordinate descent algorithms and the performance of
the proposed methodology is evaluated in different datasets.
Journal: Journal of Applied Statistics
Pages: 1209-1230
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.784896
File-URL: http://hdl.handle.net/10.1080/02664763.2013.784896
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1209-1230
Template-Type: ReDIF-Article 1.0
Author-Name: Hyokyoung Grace Hong
Author-X-Name-First: Hyokyoung Grace
Author-X-Name-Last: Hong
Author-Name: Jianhui Zhou
Author-X-Name-First: Jianhui
Author-X-Name-Last: Zhou
Title: A multi-index model for quantile regression with ordinal data
Abstract:
In this paper, we propose a quantile approach to the
multi-index semiparametric model for an ordinal response variable.
Permitting non-parametric transformation of the response, the proposed
method achieves a root-n rate of convergence and has
attractive robustness properties. Further, the proposed model allows
additional indices to model the remaining correlations between covariates
and the residuals from the single-index, considerably reducing the error
variance and thus leading to more efficient prediction intervals (PIs).
The utility of the model is demonstrated by estimating PIs for functional
status of the elderly based on data from the second longitudinal study of
aging. It is shown that the proposed multi-index model provides
significantly narrower PIs than competing models. Our approach can be
applied to other areas in which the distribution of future observations
must be predicted from ordinal response data.
Journal: Journal of Applied Statistics
Pages: 1231-1245
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.785489
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785489
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1231-1245
Template-Type: ReDIF-Article 1.0
Author-Name: Hua Jin
Author-X-Name-First: Hua
Author-X-Name-Last: Jin
Author-Name: Qi Mo
Author-X-Name-First: Qi
Author-X-Name-Last: Mo
Title: Hip fracture prediction from a new classification algorithm based on recursive partitioning methods
Abstract:
Classification and regression tree has been useful in medical
research to construct algorithms for disease diagnosis or prognostic
prediction. Jin et al.7 developed a robust and
cost-saving tree (RACT) algorithm with application in classification of
hip fracture risk after 5-year follow-up based on the data from the Study
of Osteoporotic Fractures (SOF). Although conventional recursive
partitioning algorithms have been well developed, they still have some
limitations. Binary splits may generate a big tree with many layers, but
trinary splits may produce too many nodes. In this paper, we propose a
classification approach combining trinary splits and binary splits to
generate a trinary--binary tree. A new non-inferiority test of entropy is
used to select the binary or trinary splits. We apply the modified method
in SOF to construct a trinary--binary classification rule for predicting
risk of osteoporotic hip fracture. Our new classification tree has good
statistical utility: it is statistically non-inferior to the optimum
binary tree and the RACT based on the testing sample and is also
cost-saving. It may be useful in clinical applications: femoral neck bone
mineral density, age, height loss and weight gain since age 25 can
identify subjects with elevated 5-year hip fracture risk without loss of
statistical efficiency.
Journal: Journal of Applied Statistics
Pages: 1246-1253
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.785490
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785490
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1246-1253
Template-Type: ReDIF-Article 1.0
Author-Name: Ferra Yanuar
Author-X-Name-First: Ferra
Author-X-Name-Last: Yanuar
Author-Name: Kamarulzaman Ibrahim
Author-X-Name-First: Kamarulzaman
Author-X-Name-Last: Ibrahim
Author-Name: Abdul Aziz Jemain
Author-X-Name-First: Abdul Aziz
Author-X-Name-Last: Jemain
Title: Bayesian structural equation modeling for the health index
Abstract:
There are many factors which could influence the level of
health of an individual. These factors are interactive and their overall
effects on health are usually measured by an index which is called as
health index. The health index could also be used as an indicator to
describe the health level of a community. Since the health index is
important, many research have been done to study its determinant. The main
purpose of this study is to model the health index of an individual based
on classical structural equation modeling (SEM) and Bayesian SEM. For
estimation of the parameters in the measurement and structural equation
models, the classical SEM applies the robust-weighted least-square
approach, while the Bayesian SEM implements the Gibbs sampler algorithm.
The Bayesian SEM approach allows the user to use the prior information for
updating the current information on the parameter. Both methods are
applied to the data gathered from a survey conducted in Hulu Langat, a
district in Malaysia. Based on the classical and the Bayesian SEM, it is
found that demographic status and lifestyle are significantly related to
the health index. However, mental health has no significant relation to
the health index.
Journal: Journal of Applied Statistics
Pages: 1254-1269
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.785491
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785491
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1254-1269
Template-Type: ReDIF-Article 1.0
Author-Name: Sebastian Kurtek
Author-X-Name-First: Sebastian
Author-X-Name-Last: Kurtek
Author-Name: Wei Wu
Author-X-Name-First: Wei
Author-X-Name-Last: Wu
Author-Name: Gary E. Christensen
Author-X-Name-First: Gary E.
Author-X-Name-Last: Christensen
Author-Name: Anuj Srivastava
Author-X-Name-First: Anuj
Author-X-Name-Last: Srivastava
Title: Segmentation, alignment and statistical analysis of biosignals with application to disease classification
Abstract:
We present a novel methodology for a comprehensive
statistical analysis of approximately periodic biosignal data. There are
two main challenges in such analysis: (1) the automatic extraction
(segmentation) of cycles from long, cyclostationary biosignals and (2) the
subsequent statistical analysis, which in many cases involves the
separation of temporal and amplitude variabilities. The proposed framework
provides a principled approach for statistical analysis of such signals,
which in turn allows for an efficient cycle segmentation algorithm. This
is achieved using a convenient representation of functions called the
square-root velocity function (SRVF). The segmented cycles, represented by
SRVFs, are temporally aligned using the notion of the Karcher mean, which
in turn allows for more efficient statistical summaries of signals. We
show the strengths of this method through various disease classification
experiments. In the case of myocardial infarction detection and
localization, we show that our method compares favorably to methods
described in the current literature.
Journal: Journal of Applied Statistics
Pages: 1270-1288
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.785492
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785492
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1270-1288
Template-Type: ReDIF-Article 1.0
Author-Name: Debasis Kundu
Author-X-Name-First: Debasis
Author-X-Name-Last: Kundu
Title: Bayesian analysis for partially complete time and type of failure data
Abstract:
In this paper, we consider the Bayesian analysis of competing
risks data, when the data are partially complete in both time and type of
failures. It is assumed that the latent cause of failures have independent
Weibull distributions with the common shape parameter, but different scale
parameters. When the shape parameter is known, it is assumed that the
scale parameters have Beta--Gamma priors. In this case, the Bayes
estimates and the associated credible intervals can be obtained in
explicit forms. When the shape parameter is also unknown, it is assumed
that it has a very flexible log-concave prior density functions. When the
common shape parameter is unknown, the Bayes estimates of the unknown
parameters and the associated credible intervals cannot be obtained in
explicit forms. We propose to use Markov Chain Monte Carlo sampling
technique to compute Bayes estimates and also to compute associated
credible intervals. We further consider the case when the covariates are
also present. The analysis of two competing risks data sets, one with
covariates and the other without covariates, have been performed for
illustrative purposes. It is observed that the proposed model is very
flexible, and the method is very easy to implement in practice.
Journal: Journal of Applied Statistics
Pages: 1289-1300
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.785493
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785493
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1289-1300
Template-Type: ReDIF-Article 1.0
Author-Name: Seyed Taghi Akhavan Niaki
Author-X-Name-First: Seyed Taghi Akhavan
Author-X-Name-Last: Niaki
Author-Name: Paravaneh Jahani
Author-X-Name-First: Paravaneh
Author-X-Name-Last: Jahani
Title: The economic design of multivariate binomial EWMA VSSI control charts
Abstract:
Since multi-attribute control charts have received little
attention compared with multivariate variable control charts, this
research is concerned with developing a new methodology to employ the
multivariate exponentially weighted moving average (MEWMA) charts for
m-attribute binomial processes; the attributes being the
number of nonconforming items. Moreover, since the variable sample size
and sampling interval (VSSI) MEWMA charts detect small process mean shifts
faster than the traditional MEWMA, an economic design of the VSSI MEWMA
chart is proposed to obtain the optimum design parameters of the chart.
The sample size, the sampling interval, and the warning/action limit
coefficients are obtained using a genetic algorithm such that the expected
total cost per hour is minimized. At the end, a sensitivity analysis has
been carried out to investigate the effects of the cost and the model
parameters on the solution of the economic design of the VSSI MEWMA chart.
Journal: Journal of Applied Statistics
Pages: 1301-1318
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.785494
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785494
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1301-1318
Template-Type: ReDIF-Article 1.0
Author-Name: Miler Jerkovic Vera
Author-X-Name-First: Miler Jerkovic
Author-X-Name-Last: Vera
Author-Name: Bojanic Dubravka
Author-X-Name-First: Bojanic
Author-X-Name-Last: Dubravka
Author-Name: Jorgovanovic Nikola
Author-X-Name-First: Jorgovanovic
Author-X-Name-Last: Nikola
Author-Name: Ilic Vojin
Author-X-Name-First: Ilic
Author-X-Name-Last: Vojin
Author-Name: Petrovacki-Balj Bojana
Author-X-Name-First: Petrovacki-Balj
Author-X-Name-Last: Bojana
Title: Detecting and removing outlier(s) in electromyographic gait-related patterns
Abstract:
In this paper, we propose a method for outlier detection and
removal in electromyographic gait-related patterns (EMG-GRPs). The goal
was to detect and remove EMG-GRPs that reduce the quality of gait data
while preserving natural biological variations in EMG-GRPs. The proposed
procedure consists of general statistical tests and is simple to use. The
Friedman test with multiple comparisons was used to find particular
EMG-GRPs that are extremely different from others. Next, outlying
observations were calculated for each suspected stride waveform by
applying the generalized extreme studentized deviate test. To complete the
analysis, we applied different outlier criteria. The results suggest that
an EMG-GRP is an outlier if it differs from at least 50% of the other
stride waveforms and contains at least 20% of the outlying observations.
The EMG signal remains a realistic representation of muscle activity and
demonstrates step-by-step variability once the outliers, as defined here,
are removed.
Journal: Journal of Applied Statistics
Pages: 1319-1332
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.785495
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785495
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1319-1332
Template-Type: ReDIF-Article 1.0
Author-Name: Sophie Bercu
Author-X-Name-First: Sophie
Author-X-Name-Last: Bercu
Author-Name: Fr�d�ric Proïa
Author-X-Name-First: Fr�d�ric
Author-X-Name-Last: Proïa
Title: A SARIMAX coupled modelling applied to individual load curves intraday forecasting
Abstract:
A dynamic coupled modelling is investigated to take
temperature into account in the individual energy consumption forecasting.
The objective is both to avoid the inherent complexity of exhaustive
SARIMAX models and to take advantage of the usual linear relation between
energy consumption and temperature for thermosensitive customers. We first
recall some issues related to individual load curves forecasting. Then, we
propose and study the properties of a dynamic coupled modelling taking
temperature into account as an exogenous contribution and its application
to the intraday prediction of energy consumption. Finally, these
theoretical results are illustrated on a real individual load curve. The
authors discuss the relevance of such an approach and anticipate that it
could form a substantial alternative to the commonly used methods for
energy consumption forecasting of individual customers.
Journal: Journal of Applied Statistics
Pages: 1333-1348
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.785496
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785496
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1333-1348
Template-Type: ReDIF-Article 1.0
Author-Name: Erhard Reschenhofer
Author-X-Name-First: Erhard
Author-X-Name-Last: Reschenhofer
Title: Robust testing for stationarity of global surface temperature
Abstract:
Surface temperature is a major indicator of climate change.
To test for the presence of an upward trend in surface-temperature (global
warming), sophisticated statistical methods are typically used which
depend on implausible and/or unverifiable assumptions, in particular on
the availability of a very large number of measurements. In this paper,
the validity of these methods is challenged. It is argued that the
available series are simply not long enough to justify the use of methods
which are based on asymptotic arguments, because only a small fraction of
the information contained in the data is utilizable to distinguish between
a trend and natural variability. Thus, a simple frequency-domain test is
proposed for the case when all but a very small number of frequencies may
be corrupted by transitory fluctuations. Simulations confirm its
robustness against short-term autocorrelation. When applied to a global
surface-temperature series, significance can be achieved with far fewer
frequencies than required by conventional tests.
Journal: Journal of Applied Statistics
Pages: 1349-1361
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.785497
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785497
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1349-1361
Template-Type: ReDIF-Article 1.0
Author-Name: K. Suresh Chandra
Author-X-Name-First: K. Suresh
Author-X-Name-Last: Chandra
Author-Name: G. Gopal
Author-X-Name-First: G.
Author-X-Name-Last: Gopal
Author-Name: M. Ramadurai
Author-X-Name-First: M.
Author-X-Name-Last: Ramadurai
Title: A stochastic frontier approach to survival analysis
Abstract:
In spite of the best set of covariates and statistical tools
for the survival analysis, there are instances when experts do not rule
out the existence of many non-observable factors that
could influence the survival probability of an individual. The fact that
every human body, sick or otherwise, strives to maximize time to death,
renders the stochastic frontier analysis (vide 2) as a meaningful tool to
measure the unobservable individual-specific deficiency factor that
accounts for the difference between the optimal and observed survival
times. In this paper, given the survival data, an attempt is made to
measure the deficiency factor for each individual in the data on adopting
the stochastic frontier analysis. Such an attempt to quantify the effect
of these unobservable factors can provide ample scope for further research
in bio-medical studies. The utility of these estimates in the survival
analysis is also highlighted using a real-life data.
Journal: Journal of Applied Statistics
Pages: 1362-1371
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.785498
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785498
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1362-1371
Template-Type: ReDIF-Article 1.0
Author-Name: Jiaqi Yang
Author-X-Name-First: Jiaqi
Author-X-Name-Last: Yang
Author-Name: Wei Zhang
Author-X-Name-First: Wei
Author-X-Name-Last: Zhang
Author-Name: Baolin Wu
Author-X-Name-First: Baolin
Author-X-Name-Last: Wu
Title: A note on statistical method for genotype calling of high-throughput single-nucleotide polymorphism arrays
Abstract:
We study the genotype calling algorithms for the
high-throughput single-nucleotide polymorphism (SNP) arrays. Building upon
the novel SNP-robust multi-chip average preprocessing approach and the
state-of-the-art corrected robust linear model with Mahalanobis distance
(CRLMM) approach for genotype calling, we propose a simple modification to
better model and combine the information across multiple SNPs with
empirical Bayes modeling, which could often significantly improve the
genotype calling of CRLMM. Through applications to the HapMap Trio data
set and a non-HapMap test set of high quality SNP chips, we illustrate the
competitive performance of the proposed method.
Journal: Journal of Applied Statistics
Pages: 1372-1381
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.785499
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785499
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1372-1381
Template-Type: ReDIF-Article 1.0
Author-Name: Jing Xu
Author-X-Name-First: Jing
Author-X-Name-Last: Xu
Title: Expect the unexpected: a first course in biostatistics
Journal: Journal of Applied Statistics
Pages: 1382-1383
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2012.760781
File-URL: http://hdl.handle.net/10.1080/02664763.2012.760781
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1382-1383
Template-Type: ReDIF-Article 1.0
Author-Name: Jing Xu
Author-X-Name-First: Jing
Author-X-Name-Last: Xu
Title: Introduction to probability with Texas Hold'em examples
Journal: Journal of Applied Statistics
Pages: 1383-1384
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2012.760782
File-URL: http://hdl.handle.net/10.1080/02664763.2012.760782
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1383-1384
Template-Type: ReDIF-Article 1.0
Author-Name: Abhay Kumar Tiwari
Author-X-Name-First: Abhay Kumar
Author-X-Name-Last: Tiwari
Title: Economic time series
Journal: Journal of Applied Statistics
Pages: 1384-1385
Issue: 6
Volume: 40
Year: 2013
Month: 6
X-DOI: 10.1080/02664763.2013.767416
File-URL: http://hdl.handle.net/10.1080/02664763.2013.767416
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:6:p:1384-1385
Template-Type: ReDIF-Article 1.0
Author-Name: Ying Chen
Author-X-Name-First: Ying
Author-X-Name-Last: Chen
Title: A powerful test method for analyzing unreplicated factorials
Abstract:
In this paper, a new test method for analyzing unreplicated
factorial designs is proposed. The proposed method is illustrated by some
examples. An extensive simulation with the standard 16-run designs was
carried out to compare the proposed method with three another existing
methods. Besides the usual power criterion, another three versions of
power, Power I--III, were also used to evaluate the performance of the
compared methods. The simulation study shows that the proposed method has
higher ability than the remaining three compared methods to identify all
active effects without misidentifying any inactive effects as active.
Journal: Journal of Applied Statistics
Pages: 1387-1401
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.785500
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785500
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1387-1401
Template-Type: ReDIF-Article 1.0
Author-Name: Saheli Datta
Author-X-Name-First: Saheli
Author-X-Name-Last: Datta
Author-Name: Raquel Prado
Author-X-Name-First: Raquel
Author-X-Name-Last: Prado
Author-Name: Abel Rodr�guez
Author-X-Name-First: Abel
Author-X-Name-Last: Rodr�guez
Title: Bayesian factor models in characterizing molecular adaptation
Abstract:
Assessing the selective influence of amino acid properties is
important in understanding evolution at the molecular level. A collection
of methods and models has been developed in recent years to determine if
amino acid sites in a given DNA sequence alignment display substitutions
that are altering or conserving a prespecified set of amino acid
properties. Residues showing an elevated number of substitutions that
favorably alter a physicochemical property are considered targets of
positive natural selection. Such approaches usually perform independent
analyses for each amino acid property under consideration, without taking
into account the fact that some of the properties may be highly
correlated. We propose a Bayesian hierarchical regression model with
latent factor structure that allows us to determine which sites display
substitutions that conserve or radically change a set of amino acid
properties, while accounting for the correlation structure that may be
present across such properties. We illustrate our approach by analyzing
simulated data sets and an alignment of lysin sperm DNA.
Journal: Journal of Applied Statistics
Pages: 1402-1424
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.785652
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785652
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1402-1424
Template-Type: ReDIF-Article 1.0
Author-Name: Z. Rezaei Ghahroodi
Author-X-Name-First: Z. Rezaei
Author-X-Name-Last: Ghahroodi
Author-Name: M. Ganjali
Author-X-Name-First: M.
Author-X-Name-Last: Ganjali
Title: A Bayesian approach for analysing longitudinal nominal outcomes using random coefficients transitional generalized logit model: an application to the labour force survey data
Abstract:
A random-effects transition model is proposed to model the
economic activity status of household members. This model is introduced to
take into account two kinds of correlations; one due to the longitudinal
nature of the study, which will be considered using a transition
parameter, and the other due to the existing correlation between responses
of members of the same household which is taken into account by
introducing random coefficients into the model. The results are presented
based on the homogeneous (all parameters are not changed by time) and
non-homogeneous Markov models with random coefficients. A Bayesian
approach via the Gibbs sampling is used to perform parameter estimation.
Results of using random-effects transition model are compared, using
deviance information criterion, with those of three other models which
exclude random effects and/or transition effects. It is shown that the
full model gains more precision due to the consideration of all aspects of
the process which generated the data. To illustrate the utility of the
proposed model, a longitudinal data set which is extracted from the
Iranian Labour Force Survey is analysed to explore the simultaneous effect
of some covariates on the current economic activity as a nominal response.
Also, some sensitivity analyses are performed to assess the robustness of
the posterior estimation of the transition parameters to the perturbations
of the prior parameters.
Journal: Journal of Applied Statistics
Pages: 1425-1445
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.785653
File-URL: http://hdl.handle.net/10.1080/02664763.2013.785653
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1425-1445
Template-Type: ReDIF-Article 1.0
Author-Name: Kouji Yamamoto
Author-X-Name-First: Kouji
Author-X-Name-Last: Yamamoto
Author-Name: Shota Murakami
Author-X-Name-First: Shota
Author-X-Name-Last: Murakami
Author-Name: Sadao Tomizawa
Author-X-Name-First: Sadao
Author-X-Name-Last: Tomizawa
Title: Point-symmetry models and decomposition for collapsed square contingency tables
Abstract:
For square contingency tables with ordered categories, there
may be some cases that one wants to analyze them by considering collapsed
tables with some adjacent categories combined in the original table. This
paper proposes three kinds of new models which have the structure of
point-symmetry (PS), quasi point-symmetry and marginal point-symmetry for
collapsed square tables. This paper also gives a decomposition of the PS
model for collapsed square tables. The father's and his daughter's
occupational mobility data are analyzed using new models.
Journal: Journal of Applied Statistics
Pages: 1446-1452
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.786028
File-URL: http://hdl.handle.net/10.1080/02664763.2013.786028
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1446-1452
Template-Type: ReDIF-Article 1.0
Author-Name: Anouar BenMabrouk
Author-X-Name-First: Anouar
Author-X-Name-Last: BenMabrouk
Author-Name: Olfa Zaafrane
Author-X-Name-First: Olfa
Author-X-Name-Last: Zaafrane
Title: Wavelet fuzzy hybrid model for physico-financial signals
Abstract:
In the present paper, a fuzzy logic-based method is combined
with wavelet decomposition to develop a step-by-step dynamic hybrid model
to analyze and approximate one-dimensional physico-financial signals
characterized by fuzzy values. Computational tests based on a well-known
signal and conducted with the pure fuzzy model, the wavelet one and the
new hybrid model, are developed and result in an efficient hybrid one.
Journal: Journal of Applied Statistics
Pages: 1453-1463
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.786690
File-URL: http://hdl.handle.net/10.1080/02664763.2013.786690
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1453-1463
Template-Type: ReDIF-Article 1.0
Author-Name: Conghua Cheng
Author-X-Name-First: Conghua
Author-X-Name-Last: Cheng
Author-Name: Jinyuan Chen
Author-X-Name-First: Jinyuan
Author-X-Name-Last: Chen
Author-Name: Jianming Bai
Author-X-Name-First: Jianming
Author-X-Name-Last: Bai
Title: Exact inferences of the two-parameter exponential distribution and Pareto distribution with censored data
Abstract:
We develop an exact inference for the location and the scale
parameters of the two-exponential distribution and the Pareto distribution
based on their maximum-likelihood estimators from the doubly Type-II and
the progressive Type-II censored sample. Based on some pivotal quantities,
exact confidence intervals and tests of hypotheses are constructed. Exact
distributions of the pivotal quantities are expressed as mixtures of
linear combinations and of ratios of linear combinations of standard
exponential random variables, which facilitates the computation of
quantiles of these pivotal quantities. We also provide a bootstrap method
for constructing a confidence interval. Some simulation studies are
carried out to assess their performances. Using the exact distribution of
the scale parameter, we establish an acceptance sampling procedure based
on the lifetime of the unit. Some numerical results are tabulated for the
illustration. One biometrical example is also given to illustrate the
proposed methods.
Journal: Journal of Applied Statistics
Pages: 1464-1479
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.788613
File-URL: http://hdl.handle.net/10.1080/02664763.2013.788613
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1464-1479
Template-Type: ReDIF-Article 1.0
Author-Name: Mohammad Z. Raqab
Author-X-Name-First: Mohammad Z.
Author-X-Name-Last: Raqab
Title: Discriminating between the generalized Rayleigh and Weibull distributions
Abstract:
Generalized Rayleigh (GR) and Weibull (WE) distributions are
used quite effectively for analysing skewed lifetime data. In this paper,
we consider the problem of selecting either GR or WE distribution as a
more appropriate fitting model for a given data set. We use the ratio of
maximized likelihoods (RML) for discriminating between the two
distributions. The asymptotic and simulated distributions of the logarithm
of the RML are applied to determine the probability of correctly selecting
between these two families of distributions. It is examined numerically
that the asymptotic results work quite well even for small sample sizes. A
real data set involving the annual rainfall recorded at Los Angeles Civic
Center during 25 years is analysed to illustrate the procedures developed
here.
Journal: Journal of Applied Statistics
Pages: 1480-1493
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.788614
File-URL: http://hdl.handle.net/10.1080/02664763.2013.788614
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1480-1493
Template-Type: ReDIF-Article 1.0
Author-Name: K. H. Makambi
Author-X-Name-First: K. H.
Author-X-Name-Last: Makambi
Title: Extended tests for non-zero between-study variance
Abstract:
The ANOVA F-test, James tests and
generalized F-test are extended to test hypotheses on the
between-study variance for values greater than zero. Using simulations, we
compare the performance of extended test procedures with respect to the
actual attained type I error rate. Examples are provided to demonstrate
the application of the procedures in ANOVA models and meta-analysis.
Journal: Journal of Applied Statistics
Pages: 1494-1505
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.788616
File-URL: http://hdl.handle.net/10.1080/02664763.2013.788616
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1494-1505
Template-Type: ReDIF-Article 1.0
Author-Name: Y. Sertdemir
Author-X-Name-First: Y.
Author-X-Name-Last: Sertdemir
Author-Name: H. R. Burgut
Author-X-Name-First: H. R.
Author-X-Name-Last: Burgut
Author-Name: Z. N. Alparslan
Author-X-Name-First: Z. N.
Author-X-Name-Last: Alparslan
Author-Name: I. Unal
Author-X-Name-First: I.
Author-X-Name-Last: Unal
Author-Name: S. Gunasti
Author-X-Name-First: S.
Author-X-Name-Last: Gunasti
Title: Comparing the methods of measuring multi-rater agreement on an ordinal rating scale: a simulation study with an application to real data
Abstract:
Agreement among raters is an important issue in medicine, as
well as in education and psychology. The agreement among two raters on a
nominal or ordinal rating scale has been investigated in many articles.
The multi-rater case with normally distributed ratings has also been
explored at length. However, there is a lack of research on multiple
raters using an ordinal rating scale. In this simulation study, several
methods were compared with analyze rater agreement. The special case that
was focused on was the multi-rater case using a bounded ordinal rating
scale. The proposed methods for agreement were compared within different
settings. Three main ordinal data simulation settings were used (normal,
skewed and shifted data). In addition, the proposed methods were applied
to a real data set from dermatology. The simulation results showed that
the Kendall's W and mean gamma highly overestimated the
agreement in data sets with shifts in data. ICC4 for bounded
data should be avoided in agreement studies with rating scales>5, where
this method highly overestimated the simulated agreement. The difference
in bias for all methods under study, except the mean gamma and Kendall's
W, decreased as the rating scale increased. The bias of
ICC3 was consistent and small for nearly all simulation
settings except the low agreement setting in the shifted data set.
Researchers should be careful in selecting agreement methods, especially
if shifts in ratings between raters exist and may apply more than one
method before any conclusions are made.
Journal: Journal of Applied Statistics
Pages: 1506-1519
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.788617
File-URL: http://hdl.handle.net/10.1080/02664763.2013.788617
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1506-1519
Template-Type: ReDIF-Article 1.0
Author-Name: Pao-Sheng Shen
Author-X-Name-First: Pao-Sheng
Author-X-Name-Last: Shen
Title: Additive hazards model with truncated and doubly censored data
Abstract:
In longitudinal studies, the additive hazard model is often
used to analyze covariate effects on the duration time, defined as the
elapsed time between the first and the second event. In this article, we
consider the situation when the first event suffers partly interval
censoring and the second event suffers left truncation and
right-censoring. We proposed a two-step estimation procedure for
estimating the regression coefficients of the additive hazards model. A
simulation study is conducted to investigate the performance of the
proposed estimator. The proposed method is applied to the Centers for
Disease Control acquired immune deficiency syndrome blood transfusion
data.
Journal: Journal of Applied Statistics
Pages: 1520-1532
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.788618
File-URL: http://hdl.handle.net/10.1080/02664763.2013.788618
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1520-1532
Template-Type: ReDIF-Article 1.0
Author-Name: Leonardo Soares Bastos
Author-X-Name-First: Leonardo Soares
Author-X-Name-Last: Bastos
Author-Name: Joel Mauricio Correa da Rosa
Author-X-Name-First: Joel Mauricio Correa
Author-X-Name-Last: da Rosa
Title: Predicting probabilities for the 2010 FIFA World Cup games using a Poisson-Gamma model
Abstract:
In this paper, we provide probabilistic predictions for
soccer games of the 2010 FIFA World Cup modelling the number of goals
scored in a game by each team. We use a Poisson distribution for the
number of goals for each team in a game, where the scoring rate is
considered unknown. We use a Gamma distribution for the scoring rate and
the Gamma parameters are chosen using historical data and difference among
teams defined by a strength factor for each team. The strength factor is a
measure of discrimination among the national teams obtained from their
memberships to fuzzy clusters. The clusters are obtained with the use of
the Fuzzy C-means algorithm applied to a vector of variables, most of them
available on the official FIFA website. Static and dynamic models were
used to predict the World Cup outcomes and the performance of our
predictions was evaluated using two comparison methods.
Journal: Journal of Applied Statistics
Pages: 1533-1544
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.788619
File-URL: http://hdl.handle.net/10.1080/02664763.2013.788619
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1533-1544
Template-Type: ReDIF-Article 1.0
Author-Name: Sukru Acitas
Author-X-Name-First: Sukru
Author-X-Name-Last: Acitas
Author-Name: Pelin Kasap
Author-X-Name-First: Pelin
Author-X-Name-Last: Kasap
Author-Name: Birdal Senoglu
Author-X-Name-First: Birdal
Author-X-Name-Last: Senoglu
Author-Name: Olcay Arslan
Author-X-Name-First: Olcay
Author-X-Name-Last: Arslan
Title: One-step M-estimators: Jones and Faddy's skewed t-distribution
Abstract:
One-step M (OSM)-estimator needs some
initial/preliminary estimates at the beginning of the calculation process.
In this study, we propose to use new initial estimates for the calculation
of the OSM-estimator. We consider simple location and simple linear
regression models when the distribution of the error terms is Jones and
Faddy's skewed t. Monte-Carlo simulation study shows that
the OSM estimator(s) based on the proposed initial estimates is/are more
efficient than the OSM estimator(s) based on the traditional initial
estimates especially for the skewed cases. We also analyze some real data
sets taken from the literature at the end of the paper.
Journal: Journal of Applied Statistics
Pages: 1545-1560
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.788620
File-URL: http://hdl.handle.net/10.1080/02664763.2013.788620
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1545-1560
Template-Type: ReDIF-Article 1.0
Author-Name: L. I. Pettit
Author-X-Name-First: L. I.
Author-X-Name-Last: Pettit
Author-Name: N. Sothinathan
Author-X-Name-First: N.
Author-X-Name-Last: Sothinathan
Title: Effect of individual observations on the Box--Cox transformation
Abstract:
In this paper, we consider the influence of individual
observations on inferences about the Box--Cox power transformation
parameter from a Bayesian point of view. We compare Bayesian diagnostic
measures with the 'forward' method of analysis due to Riani and Atkinson.
In particular, we look at the effect of omitting observations on the
inference by comparing particular choices of transformation using the
conditional predictive ordinate and the k
d measure
of Pettit and Young. We illustrate the methods using a designed
experiment. We show that a group of masked outliers can be detected using
these single deletion diagnostics. Also, we show that Bayesian diagnostic
measures are simpler to use to investigate the effect of observations on
transformations than the forward search method.
Journal: Journal of Applied Statistics
Pages: 1561-1571
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.789007
File-URL: http://hdl.handle.net/10.1080/02664763.2013.789007
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1561-1571
Template-Type: ReDIF-Article 1.0
Author-Name: Jorge Gonz�lez Chapela
Author-X-Name-First: Jorge
Author-X-Name-Last: Gonz�lez Chapela
Title: Things that make us different: analysis of deviance with time-use data
Abstract:
The constrained, non-normal nature of time-use data poses a
challenge to ordinary analysis of variance. This paper investigates a
computationally simple variance decomposition technique suitable for those
data. As a by-product of the analysis, a measure of fit for systems of
time-demand equations is proposed that possesses several useful
properties.
Journal: Journal of Applied Statistics
Pages: 1572-1585
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.789097
File-URL: http://hdl.handle.net/10.1080/02664763.2013.789097
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1572-1585
Template-Type: ReDIF-Article 1.0
Author-Name: Mariana Rodrigues-Motta
Author-X-Name-First: Mariana
Author-X-Name-Last: Rodrigues-Motta
Author-Name: Hildete P. Pinheiro
Author-X-Name-First: Hildete P.
Author-X-Name-Last: Pinheiro
Author-Name: Eduardo G. Martins
Author-X-Name-First: Eduardo G.
Author-X-Name-Last: Martins
Author-Name: M�rcio S. Araújo
Author-X-Name-First: M�rcio S.
Author-X-Name-Last: Araújo
Author-Name: S�rgio F. dos Reis
Author-X-Name-First: S�rgio F.
Author-X-Name-Last: dos Reis
Title: Multivariate models for correlated count data
Abstract:
In this study, we deal with the problem of overdispersion
beyond extra zeros for a collection of counts that can be correlated.
Poisson, negative binomial, zero-inflated Poisson and zero-inflated
negative binomial distributions have been considered. First, we propose a
multivariate count model in which all counts follow the same distribution
and are correlated. Then we extend this model in a sense that correlated
counts may follow different distributions. To accommodate correlation
among counts, we have considered correlated random effects for each
individual in the mean structure, thus inducing dependency among common
observations to an individual. The method is applied to real data to
investigate variation in food resources use in a species of marsupial in a
locality of the Brazilian Cerrado biome.
Journal: Journal of Applied Statistics
Pages: 1586-1596
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.789098
File-URL: http://hdl.handle.net/10.1080/02664763.2013.789098
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1586-1596
Template-Type: ReDIF-Article 1.0
Author-Name: Sanjay Kumar Singh
Author-X-Name-First: Sanjay Kumar
Author-X-Name-Last: Singh
Author-Name: Umesh Singh
Author-X-Name-First: Umesh
Author-X-Name-Last: Singh
Author-Name: Dinesh Kumar
Author-X-Name-First: Dinesh
Author-X-Name-Last: Kumar
Title: Bayesian estimation of parameters of inverse Weibull distribution
Abstract:
The present paper describes the Bayes estimators of
parameters of inverse Weibull distribution for complete, type I and type
II censored samples under general entropy and squared error loss
functions. The proposed estimators have been compared on the basis of
their simulated risks (average loss over sample space). A real-life data
set is used to illustrate the results.
Journal: Journal of Applied Statistics
Pages: 1597-1607
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.789492
File-URL: http://hdl.handle.net/10.1080/02664763.2013.789492
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1597-1607
Template-Type: ReDIF-Article 1.0
Author-Name: Saad T. Bakir
Author-X-Name-First: Saad T.
Author-X-Name-Last: Bakir
Title: A subset selection procedure for multinomial distributions
Abstract:
A subset selection procedure is developed for selecting a
subset containing the multinomial population that has the highest value of
a certain linear combination of the multinomial cell probabilities; such
population is called the 'best' The multivariate normal large sample
approximation to the multinomial distribution is used to derive
expressions for the probability of a correct selection, and for the
threshold constant involved in the procedure. The procedure guarantees
that the probability of a correct selection is at least at a pre-assigned
level. The proposed procedure is an extension of Gupta and Sobel's [14]
selection procedure for binomials and of Bakir's [2] restrictive selection
procedure for multinomials. One illustration of the procedure concerns
population income mobility in four countries: Peru, Russia, South Africa
and the USA. Analysis indicates that Russia and Peru fall in the selected
subset containing the best population with respect to income mobility from
poverty to a higher-income status. The procedure is also applied to data
concerning grade distribution for students in a certain freshman class.
Journal: Journal of Applied Statistics
Pages: 1608-1618
Issue: 7
Volume: 40
Year: 2013
Month: 7
X-DOI: 10.1080/02664763.2013.789493
File-URL: http://hdl.handle.net/10.1080/02664763.2013.789493
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:7:p:1608-1618
Template-Type: ReDIF-Article 1.0
Author-Name: Claude Mant�
Author-X-Name-First: Claude
Author-X-Name-Last: Mant�
Author-Name: Guillaume Bernard
Author-X-Name-First: Guillaume
Author-X-Name-Last: Bernard
Author-Name: Patrick Bonhomme
Author-X-Name-First: Patrick
Author-X-Name-Last: Bonhomme
Author-Name: David Nerini
Author-X-Name-First: David
Author-X-Name-Last: Nerini
Title: Application of ordinal correspondence analysis for submerged aquatic vegetation monitoring
Abstract:
The European Water Framework states
that macrophyte communities (seaweeds and seagrass) are key
indicators of the ecological health of lagoons. Furthermore, the
restoration of these communities, especially the Zostera
meadows, is one of the main objectives of the Berre lagoon restoration
plan. Consequently, a monitoring programme of the main
macrophyte species still present in the lagoon was
initiated in 1996. This monitoring resulted in a sequence of 11 spatially
structured annual tables consisting of the observed density of these
species. These tables are processed in this study. First, we specify the
principles of Beh's ordinal correspondence analysis (OCA), designed for
ordered row/column categories, and compare this method to classical
correspondence analysis (CA). Then, we show that OCA is straightforwardly
adaptable for processing a sequence of ordered contingency tables like
ours. Both OCA and CA are afterwards used to reveal and test the main
patterns of spatio-temporal changes of two macrophyte
species in the Berre lagoon: Ulva and
Zostera. The results we obtained are compared and
discussed.
Journal: Journal of Applied Statistics
Pages: 1619-1638
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.789494
File-URL: http://hdl.handle.net/10.1080/02664763.2013.789494
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1619-1638
Template-Type: ReDIF-Article 1.0
Author-Name: Syed Mohsin Ali Kazmi
Author-X-Name-First: Syed Mohsin Ali
Author-X-Name-Last: Kazmi
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Author-Name: Sajid Ali
Author-X-Name-First: Sajid
Author-X-Name-Last: Ali
Author-Name: Nasir Abbas
Author-X-Name-First: Nasir
Author-X-Name-Last: Abbas
Title: Selection of suitable prior for the Bayesian mixture of a class of lifetime distributions under type-I censored datasets
Abstract:
This paper explores the study on mixture of a class of
probability density functions under type-I censoring scheme. In this
paper, we mold a heterogeneous population by means of a two-component
mixture of the class of probability density functions. The parameters of
the class of mixture density functions are estimated and compared using
the Bayes estimates under the squared-error and precautionary loss
functions. A censored mixture dataset is simulated by probabilistic mixing
for the computational purpose considering particular case of the Maxwell
distribution. Closed-form expressions for the Bayes estimators along with
their posterior risks are derived for censored as well as complete
samples. Some stimulating comparisons and properties of the estimates are
presented here. A factual dataset has also been for illustration.
Journal: Journal of Applied Statistics
Pages: 1639-1658
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.789831
File-URL: http://hdl.handle.net/10.1080/02664763.2013.789831
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1639-1658
Template-Type: ReDIF-Article 1.0
Author-Name: Akiko Kada
Author-X-Name-First: Akiko
Author-X-Name-Last: Kada
Author-Name: Zhihong Cai
Author-X-Name-First: Zhihong
Author-X-Name-Last: Cai
Author-Name: Manabu Kuroki
Author-X-Name-First: Manabu
Author-X-Name-Last: Kuroki
Title: Medical diagnostic test based on the potential test result approach: bounds and identification
Abstract:
Evaluating the performance of a medical diagnostic test is an
important issue in disease diagnosis. Youden [Index for rating
diagnostic tests, Cancer 3 (1950), pp. 32--35] stated that the
ideal measure of performance is to ensure that the control group resembles
the diseased group as closely as possible in all respects except for the
presence of the disease. To achieve this aim, this paper introduces the
potential test result approach and proposes a new measure to evaluate the
performance of medical diagnostic tests. This proposed measure, denoted as
, can be interpreted as a probability that a test result
T would respond to a disease status D
(d is an element of {d
0, d 1}) for a given
threshold t, and therefore evaluates both the sufficiency
and necessity of the performance of a medical diagnostic test. This new
measure provides a total different interpretation for the Youden index and
thus helps us to better understand the essence of the Youden index and its
properties. We further propose non-parametric bounds on the proposed
measure based on a variety of assumptions and illustrate our results with
an example from the neonatal audiology study.
Journal: Journal of Applied Statistics
Pages: 1659-1672
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.789832
File-URL: http://hdl.handle.net/10.1080/02664763.2013.789832
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1659-1672
Template-Type: ReDIF-Article 1.0
Author-Name: Ting-Ting Gang
Author-X-Name-First: Ting-Ting
Author-X-Name-Last: Gang
Author-Name: Jun Yang
Author-X-Name-First: Jun
Author-X-Name-Last: Yang
Author-Name: Yu Zhao
Author-X-Name-First: Yu
Author-X-Name-Last: Zhao
Title: Multivariate control chart based on the highest possibility region
Abstract:
The T -super-2 control chart
is widely adopted in multivariate statistical process control. However,
when dealing with asymmetrical or multimodal distributions using the
traditional T -super-2 control chart, some
points with relatively high occurrence possibility might be excluded,
while some points with relatively low occurrence possibility might be
accepted. Motived by the thought of the highest posterior density credible
region, we develop a control chart based on the highest possibility region
to solve this problem. It is shown that the proposed multivariate control
chart will not only meet the false alarm requirement, but also ensure that
all the in-control points are with relatively high occurrence possibility.
The advantages and effectiveness of the proposed control chart are
demonstrated by some numerical examples in the end.
Journal: Journal of Applied Statistics
Pages: 1673-1681
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.790007
File-URL: http://hdl.handle.net/10.1080/02664763.2013.790007
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1673-1681
Template-Type: ReDIF-Article 1.0
Author-Name: Luca De Angelis
Author-X-Name-First: Luca
Author-X-Name-Last: De Angelis
Author-Name: Leonard J. Paas
Author-X-Name-First: Leonard J.
Author-X-Name-Last: Paas
Title: A dynamic analysis of stock markets using a hidden Markov model
Abstract:
This paper proposes a framework to detect financial crises,
pinpoint the end of a crisis in stock markets and support investment
decision-making processes. This proposal is based on a hidden Markov model
(HMM) and allows for a specific focus on conditional mean returns. By
analysing weekly changes in the US stock market indexes over a period of
20 years, this study obtains an accurate detection of stable and turmoil
periods and a probabilistic measure of switching between different stock
market conditions. The results contribute to the discussion of the
capabilities of Markov-switching models of analysing stock market
behaviour. In particular, we find evidence that HMM outperforms threshold
GARCH model with Student-t innovations both in-sample and
out-of-sample, giving financial operators some appealing investment
strategies.
Journal: Journal of Applied Statistics
Pages: 1682-1700
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.793302
File-URL: http://hdl.handle.net/10.1080/02664763.2013.793302
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1682-1700
Template-Type: ReDIF-Article 1.0
Author-Name: Laura Barbieri
Author-X-Name-First: Laura
Author-X-Name-Last: Barbieri
Title: Causality and interdependence analysis in linear econometric models with an application to fertility
Abstract:
This paper is an applied analysis of the causal structure of
linear multi-equational econometric models. Its aim is to identify the
kind of relationships linking the endogenous variables of the model,
distinguishing between causal links and feedback loops. The investigation
is first carried out within a deterministic framework and then moves on to
show how the results may change inside a more realistic stochastic
context. The causal analysis is then specifically applied to a linear
simultaneous equation model explaining fertility rates. The analysis is
carried out by means of a specific RATS programming code designed to show
the specific nature of the relationships within the model.
Journal: Journal of Applied Statistics
Pages: 1701-1716
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.793660
File-URL: http://hdl.handle.net/10.1080/02664763.2013.793660
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1701-1716
Template-Type: ReDIF-Article 1.0
Author-Name: A. Asgharzadeh
Author-X-Name-First: A.
Author-X-Name-Last: Asgharzadeh
Author-Name: Hassan S. Bakouch
Author-X-Name-First: Hassan S.
Author-X-Name-Last: Bakouch
Author-Name: L. Esmaeili
Author-X-Name-First: L.
Author-X-Name-Last: Esmaeili
Title: Pareto Poisson--Lindley distribution with applications
Abstract:
A new lifetime distribution is introduced based on
compounding Pareto and Poisson--Lindley distributions. Several statistical
properties of the distribution are established, including behavior of the
probability density function and the failure rate function, heavy- and
long-right tailedness, moments, the Laplace transform, quantiles, order
statistics, moments of residual lifetime, conditional moments, conditional
moment generating function, stress--strength parameter, R�nyi entropy and
Song's measure. We get maximum-likelihood estimators of the distribution
parameters and investigate the asymptotic distribution of the estimators
via Fisher's information matrix. Applications of the distribution using
three real data sets are presented and it is shown that the distribution
fits better than other related distributions in practical uses.
Journal: Journal of Applied Statistics
Pages: 1717-1734
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.793886
File-URL: http://hdl.handle.net/10.1080/02664763.2013.793886
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1717-1734
Template-Type: ReDIF-Article 1.0
Author-Name: Hasan Ertas
Author-X-Name-First: Hasan
Author-X-Name-Last: Ertas
Author-Name: Murat Erisoglu
Author-X-Name-First: Murat
Author-X-Name-Last: Erisoglu
Author-Name: Selahattin Kaciranlar
Author-X-Name-First: Selahattin
Author-X-Name-Last: Kaciranlar
Title: Detecting influential observations in Liu and modified Liu estimators
Abstract:
In regression, detecting anomalous observations is a
significant step for model-building process. Various influence measures
based on different motivational arguments are designed to measure the
influence of observations through different aspects of various regression
models. The presence of influential observations in the data is
complicated by the existence of multicollinearity. The purpose of this
paper is to assess the influence of observations in the Liu [9] and
modified Liu [15] estimators by using the method of approximate case
deletion formulas suggested by Walker and Birch [14]. A numerical example
using a real data set used by Longley [10] and a Monte Carlo simulation
are given to illustrate the theoretical results.
Journal: Journal of Applied Statistics
Pages: 1735-1745
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.794203
File-URL: http://hdl.handle.net/10.1080/02664763.2013.794203
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1735-1745
Template-Type: ReDIF-Article 1.0
Author-Name: Z. I. Kalaylioglu
Author-X-Name-First: Z. I.
Author-X-Name-Last: Kalaylioglu
Author-Name: O. Ozturk
Author-X-Name-First: O.
Author-X-Name-Last: Ozturk
Title: Bayesian semiparametric models for nonignorable missing mechanisms in generalized linear models
Abstract:
Semiparametric models provide a more flexible form for
modeling the relationship between the response and the explanatory
variables. On the other hand in the literature of modeling for the missing
variables, canonical form of the probability of the variable being missing
(p) is modeled taking a fully parametric approach. Here
we consider a regression spline based semiparametric approach to model the
missingness mechanism of nonignorably missing covariates. In this model
the relationship between the suitable canonical form of p
(e.g. probit p) and the missing covariate is modeled
through several splines. A Bayesian procedure is developed to efficiently
estimate the parameters. A computationally advantageous prior construction
is proposed for the parameters of the semiparametric part. A WinBUGS code
is constructed to apply Gibbs sampling to obtain the posterior
distributions. We show through an extensive Monte Carlo simulation
experiment that response model coefficent estimators maintain better (when
the true missingness mechanism is nonlinear) or equivalent (when the true
missingness mechanism is linear) bias and efficiency properties with the
use of proposed semiparametric missingness model compared to the
conventional model.
Journal: Journal of Applied Statistics
Pages: 1746-1763
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.794329
File-URL: http://hdl.handle.net/10.1080/02664763.2013.794329
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1746-1763
Template-Type: ReDIF-Article 1.0
Author-Name: Mark Trede
Author-X-Name-First: Mark
Author-X-Name-Last: Trede
Author-Name: Cornelia Savu
Author-X-Name-First: Cornelia
Author-X-Name-Last: Savu
Title: Do stock returns have an Archimedean copula?
Abstract:
The flexible class of Archimedean copulas plays an important
role in multivariate statistics. While there is a large number of
goodness-of-fit tests for copulas and parametric families of copulas, the
question if a given data set belongs to an arbitrary Archimedean copula or
not has not yet received much attention in the literature. This paper
suggests a new, straightforward method to test whether a copula is an
Archimedean copula without the need to specify its parametric family. We
conduct Monte Carlo simulations to assess the power of the test. The
approach is applied to (bivariate) joint distributions of stock asset
returns. We find that, in general, stock returns may have Archimedean
copulas.
Journal: Journal of Applied Statistics
Pages: 1764-1778
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.794330
File-URL: http://hdl.handle.net/10.1080/02664763.2013.794330
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1764-1778
Template-Type: ReDIF-Article 1.0
Author-Name: Zhiyong Zhang
Author-X-Name-First: Zhiyong
Author-X-Name-Last: Zhang
Title: Bayesian growth curve models with the generalized error distribution
Abstract:
To deal with the longitudinal data with both leptokurtic and
platykurtic errors, we extend growth curve models using the generalized
error distribution (GED) model. The Metropolis--Hastings algorithm is used
to estimate the GED model parameters in the Bayesian framework. The
application of the GED model is illustrated through the analysis of
mathematical development data. Results show that the GED model can
correctly identify the deviation from normal of the error distributions.
Journal: Journal of Applied Statistics
Pages: 1779-1795
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.796348
File-URL: http://hdl.handle.net/10.1080/02664763.2013.796348
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1779-1795
Template-Type: ReDIF-Article 1.0
Author-Name: Guillermo Ferreira
Author-X-Name-First: Guillermo
Author-X-Name-Last: Ferreira
Author-Name: Luis M. Castro
Author-X-Name-First: Luis M.
Author-X-Name-Last: Castro
Author-Name: Victor H. Lachos
Author-X-Name-First: Victor H.
Author-X-Name-Last: Lachos
Author-Name: Ronaldo Dias
Author-X-Name-First: Ronaldo
Author-X-Name-Last: Dias
Title: Bayesian modeling of autoregressive partial linear models with scale mixture of normal errors
Abstract:
Normality and independence of error terms are typical
assumptions for partial linear models. However, these assumptions may be
unrealistic in many fields, such as economics, finance and biostatistics.
In this paper, a Bayesian analysis for partial linear model with
first-order autoregressive errors belonging to the class of the scale
mixtures of normal distributions is studied in detail. The proposed model
provides a useful generalization of the symmetrical linear regression
model with independent errors, since the distribution of the error term
covers both correlated and thick-tailed distributions, and has a
convenient hierarchical representation allowing easy implementation of a
Markov chain Monte Carlo scheme. In order to examine the robustness of the
model against outlying and influential observations, a Bayesian case
deletion influence diagnostics based on the Kullback--Leibler (K--L)
divergence is presented. The proposed method is applied to monthly and
daily returns of two Chilean companies.
Journal: Journal of Applied Statistics
Pages: 1796-1816
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.796349
File-URL: http://hdl.handle.net/10.1080/02664763.2013.796349
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1796-1816
Template-Type: ReDIF-Article 1.0
Author-Name: Vishal Maurya
Author-X-Name-First: Vishal
Author-X-Name-Last: Maurya
Author-Name: Amar Nath Gill
Author-X-Name-First: Amar Nath
Author-X-Name-Last: Gill
Author-Name: Parminder Singh
Author-X-Name-First: Parminder
Author-X-Name-Last: Singh
Title: Multiple comparisons with a control for exponential location parameters under heteroscedasticity
Abstract:
In this paper, a new design-oriented two-stage two-sided
simultaneous confidence intervals, for comparing several exponential
populations with control population in terms of location parameters under
heteroscedasticity, are proposed. If there is a prior information that the
location parameter of k exponential populations are not
less than the location parameter of control population, one-sided
simultaneous confidence intervals provide more inferential sensitivity
than two-sided simultaneous confidence intervals. But the two-sided
simultaneous confidence intervals have advantages over the one-sided
simultaneous confidence intervals as they provide both lower and upper
bounds for the parameters of interest. The proposed design-oriented
two-stage two-sided simultaneous confidence intervals provide the benefits
of both the two-stage one-sided and two-sided simultaneous confidence
intervals. When the additional sample at the second stage may not be
available due to the experimental budget shortage or other factors in an
experiment, one-stage two-sided confidence intervals are proposed, which
combine the advantages of one-stage one-sided and two-sided simultaneous
confidence intervals. The critical constants are obtained using the
techniques given in Lam [9,10]. These critical constant are compared with
the critical constants obtained by Bonferroni inequality techniques and
found that critical constant obtained by Lam [9,10] are less conservative
than critical constants computed from the Bonferroni inequality technique.
Implementation of the proposed simultaneous confidence intervals is
demonstrated by a numerical example.
Journal: Journal of Applied Statistics
Pages: 1817-1830
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.796350
File-URL: http://hdl.handle.net/10.1080/02664763.2013.796350
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1817-1830
Template-Type: ReDIF-Article 1.0
Author-Name: Paola Annoni
Author-X-Name-First: Paola
Author-X-Name-Last: Annoni
Author-Name: Dorota Weziak-Bialowolska
Author-X-Name-First: Dorota
Author-X-Name-Last: Weziak-Bialowolska
Author-Name: Hania Farhan
Author-X-Name-First: Hania
Author-X-Name-Last: Farhan
Title: Measuring the impact of the Web: Rasch modelling for survey evaluation
Abstract:
In 2012, the World Wide Web Foundation launched for the first
time the Web Index (WI), which combines the existing secondary data with
new primary data to rank countries according to their progress and use of
the Web. Primary data are gathered via a multi-country specifically
designed questionnaire. The aim of our analysis is (1) to evaluate the
measurement properties of the expert assessment survey and to provide
survey designers with some insights into possible problematic questions
and/or unexpectedly behaving countries and (2) to assess the experts'
perception of the state and the value of the Web. To do so the Rating
Scale Rasch model is employed. Results show that about 10% of survey
questions are detected as misfitting and need to be reworded. Possible
reasons are: counter-orientation with respect to the WI polarity,
difficulty in understanding the question's words or binary instead of the
multiple response scale. Country analysis shows that no country can be
considered as an outlier due to notable unexpected pattern of answers.
Since the survey is to be expanded in future editions of the WI, the
results of our analysis are very important in pin-pointing the questions
most in need of refinement for the next edition of the Index.
Journal: Journal of Applied Statistics
Pages: 1831-1851
Issue: 8
Volume: 40
Year: 2013
Month: 8
X-DOI: 10.1080/02664763.2013.796351
File-URL: http://hdl.handle.net/10.1080/02664763.2013.796351
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:8:p:1831-1851
Template-Type: ReDIF-Article 1.0
Author-Name: David Canning
Author-X-Name-First: David
Author-X-Name-Last: Canning
Author-Name: Declan French
Author-X-Name-First: Declan
Author-X-Name-Last: French
Author-Name: Michael Moore
Author-X-Name-First: Michael
Author-X-Name-Last: Moore
Title: Non-parametric estimation of data dimensionality prior to data compression: the case of the human development index
Abstract:
In many applications in applied statistics, researchers
reduce the complexity of a data set by combining a group of variables into
a single measure using a factor analysis or an index number. We argue that
such compression loses information if the data actually have high
dimensionality. We advocate the use of a non-parametric estimator,
commonly used in physics (the Takens estimator), to
estimate the correlation dimension of the data prior to compression. The
advantage of this approach over traditional linear data compression
approaches is that the data do not have to be linearised. Applying our
ideas to the United Nations Human Development Index, we find that the four
variables that are used in its construction have dimension 3 and the index
loses information.
Journal: Journal of Applied Statistics
Pages: 1853-1863
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.798629
File-URL: http://hdl.handle.net/10.1080/02664763.2013.798629
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:1853-1863
Template-Type: ReDIF-Article 1.0
Author-Name: Jorge Alberto Achcar
Author-X-Name-First: Jorge Alberto
Author-X-Name-Last: Achcar
Author-Name: Em�lio Augusto Coelho-Barros
Author-X-Name-First: Em�lio Augusto
Author-X-Name-Last: Coelho-Barros
Author-Name: Josmar Mazucheli
Author-X-Name-First: Josmar
Author-X-Name-Last: Mazucheli
Title: Block and Basu bivariate lifetime distribution in the presence of cure fraction
Abstract:
This paper presents estimates for the parameters included in
the Block and Basu bivariate lifetime distributions in the presence of
covariates and cure fraction, applied to analyze survival data when some
individuals may never experience the event of interest and two lifetimes
are associated with each unit. A Bayesian procedure is used to get point
and confidence intervals for the unknown parameters. Posterior summaries
of interest are obtained using standard Markov Chain Monte Carlo methods
in rjags package for R software. An
illustration of the proposed methodology is given for a Diabetic
Retinopathy Study data set.
Journal: Journal of Applied Statistics
Pages: 1864-1874
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.798630
File-URL: http://hdl.handle.net/10.1080/02664763.2013.798630
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:1864-1874
Template-Type: ReDIF-Article 1.0
Author-Name: Wen-Liang Hung
Author-X-Name-First: Wen-Liang
Author-X-Name-Last: Hung
Author-Name: De-Hua Chen
Author-X-Name-First: De-Hua
Author-X-Name-Last: Chen
Title: Clustering algorithm for proximity-relation matrix and its applications
Abstract:
In this paper, we present a new algorithm for clustering
proximity-relation matrix that does not require the transitivity property.
The proposed algorithm is first inspired by the idea of Yang and Wu [16]
then turned into a self-organizing process that is built upon the
intuition behind clustering. At the end of the process subjects belonging
to be the same cluster should converge to the same point, which represents
the cluster center. However, the performance of Yang and Wu's algorithm
depends on parameter selection. In this paper, we use the partition
entropy (PE) index to choose it. Numerical result illustrates that the
proposed method does not only solve the parameter selection problem but
also obtains an optimal clustering result. Finally, we apply the proposed
algorithm to three applications. One is to evaluate the performance of
higher education in Taiwan, another is machine--parts grouping in cellular
manufacturing systems, and the other is to cluster probability density
functions.
Journal: Journal of Applied Statistics
Pages: 1875-1892
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.799126
File-URL: http://hdl.handle.net/10.1080/02664763.2013.799126
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:1875-1892
Template-Type: ReDIF-Article 1.0
Author-Name: Rui Fragoso
Author-X-Name-First: Rui
Author-X-Name-Last: Fragoso
Author-Name: Maria Leonor da Silva Carvalho
Author-X-Name-First: Maria Leonor da Silva
Author-X-Name-Last: Carvalho
Title: Estimation of cost allocation coefficients at the farm level using an entropy approach
Abstract:
This paper aims to estimate the farm cost allocation
coefficients from whole-farm input costs. An entropy approach was
developed under a Tobit formulation and was applied to a sample of farms
from the 2004 Farm Accountancy Data Network data base for Alentejo region,
Southern Portugal. A Generalized Maximum Entropy model and Cross
Generalized Entropy model were developed to the sample conditions and were
tested. Model results were assessed in terms of their precision and
estimation power and were compared with the observed data. The entropy
approach showed to be a flexible and valid tool to estimate incomplete
information, namely regarding farm costs.
Journal: Journal of Applied Statistics
Pages: 1893-1906
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.799127
File-URL: http://hdl.handle.net/10.1080/02664763.2013.799127
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:1893-1906
Template-Type: ReDIF-Article 1.0
Author-Name: R. M. Green
Author-X-Name-First: R. M.
Author-X-Name-Last: Green
Author-Name: M. S. Bebbington
Author-X-Name-First: M. S.
Author-X-Name-Last: Bebbington
Title: A longitudinal analysis of infant and senescent mortality using mixture models
Abstract:
We construct a mixture distribution including infant,
exogenous and Gompertzian/non-Gompertzian senescent mortality. Using
mortality data from Swedish females 1751--, we show that this outperforms
models without these features, and compare its trends in cohort and period
mortality over time. We find an almost complete disappearance of exogenous
mortality within the last century of period mortality, with cohort
mortality approaching the same limits. Both Gompertzian and
non-Gompertzian senescent mortality are consistently present, with the
estimated balance between them oscillating constantly. While the
parameters of the latter appear to be trending over time, the parameters
of the former do not.
Journal: Journal of Applied Statistics
Pages: 1907-1920
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.800032
File-URL: http://hdl.handle.net/10.1080/02664763.2013.800032
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:1907-1920
Template-Type: ReDIF-Article 1.0
Author-Name: Arindam Gupta
Author-X-Name-First: Arindam
Author-X-Name-Last: Gupta
Author-Name: Samba Siva Rao Pasupuleti
Author-X-Name-First: Samba Siva Rao
Author-X-Name-Last: Pasupuleti
Title: A new behavioural model for fertility schedules
Abstract:
Modelling age-specific fertility rates is of great importance
in demography because of their influence on population growth. Although we
have a variety of fertility models in the demographic literature, most of
them do not have any demographic interpretation for their parameters. It
is generally expected that models with behavioural interpretation are more
universal than those without any interpretation. Even though the famous
Gompertz model has some behavioural interpretation it suffers from other
drawbacks. In the present work, we propose a new fertility model, which
has its genesis in the generalization of logistic law. The proposed model
has good behavioural interpretation, alongside having nice parameter
interpretations.
Journal: Journal of Applied Statistics
Pages: 1921-1930
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.800033
File-URL: http://hdl.handle.net/10.1080/02664763.2013.800033
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:1921-1930
Template-Type: ReDIF-Article 1.0
Author-Name: Christian H. Weiß
Author-X-Name-First: Christian H.
Author-X-Name-Last: Weiß
Title: Integer-valued autoregressive models for counts showing underdispersion
Abstract:
The Poisson distribution is a simple and popular model for
count-data random variables, but it suffers from the equidispersion
requirement, which is often not met in practice. While models for
overdispersed counts have been discussed intensively in the literature,
the opposite phenomenon, underdispersion, has received only little
attention, especially in a time series context. We start with a detailed
survey of distribution models allowing for underdispersion, discuss their
properties and highlight possible disadvantages. After having identified
two model families with attractive properties as well as only two model
parameters, we combine these models with the INAR(1) model
(integer-valued
autoregressive), which is particularly
well suited to obtain auotocorrelated counts with underdispersion.
Properties of the resulting stationary INAR(1) models and approaches for
parameter estimation are considered, as well as possible extensions to
higher order autoregressions. Three real-data examples illustrate the
application of the models in practice.
Journal: Journal of Applied Statistics
Pages: 1931-1948
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.800034
File-URL: http://hdl.handle.net/10.1080/02664763.2013.800034
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:1931-1948
Template-Type: ReDIF-Article 1.0
Author-Name: Tiejun Tong
Author-X-Name-First: Tiejun
Author-X-Name-Last: Tong
Author-Name: Zeny Feng
Author-X-Name-First: Zeny
Author-X-Name-Last: Feng
Author-Name: Julia S. Hilton
Author-X-Name-First: Julia S.
Author-X-Name-Last: Hilton
Author-Name: Hongyu Zhao
Author-X-Name-First: Hongyu
Author-X-Name-Last: Zhao
Title: Estimating the proportion of true null hypotheses using the pattern of observed p-values
Abstract:
Estimating the proportion of true null hypotheses,
π0, has attracted much attention in the recent statistical
literature. Besides its apparent relevance for a set of specific
scientific hypotheses, an accurate estimate of this parameter is key for
many multiple testing procedures. Most existing methods for estimating
π0 in the literature are motivated from the independence
assumption of test statistics, which is often not true in reality.
Simulations indicate that most existing estimators in the presence of the
dependence among test statistics can be poor, mainly due to the increase
of variation in these estimators. In this paper, we propose several
data-driven methods for estimating π0 by incorporating the
distribution pattern of the observed p-values as a
practical approach to address potential dependence among test statistics.
Specifically, we use a linear fit to give a data-driven estimate for the
proportion of true-null p-values in (λ, 1] over
the whole range [0, 1] instead of using the expected proportion at 1 -
λ. We find that the proposed estimators may substantially decrease
the variance of the estimated true null proportion and thus improve the
overall performance.
Journal: Journal of Applied Statistics
Pages: 1949-1964
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.800035
File-URL: http://hdl.handle.net/10.1080/02664763.2013.800035
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:1949-1964
Template-Type: ReDIF-Article 1.0
Author-Name: Donatella Vicari
Author-X-Name-First: Donatella
Author-X-Name-Last: Vicari
Author-Name: Johan Ren� van Dorp
Author-X-Name-First: Johan Ren�
Author-X-Name-Last: van Dorp
Title: On a bounded bimodal two-sided distribution fitted to the Old-Faithful geyser data
Abstract:
In this paper, we shall develop a novel family of bimodal
univariate distributions (also allowing for unimodal shapes) and
demonstrate its use utilizing the well-known and almost classical data set
involving durations and waiting times of eruptions of the Old-Faithful
geyser in Yellowstone park. Specifically, we shall analyze the
Old-Faithful data set with 272 data points provided in Dekking et
al. [3]. In the process, we develop a bivariate distribution
using a copula technique and compare its fit to a mixture of bivariate
normal distributions also fitted to the same bivariate data set. We
believe the fit-analysis and comparison is primarily illustrative from an
educational perspective for distribution theory modelers, since in the
process a variety of statistical techniques are demonstrated. We do not
claim one model as preferred over the other.
Journal: Journal of Applied Statistics
Pages: 1965-1978
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.800036
File-URL: http://hdl.handle.net/10.1080/02664763.2013.800036
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:1965-1978
Template-Type: ReDIF-Article 1.0
Author-Name: Edgard M. Maboudou-Tchao
Author-X-Name-First: Edgard M.
Author-X-Name-Last: Maboudou-Tchao
Author-Name: Douglas M. Hawkins
Author-X-Name-First: Douglas M.
Author-X-Name-Last: Hawkins
Title: Detection of multiple change-points in multivariate data
Abstract:
The statistical analysis of change-point detection and
estimation has received much attention recently. A time point such that
observations follow a certain statistical distribution up to that point
and a different distribution -- commonly of the same functional form but
different parameters after that point -- is called a change-point.
Multiple change-point problems arise when we have more than one
change-point. This paper develops a method for multivariate normally
distributed data to detect change-points and estimate within-segment
parameters using maximum likelihood estimation.
Journal: Journal of Applied Statistics
Pages: 1979-1995
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.800471
File-URL: http://hdl.handle.net/10.1080/02664763.2013.800471
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:1979-1995
Template-Type: ReDIF-Article 1.0
Author-Name: Yanqing Yi
Author-X-Name-First: Yanqing
Author-X-Name-Last: Yi
Author-Name: Yuan Yuan
Author-X-Name-First: Yuan
Author-X-Name-Last: Yuan
Title: An optimal allocation for response-adaptive designs
Abstract:
A new allocation proportion is derived by using differential
equation methods for response-adaptive designs. This new allocation is
compared with the balanced and the Neyman allocations and the optimal
allocation proposed by Rosenberger, Stallard, Ivanova, Harper and Ricks
(RSIHR) from an ethical point of view and statistical power performance.
The new allocation has the ethical advantages of allocating more than 50%
of patients to the better treatment. It also allocates higher proportion
of patients to the better treatment than the RSIHR optimal allocation for
success probabilities larger than 0.5. The statistical power under the
proposed allocation is compared with these under the balanced, the Neyman
and Rosenberger's optimal allocations through simulation. The simulation
results indicate that the statistical power under the proposed allocation
proportion is similar as to those under the balanced, the Neyman and the
RSIHR allocations.
Journal: Journal of Applied Statistics
Pages: 1996-2008
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.800846
File-URL: http://hdl.handle.net/10.1080/02664763.2013.800846
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:1996-2008
Template-Type: ReDIF-Article 1.0
Author-Name: Laura Barbieri
Author-X-Name-First: Laura
Author-X-Name-Last: Barbieri
Author-Name: Mario Faliva
Author-X-Name-First: Mario
Author-X-Name-Last: Faliva
Author-Name: Maria Grazia Zoia
Author-X-Name-First: Maria Grazia
Author-X-Name-Last: Zoia
Title: Band-limited component estimation in time-limited economic series
Abstract:
This paper tackles the issue of economic time-series modeling
from a joint time and frequency-domain standpoint, with the objective of
estimating the latent trend-cycle component. Since time-series records are
data strings over a finite time span, they read as samples of contiguous
data drawn from realizations of stochastic processes aligned with the time
arrow. This accounts for the interpretation of time series as time-limited
signals. Economic time series (up to a disturbance term) result from
latent components known as trend, cycle, and seasonality, whose generating
stochastic processes are harmonizable on a finite average-power argument.
In addition, since trend is associated with long-run regular movements,
and cycle with medium-term economic fluctuation, both of these turn out to
be band-limited components. Recognizing such a frequency-domain location
permits a filter-based approach to component estimation. This is
accomplished through a Toeplitz matrix operator with sinc functions as
entries, mirroring the ideal low-pass filter impulse response. The notion
of virtual transfer function is developed and its closed-form expression
derived in order to evaluate the filter features. The paper is completed
by applying this filter to quarterly data from Italian industrial
production, thus shedding light on the performance of the estimation
procedure.
Journal: Journal of Applied Statistics
Pages: 2009-2023
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.801408
File-URL: http://hdl.handle.net/10.1080/02664763.2013.801408
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:2009-2023
Template-Type: ReDIF-Article 1.0
Author-Name: Weihua Zhao
Author-X-Name-First: Weihua
Author-X-Name-Last: Zhao
Author-Name: Riquan Zhang
Author-X-Name-First: Riquan
Author-X-Name-Last: Zhang
Author-Name: Jicai Liu
Author-X-Name-First: Jicai
Author-X-Name-Last: Liu
Title: Robust variable selection for the varying coefficient model based on composite L 1--L 2 regression
Abstract:
The varying coefficient model (VCM) is an important
generalization of the linear regression model and many existing estimation
procedures for VCM were built on L
2 loss, which is popular for its mathematical beauty but is not
robust to non-normal errors and outliers. In this paper, we address the
problem of both robustness and efficiency of estimation and variable
selection procedure based on the convex combined loss of
L 1 and L
2 instead of only quadratic loss for VCM. By using
local linear modeling method, the asymptotic normality of estimation is
driven and a useful selection method is proposed for the weight of
composite L 1 and
L 2. Then the variable
selection procedure is given by combining local kernel smoothing with
adaptive group LASSO. With appropriate selection of tuning parameters by
Bayesian information criterion (BIC) the theoretical properties of the new
procedure, including consistency in variable selection and the oracle
property in estimation, are established. The finite sample performance of
the new method is investigated through simulation studies and the analysis
of body fat data. Numerical studies show that the new method is better
than or at least as well as the least square-based method in terms of both
robustness and efficiency for variable selection.
Journal: Journal of Applied Statistics
Pages: 2024-2040
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.804040
File-URL: http://hdl.handle.net/10.1080/02664763.2013.804040
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:2024-2040
Template-Type: ReDIF-Article 1.0
Author-Name: Nicholas T. Longford
Author-X-Name-First: Nicholas T.
Author-X-Name-Last: Longford
Title: Searching for contaminants
Abstract:
Decision theory is applied to the problem of identifying a
small fraction of observations that contaminate a random sample from a
specified distribution. The uncertainty about the parameters that
characterise the contamination is addressed by sensitivity analysis. The
analyst's (or the client's) perspective and priorities are incorporated in
the analysis by ranges of plausible loss functions. An application to
fraud detection is presented.
Journal: Journal of Applied Statistics
Pages: 2041-2055
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.804041
File-URL: http://hdl.handle.net/10.1080/02664763.2013.804041
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:2041-2055
Template-Type: ReDIF-Article 1.0
Author-Name: Hossein Zamani
Author-X-Name-First: Hossein
Author-X-Name-Last: Zamani
Author-Name: Noriszura Ismail
Author-X-Name-First: Noriszura
Author-X-Name-Last: Ismail
Title: Score test for testing zero-inflated Poisson regression against zero-inflated generalized Poisson alternatives
Abstract:
In several cases, count data often have excessive number of
zero outcomes. This zero-inflated phenomenon is a specific cause of
overdispersion, and zero-inflated Poisson regression model (ZIP) has been
proposed for accommodating zero-inflated data. However, if the data
continue to suggest additional overdispersion, zero-inflated negative
binomial (ZINB) and zero-inflated generalized Poisson (ZIGP) regression
models have been considered as alternatives. This study proposes the score
test for testing ZIP regression model against ZIGP alternatives and proves
that it is equal to the score test for testing ZIP regression model
against ZINB alternatives. The advantage of using the score test over
other alternative tests such as likelihood ratio and Wald is that the
score test can be used to determine whether a more complex model is
appropriate without fitting the more complex model. Applications of the
proposed score test on several datasets are also illustrated.
Journal: Journal of Applied Statistics
Pages: 2056-2068
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.804904
File-URL: http://hdl.handle.net/10.1080/02664763.2013.804904
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:2056-2068
Template-Type: ReDIF-Article 1.0
Author-Name: Riccardo Borgoni
Author-X-Name-First: Riccardo
Author-X-Name-Last: Borgoni
Author-Name: Valeria Tritto
Author-X-Name-First: Valeria
Author-X-Name-Last: Tritto
Author-Name: Daniela de Bartolo
Author-X-Name-First: Daniela
Author-X-Name-Last: de Bartolo
Title: Identifying radon-prone building typologies by marginal modelling
Abstract:
Radon is a naturally occurring decay product of uranium known
to be the main contributor to natural background radiation exposure. It
has been established that the health risk related to radon exposure is
lung cancer. In fact, radon is considered to be a major leading cause of
lung cancer, second only to smoking. In this paper, we identified building
typologies that affect the probability of detecting indoor radon
concentration above reference values, using the data collected within two
monitoring campaigns recently conducted in Northern Italy. This
information is fundamental both in prevention, i.e. when the construction
of a new building is planned and in mitigation, i.e. when a high
concentration detected inside buildings has to be reduced. A spatial
regression approach for binary data was adopted for this goal where some
relevant covariates on the soil were retrieved by linking external spatial
databases.
Journal: Journal of Applied Statistics
Pages: 2069-2086
Issue: 9
Volume: 40
Year: 2013
Month: 9
X-DOI: 10.1080/02664763.2013.804906
File-URL: http://hdl.handle.net/10.1080/02664763.2013.804906
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:9:p:2069-2086
Template-Type: ReDIF-Article 1.0
Author-Name: Marcelo Justus dos Santos
Author-X-Name-First: Marcelo Justus
Author-X-Name-Last: dos Santos
Author-Name: Ana Lúcia Kassouf
Author-X-Name-First: Ana Lúcia
Author-X-Name-Last: Kassouf
Title: A cointegration analysis of crime, economic activity, and police performance in São Paulo city
Abstract:
The main objective of this paper is to investigate possible
causes for the significant reduction observed in crime rates in São
Paulo city. By applying a cointegration analysis, we observed long-run
relationships between crime, economic activity, and police performance.
The results indicate that the lethal crime rate is positively related to
unemployment and negatively related to real wages and to the results of
law-enforcement activities, specifically arrests and seizure of firearms.
Moreover, the hypothesis that the Disarmament Statute led to a reduction
in the lethal crime rate is not rejected.
Journal: Journal of Applied Statistics
Pages: 2087-2109
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.804905
File-URL: http://hdl.handle.net/10.1080/02664763.2013.804905
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2087-2109
Template-Type: ReDIF-Article 1.0
Author-Name: Wantanee Surapaitoolkorn
Author-X-Name-First: Wantanee
Author-X-Name-Last: Surapaitoolkorn
Title: Variable dimension via stochastic volatility model using FX rates
Abstract:
In this paper, changepoint analysis is applied to stochastic
volatility (SV) models which aim to understand the locations and movements
of high frequency FX financial time series. Bayesian inference using the
Markov Chain Monte Carlo method is performed using a process called
variable dimension for SV parameters. Interesting results
are that FX series have locations where one or more positions of the
sequence correspond to systemic changes, and overall non-stationarity, in
the returns process. Furthermore, we found that the changepoint locations
provide an informative estimate for all FX series. Importantly in most
cases, the detected changepoints can be identified with economic factors
relevant to the country concerned. This helps support the fact that
macroeconomics news and the movement in financial price are positively
related.
Journal: Journal of Applied Statistics
Pages: 2110-2128
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.807330
File-URL: http://hdl.handle.net/10.1080/02664763.2013.807330
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2110-2128
Template-Type: ReDIF-Article 1.0
Author-Name: Jose R.S. Santos
Author-X-Name-First: Jose R.S.
Author-X-Name-Last: Santos
Author-Name: Caio L.N. Azevedo
Author-X-Name-First: Caio L.N.
Author-X-Name-Last: Azevedo
Author-Name: Heleno Bolfarine
Author-X-Name-First: Heleno
Author-X-Name-Last: Bolfarine
Title: A multiple group item response theory model with centered skew-normal latent trait distributions under a Bayesian framework
Abstract:
Very often, in psychometric research, as in educational
assessment, it is necessary to analyze item response from clustered
respondents. The multiple group item response theory (IRT) model proposed
by Bock and Zimowski [12] provides a useful framework for analyzing such
type of data. In this model, the selected groups of respondents are of
specific interest such that group-specific population distributions need
to be defined. The usual assumption for parameter estimation in this
model, which is that the latent traits are random variables following
different symmetric normal distributions, has been questioned in many
works found in the IRT literature. Furthermore, when this assumption does
not hold, misleading inference can result. In this paper, we consider that
the latent traits for each group follow different skew-normal
distributions, under the centered parameterization. We named it skew
multiple group IRT model. This modeling extends the works of Azevedo
et al. [4], Baz�n et al. [11] and Bock
and Zimowski [12] (concerning the latent trait distribution). Our approach
ensures that the model is identifiable. We propose and compare, concerning
convergence issues, two Monte Carlo Markov Chain (MCMC) algorithms for
parameter estimation. A simulation study was performed in order to
evaluate parameter recovery for the proposed model and the selected
algorithm concerning convergence issues. Results reveal that the proposed
algorithm recovers properly all model parameters. Furthermore, we analyzed
a real data set which presents asymmetry concerning the latent traits
distribution. The results obtained by using our approach confirmed the
presence of negative asymmetry for some latent trait distributions.
Journal: Journal of Applied Statistics
Pages: 2129-2149
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.807331
File-URL: http://hdl.handle.net/10.1080/02664763.2013.807331
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2129-2149
Template-Type: ReDIF-Article 1.0
Author-Name: S. Faria
Author-X-Name-First: S.
Author-X-Name-Last: Faria
Author-Name: F. Gon�alves
Author-X-Name-First: F.
Author-X-Name-Last: Gon�alves
Title: Financial data modeling by Poisson mixture regression
Abstract:
In many financial applications, Poisson mixture regression
models are commonly used to analyze heterogeneous count data. When fitting
these models, the observed counts are supposed to come from two or more
subpopulations and parameter estimation is typically performed by means of
maximum likelihood via the Expectation--Maximization algorithm. In this
study, we discuss briefly the procedure for fitting Poisson mixture
regression models by means of maximum likelihood, the model selection and
goodness-of-fit tests. These models are applied to a real data set for
credit-scoring purposes. We aim to reveal the impact of demographic and
financial variables in creating different groups of clients and to predict
the group to which each client belongs, as well as his expected number of
defaulted payments. The model's conclusions are very interesting,
revealing that the population consists of three groups, contrasting with
the traditional good versus bad categorization approach of the
credit-scoring systems.
Journal: Journal of Applied Statistics
Pages: 2150-2162
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.807332
File-URL: http://hdl.handle.net/10.1080/02664763.2013.807332
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2150-2162
Template-Type: ReDIF-Article 1.0
Author-Name: Jacques B�nass�ni
Author-X-Name-First: Jacques
Author-X-Name-Last: B�nass�ni
Title: A concentration approach to sensitivity studies in statistical estimation problems
Abstract:
It is shown that the concept of concentration is of potential
interest in the sensitivity study of some parameters and related
estimators. Basic ideas are introduced for a real parameter θ>0
together with graphical representations using Lorenz curves of
concentration. Examples based on the mean, standard deviation and variance
are provided for some classical distributions. This concentration approach
is also discussed in relation with influence functions. Special emphasis
is given to the average concentration of an estimator which provides a
sensitivity measure allowing one to compare several estimators of the same
parameter. Properties of this measure are investigated through simulation
studies and its practical interest is illustrated by examples based on the
trimmed mean and the Winsorized variance.
Journal: Journal of Applied Statistics
Pages: 2163-2180
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.808318
File-URL: http://hdl.handle.net/10.1080/02664763.2013.808318
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2163-2180
Template-Type: ReDIF-Article 1.0
Author-Name: Vesna Ćojbašić Rajić
Author-X-Name-First: Vesna Ćojbašić
Author-X-Name-Last: Rajić
Author-Name: J. Stanojević
Author-X-Name-First: J.
Author-X-Name-Last: Stanojević
Title: Confidence intervals for the ratio of two variances
Abstract:
In this paper we consider confidence intervals for the ratio
of two population variances. We propose a confidence interval for the
ratio of two variances based on the t-statistic by
deriving its Edgeworth expansion and considering Hall's and Johnson's
transformations. Then, we consider the coverage accuracy of suggested
intervals and intervals based on the F-statistic for some
distributions.
Journal: Journal of Applied Statistics
Pages: 2181-2187
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.808319
File-URL: http://hdl.handle.net/10.1080/02664763.2013.808319
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2181-2187
Template-Type: ReDIF-Article 1.0
Author-Name: Hongli Niu
Author-X-Name-First: Hongli
Author-X-Name-Last: Niu
Author-Name: Jun Wang
Author-X-Name-First: Jun
Author-X-Name-Last: Wang
Title: Power-law scaling behavior analysis of financial time series model by voter interacting dynamic system
Abstract:
We investigate the power-law scaling behaviors of returns for
a financial price process which is developed by the voter interacting
dynamic system in comparison with the real financial market index
(Shanghai Composite Index). The voter system is a continuous time Markov
process, which originally represents a voter's attitude on a particular
topic, that is, voters reconsider their opinions at times distributed
according to independent exponential random variables. In this paper, the
detrended fluctuation analysis method is employed to explore the long
range power-law correlations of return time series for different values of
parameters in the financial model. The findings show no indication or very
weak long-range power-law correlations for the simulated returns but
strong long-range dependence for the absolute returns. The multiplier
distribution is studied to demonstrate directly the existence of scale
invariance in the actual data of the Shanghai Stock Exchange and the
simulation data of the model by comparison. Moreover, the Zipf analysis is
applied to investigate the statistical behaviors of frequency functions
and the distributions of the returns. By a comparative study, the
simulation data for our constructed price model exhibits very similar
behaviors to the real stock index, this indicates somewhat rationality of
our model to the market application.
Journal: Journal of Applied Statistics
Pages: 2188-2203
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.809515
File-URL: http://hdl.handle.net/10.1080/02664763.2013.809515
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2188-2203
Template-Type: ReDIF-Article 1.0
Author-Name: Saima Altaf
Author-X-Name-First: Saima
Author-X-Name-Last: Altaf
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Title: Analysis of the amended Davidson model with order effect for paired comparison in the Bayesian paradigm
Abstract:
We commonly observe many types of paired nature of
competitions in which the objects are compared by the respondents pairwise
in a subjective manner. The Bayesian statistics, contrary to the classical
statistics, presents a generic tool to incorporate new experimental
evidence and update the existing information. These and other properties
have ushered the statisticians to focus their attention on the Bayesian
analysis of different paired comparison models. The present article
focuses on the amended Davidson model for paired comparison in which an
amendment has been introduced that accommodates the option of not
distinguishing the effects of two treatments when they are compared
pairwise. However, Bayesian analysis of the amended Davidson model is
performed using the noninformative priors after making another small
modification of incorporating the parameter of order effect factor. The
joint and marginal posterior distributions of the parameters, their
posterior estimates, predictive and posterior probabilities to compare the
treatment parameters are obtained.
Journal: Journal of Applied Statistics
Pages: 2204-2218
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.809516
File-URL: http://hdl.handle.net/10.1080/02664763.2013.809516
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2204-2218
Template-Type: ReDIF-Article 1.0
Author-Name: M. Revan Özkale
Author-X-Name-First: M. Revan
Author-X-Name-Last: Özkale
Title: Influence measures in affine combination type regression
Abstract:
The detection of outliers and influential observations has
received a great deal of attention in the statistical literature in the
context of least-squares (LS) regression. However, the explanatory
variables can be correlated with each other and alternatives to LS come
out to address outliers/influential observations and multicollinearity,
simultaneously. This paper proposes new influence measures based on the
affine combination type regression for the detection of influential
observations in the linear regression model when multicollinearity exists.
Approximate influence measures are also proposed for the affine
combination type regression. Since the affine combination type regression
includes the ridge, the Liu and the shrunken regressions as special cases,
influence measures under the ridge, the Liu and the shrunken regressions
are also examined to see the possible effect that multicollinearity can
have on the influence of an observation. The Longley data set is given
illustrating the influence measures in affine combination type regression
and also in ridge, Liu and shrunken regressions so that the performance of
different biased regressions on detecting and assessing the influential
observations is examined.
Journal: Journal of Applied Statistics
Pages: 2219-2243
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.809568
File-URL: http://hdl.handle.net/10.1080/02664763.2013.809568
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2219-2243
Template-Type: ReDIF-Article 1.0
Author-Name: Yangxin Huang
Author-X-Name-First: Yangxin
Author-X-Name-Last: Huang
Author-Name: Getachew A. Dagne
Author-X-Name-First: Getachew A.
Author-X-Name-Last: Dagne
Author-Name: Jeong-Gun Park
Author-X-Name-First: Jeong-Gun
Author-X-Name-Last: Park
Title: Segmental modeling of changing immunologic response for CD4 data with skewness, missingness and dropout
Abstract:
In clinical practice, the profile of each subject's CD4
response from a longitudinal study may follow a 'broken stick' like
trajectory, indicating multiple phases of increase and/or decline in
response. Such multiple phases (changepoints) may be important indicators
to help quantify treatment effect and improve management of patient care.
Although it is a common practice to analyze complex AIDS longitudinal data
using nonlinear mixed-effects (NLME) or nonparametric mixed-effects (NPME)
models in the literature, NLME or NPME models become a challenge to
estimate changepoint due to complicated structures of model formulations.
In this paper, we propose a changepoint mixed-effects model with random
subject-specific parameters, including the changepoint for the analysis of
longitudinal CD4 cell counts for HIV infected subjects following highly
active antiretroviral treatment. The longitudinal CD4 data in this study
may exhibit departures from symmetry, may encounter missing observations
due to various reasons, which are likely to be non-ignorable in the sense
that missingness may be related to the missing values, and may be censored
at the time of the subject going off study-treatment, which is a
potentially informative dropout mechanism. Inferential procedures can be
complicated dramatically when longitudinal CD4 data with asymmetry
(skewness), incompleteness and informative dropout are observed in
conjunction with an unknown changepoint. Our objective is to address the
simultaneous impact of skewness, missingness and informative censoring by
jointly modeling the CD4 response and dropout time processes under a
Bayesian framework. The method is illustrated using a real AIDS data set
to compare potential models with various scenarios, and some interested
results are presented.
Journal: Journal of Applied Statistics
Pages: 2244-2258
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.809569
File-URL: http://hdl.handle.net/10.1080/02664763.2013.809569
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2244-2258
Template-Type: ReDIF-Article 1.0
Author-Name: Min Wang
Author-X-Name-First: Min
Author-X-Name-Last: Wang
Author-Name: Jing Zhao
Author-X-Name-First: Jing
Author-X-Name-Last: Zhao
Author-Name: Xiaoqian Sun
Author-X-Name-First: Xiaoqian
Author-X-Name-Last: Sun
Author-Name: Chanseok Park
Author-X-Name-First: Chanseok
Author-X-Name-Last: Park
Title: Robust explicit estimation of the two-parameter Birnbaum--Saunders distribution
Abstract:
The two-parameter Birnbaum--Saunders distribution is widely
applicable to model failure times of fatiguing materials. Its
maximum-likelihood estimators (MLEs) are very sensitive to outliers and
also have no closed-form expressions. This motivates us to develop some
alternative estimators. In this paper, we develop two robust estimators,
which are also explicit functions of sample observations and are thus easy
to compute. We derive their breakdown points and carry out extensive Monte
Carlo simulation experiments to compare the performance of all the
estimators under consideration. It has been observed from the simulation
results that the proposed estimators outperform in a manner that is
approximately comparable with the MLEs, whereas they are far superior in
the presence of data contamination that often occurs in practical
situations. A simple bias-reduction technique is presented to reduce the
bias of the recommended estimators. Finally, the practical application of
the developed procedures is illustrated with a real-data example.
Journal: Journal of Applied Statistics
Pages: 2259-2274
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.809570
File-URL: http://hdl.handle.net/10.1080/02664763.2013.809570
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2259-2274
Template-Type: ReDIF-Article 1.0
Author-Name: Mauro Costantini
Author-X-Name-First: Mauro
Author-X-Name-Last: Costantini
Title: Forecasting the industrial production using alternative factor models and business survey data
Abstract:
This paper compares the forecasting performance of three
alternative factor models based on business survey data for the industrial
production in Italy. The first model uses static principal component
analysis, while the other two apply dynamic principal component analysis
in frequency domain and subspace algorithms for state-space
representation, respectively. Once the factors are extracted from the
business survey data, then they are included into a single equation to
predict the industrial production index. The forecast results show that
the three factor models have a better performance than that of a simple
autoregressive benchmark model regardless of the specification and
estimation methods. Furthermore, the state-space model yields superior
forecasts amongst the factor models.
Journal: Journal of Applied Statistics
Pages: 2275-2289
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.809870
File-URL: http://hdl.handle.net/10.1080/02664763.2013.809870
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2275-2289
Template-Type: ReDIF-Article 1.0
Author-Name: Hossein Hassani
Author-X-Name-First: Hossein
Author-X-Name-Last: Hassani
Author-Name: Saeed Heravi
Author-X-Name-First: Saeed
Author-X-Name-Last: Heravi
Author-Name: Gary Brown
Author-X-Name-First: Gary
Author-X-Name-Last: Brown
Author-Name: Daniel Ayoubkhani
Author-X-Name-First: Daniel
Author-X-Name-Last: Ayoubkhani
Title: Forecasting before, during, and after recession with singular spectrum analysis
Abstract:
The aim of this research is to apply the singular spectrum
analysis (SSA) technique, which is a relatively new and powerful technique
in time series analysis and forecasting, to forecast the 2008 UK
recession, using eight economic time series. These time series were
selected as they represent the most important economic indicators in the
UK. The ability to understand the underlying structure of these series and
to quickly identify turning points such as the on-set of the recent
recession is of key interest to users. In recent years, the SSA technique
has been further developed and applied to many practical problems. Hence,
these series will provide an ideal practical test of the potential
benefits from SSA during one of the most challenging periods for
econometric analyses of recent years. The results are compared with those
obtained using the ARIMA and Holt--Winters models as these methods are
currently used as standard forecasting methods in the Office for National
Statistics in the UK.
Journal: Journal of Applied Statistics
Pages: 2290-2302
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.810193
File-URL: http://hdl.handle.net/10.1080/02664763.2013.810193
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2290-2302
Template-Type: ReDIF-Article 1.0
Author-Name: Haydar Demirhan
Author-X-Name-First: Haydar
Author-X-Name-Last: Demirhan
Title: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells
Abstract:
In this article, a Bayesian approach is proposed for the
estimation of log odds ratios and intraclass correlations over a two-way
contingency table, including intraclass correlated cells. Required
likelihood functions of log odds ratios are obtained, and determination of
prior structures is discussed. Hypothesis testing for log odds ratios and
intraclass correlations by using the posterior simulations is outlined.
Because the proposed approach includes no asymptotic theory, it is useful
for the estimation and hypothesis testing of log odds ratios in the
presence of certain intraclass correlation patterns. A family health
status and limitations data set is analyzed by using the proposed approach
in order to figure out the impact of intraclass correlations on the
estimates and hypothesis tests of log odds ratios. Although intraclass
correlations are small in the data set, we obtain that even small
intraclass correlations can significantly affect the estimates and test
results, and our approach is useful for the estimation and testing of log
odds ratios in the presence of intraclass correlations.
Journal: Journal of Applied Statistics
Pages: 2303-2316
Issue: 10
Volume: 40
Year: 2013
Month: 10
X-DOI: 10.1080/02664763.2013.810196
File-URL: http://hdl.handle.net/10.1080/02664763.2013.810196
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:10:p:2303-2316
Template-Type: ReDIF-Article 1.0
Author-Name: Ludovic Seifert
Author-X-Name-First: Ludovic
Author-X-Name-Last: Seifert
Author-Name: Jean-Fran�ois Coeurjolly
Author-X-Name-First: Jean-Fran�ois
Author-X-Name-Last: Coeurjolly
Author-Name: Romain H�rault
Author-X-Name-First: Romain
Author-X-Name-Last: H�rault
Author-Name: L�o Wattebled
Author-X-Name-First: L�o
Author-X-Name-Last: Wattebled
Author-Name: Keith Davids
Author-X-Name-First: Keith
Author-X-Name-Last: Davids
Title: Temporal dynamics of inter-limb coordination in ice climbing revealed through change-point analysis of the geodesic mean of circular data
Abstract:
This study examined the temporal dynamics of the inter-limb
angles of skilled and less skilled ice climbers to determine how they
explored ice fall properties to adapt their coordination patterns during
performance. We observed two circular time series corresponding to the
upper- and lower-limbs of seven expert and eight inexperienced ice
climbers. We analyzed these data through a multiple change-point analysis
of the geodesic (or Fr�chet) mean on the circle. Guided by the nature of
the geodesic mean obtained by an optimization procedure, we extended the
filtered derivative method, known to be computationally very cheap and
fast, to circular data. Local estimation of the variability was assessed
through the number of change-points computed via the filtered derivatives
with p-value method for the time series and integrated
squared error (ISE). Results of this change-point analysis did not reveal
significant differences of the number of change-points between groups but
indicated higher ISE that supported the existence of plateaux for
beginners. These results emphasized higher local variability of limb
angles for experts than for beginners suggesting greater dependence on the
properties of the performance environment and adaptive behaviors in the
former. Conversely, the lower local variance of limb angles assessed in
beginners may reflect their independence of the environmental constraints,
as they focused mainly on controlling body equilibrium.
Journal: Journal of Applied Statistics
Pages: 2317-2331
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.810194
File-URL: http://hdl.handle.net/10.1080/02664763.2013.810194
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2317-2331
Template-Type: ReDIF-Article 1.0
Author-Name: Chor Foon Tang
Author-X-Name-First: Chor Foon
Author-X-Name-Last: Tang
Title: A revisitation of the export-led growth hypothesis in Malaysia using the leveraged bootstrap simulation and rolling causality techniques
Abstract:
According to the neoclassical growth theory, export expansion
could stimulate economic growth because it promotes specialisation and
raises factor productivity. Thus, many developing countries depend heavily
on export-orientated businesses to accelerate economic growth.
Nevertheless, the causality evidences on the export-led growth hypothesis
remain elusive and controversial. Two primary empirical questions emerged
in the international trade and development literatures are: (a) Does the
export-led growth hypothesis still valid? (b) Why causality evidences are
inconsistent among studies? In light of these, the present study attempts
to contribute to the export-led growth literature by using the Malaysian
data set. This study covers the monthly data set from January 1975 to
August 2010. To achieve the objectives of this study, we employ the
leveraged bootstrap simulation causality test and also the rolling
regression-based causality tests. The leveraged bootstrap simulation
causality results suggest that exports and output growth are bilateral
causality in nature. However, the rolling causality results demonstrate
that the causality inferences for export-led growth hypothesis are
unstable over time. For this reason, policy initiative to promote exports
may not always stimulate economic growth and development in Malaysia.
Therefore, balancing policy is urged to ensure that the economic growth in
Malaysia can be materialised.
Journal: Journal of Applied Statistics
Pages: 2332-2340
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.810195
File-URL: http://hdl.handle.net/10.1080/02664763.2013.810195
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2332-2340
Template-Type: ReDIF-Article 1.0
Author-Name: Jing-Er Chiu
Author-X-Name-First: Jing-Er
Author-X-Name-Last: Chiu
Author-Name: Chih-Hsin Tsai
Author-X-Name-First: Chih-Hsin
Author-X-Name-Last: Tsai
Title: Properties and performance of one-sided cumulative count of conforming chart with parameter estimation in high-quality processes
Abstract:
The one-sided cumulative count of conforming (CCC) chart is a
useful method to monitor nonconforming fraction in high-quality
manufacturing processes. The nonconforming fraction parameter is assumed
to be known when implementing a one-sided CCC chart. In this study, we
investigated the impact of estimated nonconforming fraction,
[pcirc] 0, in a one-sided CCC
chart. The run length distribution is derived as well as the conditional
probability of a false alarm rate (CFAR), conditional average run length
(CARL) and its standard deviation (CSDRL). Simulation results are
conducted to evaluate the effect of [pcirc]
0 in a one-sided CCC chart. The results show that values of
CFAR, CARL and CSDRL are close to the nominal values for a large sample.
The impact of estimation errors was also studied. We find that CFAR
decreases for large [pcirc] 0.
Thus, a large value of [pcirc]
0 is suggested for fewer false alarms.
Journal: Journal of Applied Statistics
Pages: 2341-2353
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.811479
File-URL: http://hdl.handle.net/10.1080/02664763.2013.811479
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2341-2353
Template-Type: ReDIF-Article 1.0
Author-Name: Filidor Vilca
Author-X-Name-First: Filidor
Author-X-Name-Last: Vilca
Author-Name: Mariana Rodrigues-Motta
Author-X-Name-First: Mariana
Author-X-Name-Last: Rodrigues-Motta
Author-Name: V�ctor Leiva
Author-X-Name-First: V�ctor
Author-X-Name-Last: Leiva
Title: On a variance stabilizing model and its application to genomic data
Abstract:
In this paper, we propose a model based on a class of
symmetric distributions, which avoids the transformation of data,
stabilizes the variance of the observations, and provides robust
estimation of parameters and high flexibility for modeling different types
of data. Probabilistic and statistical aspects of this new model are
developed throughout the article, which include mathematical properties,
estimation of parameters and inference. The obtained results are
illustrated by means of real genomic data.
Journal: Journal of Applied Statistics
Pages: 2354-2371
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.811480
File-URL: http://hdl.handle.net/10.1080/02664763.2013.811480
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2354-2371
Template-Type: ReDIF-Article 1.0
Author-Name: Lei Wang
Author-X-Name-First: Lei
Author-X-Name-Last: Wang
Author-Name: Yukun Liu
Author-X-Name-First: Yukun
Author-X-Name-Last: Liu
Author-Name: Wei Wu
Author-X-Name-First: Wei
Author-X-Name-Last: Wu
Author-Name: Xiaolong Pu
Author-X-Name-First: Xiaolong
Author-X-Name-Last: Pu
Title: Sequential LND sensitivity test for binary response data
Abstract:
Sensitivity tests are used to make inferences about a
sensitivity, a characteristic property of some products that cannot be
observed directly. For binary response sensitivity data (dead or alive,
explode or unexplode), the Langlie and Neyer are two well-known
sensitivity tests. The priorities of the Langlie and Neyer tests are
investigated in this paper. It is shown that the Langlie test has an
advantage in getting an overlap, while the Neyer test has better
estimation precision. Aiming at improving both the speed of getting an
overlap and the estimation precision, we propose a new sensitivity test
which replaces the first part of the Neyer test with the Langlie test. Our
simulation studies indicate that the proposed test outperforms the
Langlie, Neyer and Dror and Steinberg tests from the viewpoints of
estimation precision and probability of obtaining an overlap.
Journal: Journal of Applied Statistics
Pages: 2372-2384
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.817546
File-URL: http://hdl.handle.net/10.1080/02664763.2013.817546
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2372-2384
Template-Type: ReDIF-Article 1.0
Author-Name: Lee Fawcett
Author-X-Name-First: Lee
Author-X-Name-Last: Fawcett
Author-Name: Neil Thorpe
Author-X-Name-First: Neil
Author-X-Name-Last: Thorpe
Title: Mobile safety cameras: estimating casualty reductions and the demand for secondary healthcare
Abstract:
We consider a fully Bayesian analysis of road casualty data
at 56 designated mobile safety camera sites in the Northumbria Police
Force area in the UK. It is well documented that regression to the mean
(RTM) can exaggerate the effectiveness of road safety measures and, since
the 1980s, an empirical Bayes (EB) estimation framework has become the
gold standard for separating real treatment effects from those of RTM. In
this paper we suggest some diagnostics to check the assumptions
underpinning the standard estimation framework. We also show that,
relative to a fully Bayesian treatment, the EB method is over-optimistic
when quantifying the variability of estimates of casualty frequency.
Implementing a fully Bayesian analysis via Markov chain Monte Carlo also
provides a more flexible and complete inferential procedure. We assess the
sensitivity of estimates of treatment effectiveness, as well as the
expected monetary value of prevention owing to the implementation of the
safety cameras, to different model specifications, which include the
estimation of trend and the construction of informative priors for some
parameters.
Journal: Journal of Applied Statistics
Pages: 2385-2406
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.817547
File-URL: http://hdl.handle.net/10.1080/02664763.2013.817547
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2385-2406
Template-Type: ReDIF-Article 1.0
Author-Name: M. H. Lee
Author-X-Name-First: M. H.
Author-X-Name-Last: Lee
Author-Name: H. J. Sadaei
Author-X-Name-First: H. J.
Author-X-Name-Last: Sadaei
Author-Name: Suhartono
Author-X-Name-First:
Author-X-Name-Last: Suhartono
Title: Improving TAIEX forecasting using fuzzy time series with Box--Cox power transformation
Abstract:
Box--Cox together with our newly proposed transformation were
implemented in three different real world empirical problems to alleviate
noisy and the volatility effect of them. Consequently, a new domain was
constructed. Subsequently, universe of discourse for transformed data was
established and an approach for calculating effective length of the
intervals was then proposed. Considering the steps above, the initial
forecasts were performed using frequently used fuzzy time series (FTS)
methods on transformed data. Final forecasts were retrieved from initial
forecasted values by proper inverse operation. Comparisons of the results
demonstrate that the proposed method produced more accurate forecasts
compared with existing FTS on original data.
Journal: Journal of Applied Statistics
Pages: 2407-2422
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.817548
File-URL: http://hdl.handle.net/10.1080/02664763.2013.817548
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2407-2422
Template-Type: ReDIF-Article 1.0
Author-Name: Therese Graversen
Author-X-Name-First: Therese
Author-X-Name-Last: Graversen
Author-Name: Steffen Lauritzen
Author-X-Name-First: Steffen
Author-X-Name-Last: Lauritzen
Title: Estimation of parameters in DNA mixture analysis
Abstract:
In [7], a Bayesian network for analysis of mixed traces of
DNA was presented using gamma distributions for modelling peak sizes in
the electropherogram. It was demonstrated that the analysis was sensitive
to the choice of a variance factor and hence this should be adapted to any
new trace analysed. In this paper, we discuss how the variance parameter
can be estimated by maximum likelihood to achieve this. The unknown
proportions of DNA from each contributor can similarly be estimated by
maximum likelihood jointly with the variance parameter. Furthermore, we
discuss how to incorporate prior knowledge about the parameters in a
Bayesian analysis. The proposed estimation methods are illustrated through
a few examples of applications for calculating evidential value in
casework and for mixture deconvolution.
Journal: Journal of Applied Statistics
Pages: 2423-2436
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.817549
File-URL: http://hdl.handle.net/10.1080/02664763.2013.817549
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2423-2436
Template-Type: ReDIF-Article 1.0
Author-Name: Coskun Kus
Author-X-Name-First: Coskun
Author-X-Name-Last: Kus
Author-Name: Yunus Akdogan
Author-X-Name-First: Yunus
Author-X-Name-Last: Akdogan
Author-Name: Shuo-Jye Wu
Author-X-Name-First: Shuo-Jye
Author-X-Name-Last: Wu
Title: Optimal progressive group censoring scheme under cost considerations for pareto distribution
Abstract:
In this article, optimal design under the restriction of
pre-determined budget of experiment is developed for the Pareto
distribution when the life test is progressively group censored. We use
the maximum-likelihood method to obtain the point estimator of the Pareto
parameter. We propose two approaches to decide the number of test units,
the number of inspections, and the length of inspection interval under
limited budget such that the asymptotic variance of estimator of Pareto
parameter is minimum. A numerical example is given to illustrate the
proposed method. Some sensitivity analysis is also studied.
Journal: Journal of Applied Statistics
Pages: 2437-2450
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.818107
File-URL: http://hdl.handle.net/10.1080/02664763.2013.818107
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2437-2450
Template-Type: ReDIF-Article 1.0
Author-Name: �lk� Erisoglu
Author-X-Name-First: �lk�
Author-X-Name-Last: Erisoglu
Author-Name: Murat Erisoglu
Author-X-Name-First: Murat
Author-X-Name-Last: Erisoglu
Author-Name: Nazif Çalis
Author-X-Name-First: Nazif
Author-X-Name-Last: Çalis
Title: Heterogeneous data modeling with two-component Weibull--Poisson distribution
Abstract:
The mixture distribution models are more useful than pure
distributions in modeling of heterogeneous data sets. The aim of this
paper is to propose mixture of Weibull--Poisson (WP) distributions to
model heterogeneous data sets for the first time. So, a powerful
alternative mixture distribution is created for modeling of the
heterogeneous data sets. In the study, many features of the proposed
mixture of WP distributions are examined. Also, the expectation
maximization (EM) algorithm is used to determine the maximum-likelihood
estimates of the parameters, and the simulation study is conducted for
evaluating the performance of the proposed EM scheme. Applications for two
real heterogeneous data sets are given to show the flexibility and
potentiality of the new mixture distribution.
Journal: Journal of Applied Statistics
Pages: 2451-2461
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.818108
File-URL: http://hdl.handle.net/10.1080/02664763.2013.818108
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2451-2461
Template-Type: ReDIF-Article 1.0
Author-Name: Robert Schall
Author-X-Name-First: Robert
Author-X-Name-Last: Schall
Author-Name: Dianne Weatherall
Author-X-Name-First: Dianne
Author-X-Name-Last: Weatherall
Title: Accuracy and fairness of rain rules for interrupted one-day cricket matches
Abstract:
In this paper, we investigate the relative merits of rain
rules for one-day cricket matches. We suggest that interrupted one-day
matches present a missing data problem: the outcome of the complete match
cannot be observed, and instead the outcome of the interrupted match, as
determined at least in part by the rain rule in question, is observed.
Viewing the outcome of the interrupted match as an imputation of the
missing outcome of the complete match, standard characteristics to assess
the performance of classification tests can be used to assess the
performance of a rain rule. In particular, we consider the overall and
conditional accuracy and the predictive value of a rain rule. We propose
two requirements for a 'fair' rain rule, and show that a fair rain rule
must satisfy an identity involving its conditional accuracies. Estimating
the performance characteristics of various rain rules from a sample of
complete one-day matches our results suggest that the Duckworth--Lewis
method, currently adopted by the International Cricket Council, is
essentially as accurate as and somewhat more fair than its best
competitors. A rain rule based on the iso-probability principle also
performs well but might benefit from re-calibration using a more
representative data base.
Journal: Journal of Applied Statistics
Pages: 2462-2479
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.818623
File-URL: http://hdl.handle.net/10.1080/02664763.2013.818623
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2462-2479
Template-Type: ReDIF-Article 1.0
Author-Name: Alexandros E. Milionis
Author-X-Name-First: Alexandros E.
Author-X-Name-Last: Milionis
Author-Name: Evangelia Papanagiotou
Author-X-Name-First: Evangelia
Author-X-Name-Last: Papanagiotou
Title: Decomposing the predictive performance of the moving average trading rule of technical analysis: the contribution of linear and non-linear dependencies in stock returns
Abstract:
The main purpose of this work is to decompose the predictive
performance of the moving average (MA) trading rule and find out the
portion that could be attributed to the possible exploitation of linear
and non-linear dependencies in stock returns. Data from the General Index
of the Athens Stock Exchange, from the Standard and Poor-500 Index of the
New York Stock Exchange and from the Austrian Traded Index of the Vienna
Stock Exchange are filtered by linear filters so as the resulting
simulated 'returns' exhibit no serial correlation. Applying MA trading
rules to both the original and the simulated indices and using a new
statistical testing procedure that takes into account the sensitivity of
the performance of the trading rule as a function of the length of the MA
it is found that the predictive performance of the trading rule is clearly
weakened when applied to the simulated indices indicating that a
substantial part of the rule's predictive performance is due to the
exploitation of linear dependencies in stock returns. This weakening is
uneven; in general the shorter the MA length the more pronounced the
attenuation.
Journal: Journal of Applied Statistics
Pages: 2480-2494
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.818624
File-URL: http://hdl.handle.net/10.1080/02664763.2013.818624
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2480-2494
Template-Type: ReDIF-Article 1.0
Author-Name: David A. Wooff
Author-X-Name-First: David A.
Author-X-Name-Last: Wooff
Author-Name: Amin Jamalzadeh
Author-X-Name-First: Amin
Author-X-Name-Last: Jamalzadeh
Title: Robust and scale-free effect sizes for non-Normal two-sample comparisons, with applications in e-commerce
Abstract:
The effect size (ES) has been mainly
introduced and investigated for changes in location under an assumption of
Normality for the underlying population. However, there are many
circumstances where populations are non-Normal, or depend on scale and
shape and not just a location parameter. Our motivating application from
e-commerce requires an ES which is appropriate for long-tailed
distributions. We review some common ES measures. We then introduce two
novel alternative ES for two-sample comparisons, one scale-free and one on
the original scale of measurement, and analyse some theoretical
properties. We examine these ES for two-sample comparison studies under an
assumption of Normality and investigate what happens when both location
and scale parameters differ. We explore ES for phenomena for non-Normal
situations, using the Weibull family for illustration. Finally, for an
application, we assess differences in customer behaviour when browsing
E-commerce websites.
Journal: Journal of Applied Statistics
Pages: 2495-2515
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.818625
File-URL: http://hdl.handle.net/10.1080/02664763.2013.818625
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2495-2515
Template-Type: ReDIF-Article 1.0
Author-Name: Gang Han
Author-X-Name-First: Gang
Author-X-Name-Last: Han
Author-Name: Yangxin Huang
Author-X-Name-First: Yangxin
Author-X-Name-Last: Huang
Author-Name: Qizhai Li
Author-X-Name-First: Qizhai
Author-X-Name-Last: Li
Author-Name: Lili Chen
Author-X-Name-First: Lili
Author-X-Name-Last: Chen
Author-Name: Xi Zhang
Author-X-Name-First: Xi
Author-X-Name-Last: Zhang
Title: Hybrid Bayesian inference on HIV viral dynamic models
Abstract:
Modelling of HIV dynamics in AIDS research has greatly
improved our understanding of the pathogenesis of HIV-1 infection and
guided for the treatment of AIDS patients and evaluation of antiretroviral
therapies. Some of the model parameters may have practical meanings with
prior knowledge available, but others might not have prior knowledge.
Incorporating priors can improve the statistical inference. Although there
have been extensive Bayesian and frequentist estimation methods for the
viral dynamic models, little work has been done on making simultaneous
inference about the Bayesian and frequentist parameters. In this article,
we propose a hybrid Bayesian inference approach for viral dynamic
nonlinear mixed-effects models using the Bayesian frequentist hybrid
theory developed in Yuan [Bayesian frequentist hybrid
inference, Ann. Statist. 37 (2009), pp. 2458--2501]. Compared
with frequentist inference in a real example and two simulation examples,
the hybrid Bayesian approach is able to improve the inference accuracy
without compromising the computational load.
Journal: Journal of Applied Statistics
Pages: 2516-2532
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.818626
File-URL: http://hdl.handle.net/10.1080/02664763.2013.818626
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2516-2532
Template-Type: ReDIF-Article 1.0
Author-Name: Yao Zhang
Author-X-Name-First: Yao
Author-X-Name-Last: Zhang
Author-Name: Eric T. Bradlow
Author-X-Name-First: Eric T.
Author-X-Name-Last: Bradlow
Author-Name: Dylan S. Small
Author-X-Name-First: Dylan S.
Author-X-Name-Last: Small
Title: New measures of clumpiness for incidence data
Abstract:
In recent years, growing attention has been placed on the
increasing pattern of 'clumpy data' in many empirical areas such as
financial market microstructure, criminology and seismology, and digital
media consumption to name just a few; but a well-defined and careful
measurement of clumpiness has remained somewhat elusive. The related 'hot
hand' effect has long been a widespread belief in sports, and has
triggered a branch of interesting research which could shed some light on
this domain. However, since many concerns have been raised about the low
power of the existing 'hot hand' significance tests, we propose a new
class of clumpiness measures which are shown to have higher statistical
power in extensive simulations under a wide variety of statistical models
for repeated outcomes. Finally, an empirical study is provided by using a
unique dataset obtained from Hulu.com, an increasingly popular video
streaming provider. Our results provide evidence that the 'clumpiness
phenomena' is widely prevalent in digital content consumption, which
supports the lore of 'bingeability' of online content believed to exist
today.
Journal: Journal of Applied Statistics
Pages: 2533-2548
Issue: 11
Volume: 40
Year: 2013
Month: 11
X-DOI: 10.1080/02664763.2013.818627
File-URL: http://hdl.handle.net/10.1080/02664763.2013.818627
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:11:p:2533-2548
Template-Type: ReDIF-Article 1.0
Author-Name: Jiin-Huarng Guo
Author-X-Name-First: Jiin-Huarng
Author-X-Name-Last: Guo
Author-Name: Wei-Ming Luh
Author-X-Name-First: Wei-Ming
Author-X-Name-Last: Luh
Title: Efficient sample size allocation with cost constraints for heterogeneous-variance group comparison
Abstract:
When conducting research with controlled
experiments, sample size planning is one of the important decisions that
researchers have to make. However, current methods do not adequately
address this issue with regard to variance heterogeneity with some cost
constraints for comparing several treatment means. This paper proposes a
sample size allocation ratio in the fixed-effect heterogeneous analysis of
variance when group variances are unequal and in cases where the sampling
and/or variable cost has some constraints. The efficient sample size
allocation is determined for the purpose of minimizing total cost with a
designated power or maximizing the power with a given total cost. Finally,
the proposed method is verified by using the index of relative efficiency
and the corresponding total cost and the total sample size needed. We also
apply our method in a pain management trial to decide an efficient sample
size. Simulation studies also show that the proposed sample size formulas
are efficient in terms of statistical power. SAS and R codes are provided
in the appendix for easy application.
Journal: Journal of Applied Statistics
Pages: 2549-2563
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.819417
File-URL: http://hdl.handle.net/10.1080/02664763.2013.819417
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2549-2563
Template-Type: ReDIF-Article 1.0
Author-Name: E. Androulakis
Author-X-Name-First: E.
Author-X-Name-Last: Androulakis
Author-Name: C. Koukouvinos
Author-X-Name-First: C.
Author-X-Name-Last: Koukouvinos
Title: A new variable selection method for uniform designs
Abstract:
As an important class of space-filling
designs, uniform designs (UDs) choose a set of points over a certain
domain such that these points are uniformly scattered, under a specific
discrepancy measure. They have been applied successfully in many
industrial and scientific experiments since they appeared in 1980. A
noteworthy and practical advantage is their ability to investigate a large
number of high-level factors simultaneously with a fairly economical set
of experimental runs. As a result, UDs can be properly used as
experimental plans that are intended to derive the significant factors
from a list of many potential ones. To this end, a new screening procedure
is introduced via penalized least squares. A simulation study is conducted
to support the proposed method, which reveals that it can be considered
quite promising and expedient, as judged in terms of Type I and Type II
error rates.
Journal: Journal of Applied Statistics
Pages: 2564-2578
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.819568
File-URL: http://hdl.handle.net/10.1080/02664763.2013.819568
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2564-2578
Template-Type: ReDIF-Article 1.0
Author-Name: Ivo Alberink
Author-X-Name-First: Ivo
Author-X-Name-Last: Alberink
Author-Name: Annabel Bolck
Author-X-Name-First: Annabel
Author-X-Name-Last: Bolck
Author-Name: Sonja Menges
Author-X-Name-First: Sonja
Author-X-Name-Last: Menges
Title: Posterior likelihood ratios for evaluation of forensic trace evidence given a two-level model on the data
Abstract:
In forensic science, in order to determine
whether sets of traces are from the same source or not, it is widely
advocated to evaluate evidential value of similarity of the traces by
likelihood ratios (LRs). If traces are expressed by measurements following
a two-level model with random effects and known variances, closed LR
formulas are available given normality, or kernel density distributions,
on the effects. For the known variances estimators are used though, which
leads to uncertainty on the resulting LRs which is hard to quantify. The
above is analyzed in an approach in which both effects and variances are
random, following standard prior distributions on univariate data, leading
to posterior LRs. For non-informative and conjugate priors, closed LR
formulas are obtained that are interesting in structure and generalize a
known result given fixed variance. A semi-conjugate prior on the model
seems usable in many applications. It is described how to obtain credible
intervals using Monte Carlo Markov Chain and regular simulation, and an
example is described for comparison of XTC tablets based on MDMA content.
In this way, uncertainty on LR estimation is expressed more clearly which
makes the evidential value more transparent in a judicial context.
Journal: Journal of Applied Statistics
Pages: 2579-2600
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.822056
File-URL: http://hdl.handle.net/10.1080/02664763.2013.822056
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2579-2600
Template-Type: ReDIF-Article 1.0
Author-Name: A.H.M. Rahmatullah Imon
Author-X-Name-First: A.H.M. Rahmatullah
Author-X-Name-Last: Imon
Author-Name: Ali S. Hadi
Author-X-Name-First: Ali S.
Author-X-Name-Last: Hadi
Title: Identification of multiple high leverage points in logistic regression
Abstract:
Leverage values are being used in
regression diagnostics as measures of unusual observations in the
X-space. Detection of high leverage observations or
points is crucial due to their responsibility for masking outliers. In
linear regression, high leverage points (HLP) are those that stand far
apart from the center (mean) of the data and hence the most extreme points
in the covariate space get the highest leverage. But Hosemer and Lemeshow
[Applied logistic regression, Wiley, New York, 1980]
pointed out that in logistic regression, the leverage measure contains a
component which can make the leverage values of genuine HLP misleadingly
very small and that creates problem in the correct identification of the
cases. Attempts have been made to identify the HLP based on the median
distances from the mean, but since they are designed for the
identification of a single high leverage point they may not be very
effective in the presence of multiple HLP due to their masking
(false--negative) and swamping (false--positive) effects. In this paper we
propose a new method for the identification of multiple HLP in logistic
regression where the suspect cases are identified by a robust group
deletion technique and they are confirmed using diagnostic techniques. The
usefulness of the proposed method is then investigated through several
well-known examples and a Monte Carlo simulation.
Journal: Journal of Applied Statistics
Pages: 2601-2616
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.822057
File-URL: http://hdl.handle.net/10.1080/02664763.2013.822057
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2601-2616
Template-Type: ReDIF-Article 1.0
Author-Name: B. M. Golam Kibria
Author-X-Name-First: B. M. Golam
Author-X-Name-Last: Kibria
Author-Name: Shipra Banik
Author-X-Name-First: Shipra
Author-X-Name-Last: Banik
Title: Parametric and nonparametric confidence intervals for estimating the difference of means of two skewed populations
Abstract:
In this paper, we have reviewed and
proposed several interval estimators for estimating the difference of
means of two skewed populations. Estimators include the
ordinary-t, two versions proposed by Welch [17] and
Satterthwaite [15], three versions proposed by Zhou and Dinh [18], Johnson
[9], Hall [8], empirical likelihood (EL), bootstrap version of EL, median
t proposed by Baklizi and Kibria [2] and bootstrap
version of median t. A Monte Carlo simulation study has
been conducted to compare the performance of the proposed interval
estimators. Some real life health related data have been considered to
illustrate the application of the paper. Based on our findings, some
possible good interval estimators for estimating the mean difference of
two populations have been recommended for the researchers.
Journal: Journal of Applied Statistics
Pages: 2617-2636
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.822478
File-URL: http://hdl.handle.net/10.1080/02664763.2013.822478
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2617-2636
Template-Type: ReDIF-Article 1.0
Author-Name: Sanying Feng
Author-X-Name-First: Sanying
Author-X-Name-Last: Feng
Author-Name: Liugen Xue
Author-X-Name-First: Liugen
Author-X-Name-Last: Xue
Title: Variable selection for partially varying coefficient single-index model
Abstract:
In this paper, we consider the problem of
variable selection for partially varying coefficient single-index model,
and present a regularized variable selection procedure by combining basis
function approximations with smoothly clipped absolute deviation penalty.
The proposed procedure simultaneously selects significant variables in the
single-index parametric components and the nonparametric coefficient
function components. With appropriate selection of the tuning parameters,
the consistency of the variable selection procedure and the oracle
property of the estimators are established. Finite sample performance of
the proposed method is illustrated by a simulation study and real data
analysis.
Journal: Journal of Applied Statistics
Pages: 2637-2652
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.823919
File-URL: http://hdl.handle.net/10.1080/02664763.2013.823919
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2637-2652
Template-Type: ReDIF-Article 1.0
Author-Name: Akanksha S. Kashikar
Author-X-Name-First: Akanksha S.
Author-X-Name-Last: Kashikar
Author-Name: Neelabh Rohan
Author-X-Name-First: Neelabh
Author-X-Name-Last: Rohan
Author-Name: T.V. Ramanathan
Author-X-Name-First: T.V.
Author-X-Name-Last: Ramanathan
Title: Integer autoregressive models with structural breaks
Abstract:
Even though integer-valued time series are
common in practice, the methods for their analysis have been developed
only in recent past. Several models for stationary processes with discrete
marginal distributions have been proposed in the literature. Such
processes assume the parameters of the model to remain constant throughout
the time period. However, this need not be true in practice. In this
paper, we introduce non-stationary integer-valued autoregressive (INAR)
models with structural breaks to model a situation, where the parameters
of the INAR process do not remain constant over time. Such models are
useful while modelling count data time series with structural breaks. The
Bayesian and Markov Chain Monte Carlo (MCMC) procedures for the estimation
of the parameters and break points of such models are discussed. We
illustrate the model and estimation procedure with the help of a
simulation study. The proposed model is applied to the two real
biometrical data sets.
Journal: Journal of Applied Statistics
Pages: 2653-2669
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.823920
File-URL: http://hdl.handle.net/10.1080/02664763.2013.823920
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2653-2669
Template-Type: ReDIF-Article 1.0
Author-Name: Sharif Mahmood
Author-X-Name-First: Sharif
Author-X-Name-Last: Mahmood
Author-Name: Begum Zainab
Author-X-Name-First: Begum
Author-X-Name-Last: Zainab
Author-Name: A.H.M. Mahbub Latif
Author-X-Name-First: A.H.M. Mahbub
Author-X-Name-Last: Latif
Title: Frailty modeling for clustered survival data: an application to birth interval in Bangladesh
Abstract:
The present work demonstrates an
application of random effects model for analyzing birth intervals that are
clustered into geographical regions. Observations from the same cluster
are assumed to be correlated because usually they share certain unobserved
characteristics between them. Ignoring the correlations among the
observations may lead to incorrect standard errors of the estimates of
parameters of interest. Beside making the comparisons between Cox's
proportional hazards model and random effects model for analyzing
geographically clustered time-to-event data, important demographic and
socioeconomic factors that may affect the length of birth intervals of
Bangladeshi women are also reported in this paper.
Journal: Journal of Applied Statistics
Pages: 2670-2680
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.825702
File-URL: http://hdl.handle.net/10.1080/02664763.2013.825702
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2670-2680
Template-Type: ReDIF-Article 1.0
Author-Name: Johannes Forkman
Author-X-Name-First: Johannes
Author-X-Name-Last: Forkman
Title: The use of a reference variety for comparisons in incomplete series of crop variety trials
Abstract:
In a series of crop variety trials, 'test
varieties' are compared with one another and with a 'reference' variety
that is included in all trials. The series is typically analyzed with a
linear mixed model and the method of generalized least squares. Usually,
the estimates of the expected differences between the test varieties and
the reference variety are presented. When the series is incomplete, i.e.
when all test varieties were not included in all trials, the method of
generalized least squares may give estimates of expected differences to
the reference variety that do not appear to accord with observed
differences. The present paper draws attention to this phenomenon and
explores the recurrent idea of comparing test varieties indirectly through
the use of the reference. A new 'reference treatment method' was specified
and compared with the method of generalized least squares when applied to
a five-year series of 85 spring wheat trials. The reference treatment
method provided estimates of differences to the reference variety that
agreed with observed differences, but was considerably less efficient than
the method of generalized least squares.
Journal: Journal of Applied Statistics
Pages: 2681-2698
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.825703
File-URL: http://hdl.handle.net/10.1080/02664763.2013.825703
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2681-2698
Template-Type: ReDIF-Article 1.0
Author-Name: Le Kang
Author-X-Name-First: Le
Author-X-Name-Last: Kang
Author-Name: Randy Carter
Author-X-Name-First: Randy
Author-X-Name-Last: Carter
Author-Name: Kathleen Darcy
Author-X-Name-First: Kathleen
Author-X-Name-Last: Darcy
Author-Name: James Kauderer
Author-X-Name-First: James
Author-X-Name-Last: Kauderer
Author-Name: Shu-Yuan Liao
Author-X-Name-First: Shu-Yuan
Author-X-Name-Last: Liao
Title: A fast Monte Carlo expectation--maximization algorithm for estimation in latent class model analysis with an application to assess diagnostic accuracy for cervical neoplasia in women with atypical glandular cells
Abstract:
In this article, we use a latent class
model (LCM) with prevalence modeled as a function of covariates to assess
diagnostic test accuracy in situations where the true disease status is
not observed, but observations on three or more conditionally independent
diagnostic tests are available. A fast Monte Carlo
expectation--maximization (MCEM) algorithm with binary (disease)
diagnostic data is implemented to estimate parameters of interest; namely,
sensitivity, specificity, and prevalence of the disease as a function of
covariates. To obtain standard errors for confidence interval construction
of estimated parameters, the missing information principle is applied to
adjust information matrix estimates. We compare the adjusted information
matrix-based standard error estimates with the bootstrap standard error
estimates both obtained using the fast MCEM algorithm through an extensive
Monte Carlo study. Simulation demonstrates that the adjusted information
matrix approach estimates the standard error similarly with the bootstrap
methods under certain scenarios. The bootstrap percentile intervals have
satisfactory coverage probabilities. We then apply the LCM analysis to a
real data set of 122 subjects from a Gynecologic Oncology Group study of
significant cervical lesion diagnosis in women with atypical glandular
cells of undetermined significance to compare the diagnostic accuracy of a
histology-based evaluation, a carbonic anhydrase-IX biomarker-based test
and a human papillomavirus DNA test.
Journal: Journal of Applied Statistics
Pages: 2699-2719
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.825704
File-URL: http://hdl.handle.net/10.1080/02664763.2013.825704
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2699-2719
Template-Type: ReDIF-Article 1.0
Author-Name: Saieed F. Ateya
Author-X-Name-First: Saieed F.
Author-X-Name-Last: Ateya
Title: Estimation under modified Weibull distribution based on right censored generalized order statistics
Abstract:
In this paper, the maximum likelihood (ML)
and Bayes, by using Markov chain Monte Carlo (MCMC), methods are
considered to estimate the parameters of three-parameter modified Weibull
distribution (MWD(β, τ, λ)) based on a right censored
sample of generalized order statistics (gos). Simulation experiments are
conducted to demonstrate the efficiency of the proposed methods. Some
comparisons are carried out between the ML and Bayes methods by computing
the mean squared errors (MSEs), Akaike's information criteria (AIC) and
Bayesian information criteria (BIC) of the estimates to illustrate the
paper. Three real data sets from Weibull(α, β) distribution are
introduced and analyzed using the MWD(β, τ, λ) and also
using the Weibull(α, β) distribution. A comparison is carried
out between the mentioned models based on the corresponding
Kolmogorov--Smirnov (K--S) test
statistic, {AIC and BIC} to emphasize that the MWD(β, τ,
λ) fits the data better than the other distribution. All parameters
are estimated based on type-II censored sample, censored upper record
values and progressively type-II censored sample which are generated from
the real data sets.
Journal: Journal of Applied Statistics
Pages: 2720-2734
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.825705
File-URL: http://hdl.handle.net/10.1080/02664763.2013.825705
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2720-2734
Template-Type: ReDIF-Article 1.0
Author-Name: Abdul Aziz Karia
Author-X-Name-First: Abdul Aziz
Author-X-Name-Last: Karia
Author-Name: Imbarine Bujang
Author-X-Name-First: Imbarine
Author-X-Name-Last: Bujang
Author-Name: Ismail Ahmad
Author-X-Name-First: Ismail
Author-X-Name-Last: Ahmad
Title: Fractionally integrated ARMA for crude palm oil prices prediction: case of potentially overdifference
Abstract:
Dealing with stationarity remains an
unsolved problem. Some of the time series data, especially crude palm oil
(CPO) prices persist towards nonstationarity in the long-run data. This
dilemma forces the researchers to conduct first-order difference. The
basic idea is that to obtain the stationary data that is considered as a
good strategy to overcome the nonstationary counterparts. An opportune
remark as it is, this proxy may lead to overdifference. The CPO prices
trend elements have not been attenuated but nearly annihilated. Therefore,
this paper presents the usefulness of autoregressive fractionally
integrated moving average (ARFIMA) model as the solution towards the
nonstationary persistency of CPO prices in the long-run data. In this
study, we employed daily historical Free-on-Board CPO prices in Malaysia.
A comparison was made between the ARFIMA over the existing
autoregressive-integrated moving average (ARIMA) model. Here, we employed
three statistical evaluation criteria in order to measure the performance
of the applied models. The general conclusion that can be derived from
this paper is that the usefulness of the ARFIMA model outperformed the
existing ARIMA model.
Journal: Journal of Applied Statistics
Pages: 2735-2748
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.825706
File-URL: http://hdl.handle.net/10.1080/02664763.2013.825706
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2735-2748
Template-Type: ReDIF-Article 1.0
Author-Name: Gang Wang
Author-X-Name-First: Gang
Author-X-Name-Last: Wang
Author-Name: Jun Wang
Author-X-Name-First: Jun
Author-X-Name-Last: Wang
Author-Name: Mingyu Wang
Author-X-Name-First: Mingyu
Author-X-Name-Last: Wang
Title: Modular-transform based clustering
Abstract:
Spectral clustering uses eigenvectors of
the Laplacian of the similarity matrix. It is convenient to solve binary
clustering problems. When applied to multi-way clustering, either the
binary spectral clustering is recursively applied or an embedding to
spectral space is done and some other methods, such as K-means clustering,
are used to cluster the points. Here we propose and study a K-way
clustering algorithm -- spectral modular transformation, based on the fact
that the graph Laplacian has an equivalent representation, which has a
diagonal modular structure. The method first transforms the original
similarity matrix into a new one, which is nearly disconnected and reveals
a cluster structure clearly, then we apply linearized cluster assignment
algorithm to split the clusters. In this way, we can find some samples for
each cluster recursively using the divide and conquer method. To get the
overall clustering results, we apply the cluster assignment obtained in
the previous step as the initialization of multiplicative update method
for spectral clustering. Examples show that our method outperforms
spectral clustering using other initializations.
Journal: Journal of Applied Statistics
Pages: 2749-2759
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.826638
File-URL: http://hdl.handle.net/10.1080/02664763.2013.826638
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2749-2759
Template-Type: ReDIF-Article 1.0
Author-Name: Emilio Gómez Déniz
Author-X-Name-First: Emilio Gómez
Author-X-Name-Last: Déniz
Title: A new discrete distribution: properties and applications in medical care
Abstract:
This paper proposes a simple and flexible
count data regression model which is able to incorporate overdispersion
(the variance is greater than the mean) and which can be considered a
competitor to the Poisson model. As is well known, this classical model
imposes the restriction that the conditional mean of each count variable
must equal the conditional variance. Nevertheless, for the common case of
well-dispersed counts the Poisson regression may not be appropriate, while
the count regression model proposed here is potentially useful. We
consider an application to model counts of medical care utilization by the
elderly in the USA using a well-known data set from the National Medical
Expenditure Survey (1987), where the dependent variable is the number of
stays after hospital admission, and where 10 explanatory variables are
analysed.
Journal: Journal of Applied Statistics
Pages: 2760-2770
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.827161
File-URL: http://hdl.handle.net/10.1080/02664763.2013.827161
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2760-2770
Template-Type: ReDIF-Article 1.0
Author-Name: Mikhail Moklyachuk
Author-X-Name-First: Mikhail
Author-X-Name-Last: Moklyachuk
Title: Advances in time series forecasting
Journal: Journal of Applied Statistics
Pages: 2771-2772
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.816023
File-URL: http://hdl.handle.net/10.1080/02664763.2013.816023
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2771-2772
Template-Type: ReDIF-Article 1.0
Author-Name: Yves Laberge
Author-X-Name-First: Yves
Author-X-Name-Last: Laberge
Title: Handbook of statistics in clinical oncology
Journal: Journal of Applied Statistics
Pages: 2772-2773
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.816026
File-URL: http://hdl.handle.net/10.1080/02664763.2013.816026
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2772-2773
Template-Type: ReDIF-Article 1.0
Author-Name: Eugenia Stoimenova
Author-X-Name-First: Eugenia
Author-X-Name-Last: Stoimenova
Title: Methodology in robust and nonparametric statistics
Journal: Journal of Applied Statistics
Pages: 2773-2773
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.816029
File-URL: http://hdl.handle.net/10.1080/02664763.2013.816029
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2773-2773
Template-Type: ReDIF-Article 1.0
Author-Name: Mark Webster
Author-X-Name-First: Mark
Author-X-Name-Last: Webster
Title: Bayesian statistics
Journal: Journal of Applied Statistics
Pages: 2773-2774
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.816049
File-URL: http://hdl.handle.net/10.1080/02664763.2013.816049
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2773-2774
Template-Type: ReDIF-Article 1.0
Author-Name: Han Lin Shang
Author-X-Name-First: Han Lin
Author-X-Name-Last: Shang
Title: The BUGS book: a practical introduction to Bayesian analysis
Journal: Journal of Applied Statistics
Pages: 2774-2775
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.816061
File-URL: http://hdl.handle.net/10.1080/02664763.2013.816061
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2774-2775
Template-Type: ReDIF-Article 1.0
Author-Name: Pieter Bastiaan Ober
Author-X-Name-First: Pieter Bastiaan
Author-X-Name-Last: Ober
Title: Introduction to linear regression analysis
Journal: Journal of Applied Statistics
Pages: 2775-2776
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.816069
File-URL: http://hdl.handle.net/10.1080/02664763.2013.816069
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2775-2776
Template-Type: ReDIF-Article 1.0
Author-Name: Giovanni C. Porzio
Author-X-Name-First: Giovanni C.
Author-X-Name-Last: Porzio
Title: Regression analysis by example
Journal: Journal of Applied Statistics
Pages: 2776-2777
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.817041
File-URL: http://hdl.handle.net/10.1080/02664763.2013.817041
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2776-2777
Template-Type: ReDIF-Article 1.0
Author-Name: Claire Keeble
Author-X-Name-First: Claire
Author-X-Name-Last: Keeble
Title: Maximum-likelihood estimation for sample surveys
Journal: Journal of Applied Statistics
Pages: 2777-2777
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.820437
File-URL: http://hdl.handle.net/10.1080/02664763.2013.820437
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2777-2777
Template-Type: ReDIF-Article 1.0
Author-Name: Mariano Ruiz Espejo
Author-X-Name-First: Mariano
Author-X-Name-Last: Ruiz Espejo
Title: Confidence intervals for proportions and related measures of effect size
Journal: Journal of Applied Statistics
Pages: 2778-2778
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.820444
File-URL: http://hdl.handle.net/10.1080/02664763.2013.820444
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2778-2778
Template-Type: ReDIF-Article 1.0
Author-Name: Mariano Ruiz Espejo
Author-X-Name-First: Mariano
Author-X-Name-Last: Ruiz Espejo
Title: Design and analysis of experiments in the health sciences
Journal: Journal of Applied Statistics
Pages: 2778-2779
Issue: 12
Volume: 40
Year: 2013
Month: 12
X-DOI: 10.1080/02664763.2013.820452
File-URL: http://hdl.handle.net/10.1080/02664763.2013.820452
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:12:p:2778-2779
Template-Type: ReDIF-Article 1.0
Author-Name: Robert G. Aykroyd
Author-X-Name-First: Robert G.
Author-X-Name-Last: Aykroyd
Title: Editorial
Journal: Journal of Applied Statistics
Pages: 1-1
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2014.859354
File-URL: http://hdl.handle.net/10.1080/02664763.2014.859354
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:1-1
Template-Type: ReDIF-Article 1.0
Author-Name: Nalan G�lpınar
Author-X-Name-First: Nalan
Author-X-Name-Last: G�lpınar
Author-Name: Kabir Katata
Author-X-Name-First: Kabir
Author-X-Name-Last: Katata
Title: Modelling oil and gas supply disruption risks using extreme-value theory and copula
Abstract:
In this paper, we are concerned with
modelling oil and gas supply disruption risks using extreme-value theory
and copula. We analyse financial and volumetric losses due to both oil and
gas supply disruptions and investigate their dependence structure using
real data. In order to illustrate the impact of crude oil and natural gas
supply disruptions on an energy-dependent economy, Nigeria is considered
as a case study. Computational studies illustrate that the generalized
extreme-value distribution anticipates higher financial losses and
extreme-value copulas produce the best fit for financial and volumetric
losses compared with normal copulas. Moreover, multivariate financial
losses exhibit stronger positive dependence than volumetric losses.
Journal: Journal of Applied Statistics
Pages: 2-25
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.827160
File-URL: http://hdl.handle.net/10.1080/02664763.2013.827160
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:2-25
Template-Type: ReDIF-Article 1.0
Author-Name: Sung Wan Han
Author-X-Name-First: Sung Wan
Author-X-Name-Last: Han
Author-Name: Rickson C. Mesquita
Author-X-Name-First: Rickson C.
Author-X-Name-Last: Mesquita
Author-Name: Theresa M. Busch
Author-X-Name-First: Theresa M.
Author-X-Name-Last: Busch
Author-Name: Mary E. Putt
Author-X-Name-First: Mary E.
Author-X-Name-Last: Putt
Title: A method for choosing the smoothing parameter in a semi-parametric model for detecting change-points in blood flow
Abstract:
In a smoothing spline model with unknown
change-points, the choice of the smoothing parameter strongly influences
the estimation of the change-point locations and the function at the
change-points. In a tumor biology example, where change-points in blood
flow in response to treatment were of interest, choosing the smoothing
parameter based on minimizing generalized cross-validation (GCV) gave
unsatisfactory estimates of the change-points. We propose a new method,
aGCV, that re-weights the residual sum of squares and generalized degrees
of freedom terms from GCV. The weight is chosen to maximize the decrease
in the generalized degrees of freedom as a function of the weight value,
while simultaneously minimizing aGCV as a function of the smoothing
parameter and the change-points. Compared with GCV, simulation studies
suggest that the aGCV method yields improved estimates of the change-point
and the value of the function at the change-point.
Journal: Journal of Applied Statistics
Pages: 26-45
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.830085
File-URL: http://hdl.handle.net/10.1080/02664763.2013.830085
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:26-45
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaowei Yang
Author-X-Name-First: Xiaowei
Author-X-Name-Last: Yang
Author-Name: Bin Peng
Author-X-Name-First: Bin
Author-X-Name-Last: Peng
Author-Name: Rongqi Chen
Author-X-Name-First: Rongqi
Author-X-Name-Last: Chen
Author-Name: Qian Zhang
Author-X-Name-First: Qian
Author-X-Name-Last: Zhang
Author-Name: Dianwen Zhu
Author-X-Name-First: Dianwen
Author-X-Name-Last: Zhu
Author-Name: Qing J. Zhang
Author-X-Name-First: Qing J.
Author-X-Name-Last: Zhang
Author-Name: Fuzhong Xue
Author-X-Name-First: Fuzhong
Author-X-Name-Last: Xue
Author-Name: Lihong Qi
Author-X-Name-First: Lihong
Author-X-Name-Last: Qi
Title: Statistical profiling methods with hierarchical logistic regression for healthcare providers with binary outcomes
Abstract:
Within the context of California's public
report of coronary artery bypass graft (CABG) surgery
outcomes, we first thoroughly review popular statistical methods for
profiling healthcare providers. Extensive simulation studies are then
conducted to compare profiling schemes based on hierarchical logistic
regression (LR) modeling under various conditions. Both Bayesian and
frequentist's methods are evaluated in classifying hospitals into
'better', 'normal' or 'worse' service providers. The simulation results
suggest that no single method would dominate others on all accounts.
Traditional schemes based on LR tend to identify too many false outliers,
while those based on hierarchical modeling are relatively conservative.
The issue of over shrinkage in hierarchical modeling is also investigated
using the 2005--2006 California CABG data set. The article provides
theoretical and empirical evidence in choosing the right methodology for
provider profiling.
Journal: Journal of Applied Statistics
Pages: 46-59
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.830086
File-URL: http://hdl.handle.net/10.1080/02664763.2013.830086
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:46-59
Template-Type: ReDIF-Article 1.0
Author-Name: Abidemi K. Adeniji
Author-X-Name-First: Abidemi K.
Author-X-Name-Last: Adeniji
Author-Name: Steven H. Belle
Author-X-Name-First: Steven H.
Author-X-Name-Last: Belle
Author-Name: Abdus S. Wahed
Author-X-Name-First: Abdus S.
Author-X-Name-Last: Wahed
Title: Incorporating diagnostic accuracy into the estimation of discrete survival function
Abstract:
Empirical distribution function (EDF) is a
commonly used estimator of population cumulative distribution function.
Survival function is estimated as the complement of EDF. However, clinical
diagnosis of an event is often subjected to misclassification, by which
the outcome is given with some uncertainty. In the presence of such
errors, the true distribution of the time to first event is unknown. We
develop a method to estimate the true survival distribution by
incorporating negative predictive values and positive predictive values of
the prediction process into a product-limit style construction. This will
allow us to quantify the bias of the EDF estimates due to the presence of
misclassified events in the observed data. We present an unbiased
estimator of the true survival rates and its variance. Asymptotic
properties of the proposed estimators are provided and these properties
are examined through simulations. We evaluate our methods using data from
the VIRAHEP-C study.
Journal: Journal of Applied Statistics
Pages: 60-72
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.830087
File-URL: http://hdl.handle.net/10.1080/02664763.2013.830087
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:60-72
Template-Type: ReDIF-Article 1.0
Author-Name: Xavier Puig
Author-X-Name-First: Xavier
Author-X-Name-Last: Puig
Author-Name: Josep Ginebra
Author-X-Name-First: Josep
Author-X-Name-Last: Ginebra
Title: A Bayesian cluster analysis of election results
Abstract:
A Bayesian cluster analysis for the
results of an election based on multinomial mixture models is proposed.
The number of clusters is chosen based on the careful comparison of the
results with predictive simulations from the models, and by checking
whether models capture most of the spatial dependence in the results. By
implementing the analysis on five recent elections in Barcelona, the
reader is walked through the choice of the best statistics and graphical
displays to help chose a model and present the results. Even though the
models do not use any information about the location of the areas in which
the results are broken into, in the example they uncover a four-cluster
structure with a strong spatial dependence, that is very stable over time
and relates to the demographic composition.
Journal: Journal of Applied Statistics
Pages: 73-94
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.830088
File-URL: http://hdl.handle.net/10.1080/02664763.2013.830088
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:73-94
Template-Type: ReDIF-Article 1.0
Author-Name: Weihua Zhao
Author-X-Name-First: Weihua
Author-X-Name-Last: Zhao
Author-Name: Riquan Zhang
Author-X-Name-First: Riquan
Author-X-Name-Last: Zhang
Author-Name: Yazhao Lv
Author-X-Name-First: Yazhao
Author-X-Name-Last: Lv
Author-Name: Jicai Liu
Author-X-Name-First: Jicai
Author-X-Name-Last: Liu
Title: Variable selection for varying dispersion beta regression model
Abstract:
The beta regression models are commonly
used by practitioners to model variables that assume values in the
standard unit interval (0, 1). In this paper, we consider the issue of
variable selection for beta regression models with varying dispersion
(VBRM), in which both the mean and the dispersion depend upon predictor
variables. Based on a penalized likelihood method, the consistency and the
oracle property of the penalized estimators are established. Following the
coordinate descent algorithm idea of generalized linear models, we develop
new variable selection procedure for the VBRM, which can efficiently
simultaneously estimate and select important variables in both mean model
and dispersion model. Simulation studies and body fat data analysis are
presented to illustrate the proposed methods.
Journal: Journal of Applied Statistics
Pages: 95-108
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.830284
File-URL: http://hdl.handle.net/10.1080/02664763.2013.830284
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:95-108
Template-Type: ReDIF-Article 1.0
Author-Name: Hani M. Samawi
Author-X-Name-First: Hani M.
Author-X-Name-Last: Samawi
Author-Name: Robert Vogel
Author-X-Name-First: Robert
Author-X-Name-Last: Vogel
Title: Notes on two sample tests for partially correlated (paired) data
Abstract:
We provide several methods to compare two
Gaussian distributed means in the two sample location problems under the
assumption of partially dependent observations. Simulation studies
indicate that our test procedure is frequently more powerful than other
methods depending on the ratio of the unpaired data and the strength and
direction of the correlation between the two variables. The tests used in
our comparative study are illustrated with an example based on data from a
small gynecological study.
Journal: Journal of Applied Statistics
Pages: 109-117
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.830285
File-URL: http://hdl.handle.net/10.1080/02664763.2013.830285
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:109-117
Template-Type: ReDIF-Article 1.0
Author-Name: E.D. Lozano-Aguilera,
Author-X-Name-First: E.D.
Author-X-Name-Last: Lozano-Aguilera,
Author-Name: Mar�a Dolores Estudillo-Mart�nez
Author-X-Name-First: Mar�a Dolores
Author-X-Name-Last: Estudillo-Mart�nez
Author-Name: Sonia Castillo-Guti�rrez
Author-X-Name-First: Sonia
Author-X-Name-Last: Castillo-Guti�rrez
Title: A proposal for plotting positions in probability plots
Abstract:
Probability plots allow us to determine
whether a set of sample observations is distributed according to a
theoretical distribution. Plotting positions are fundamental elements in
statistics and, in particular, for the construction of probability plots.
In this paper, a new plotting position to construct different probability
plots, such as Q--Q Plot, P--P Plot and S--P Plot, is proposed. The
proposed definition is based on the median of the ith
order statistic of the theoretical distribution considered. The main
feature of this plotting position formula is that it is independent of the
theoretical distribution selected. Moreover, the procedure developed is
'almost' exact, reaching, without a high cost of time, an accuracy as
great as we want, which avoids using approximations (proposed by other
authors).
Journal: Journal of Applied Statistics
Pages: 118-126
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.831814
File-URL: http://hdl.handle.net/10.1080/02664763.2013.831814
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:118-126
Template-Type: ReDIF-Article 1.0
Author-Name: Stacia M. DeSantis
Author-X-Name-First: Stacia M.
Author-X-Name-Last: DeSantis
Author-Name: Christos Lazaridis
Author-X-Name-First: Christos
Author-X-Name-Last: Lazaridis
Author-Name: Shuang Ji
Author-X-Name-First: Shuang
Author-X-Name-Last: Ji
Author-Name: Francis G. Spinale
Author-X-Name-First: Francis G.
Author-X-Name-Last: Spinale
Title: Analyzing propensity matched zero-inflated count outcomes in observational studies
Abstract:
Determining the effectiveness of different
treatments from observational data, which are characterized by imbalance
between groups due to lack of randomization, is challenging. Propensity
matching is often used to rectify imbalances among prognostic variables.
However, there are no guidelines on how appropriately to analyze group
matched data when the outcome is a zero-inflated count. In addition, there
is debate over whether to account for correlation of responses induced by
matching and/or whether to adjust for variables used in generating the
propensity score in the final analysis. The aim of this research is to
compare covariate unadjusted and adjusted zero-inflated Poisson models
that do and do not account for the correlation. A simulation study is
conducted, demonstrating that it is necessary to adjust for potential
residual confounding, but that accounting for correlation is less
important. The methods are applied to a biomedical research data set.
Journal: Journal of Applied Statistics
Pages: 127-141
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.834296
File-URL: http://hdl.handle.net/10.1080/02664763.2013.834296
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:127-141
Template-Type: ReDIF-Article 1.0
Author-Name: M�rcio Poletti Laurini
Author-X-Name-First: M�rcio Poletti
Author-X-Name-Last: Laurini
Title: Dynamic functional data analysis with non-parametric state space models
Abstract:
In this article, we introduce a new method
for modelling curves with dynamic structures, using a non-parametric
approach formulated as a state space model. The non-parametric approach is
based on the use of penalised splines, represented as a dynamic mixed
model. This formulation can capture the dynamic evolution of curves using
a limited number of latent factors, allowing an accurate fit with a small
number of parameters. We also present a new method to determine the
optimal smoothing parameter through an adaptive procedure, using a
formulation analogous to a model of stochastic volatility (SV). The
non-parametric state space model allows unifying different methods applied
to data with a functional structure in finance. We present the advantages
and limitations of this method through simulation studies and also by
comparing its predictive performance with other parametric and
non-parametric methods used in financial applications using data on the
term structure of interest rates.
Journal: Journal of Applied Statistics
Pages: 142-163
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.838663
File-URL: http://hdl.handle.net/10.1080/02664763.2013.838663
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:142-163
Template-Type: ReDIF-Article 1.0
Author-Name: Boudewijn F. Roukema
Author-X-Name-First: Boudewijn F.
Author-X-Name-Last: Roukema
Title: A first-digit anomaly in the 2009 Iranian presidential election
Abstract:
A local bootstrap method is proposed for
the analysis of electoral vote-count first-digit frequencies,
complementing the Benford's Law limit. The method is calibrated on five
presidential-election first rounds (2002--2006) and applied to the 2009
Iranian presidential-election first round. Candidate K has a highly
significant (p>0.15% ) excess of vote counts starting
with the digit 7. This leads to other anomalies, two of which are
individually significant at p∼ 0.1% and one at
p∼ 1%. Independently, Iranian pre-election opinion
polls significantly reject the official results unless the five polls
favouring candidate A are considered alone. If the latter represent
normalised data and a linear, least-squares, equal-weighted fit is used,
then either candidates R and K suffered a sudden, dramatic (70%±15%
) loss of electoral support just prior to the election, or the official
results are rejected (p∼ 0.01% ).
Journal: Journal of Applied Statistics
Pages: 164-199
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.838664
File-URL: http://hdl.handle.net/10.1080/02664763.2013.838664
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:164-199
Template-Type: ReDIF-Article 1.0
Author-Name: Aamir Saghir
Author-X-Name-First: Aamir
Author-X-Name-Last: Saghir
Author-Name: Zhengyan Lin
Author-X-Name-First: Zhengyan
Author-X-Name-Last: Lin
Title: Control chart for monitoring multivariate COM-Poisson attributes
Abstract:
Statistical process control of
multi-attribute count data has received much attention with modern
data-acquisition equipment and online computers. The multivariate Poisson
distribution is often used to monitor multivariate attributes count data.
However, little work has been done so far on under- or over-dispersed
multivariate count data, which is common in many industrial processes,
with positive or negative correlation. In this study, a Shewhart-type
multivariate control chart is constructed to monitor such kind of data,
namely the multivariate COM-Poisson (MCP) chart, based on the MCP
distribution. The performance of the MCP chart is evaluated by the average
run length in simulation. The proposed chart generalizes some existing
multivariate attribute charts as its special cases. A real-life bivariate
process and a simulated trivariate Poisson process are used to illustrate
the application of the MCP chart.
Journal: Journal of Applied Statistics
Pages: 200-214
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.838666
File-URL: http://hdl.handle.net/10.1080/02664763.2013.838666
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:200-214
Template-Type: ReDIF-Article 1.0
Author-Name: Hanieh Panahi
Author-X-Name-First: Hanieh
Author-X-Name-Last: Panahi
Author-Name: Abdolreza Sayyareh
Author-X-Name-First: Abdolreza
Author-X-Name-Last: Sayyareh
Title: Parameter estimation and prediction of order statistics for the Burr Type XII distribution with Type II censoring
Abstract:
This article deals with the statistical
inference and prediction on Burr Type XII parameters based on Type II
censored sample. It is observed that the maximum likelihood estimators
(MLEs) cannot be obtained in closed form. We use the
expectation-maximization algorithm to compute the MLEs. We also obtain the
Bayes estimators under symmetric and asymmetric loss functions such as
squared error and Linex By applying Lindley's approximation and Markov
chain Monte Carlo (MCMC) technique. Further, MCMC samples are used to
calculate the highest posterior density credible intervals. Monte Carlo
simulation study and two real-life data-sets are presented to illustrate
all of the methods developed here. Furthermore, we obtain a prediction of
future order statistics based on the observed ordered because of its
important application in different fields such as medical and engineering
sciences. A numerical example carried out to illustrate the procedures
obtained for prediction of future order statistics.
Journal: Journal of Applied Statistics
Pages: 215-232
Issue: 1
Volume: 41
Year: 2014
Month: 1
X-DOI: 10.1080/02664763.2013.838668
File-URL: http://hdl.handle.net/10.1080/02664763.2013.838668
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:1:p:215-232
Template-Type: ReDIF-Article 1.0
Author-Name: H.E.T. Holgersson
Author-X-Name-First: H.E.T.
Author-X-Name-Last: Holgersson
Author-Name: L. Nordstr�m
Author-X-Name-First: L.
Author-X-Name-Last: Nordstr�m
Author-Name: Ö. Öner
Author-X-Name-First: Ö.
Author-X-Name-Last: Öner
Title: Dummy variables vs. category-wise models
Abstract:
Empirical research frequently involves
regression analysis with binary categorical variables, which are
traditionally handled through dummy explanatory variables. This paper
argues that separate category-wise models may provide a more logical and
comprehensive tool for analysing data with binary categories. Exploring
different aspects of both methods, we contrast the two with a Monte Carlo
simulation and an empirical example to provide a practical insight.
Journal: Journal of Applied Statistics
Pages: 233-241
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.838665
File-URL: http://hdl.handle.net/10.1080/02664763.2013.838665
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:233-241
Template-Type: ReDIF-Article 1.0
Author-Name: Ting-Li Chen
Author-X-Name-First: Ting-Li
Author-X-Name-Last: Chen
Author-Name: Stuart Geman
Author-X-Name-First: Stuart
Author-X-Name-Last: Geman
Title: Image warping using radial basis functions
Abstract:
Image warping is the process of deforming
an image through a transformation of its domain, which is typically a
subset of R -super-2.
Given the destination of a collection of points, the problem becomes one
of finding a suitable smooth interpolation for the destinations of the
remaining points of the domain. A common solution is to use the thin plate
spline (TPS). We find that the TPS often introduces unintended distortions
of image structures. In this paper, we will analyze interpolation by TPS,
experiment with other radial basis functions, and suggest two alternative
functions that provide better results.
Journal: Journal of Applied Statistics
Pages: 242-258
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.838667
File-URL: http://hdl.handle.net/10.1080/02664763.2013.838667
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:242-258
Template-Type: ReDIF-Article 1.0
Author-Name: S.H. Lin
Author-X-Name-First: S.H.
Author-X-Name-Last: Lin
Title: Comparing the mean vectors of two independent multivariate log-normal distributions
Abstract:
The multivariate log-normal distribution
is a good candidate to describe data that are not only positive and
skewed, but also contain many characteristic values. In this study, we
apply the generalized variable method to compare the mean vectors of two
independent multivariate log-normal populations that display
heteroscedasticity. Two generalized pivotal quantities are derived for
constructing the generalized confidence region and for testing the
difference between two mean vectors. Simulation results indicate that the
proposed procedures exhibit satisfactory performance regardless of the
sample sizes and heteroscedasticity. The type I error rates obtained are
consistent with expectations and the coverage probabilities are close to
the nominal level when compared with the other method which is currently
available. These features make the proposed method a worthy alternative
for inferential analysis of problems involving multivariate log-normal
means. The results are illustrated using three examples.
Journal: Journal of Applied Statistics
Pages: 259-274
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.838669
File-URL: http://hdl.handle.net/10.1080/02664763.2013.838669
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:259-274
Template-Type: ReDIF-Article 1.0
Author-Name: Arash Nademi
Author-X-Name-First: Arash
Author-X-Name-Last: Nademi
Author-Name: Rahman Farnoosh
Author-X-Name-First: Rahman
Author-X-Name-Last: Farnoosh
Title: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach
Abstract:
We propose data generating structures
which can be represented as a mixture of autoregressive-autoregressive
conditionally heteroscedastic models. The switching between the states is
governed by a hidden Markov chain. We investigate semi-parametric
estimators for estimating the functions based on the quasi-maximum
likelihood approach and provide sufficient conditions for geometric
ergodicity of the process. We also present an expectation--maximization
algorithm for calculating the estimates numerically.
Journal: Journal of Applied Statistics
Pages: 275-293
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.839129
File-URL: http://hdl.handle.net/10.1080/02664763.2013.839129
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:275-293
Template-Type: ReDIF-Article 1.0
Author-Name: Tsukasa Hokimoto
Author-X-Name-First: Tsukasa
Author-X-Name-Last: Hokimoto
Author-Name: Kunio Shimizu
Author-X-Name-First: Kunio
Author-X-Name-Last: Shimizu
Title: A non-homogeneous hidden Markov model for predicting the distribution of sea surface elevation
Abstract:
The prediction problem of sea state based
on the field measurements of wave and meteorological factors is a topic of
interest from the standpoints of navigation safety and fisheries. Various
statistical methods have been considered for the prediction of the
distribution of sea surface elevation. However, prediction of sea state in
the transitional situation when waves are developing by blowing wind has
been a difficult problem until now, because the statistical expression of
the dynamic mechanism during this situation is very complicated. In this
article, we consider this problem through the development of a statistical
model. More precisely, we develop a model for the prediction of the
time-varying distribution of sea surface elevation, taking into account a
non-homogeneous hidden Markov model in which the time-varying structures
are influenced by wind speed and wind direction. Our prediction
experiments suggest the possibility that the proposed model contributes to
an improvement of the prediction accuracy by using a homogenous hidden
Markov model. Furthermore, we found that the prediction accuracy is
influenced by the circular distribution of the circular hidden Markov
model for the directional time series wind direction data.
Journal: Journal of Applied Statistics
Pages: 294-319
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.839634
File-URL: http://hdl.handle.net/10.1080/02664763.2013.839634
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:294-319
Template-Type: ReDIF-Article 1.0
Author-Name: Jos� A. Fioruci
Author-X-Name-First: Jos� A.
Author-X-Name-Last: Fioruci
Author-Name: Ricardo S. Ehlers
Author-X-Name-First: Ricardo S.
Author-X-Name-Last: Ehlers
Author-Name: Marinho G. Andrade Filho
Author-X-Name-First: Marinho G.
Author-X-Name-Last: Andrade Filho
Title: Bayesian multivariate GARCH models with dynamic correlations and asymmetric error distributions
Abstract:
The main goal in this paper is to develop
and apply stochastic simulation techniques for GARCH models with
multivariate skewed distributions using the Bayesian approach. Both
parameter estimation and model comparison are not trivial tasks and
several approximate and computationally intensive methods (Markov chain
Monte Carlo) will be used to this end. We consider a flexible class of
multivariate distributions which can model both skewness and heavy tails.
Also, we do not fix tail behaviour when dealing with fat tail
distributions but leave it subject to inference.
Journal: Journal of Applied Statistics
Pages: 320-331
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.839635
File-URL: http://hdl.handle.net/10.1080/02664763.2013.839635
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:320-331
Template-Type: ReDIF-Article 1.0
Author-Name: Michael McCullough
Author-X-Name-First: Michael
Author-X-Name-Last: McCullough
Author-Name: Thomas L Marsh
Author-X-Name-First: Thomas L
Author-X-Name-Last: Marsh
Author-Name: Ron C Mittelhammer
Author-X-Name-First: Ron C
Author-X-Name-Last: Mittelhammer
Title: Reconstructing nonlinear structure in regression residuals
Abstract:
Phase space reconstruction is investigated
as a diagnostic tool for uncovering structure of nonlinear processes in
regression residuals. Results in the form of phase portraits (e.g. scatter
plots of reconstructed dynamical systems) and descriptive statistics
provide information that can identify underlying structural components
from stochastic data outcomes, even in cases where such data appear
essentially random, and provide insights categorizing structural
components into functional classes to inform econometric/time series
modeling efforts. Empirical evidence supporting this approach is provided
using simulations from an Ikeda mapping. An application to US hops exports
is used to illustrate the application of the approach.
Journal: Journal of Applied Statistics
Pages: 332-350
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.839636
File-URL: http://hdl.handle.net/10.1080/02664763.2013.839636
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:332-350
Template-Type: ReDIF-Article 1.0
Author-Name: Ming Zhou
Author-X-Name-First: Ming
Author-X-Name-Last: Zhou
Author-Name: Yongzhao Shao
Author-X-Name-First: Yongzhao
Author-X-Name-Last: Shao
Title: A powerful test for multivariate normality
Abstract:
This paper investigates a new test for
normality that is easy for biomedical researchers to understand and easy
to implement in all dimensions. In terms of power comparison against a
broad range of alternatives, the new test outperforms the best known
competitors in the literature as demonstrated by simulation results. In
addition, the proposed test is illustrated using data from real biomedical
studies.
Journal: Journal of Applied Statistics
Pages: 351-363
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.839637
File-URL: http://hdl.handle.net/10.1080/02664763.2013.839637
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:351-363
Template-Type: ReDIF-Article 1.0
Author-Name: A. Asrat Atsedeweyn
Author-X-Name-First: A. Asrat
Author-X-Name-Last: Atsedeweyn
Author-Name: K. Srinivasa Rao
Author-X-Name-First: K.
Author-X-Name-Last: Srinivasa Rao
Title: Linear regression model with new symmetric distributed errors
Abstract:
Regression models play a dominant role in
analyzing several data sets arising from areas like agricultural
experiment, space experiment, biological experiment, financial modeling,
etc. One of the major strings in developing the regression models is the
assumption of the distribution of the error terms. It is customary to
consider that the error terms follow the Gaussian distribution. However,
there are some drawbacks of Gaussian errors such as the distribution being
mesokurtic having kurtosis three. In many practical situations the
variables under study may not be having mesokurtic but they are
platykurtic. Hence, to analyze these sorts of platykurtic variables, a
two-variable regression model with new symmetric distributed errors is
developed and analyzed. The maximum likelihood (ML) estimators of the
model parameters are derived. The properties of the ML estimators with
respect to the new symmetrically distributed errors are also discussed. A
simulation study is carried out to compare the proposed model with that of
Gaussian errors and found that the proposed model performs better when the
variables are platykurtic. Some applications of the developed model are
also pointed out.
Journal: Journal of Applied Statistics
Pages: 364-381
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.839638
File-URL: http://hdl.handle.net/10.1080/02664763.2013.839638
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:364-381
Template-Type: ReDIF-Article 1.0
Author-Name: Ji-Liang Shiu
Author-X-Name-First: Ji-Liang
Author-X-Name-Last: Shiu
Author-Name: Chia-Hung D. Sun
Author-X-Name-First: Chia-Hung D.
Author-X-Name-Last: Sun
Title: The determinants of price in online auctions: more evidence from unbalanced panel data
Abstract:
This study provides an alternative
approach that takes account of the unobserved effects of each seller under
a sample selection framework while using online auction data. We use data
collected from Yahoo! Kimo Auction (Taiwan) to demonstrate that earlier
empirical results of online auction studies may be biased due to violating
the assumption of independence of the error terms between sample
observations. Empirical findings show that seller reputation is no longer
as the most important factor for buyers to bid on items, while the sample
data confirm the unobserved heterogeneity of sellers and sample selection
problem.
Journal: Journal of Applied Statistics
Pages: 382-392
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.839639
File-URL: http://hdl.handle.net/10.1080/02664763.2013.839639
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:382-392
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaoguang Wang
Author-X-Name-First: Xiaoguang
Author-X-Name-Last: Wang
Author-Name: Junhui Fan
Author-X-Name-First: Junhui
Author-X-Name-Last: Fan
Title: Variable selection for multivariate generalized linear models
Abstract:
Generalized linear models (GLMs) are
widely studied to deal with complex response variables. For the analysis
of categorical dependent variables with more than two response categories,
multivariate GLMs are presented to build the relationship between this
polytomous response and a set of regressors. Traditional variable
selection approaches have been proposed for the multivariate GLM with a
canonical link function when the number of parameters is fixed in the
literature. However, in many model selection problems, the number of
parameters may be large and grow with the sample size. In this paper, we
present a new selection criterion to the model with a diverging number of
parameters. Under suitable conditions, the criterion is shown to be model
selection consistent. A simulation study and a real data analysis are
conducted to support theoretical findings.
Journal: Journal of Applied Statistics
Pages: 393-406
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.839640
File-URL: http://hdl.handle.net/10.1080/02664763.2013.839640
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:393-406
Template-Type: ReDIF-Article 1.0
Author-Name: Saeid Amiri
Author-X-Name-First: Saeid
Author-X-Name-Last: Amiri
Title: The resampling of entropies with the application of biodiversity
Abstract:
This paper discusses the bootstrap test of
entropies. Since the comparison of entropies is of prime interest in
applied fields, finding an appropriate way to carry out such a comparison
is of utmost importance. This paper presents how resampling should be
performed to obtain an accurate p-value. Although the
test using a pair-wise bootstrap confidence interval (CI) has already been
dealt with in few works, here the bootstrap tests are studied because it
may demand quite a different resampling algorithm compared with the CI.
Moreover, the multiple test is studied. The proposed tests appear to yield
several appreciable advantages. The easy implementation and the power of
the proposed test can be considered as advantages. Here the entropy of the
discrete variable is studied. The proposed tests are examined using Monte
Carlo investigations and also evaluated using various distributions.
Journal: Journal of Applied Statistics
Pages: 407-422
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.840052
File-URL: http://hdl.handle.net/10.1080/02664763.2013.840052
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:407-422
Template-Type: ReDIF-Article 1.0
Author-Name: Paul J. Plummer
Author-X-Name-First: Paul J.
Author-X-Name-Last: Plummer
Author-Name: Jie Chen
Author-X-Name-First: Jie
Author-X-Name-Last: Chen
Title: A Bayesian approach for locating change points in a compound Poisson process with application to detecting DNA copy number variations
Abstract:
This work examines the problem of locating
changes in the distribution of a Compound Poisson Process where the
variables being summed are iid normal and the number of variable follows
the Poisson distribution. A Bayesian approach is developed to identify the
location of significant changes in any of the parameters of the
distribution, and a sliding window algorithm is used to identify multiple
change points. These results can be applied in any field of study where an
interest in locating changes not only in the parameter of a normally
distributed data set but also in the rate of their occurrence. It has
direct application to the study of DNA copy number variations in cancer
research, where it is known that the distances between the genes can
affect their intensity level.
Journal: Journal of Applied Statistics
Pages: 423-438
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.840272
File-URL: http://hdl.handle.net/10.1080/02664763.2013.840272
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:423-438
Template-Type: ReDIF-Article 1.0
Author-Name: Krishna K. Saha
Author-X-Name-First: Krishna K.
Author-X-Name-Last: Saha
Author-Name: Roger Bilisoly
Author-X-Name-First: Roger
Author-X-Name-Last: Bilisoly
Author-Name: Darius M. Dziuda
Author-X-Name-First: Darius M.
Author-X-Name-Last: Dziuda
Title: Hybrid-based confidence intervals for the ratio of two treatment means in the over-dispersed Poisson data
Abstract:
In many clinical trials and
epidemiological studies, comparing the mean count response of an exposed
group to a control group is often of interest. This type of data is often
over-dispersed with respect to Poisson variation, and previous studies
usually compared groups using confidence intervals (CIs) of the difference
between the two means. However, in some situations, especially when the
means are small, interval estimation of the mean ratio (MR) is preferable.
Moreover, Cox and Lewis [4] pointed out many other situations where the MR
is more relevant than the difference of means. In this paper, we consider
CI construction for the ratio of means between two treatments for
over-dispersed Poisson data. We develop several CIs for the situation by
hybridizing two separate CIs for two individual means. Extensive
simulations show that all hybrid-based CIs perform reasonably well in
terms of coverage. However, the CIs based on the delta method using the
logarithmic transformation perform better than other intervals in the
sense that they have slightly shorter interval lengths and show better
balance of tail errors. These proposed CIs are illustrated with three real
data examples.
Journal: Journal of Applied Statistics
Pages: 439-453
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.840273
File-URL: http://hdl.handle.net/10.1080/02664763.2013.840273
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:439-453
Template-Type: ReDIF-Article 1.0
Author-Name: Bulent Tutmez
Author-X-Name-First: Bulent
Author-X-Name-Last: Tutmez
Title: Analyzing non-stationarity in cement stone pit by median polish interpolation: a case study
Abstract:
The raw materials utilized in the
manufacture of cement comprise mainly of lime, silica, alumina and iron
oxide. Spatial evaluation of these main chemical constituents of cement
has crucial importance for providing effective production. Because these
components are composed of some raw materials such as limestone and marl,
the spatial relationships in a calcareous marl stone pit was taken into
consideration. In practice, spatial field data taken from a cement quarry
may include some variations and trends. For modeling and removing spatial
trend in a cement raw material quarry as well as providing unbiased
estimates, median polish kriging was used. By using the variation of the
data itself, some approximations and interpolations were carried out. It
was recorded that the method obtained outlier-resistant estimation of
spatial trend without needing an external exploratory variable. In
addition, it provided very effective estimations and additional
information for analyzing spatial non-stationary data.
Journal: Journal of Applied Statistics
Pages: 454-466
Issue: 2
Volume: 41
Year: 2014
Month: 2
X-DOI: 10.1080/02664763.2013.840274
File-URL: http://hdl.handle.net/10.1080/02664763.2013.840274
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:2:p:454-466
Template-Type: ReDIF-Article 1.0
Author-Name: Guo-Liang Tian
Author-X-Name-First: Guo-Liang
Author-X-Name-Last: Tian
Author-Name: Mingqiu Wang
Author-X-Name-First: Mingqiu
Author-X-Name-Last: Wang
Author-Name: Lixin Song
Author-X-Name-First: Lixin
Author-X-Name-Last: Song
Title: Variable selection in the high-dimensional continuous generalized linear model with current status data
Abstract:
In survival studies, current status data
are frequently encountered when some individuals in a study are not
successively observed. This paper considers the problem of simultaneous
variable selection and parameter estimation in the high-dimensional
continuous generalized linear model with current status data. We apply the
penalized likelihood procedure with the smoothly clipped absolute
deviation penalty to select significant variables and estimate the
corresponding regression coefficients. With a proper choice of tuning
parameters, the resulting estimator is shown to be a root
n/p
n
-consistent estimator under some mild
conditions. In addition, we show that the resulting estimator has the same
asymptotic distribution as the estimator obtained when the true model is
known. The finite sample behavior of the proposed estimator is evaluated
through simulation studies and a real example.
Journal: Journal of Applied Statistics
Pages: 467-483
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.840271
File-URL: http://hdl.handle.net/10.1080/02664763.2013.840271
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:467-483
Template-Type: ReDIF-Article 1.0
Author-Name: Alok Kumar Dwivedi
Author-X-Name-First: Alok Kumar
Author-X-Name-Last: Dwivedi
Author-Name: Indika Mallawaarachchi
Author-X-Name-First: Indika
Author-X-Name-Last: Mallawaarachchi
Author-Name: Soyoung Lee
Author-X-Name-First: Soyoung
Author-X-Name-Last: Lee
Author-Name: Patrick Tarwater
Author-X-Name-First: Patrick
Author-X-Name-Last: Tarwater
Title: Methods for estimating relative risk in studies of common binary outcomes
Abstract:
Studying the effect of exposure or
intervention on a dichotomous outcome is very common in medical research.
Logistic regression (LR) is often used to determine such association which
provides odds ratio (OR). OR often overestimates the effect size for
prevalent outcome data. In such situations, use of relative risk (RR) has
been suggested. We propose modifications in Zhang and Yu and Diaz-Quijano
methods. These methods were compared with stratified Mantel Haenszel
method, LR, log binomial regression (LBR), Zhang and Yu method,
Poisson/Cox regression, modified Poisson/Cox regression, marginal
probability method, COPY method, inverse probability of treatment weighted
LBR, and Diaz-Quijano method. Our proposed modified Diaz-Quijano (MDQ)
method provides RR and its confidence interval similar to those estimated
by modified Poisson/Cox and LBRs. The proposed modifications in Zhang and
Yu method provides better estimate of RR and its standard error as
compared to Zhang and Yu method in a variety of situations with prevalent
outcome. The MDQ method can be used easily to estimate the RR and its
confidence interval in the studies which require reporting of RRs.
Regression models which directly provide the estimate of RR without
convergence problems such as the MDQ method and modified Poisson/Cox
regression should be preferred.
Journal: Journal of Applied Statistics
Pages: 484-500
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.840772
File-URL: http://hdl.handle.net/10.1080/02664763.2013.840772
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:484-500
Template-Type: ReDIF-Article 1.0
Author-Name: P. Niloofar
Author-X-Name-First: P.
Author-X-Name-Last: Niloofar
Author-Name: M. Ganjali
Author-X-Name-First: M.
Author-X-Name-Last: Ganjali
Title: A new multivariate imputation method based on Bayesian networks
Abstract:
Dealing with incomplete data is a
pervasive problem in statistical surveys. Bayesian networks have been
recently used in missing data imputation. In this research, we propose a
new methodology for the multivariate imputation of missing data using
discrete Bayesian networks and conditional Gaussian Bayesian networks.
Results from imputing missing values in coronary artery disease data set
and milk composition data set as well as a simulation study from
cancer-neapolitan network are presented to demonstrate and compare the
performance of three Bayesian network-based imputation methods with those
of multivariate imputation by chained equations (MICE) and the classical
hot-deck imputation method. To assess the effect of the structure learning
algorithm on the performance of the Bayesian network-based methods, two
methods called Peter-Clark algorithm and greedy search-and-score have been
applied. Bayesian network-based methods are: first, the method introduced
by Di Zio et al. [Bayesian networks for
imputation, J. R. Stat. Soc. Ser. A 167 (2004), 309--322] in
which, each missing item of a variable is imputed using the information
given in the parents of that variable; second, the method of Di Zio
et al. [Multivariate techniques for imputation
based on Bayesian networks, Neural Netw. World 15 (2005),
303--310] which uses the information in the Markov blanket set of the
variable to be imputed and finally, our new proposed method which applies
the whole available knowledge of all variables of interest, consisting the
Markov blanket and so the parent set, to impute a missing item. Results
indicate the high quality of our new proposed method especially in the
presence of high missingness percentages and more connected networks. Also
the new method have shown to be more efficient than the MICE method for
small sample sizes with high missing rates.
Journal: Journal of Applied Statistics
Pages: 501-518
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.842960
File-URL: http://hdl.handle.net/10.1080/02664763.2013.842960
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:501-518
Template-Type: ReDIF-Article 1.0
Author-Name: Çiğdem Arıcıgil Çilan
Author-X-Name-First: Çiğdem
Author-X-Name-Last: Arıcıgil Çilan
Title: Latent class analysis for measuring Turkish People's future expectations for Turkey
Abstract:
The aim of this study is to classify the
Turkish People and measure the probability of their positive or negative
expectations according to their 5-year expectations on Turkish Economy,
Social Rights and Freedom, Rendering of the Public Services, Government
Transparency and Turkey's Reputation. For this purpose latest data from
the Turkish Statistical Institute's Life Satisfaction Survey 2011 was used
and latent class analysis (LCA) was utilized on this data. For this study,
unrestricted and restricted models of LCAs were performed, and it is
observed that the three-class unrestricted model was found to be the best
fit. Latent Class probabilities were interpreted and each class was named
based on the calculated conditional probabilities.
Journal: Journal of Applied Statistics
Pages: 519-529
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.842961
File-URL: http://hdl.handle.net/10.1080/02664763.2013.842961
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:519-529
Template-Type: ReDIF-Article 1.0
Author-Name: Wenlin Dai
Author-X-Name-First: Wenlin
Author-X-Name-Last: Dai
Author-Name: Tiejun Tong
Author-X-Name-First: Tiejun
Author-X-Name-Last: Tong
Title: Variance estimation in nonparametric regression with jump discontinuities
Abstract:
Variance estimation is an important topic
in nonparametric regression. In this paper, we propose a pairwise
regression method for estimating the residual variance. Specifically, we
regress the squared difference between observations on the squared
distance between design points, and then estimate the residual variance as
the intercept. Unlike most existing difference-based estimators that
require a smooth regression function, our method applies to regression
models with jump discontinuities. Our method also applies to the
situations where the design points are unequally spaced. Finally, we
conduct extensive simulation studies to evaluate the finite-sample
performance of the proposed method and compare it with some existing
competitors.
Journal: Journal of Applied Statistics
Pages: 530-545
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.842962
File-URL: http://hdl.handle.net/10.1080/02664763.2013.842962
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:530-545
Template-Type: ReDIF-Article 1.0
Author-Name: Wali Ullah
Author-X-Name-First: Wali
Author-X-Name-Last: Ullah
Author-Name: Yasumasa Matsuda
Author-X-Name-First: Yasumasa
Author-X-Name-Last: Matsuda
Author-Name: Yoshihiko Tsukuda
Author-X-Name-First: Yoshihiko
Author-X-Name-Last: Tsukuda
Title: Dynamics of the term structure of interest rates and monetary policy: is monetary policy effective during zero interest rate policy?
Abstract:
The monetary policy targets the short
rates; however, during zero interest rate policy (ZIRP), the short end of
the yield curve cannot serve as a policy instrument. Relying on the joint
yields-macro latent factors model, this study empirically examines the
effect of monetary policy stances on term structure and the possible
feedback effect on the real sector using the Japanese experience of ZIRP.
The analysis indicates that it is the entire term structure that transmits
the policy shocks to the real economy rather than the yield spread only.
The monetary policy signals pass through the yield curve level and slope
factors to stimulate the economic activity. The curvature factor, besides
reflecting the cyclical fluctuations of the economy, acts as a leading
indicator for future inflation. In addition, policy influence tends to be
low as the short end becomes segmented toward medium/long-term of the
yield curve. Furthermore, volatility in bond markets is found to be
asymmetrically affected by positive and negative shocks and long end tends
to be less sensitive to stochastic shocks than the short maturities. The
expectation hypothesis of the term structure does not hold during the ZIRP
period.
Journal: Journal of Applied Statistics
Pages: 546-572
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.845142
File-URL: http://hdl.handle.net/10.1080/02664763.2013.845142
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:546-572
Template-Type: ReDIF-Article 1.0
Author-Name: Insuk Sohn
Author-X-Name-First: Insuk
Author-X-Name-Last: Sohn
Author-Name: Jooyong Shim
Author-X-Name-First: Jooyong
Author-X-Name-Last: Shim
Author-Name: Changha Hwang
Author-X-Name-First: Changha
Author-X-Name-Last: Hwang
Author-Name: Sujong Kim
Author-X-Name-First: Sujong
Author-X-Name-Last: Kim
Author-Name: Jae Won Lee
Author-X-Name-First: Jae Won
Author-X-Name-Last: Lee
Title: Transcription factor-binding site identification and gene classification via fusion of the supervised-weighted discrete kernel clustering and support vector machine
Abstract:
The genetic regulatory mechanism heavily
influences a substantial portion of biological functions and processes
needed to sustain life. For a comprehensive mechanistic understanding of
biological processes, it is important to identify the common transcription
factor (TF) binding sites (TFBSs) from a set of promoter sequences of
co-regulated genes and classify genes that are co-regulated by certain
TFs, therefore to provide an insight into the mechanism that underlies the
interaction among the co-regulated genes and complicate genetic
regulation. We propose a new supervised-weighted discrete kernel
clustering (SWDKC) classification method for the identification of TFBS
and the classification of gene. Our SWDKC method gave smaller
misclassification error rate than the other methods on both the simulated
data and the real NF-κB data. We verify that the selected
over-represented TFBSs serve informative TFBSs from a biological point of
view.
Journal: Journal of Applied Statistics
Pages: 573-581
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.845143
File-URL: http://hdl.handle.net/10.1080/02664763.2013.845143
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:573-581
Template-Type: ReDIF-Article 1.0
Author-Name: A.B. Schmiedt
Author-X-Name-First: A.B.
Author-X-Name-Last: Schmiedt
Author-Name: H.H. Dickert
Author-X-Name-First: H.H.
Author-X-Name-Last: Dickert
Author-Name: W. Bleck
Author-X-Name-First: W.
Author-X-Name-Last: Bleck
Author-Name: U. Kamps
Author-X-Name-First: U.
Author-X-Name-Last: Kamps
Title: Multivariate extreme value analysis and its relevance in a metallographical application
Abstract:
Motivated from extreme value (EV) analysis
for large non-metallic inclusions in engineering steels and a real data
set, the benefit of choosing a multivariate EV approach is discussed. An
extensive simulation study shows that the common univariate setup may lead
to a high proportion of mis-specifications of the true EV distribution, as
well as that the statistical analysis is considerably improved when being
based on the respective data of r largest observations,
with r appropriately chosen. Results for several
underlying distributions and various values of r are
presented along with effects on estimators for the parameters of the
generalized EV family of distributions.
Journal: Journal of Applied Statistics
Pages: 582-595
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.845872
File-URL: http://hdl.handle.net/10.1080/02664763.2013.845872
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:582-595
Template-Type: ReDIF-Article 1.0
Author-Name: Giancarlo Diana
Author-X-Name-First: Giancarlo
Author-X-Name-Last: Diana
Author-Name: Saba Riaz
Author-X-Name-First: Saba
Author-X-Name-Last: Riaz
Author-Name: Javid Shabbir
Author-X-Name-First: Javid
Author-X-Name-Last: Shabbir
Title: Hansen and Hurwitz estimator with scrambled response on the second call
Abstract:
In this paper we propose a modified
version of the estimator of Hansen and Hurwitz [12] in the case of
quantitative sensitive variable and consider a randomization mechanism on
the second call that provides privacy protection to the respondents to get
truthful information. We use variance of the modified estimator as a tool
to measure privacy protection and it is observed that the higher is the
variance, the lower is the efficiency but the higher is the privacy
protection. To overcome this efficiency loss, we consider a linear
regression estimator using known non-sensitive auxiliary information. With
consideration of four scrambled models, we try to make a trade-off between
efficiency and privacy protection. To show this compromise, analytical and
numerical comparisons are obtained.
Journal: Journal of Applied Statistics
Pages: 596-611
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.846305
File-URL: http://hdl.handle.net/10.1080/02664763.2013.846305
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:596-611
Template-Type: ReDIF-Article 1.0
Author-Name: Mahmoud Torabi
Author-X-Name-First: Mahmoud
Author-X-Name-Last: Torabi
Title: Hierarchical Bayesian bivariate disease mapping: analysis of children and adults asthma visits to hospital
Abstract:
In spatial epidemiology, detecting areas
with high ratio of disease is important as it may lead to identifying risk
factors associated with disease. This in turn may lead to further
epidemiological investigations into the nature of disease. Disease mapping
studies have been widely performed with considering only one disease in
the estimated models. Simultaneous modelling of different diseases can
also be a valuable tool both from the epidemiological and also from the
statistical point of view. In particular, when we have several
measurements recorded at each spatial location, one can consider
multivariate models in order to handle the dependence among the
multivariate components and the spatial dependence between locations. In
this paper, spatial models that use multivariate conditionally
autoregressive smoothing across the spatial dimension are considered. We
study the patterns of incidence ratios and identify areas with
consistently high ratio estimates as areas for further investigation. A
hierarchical Bayesian approach using Markov chain Monte Carlo techniques
is employed to simultaneously examine spatial trends of asthma visits by
children and adults to hospital in the province of Manitoba, Canada,
during 2000--2010.
Journal: Journal of Applied Statistics
Pages: 612-621
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.847066
File-URL: http://hdl.handle.net/10.1080/02664763.2013.847066
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:612-621
Template-Type: ReDIF-Article 1.0
Author-Name: Francisco Louzada
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada
Author-Name: M�rio de Castro
Author-X-Name-First: M�rio
Author-X-Name-Last: de Castro
Author-Name: Vera Tomazella
Author-X-Name-First: Vera
Author-X-Name-Last: Tomazella
Author-Name: Jhon F.B. Gonzales
Author-X-Name-First: Jhon F.B.
Author-X-Name-Last: Gonzales
Title: Modeling categorical covariates for lifetime data in the presence of cure fraction by Bayesian partition structures
Abstract:
In this paper, we propose a Bayesian
partition modeling for lifetime data in the presence of a cure fraction by
considering a local structure generated by a tessellation which depends on
covariates. In this modeling we include information of nominal qualitative
variables with more than two categories or ordinal qualitative variables.
The proposed modeling is based on a promotion time cure model structure
but assuming that the number of competing causes follows a geometric
distribution. It is an alternative modeling strategy to the conventional
survival regression modeling generally used for modeling lifetime data in
the presence of a cure fraction, which models the cure fraction through a
(generalized) linear model of the covariates. An advantage of our approach
is its ability to capture the effects of covariates in a local structure.
The flexibility of having a local structure is crucial to capture local
effects and features of the data. The modeling is illustrated on two real
melanoma data sets.
Journal: Journal of Applied Statistics
Pages: 622-634
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.847067
File-URL: http://hdl.handle.net/10.1080/02664763.2013.847067
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:622-634
Template-Type: ReDIF-Article 1.0
Author-Name: Najeh Chaâbane
Author-X-Name-First: Najeh
Author-X-Name-Last: Chaâbane
Title: A novel auto-regressive fractionally integrated moving average--least-squares support vector machine model for electricity spot prices prediction
Abstract:
In the framework of competitive
electricity market, prices forecasting has become a real challenge for all
market participants. However, forecasting is a rather complex task since
electricity prices involve many features comparably with those in
financial markets. Electricity markets are more unpredictable than other
commodities referred to as extreme volatile. Therefore, the choice of the
forecasting model has become even more important. In this paper, a new
hybrid model is proposed. This model exploits the feature and strength of
the auto-regressive fractionally integrated moving average model as well
as least-squares support vector machine model. The expected prediction
combination takes advantage of each model's strength or unique capability.
The proposed model is examined by using data from the Nordpool electricity
market. Empirical results showed that the proposed method has the best
prediction accuracy compared to other methods.
Journal: Journal of Applied Statistics
Pages: 635-651
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.847068
File-URL: http://hdl.handle.net/10.1080/02664763.2013.847068
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:635-651
Template-Type: ReDIF-Article 1.0
Author-Name: Edgard Nyssen
Author-X-Name-First: Edgard
Author-X-Name-Last: Nyssen
Author-Name: Wolfgang Jacquet
Author-X-Name-First: Wolfgang
Author-X-Name-Last: Jacquet
Title: A statistical testing framework for evaluating the quality of measurement processes
Abstract:
In this paper we address the evaluation of
measurement process quality. We mainly focus on the evaluation procedure,
as far as it is based on the numerical outcomes for the measurement of a
single physical quantity. We challenge the approach where the 'exact'
value of the observed quantity is compared with the error interval
obtained from the measurements under test and we propose a procedure where
reference measurements are used as 'gold standard'. To this purpose, we
designed a specific t-test procedure, explained here. We
also describe and discuss a numerical simulation experiment demonstrating
the behaviour of our procedure.
Journal: Journal of Applied Statistics
Pages: 652-659
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.847069
File-URL: http://hdl.handle.net/10.1080/02664763.2013.847069
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:652-659
Template-Type: ReDIF-Article 1.0
Author-Name: Yuzhu Tian
Author-X-Name-First: Yuzhu
Author-X-Name-Last: Tian
Author-Name: Qianqian Zhu
Author-X-Name-First: Qianqian
Author-X-Name-Last: Zhu
Author-Name: Maozai Tian
Author-X-Name-First: Maozai
Author-X-Name-Last: Tian
Title: Inference for mixed generalized exponential distribution under progressively type-II censored samples
Abstract:
In industrial life tests, reliability
analysis and clinical trials, the type-II progressive censoring
methodology, which allows for random removals of the remaining survival
units at each failure time, has become quite popular for analyzing
lifetime data. Parameter estimation under progressively type-II censored
samples for many common lifetime distributions has been investigated
extensively. However, how to estimate unknown parameters of the mixed
distribution models under progressive type-II censoring schemes is still a
challenging and interesting problem. Based on progressively type-II
censored samples, this paper addresses the estimation problem of mixed
generalized exponential distributions. In addition, it is observed that
the maximum-likelihood estimates (MLEs) cannot be easily obtained in
closed form due to the complexity of the likelihood function. Thus, we
make good use of the expectation-maximization algorithm to obtain the
MLEs. Finally, some simulations are implemented in order to show the
performance of the proposed method under finite samples and a case
analysis is illustrated.
Journal: Journal of Applied Statistics
Pages: 660-676
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.847070
File-URL: http://hdl.handle.net/10.1080/02664763.2013.847070
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:660-676
Template-Type: ReDIF-Article 1.0
Author-Name: Edilberto Cepeda-Cuervo
Author-X-Name-First: Edilberto
Author-X-Name-Last: Cepeda-Cuervo
Author-Name: Jorge Alberto Achcar
Author-X-Name-First: Jorge Alberto
Author-X-Name-Last: Achcar
Author-Name: Liliana Garrido Lopera
Author-X-Name-First: Liliana Garrido
Author-X-Name-Last: Lopera
Title: Bivariate beta regression models: joint modeling of the mean, dispersion and association parameters
Abstract:
In this paper a bivariate beta regression
model with joint modeling of the mean and dispersion parameters is
proposed, defining the bivariate beta distribution from
Farlie--Gumbel--Morgenstern (FGM) copulas. This model, that can be
generalized using other copulas, is a good alternative to analyze
non-independent pairs of proportions and can be fitted applying standard
Markov chain Monte Carlo methods. Results of two applications of the
proposed model in the analysis of structural and real data set are
included.
Journal: Journal of Applied Statistics
Pages: 677-687
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.847071
File-URL: http://hdl.handle.net/10.1080/02664763.2013.847071
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:677-687
Template-Type: ReDIF-Article 1.0
Author-Name: Yi Zhang
Author-X-Name-First: Yi
Author-X-Name-Last: Zhang
Author-Name: Haitao Chu
Author-X-Name-First: Haitao
Author-X-Name-Last: Chu
Author-Name: Donglin Zeng
Author-X-Name-First: Donglin
Author-X-Name-Last: Zeng
Title: Evaluation of incomplete multiple diagnostic tests, with an application in the colon cancer family registry study
Abstract:
Accurate diagnosis of a molecularly
defined subtype of cancer is often an important step toward its effective
control and treatment. For the diagnosis of some subtypes of a cancer, a
gold standard with perfect sensitivity and specificity may be unavailable.
In those scenarios, tumor subtype status is commonly measured by multiple
imperfect diagnostic markers. Additionally, in many such studies, some
subjects are only measured by a subset of diagnostic tests and the missing
probabilities may depend on the unknown disease status. In this paper, we
present statistical methods based on the EM algorithm to evaluate
incomplete multiple imperfect diagnostic tests under a missing at random
assumption and one missing not at random scenario. We apply the proposed
methods to a real data set from the National Cancer Institute (NCI) colon
cancer family registry on diagnosing microsatellite instability for
hereditary non-polyposis colorectal cancer to estimate diagnostic accuracy
parameters (i.e. sensitivities and specificities), prevalence, and
potential differential missing probabilities for 11 biomarker tests.
Simulations are also conducted to evaluate the small-sample performance of
our methods.
Journal: Journal of Applied Statistics
Pages: 688-700
Issue: 3
Volume: 41
Year: 2014
Month: 3
X-DOI: 10.1080/02664763.2013.849231
File-URL: http://hdl.handle.net/10.1080/02664763.2013.849231
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:3:p:688-700
Template-Type: ReDIF-Article 1.0
Author-Name: Andrew Cron
Author-X-Name-First: Andrew
Author-X-Name-Last: Cron
Author-Name: Liang Zhang
Author-X-Name-First: Liang
Author-X-Name-Last: Zhang
Author-Name: Deepak Agarwal
Author-X-Name-First: Deepak
Author-X-Name-Last: Agarwal
Title: Collaborative filtering for massive multinomial data
Abstract:
Content recommendation on a webpage
involves recommending content links (items) on multiple slots for each
user visit to maximize some objective function, typically the
click-through rate (CTR) which is the probability of clicking on an item
for a given user visit. Most existing approaches to this problem assume
user's response (click/no click) on different slots are independent of
each other. This is problematic since in many scenarios CTR on a slot may
depend on externalities like items recommended on other slots.
Incorporating the effects of such externalities in the modeling process is
important to better predictive accuracy. We therefore propose a
hierarchical model that assumes a multinomial response for each visit to
incorporate competition among slots and models complex interactions among
(user, item, slot) combinations through factor models via a tensor
approach. In addition, factors in our model are drawn with means that are
based on regression functions of user/item covariates, which helps us
obtain better estimates for users/items that are relatively new with
little past activity. We show marked gains in predictive accuracy by
various metrics.
Journal: Journal of Applied Statistics
Pages: 701-715
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.847072
File-URL: http://hdl.handle.net/10.1080/02664763.2013.847072
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:701-715
Template-Type: ReDIF-Article 1.0
Author-Name: Jean-Paul Lucas
Author-X-Name-First: Jean-Paul
Author-X-Name-Last: Lucas
Author-Name: V�ronique S�bille
Author-X-Name-First: V�ronique
Author-X-Name-Last: S�bille
Author-Name: Alain Le Tertre
Author-X-Name-First: Alain
Author-X-Name-Last: Le Tertre
Author-Name: Yann Le Strat
Author-X-Name-First: Yann
Author-X-Name-Last: Le Strat
Author-Name: Lise Bellanger
Author-X-Name-First: Lise
Author-X-Name-Last: Bellanger
Title: Multilevel modelling of survey data: impact of the two-level weights used in the pseudolikelihood
Abstract:
Approaches that use the pseudolikelihood
to perform multilevel modelling on survey data have been presented in the
literature. To avoid biased estimates due to unequal selection
probabilities, conditional weights can be introduced at each level.
Less-biased estimators can also be obtained in a two-level linear model if
the level-1 weights are scaled. In this paper, we studied several level-2
weights that can be introduced into the pseudolikelihood when the sampling
design and the hierarchical structure of the multilevel model do not
match. Two-level and three-level models were studied. The present work was
motivated by a study that aims to estimate the contributions of lead
sources to polluting the interior floor dust of the rooms within
dwellings. We performed a simulation study using the real data collected
from a French survey to achieve our objective. We conclude that it is
preferable to use unweighted analyses or, at the most, to use conditional
level-2 weights in a two-level or a three-level model. We state some
warnings and make some recommendations.
Journal: Journal of Applied Statistics
Pages: 716-732
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.847404
File-URL: http://hdl.handle.net/10.1080/02664763.2013.847404
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:716-732
Template-Type: ReDIF-Article 1.0
Author-Name: Ahmed N Albatineh
Author-X-Name-First: Ahmed N
Author-X-Name-Last: Albatineh
Author-Name: B.M. Golam Kibria
Author-X-Name-First: B.M. Golam
Author-X-Name-Last: Kibria
Author-Name: Meredith L Wilcox
Author-X-Name-First: Meredith L
Author-X-Name-Last: Wilcox
Author-Name: Bashar Zogheib
Author-X-Name-First: Bashar
Author-X-Name-Last: Zogheib
Title: Confidence interval estimation for the population coefficient of variation using ranked set sampling: a simulation study
Abstract:
In this paper, an evaluation of the
performance of several confidence interval estimators of the population
coefficient of variation (τ) using ranked set sampling compared to
simple random sampling is performed. Two performance measures are used to
assess the confidence intervals for τ, namely: width and coverage
probabilities. Simulated data were generated from normal, log-normal, skew
normal, Gamma, and Weibull distributions with specified population
parameters so that the same values of τ are obtained for each
distribution, with sample sizes n=15, 20, 25, 50, 100. A
real data example representing birth weight of 189 newborns is used for
illustration and performance comparison.
Journal: Journal of Applied Statistics
Pages: 733-751
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.847405
File-URL: http://hdl.handle.net/10.1080/02664763.2013.847405
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:733-751
Template-Type: ReDIF-Article 1.0
Author-Name: Essam A. Ahmed
Author-X-Name-First: Essam A.
Author-X-Name-Last: Ahmed
Title: Bayesian estimation based on progressive Type-II censoring from two-parameter bathtub-shaped lifetime model: an Markov chain Monte Carlo approach
Abstract:
In this paper, maximum likelihood and
Bayes estimators of the parameters, reliability and hazard functions have
been obtained for two-parameter bathtub-shaped lifetime distribution when
sample is available from progressive Type-II censoring scheme. The Markov
chain Monte Carlo (MCMC) method is used to compute the Bayes estimates of
the model parameters. It has been assumed that the parameters have gamma
priors and they are independently distributed. Gibbs within the
Metropolis--Hasting algorithm has been applied to generate MCMC samples
from the posterior density function. Based on the generated samples, the
Bayes estimates and highest posterior density credible intervals of the
unknown parameters as well as reliability and hazard functions have been
computed. The results of Bayes estimators are obtained under both the
balanced-squared error loss and balanced linear-exponential (BLINEX) loss.
Moreover, based on the asymptotic normality of the maximum likelihood
estimators the approximate confidence intervals (CIs) are obtained. In
order to construct the asymptotic CI of the reliability and hazard
functions, we need to find the variance of them, which are approximated by
delta and Bootstrap methods. Two real data sets have been analyzed to
demonstrate how the proposed methods can be used in practice.
Journal: Journal of Applied Statistics
Pages: 752-768
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.847907
File-URL: http://hdl.handle.net/10.1080/02664763.2013.847907
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:752-768
Template-Type: ReDIF-Article 1.0
Author-Name: Lingsong Zhang
Author-X-Name-First: Lingsong
Author-X-Name-Last: Zhang
Author-Name: Zhengyuan Zhu
Author-X-Name-First: Zhengyuan
Author-X-Name-Last: Zhu
Author-Name: J. S. Marron
Author-X-Name-First: J. S.
Author-X-Name-Last: Marron
Title: Multiresolution anomaly detection method for fractional Gaussian noise
Abstract:
Driven by network intrusion detection, we
propose a MultiResolution Anomaly Detection (MRAD) method, which
effectively utilizes the multiscale properties of Internet features and
network anomalies. In this paper, several theoretical properties of the
MRAD method are explored. A major new result is the mathematical
formulation of the notion that a two-scaled MRAD method has larger power
than the average power of the detection method based on the given two
scales. Test threshold is also developed. Comparisons between MRAD method
and other classical outlier detectors in time series are reported as well.
Journal: Journal of Applied Statistics
Pages: 769-784
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.850065
File-URL: http://hdl.handle.net/10.1080/02664763.2013.850065
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:769-784
Template-Type: ReDIF-Article 1.0
Author-Name: Show-Lin Chen
Author-X-Name-First: Show-Lin
Author-X-Name-Last: Chen
Author-Name: Nen-Jing Chen
Author-X-Name-First: Nen-Jing
Author-X-Name-Last: Chen
Author-Name: Rwei-Ju Chuang
Author-X-Name-First: Rwei-Ju
Author-X-Name-Last: Chuang
Title: An empirical study on technical analysis: GARCH (1, 1) model
Abstract:
One of the deficits of the common
Bollinger band is that it fails to consider the fat tails/leptokurtosis
often exists in financial time series. An adjusted Bollinger band
generated by rolling GARCH regression method is proposed in this study.
The performance of the adjusted Bollinger band strategy on EUR, GBP, JPY,
and AUD vs. USD foreign exchange trading is evaluated. Results show that
in general, the adjusted Bollinger band performs better than the
traditional one in terms of success ratios, net successes, and profit. In
addition, no matter there is transaction cost or not, only adjusted
Bollinger strategies are recommended for investors. Adjusted Bollinger
band strategies with MA 5 or 10 are recommended for EUR, GBP, and JPY.
Adjusted Bollinger strategy with MA 20 is the recommended strategies for
AUD.
Journal: Journal of Applied Statistics
Pages: 785-801
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.856383
File-URL: http://hdl.handle.net/10.1080/02664763.2013.856383
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:785-801
Template-Type: ReDIF-Article 1.0
Author-Name: Hisham Hilow
Author-X-Name-First: Hisham
Author-X-Name-Last: Hilow
Title: Minimum cost linear trend-free 12-run fractional factorial designs
Abstract:
Time trend resistant fractional factorial
experiments have often been based on regular fractionated designs where
several algorithms exist for sequencing their runs in minimum number of
factor-level changes (i.e. minimum cost) such that main effects and/or
two-factor interactions are orthogonal to and free from aliasing with the
time trend, which may be present in the sequentially generated responses.
On the other hand, only one algorithm exists for sequencing runs of the
more economical non-regular fractional factorial experiments, namely
Angelopoulos et al. [1]. This research studies sequential
factorial experimentation under non-regular fractionated designs and
constructs a catalog of 8 minimum cost linear trend-free 12-run designs
(of resolution III) in 4 up to 11 two-level factors by applying the
interactions-main effects assignment technique of Cheng and Jacroux [3] on
the standard 12-run Plackett--Burman design, where factor-level changes
between runs are minimal and where main effects are orthogonal to the
linear time trend. These eight 12-run designs are non-orthogonal but are
more economical than the linear trend-free designs of Angelopoulos
et al. [1], where they can accommodate larger number of
two-level factors in smaller number of experimental runs. These
non-regular designs are also more economical than many regular trend-free
designs. The following will be provided for each proposed systematic
design:
(1) The run order in minimum number of
factor-level changes.
(2) The total number of
factor-level changes between the 12 runs (i.e. the cost).
(3) The closed-form least-squares contrast estimates for all main effects
as well as their closed-form variance--covariance structure.
In addition, combined designs of
each of these 8 designs that can be generated by either complete or
partial foldover allow for the estimation of two-factor interactions
involving one of the factors (i.e. the most influential).
Journal: Journal of Applied Statistics
Pages: 802-816
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.856384
File-URL: http://hdl.handle.net/10.1080/02664763.2013.856384
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:802-816
Template-Type: ReDIF-Article 1.0
Author-Name: Keya Rani Das
Author-X-Name-First: Keya Rani
Author-X-Name-Last: Das
Author-Name: A.H.M. Rahmatullah Imon
Author-X-Name-First: A.H.M. Rahmatullah
Author-X-Name-Last: Imon
Title: Geometric median and its application in the identification of multiple outliers
Abstract:
Geometric mean (GM) is having growing and
wider applications in statistical data analysis as a measure of central
tendency. It is generally believed that GM is less sensitive to outliers
than the arithmetic mean (AM) but we suspect likewise the AM the GM may
also suffer a huge set back in the presence of outliers, especially when
multiple outliers occur in a data. So far as we know, not much work has
been done on the robustness issue of GM. In quest of a simple robust
measure of central tendency, we propose the geometric median (GMed) in
this paper. We show that the classical GM has only 0% breakdown point
while it is 50% for the proposed GMed. Numerical examples also support our
claim that the proposed GMed is unaffected in the presence of multiple
outliers and can maintain the highest possible 50% breakdown. Later we
develop a new method for the identification of multiple outliers based on
this proposed GMed. A variety of numerical examples show that the proposed
method can successfully identify all potential outliers while the
traditional GM fails to do so.
Journal: Journal of Applied Statistics
Pages: 817-831
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.856385
File-URL: http://hdl.handle.net/10.1080/02664763.2013.856385
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:817-831
Template-Type: ReDIF-Article 1.0
Author-Name: Sajid Ali
Author-X-Name-First: Sajid
Author-X-Name-Last: Ali
Author-Name: Muhammad Riaz
Author-X-Name-First: Muhammad
Author-X-Name-Last: Riaz
Title: On the generalized process capability under simple and mixture models
Abstract:
Process capability (PC) indices measure
the ability of a process of interest to meet the desired specifications
under certain restrictions. There are a variety of capability indices
available in literature for different interest variables such as weights,
lengths, thickness, and the life time of items among many others. The goal
of this article is to study the generalized capability indices from the
Bayesian view point under different symmetric and asymmetric loss
functions for the simple and mixture of generalized lifetime models. For
our study purposes, we have covered a simple and two component mixture of
Maxwell distribution as a special case of the generalized class of models.
A comparative discussion of the PC with the mixture models under Laplace
and inverse Rayleigh are also included. Bayesian point estimation of
maintenance performance of the system is also part of the study
(considering the Maxwell failure lifetime model and the repair time
model). A real-life example is also included to illustrate the procedural
details of the proposed method.
Journal: Journal of Applied Statistics
Pages: 832-852
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.856386
File-URL: http://hdl.handle.net/10.1080/02664763.2013.856386
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:832-852
Template-Type: ReDIF-Article 1.0
Author-Name: Sharmishtha Mitra
Author-X-Name-First: Sharmishtha
Author-X-Name-Last: Mitra
Author-Name: Amit Mitra
Author-X-Name-First: Amit
Author-X-Name-Last: Mitra
Title: M-estimator-based robust estimation of the number of components of a superimposed sinusoidal signal model
Abstract:
In this paper, we consider the problem of
estimating the number of components of a superimposed nonlinear sinusoids
model of a signal in the presence of additive noise. We propose and
provide a detailed empirical comparison of robust methods for estimation
of the number of components. The proposed methods, which are robust
modifications of the commonly used information theoretic criteria, are
based on various M-estimator approaches and are robust with respect to
outliers present in the data and heavy-tailed noise. The proposed methods
are compared with the usual non-robust methods through extensive
simulations under varied model scenarios. We also present real signal
analysis of two speech signals to show the usefulness of the proposed
methodology.
Journal: Journal of Applied Statistics
Pages: 853-878
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.856387
File-URL: http://hdl.handle.net/10.1080/02664763.2013.856387
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:853-878
Template-Type: ReDIF-Article 1.0
Author-Name: Ping Zeng
Author-X-Name-First: Ping
Author-X-Name-Last: Zeng
Author-Name: Yongyue Wei
Author-X-Name-First: Yongyue
Author-X-Name-Last: Wei
Author-Name: Yang Zhao
Author-X-Name-First: Yang
Author-X-Name-Last: Zhao
Author-Name: Jin Liu
Author-X-Name-First: Jin
Author-X-Name-Last: Liu
Author-Name: Liya Liu
Author-X-Name-First: Liya
Author-X-Name-Last: Liu
Author-Name: Ruyang Zhang
Author-X-Name-First: Ruyang
Author-X-Name-Last: Zhang
Author-Name: Jianwei Gou
Author-X-Name-First: Jianwei
Author-X-Name-Last: Gou
Author-Name: Shuiping Huang
Author-X-Name-First: Shuiping
Author-X-Name-Last: Huang
Author-Name: Feng Chen
Author-X-Name-First: Feng
Author-X-Name-Last: Chen
Title: Variable selection approach for zero-inflated count data via adaptive lasso
Abstract:
This article proposes a variable selection
approach for zero-inflated count data analysis based on the adaptive lasso
technique. Two models including the zero-inflated Poisson and the
zero-inflated negative binomial are investigated. An efficient algorithm
is used to minimize the penalized log-likelihood function in an
approximate manner. Both the generalized cross-validation and Bayesian
information criterion procedures are employed to determine the optimal
tuning parameter, and a consistent sandwich formula of standard errors for
nonzero estimates is given based on local quadratic approximation. We
evaluate the performance of the proposed adaptive lasso approach through
extensive simulation studies, and apply it to analyze real-life data about
doctor visits.
Journal: Journal of Applied Statistics
Pages: 879-894
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.858672
File-URL: http://hdl.handle.net/10.1080/02664763.2013.858672
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:879-894
Template-Type: ReDIF-Article 1.0
Author-Name: Ellinor Fackle-Fornius
Author-X-Name-First: Ellinor
Author-X-Name-Last: Fackle-Fornius
Author-Name: Linda Anna W�nstr�m
Author-X-Name-First: Linda Anna
Author-X-Name-Last: W�nstr�m
Title: Minimax D-optimal designs of contingent valuation experiments: willingness to pay for environmentally friendly clothes
Abstract:
This paper demonstrates how to plan a
contingent valuation experiment to assess the value of ecologically
produced clothes. First, an appropriate statistical model (the trinomial
spike model) that describes the probability that a randomly selected
individual will accept any positive bid, and if so, will accept the bid A,
is defined. Secondly, an optimization criterion that is a function of the
variances of the parameter estimators is chosen. However, the variances of
the parameter estimators in this model depend on the true parameter
values. Pilot study data are therefore used to obtain estimates of the
parameter values and a locally optimal design is found. Because this
design is only optimal given that the estimated parameter values are
correct, a design that minimizes the maximum of the criterion function
over a plausable parameter region (i.e. a minimax design) is then found.
Journal: Journal of Applied Statistics
Pages: 895-908
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.858670
File-URL: http://hdl.handle.net/10.1080/02664763.2013.858670
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:895-908
Template-Type: ReDIF-Article 1.0
Author-Name: Jonathan Gillard
Author-X-Name-First: Jonathan
Author-X-Name-Last: Gillard
Title: The R book, second edition
Journal: Journal of Applied Statistics
Pages: 909-909
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.853909
File-URL: http://hdl.handle.net/10.1080/02664763.2013.853909
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:909-909
Template-Type: ReDIF-Article 1.0
Author-Name: Isaac Dialsingh
Author-X-Name-First: Isaac
Author-X-Name-Last: Dialsingh
Title: Risk assessment and decision analysis with Bayesian networks
Journal: Journal of Applied Statistics
Pages: 910-910
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.853911
File-URL: http://hdl.handle.net/10.1080/02664763.2013.853911
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:910-910
Template-Type: ReDIF-Article 1.0
Author-Name: Isaac Dialsingh
Author-X-Name-First: Isaac
Author-X-Name-Last: Dialsingh
Title: Computational statistics, second edition
Journal: Journal of Applied Statistics
Pages: 910-911
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.853912
File-URL: http://hdl.handle.net/10.1080/02664763.2013.853912
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:910-911
Template-Type: ReDIF-Article 1.0
Author-Name: Mariano Ruiz Espejo
Author-X-Name-First: Mariano
Author-X-Name-Last: Ruiz Espejo
Title: Medical biostatistics, third edition
Journal: Journal of Applied Statistics
Pages: 911-911
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.853918
File-URL: http://hdl.handle.net/10.1080/02664763.2013.853918
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:911-911
Template-Type: ReDIF-Article 1.0
Author-Name: Anshuman Sahu
Author-X-Name-First: Anshuman
Author-X-Name-Last: Sahu
Title: Statistical methods in customer relationship management
Journal: Journal of Applied Statistics
Pages: 912-912
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.853920
File-URL: http://hdl.handle.net/10.1080/02664763.2013.853920
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:912-912
Template-Type: ReDIF-Article 1.0
Author-Name: A.M. Mosammam
Author-X-Name-First: A.M.
Author-X-Name-Last: Mosammam
Title: Spatio-temporal design: advances in efficient data acquisition
Journal: Journal of Applied Statistics
Pages: 912-913
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.853924
File-URL: http://hdl.handle.net/10.1080/02664763.2013.853924
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:912-913
Template-Type: ReDIF-Article 1.0
Author-Name: Hassan S. Bakouch
Author-X-Name-First: Hassan S.
Author-X-Name-Last: Bakouch
Title: Using the Weibull distribution: Reliability, modeling and inference
Journal: Journal of Applied Statistics
Pages: 913-914
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.853927
File-URL: http://hdl.handle.net/10.1080/02664763.2013.853927
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:913-914
Template-Type: ReDIF-Article 1.0
Author-Name: Božidar V. Popović
Author-X-Name-First: Božidar V.
Author-X-Name-Last: Popović
Title: A course on statistics for finance
Journal: Journal of Applied Statistics
Pages: 914-915
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.853931
File-URL: http://hdl.handle.net/10.1080/02664763.2013.853931
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:914-915
Template-Type: ReDIF-Article 1.0
Author-Name: Isaac Dialsingh
Author-X-Name-First: Isaac
Author-X-Name-Last: Dialsingh
Title: Applied categorical and count data analysis
Journal: Journal of Applied Statistics
Pages: 915-915
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.853934
File-URL: http://hdl.handle.net/10.1080/02664763.2013.853934
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:915-915
Template-Type: ReDIF-Article 1.0
Author-Name: Cl�udia Neves
Author-X-Name-First: Cl�udia
Author-X-Name-Last: Neves
Title: Categorical data analysis, third edition
Journal: Journal of Applied Statistics
Pages: 915-916
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.854979
File-URL: http://hdl.handle.net/10.1080/02664763.2013.854979
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:915-916
Template-Type: ReDIF-Article 1.0
Author-Name: Mikhail Moklyachuk
Author-X-Name-First: Mikhail
Author-X-Name-Last: Moklyachuk
Title: Exercises in probability, second edition
Journal: Journal of Applied Statistics
Pages: 916-917
Issue: 4
Volume: 41
Year: 2014
Month: 4
X-DOI: 10.1080/02664763.2013.854981
File-URL: http://hdl.handle.net/10.1080/02664763.2013.854981
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:4:p:916-917
Template-Type: ReDIF-Article 1.0
Author-Name: Shang-Ling Ou
Author-X-Name-First: Shang-Ling
Author-X-Name-Last: Ou
Author-Name: Li-yu Daisy Liu
Author-X-Name-First: Li-yu Daisy
Author-X-Name-Last: Liu
Author-Name: Yih-Chang Ou
Author-X-Name-First: Yih-Chang
Author-X-Name-Last: Ou
Title: Using a genetic algorithm-based RAROC model for the performance and persistence of the funds
Abstract:
Assisting fund investors in making better
investment decisions when faced with market climate change is an important
subject. For this purpose, we adopt a genetic algorithm (GA) to search for
an optimal decay factor for an exponential weighted moving average model,
which is used to calculate the value at risk combined with risk-adjusted
return on capital (RAROC). We then propose a GA-based RAROC model. Next,
using the model we find the optimal decay factor and investigate the
performance and persistence of 31 Taiwanese open-end equity mutual funds
over the period from November 2006 to October 2009, divided into three
periods: November 2006--October 2007, November 2007--October 2008, and
November 2008--October 2009, which includes the global financial crisis.
We find that for three periods, the optimal decay factors are 0.999,
0.951, and 0.990, respectively. The rankings of funds between bull and
bear markets are quite different. Moreover, the proposed model improves
performance persistence. That is, a fund's past performance will continue
into the future.
Journal: Journal of Applied Statistics
Pages: 929-943
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.856870
File-URL: http://hdl.handle.net/10.1080/02664763.2013.856870
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:929-943
Template-Type: ReDIF-Article 1.0
Author-Name: K. Fačevicov�
Author-X-Name-First: K.
Author-X-Name-Last: Fačevicov�
Author-Name: K. Hron
Author-X-Name-First: K.
Author-X-Name-Last: Hron
Author-Name: V. Todorov
Author-X-Name-First: V.
Author-X-Name-Last: Todorov
Author-Name: D. Guo
Author-X-Name-First: D.
Author-X-Name-Last: Guo
Author-Name: M. Templ
Author-X-Name-First: M.
Author-X-Name-Last: Templ
Title: Logratio approach to statistical analysis of 2×2 compositional tables
Abstract:
Compositional tables represent a
continuous counterpart to well-known contingency tables. Their cells
contain quantitatively expressed relative contributions of a whole,
carrying exclusively relative information and are popularly represented in
proportions or percentages. The resulting factors, corresponding to rows
and columns of the table, can be inspected similarly as with contingency
tables, e.g. for their mutual independent behaviour. The nature of
compositional tables requires a specific geometrical treatment,
represented by the Aitchison geometry on the simplex. The properties of
the Aitchison geometry allow a decomposition of the original table into
its independent and interactive parts. Moreover, the specific case of
2×2 compositional tables allows the construction of easily
interpretable orthonormal coordinates (resulting from the isometric
logratio transformation) for the original table and its decompositions.
Consequently, for a sample of compositional tables both explorative
statistical analysis like graphical inspection of the independent and
interactive parts or any statistical inference (odds-ratio-like testing of
independence) can be performed. Theoretical advancements of the presented
approach are demonstrated using two economic applications.
Journal: Journal of Applied Statistics
Pages: 944-958
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.856871
File-URL: http://hdl.handle.net/10.1080/02664763.2013.856871
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:944-958
Template-Type: ReDIF-Article 1.0
Author-Name: Lee Fawcett
Author-X-Name-First: Lee
Author-X-Name-Last: Fawcett
Author-Name: David Walshaw
Author-X-Name-First: David
Author-X-Name-Last: Walshaw
Title: Estimating the probability of simultaneous rainfall extremes within a region: a spatial approach
Abstract:
In this paper we investigate the impact of
model mis-specification, in terms of the dependence structure in the
extremes of a spatial process, on the estimation of key quantities that
are of interest to hydrologists and engineers. For example, it is often
the case that severe flooding occurs as a result of the observation of
rainfall extremes at several locations in a region simultaneously. Thus,
practitioners might be interested in estimates of the joint exceedance
probability of some high levels across these locations. It is likely that
there will be spatial dependence present between the extremes, and this
should be properly accounted for when estimating such probabilities. We
compare the use of standard models from the geostatistics literature with
max-stables models from extreme value theory. We find that, in some
situations, using an incorrect spatial model for our extremes results in a
significant under-estimation of these probabilities which -- in flood
defence terms -- could lead to substantial under-protection.
Journal: Journal of Applied Statistics
Pages: 959-976
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.856872
File-URL: http://hdl.handle.net/10.1080/02664763.2013.856872
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:959-976
Template-Type: ReDIF-Article 1.0
Author-Name: Martin X. Dunbar
Author-X-Name-First: Martin X.
Author-X-Name-Last: Dunbar
Author-Name: Hani M. Samawi
Author-X-Name-First: Hani M.
Author-X-Name-Last: Samawi
Author-Name: Robert Vogel
Author-X-Name-First: Robert
Author-X-Name-Last: Vogel
Author-Name: Lili Yu
Author-X-Name-First: Lili
Author-X-Name-Last: Yu
Title: Steady-state Gibbs sampler estimation for lung cancer data
Abstract:
This paper is based on the application of
a Bayesian model to a clinical trial study to determine a more effective
treatment to lower mortality rates and consequently to increase survival
times among patients with lung cancer. In this study, Qian et
al. [13] strived to determine if a Weibull survival model can be
used to decide whether to stop a clinical trial. The traditional Gibbs
sampler was used to estimate the model parameters. This paper proposes to
use the independent steady-state Gibbs sampling (ISSGS) approach,
introduced by Dunbar et al. [3], to improve the original
Gibbs sampler in multidimensional problems. It is demonstrated that ISSGS
provides accuracy with unbiased estimation and improves the performance
and convergence of the Gibbs sampler in this application.
Journal: Journal of Applied Statistics
Pages: 977-988
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.858671
File-URL: http://hdl.handle.net/10.1080/02664763.2013.858671
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:977-988
Template-Type: ReDIF-Article 1.0
Author-Name: Kun Li
Author-X-Name-First: Kun
Author-X-Name-Last: Li
Author-Name: YongSheng Qian
Author-X-Name-First: YongSheng
Author-X-Name-Last: Qian
Author-Name: WenBo Zhao
Author-X-Name-First: WenBo
Author-X-Name-Last: Zhao
Title: An auxiliary function approach for Lasso in music composition using cellular automata
Abstract:
In this paper, we present an auxiliary
function approach to solve the overlap group Lasso problem. Our goal is to
solve a more general structure with overlapping groups, which is suitable
to be used in cellular automata (CA). The CA were introduced to the
algorithmic composition which is based on the development and
classification. At the same time, concrete algorithm and mapping from CA
to music series are given. Experimental simulations show the effectiveness
of our algorithms, and using the auxiliary function approach to solve
Lasso with CA is a potentially useful music automatic-generation
algorithm.
Journal: Journal of Applied Statistics
Pages: 989-997
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.859233
File-URL: http://hdl.handle.net/10.1080/02664763.2013.859233
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:989-997
Template-Type: ReDIF-Article 1.0
Author-Name: M. Revan Özkale
Author-X-Name-First: M. Revan
Author-X-Name-Last: Özkale
Title: The relative efficiency of the restricted estimators in linear regression models
Abstract:
This paper deals with the problem of
multicollinearity in a multiple linear regression model with linear
equality restrictions. The restricted two parameter estimator which was
proposed in case of multicollinearity satisfies the restrictions. The
performance of the restricted two parameter estimator over the restricted
least squares (RLS) estimator and the ordinary least squares (OLS)
estimator is examined under the mean square error (MSE) matrix criterion
when the restrictions are correct and not correct. The necessary and
sufficient conditions for the restricted ridge regression, restricted Liu
and restricted shrunken estimators, which are the special cases of the
restricted two parameter estimator, to have a smaller MSE matrix than the
RLS and the OLS estimators are derived when the restrictions hold true and
do not hold true. Theoretical results are illustrated with numerical
examples based on Webster, Gunst and Mason data and Gorman and Toman data.
We conduct a final demonstration of the performance of the estimators by
running a Monte Carlo simulation which shows that when the variance of the
error term and the correlation between the explanatory variables are
large, the restricted two parameter estimator performs better than the RLS
estimator and the OLS estimator under the configurations examined.
Journal: Journal of Applied Statistics
Pages: 998-1027
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.859234
File-URL: http://hdl.handle.net/10.1080/02664763.2013.859234
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:998-1027
Template-Type: ReDIF-Article 1.0
Author-Name: Nanhua Zhang
Author-X-Name-First: Nanhua
Author-X-Name-Last: Zhang
Author-Name: Henian Chen
Author-X-Name-First: Henian
Author-X-Name-Last: Chen
Author-Name: Yuanshu Zou
Author-X-Name-First: Yuanshu
Author-X-Name-Last: Zou
Title: A joint model of binary and longitudinal data with non-ignorable missingness, with application to marital stress and late-life major depression in women
Abstract:
Understanding how long-term marital stress
affects major depressive disorder (MDD) in older women has clinical
implications for the treatment of women at risk. In this paper, we
consider the problem of predicting MDD in older women (mean age 60) from a
marital stress scale administered four times during the preceding 20-year
period, with a greater dropout by women experiencing marital stress or
MDD. To analyze these data, we propose a Bayesian joint model consisting
of: (1) a linear mixed effects model for the longitudinal measurements,
(2) a generalized linear model for the binary primary endpoint, and (3) a
shared parameter model for the missing data mechanism. Our analysis
indicates that MDD in older women is significantly associated with higher
levels of prior marital stress and increasing marital stress over time,
although there is a generally decreasing trend in marital stress. This is
the first study to propose a joint model for incompletely observed
longitudinal measurements, a binary primary endpoint, and non-ignorable
missing data; a comparison shows that the joint model yields better
predictive accuracy than a two-stage model. These findings suggest that
women who experience marital stress in mid-life need treatment to help
prevent late-life MDD, which has serious consequences for older persons.
Journal: Journal of Applied Statistics
Pages: 1028-1039
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.859235
File-URL: http://hdl.handle.net/10.1080/02664763.2013.859235
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:1028-1039
Template-Type: ReDIF-Article 1.0
Author-Name: Melody S. Goodman
Author-X-Name-First: Melody S.
Author-X-Name-Last: Goodman
Author-Name: Yi Li
Author-X-Name-First: Yi
Author-X-Name-Last: Li
Author-Name: Anne M. Stoddard
Author-X-Name-First: Anne M.
Author-X-Name-Last: Stoddard
Author-Name: Glorian Sorensen
Author-X-Name-First: Glorian
Author-X-Name-Last: Sorensen
Title: Analysis of ordinal outcomes with longitudinal covariates subject to missingness
Abstract:
We propose a mixture model for data with
an ordinal outcome and a longitudinal covariate that is subject to
missingness. Data from a tailored telephone delivered, smoking cessation
intervention for construction laborers are used to illustrate the method,
which considers as an outcome a categorical measure of smoking cessation,
and evaluates the effectiveness of the motivational telephone interviews
on this outcome. We propose two model structures for the longitudinal
covariate, for the case when the missing data are missing at random, and
when the missing data mechanism is non-ignorable. A generalized EM
algorithm is used to obtain maximum likelihood estimates.
Journal: Journal of Applied Statistics
Pages: 1040-1052
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.859236
File-URL: http://hdl.handle.net/10.1080/02664763.2013.859236
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:1040-1052
Template-Type: ReDIF-Article 1.0
Author-Name: M. Templ
Author-X-Name-First: M.
Author-X-Name-Last: Templ
Author-Name: P. Filzmoser
Author-X-Name-First: P.
Author-X-Name-Last: Filzmoser
Title: Simulation and quality of a synthetic close-to-reality employer--employee population
Abstract:
It is of essential importance that
researchers have access to linked employer--employee data, but such data
sets are rarely available for researchers or the public. Even in case that
survey data have been made available, the evaluation of estimation methods
is usually done by complex design-based simulation studies. For this aim,
data on population level are needed to know the true parameters that are
compared with the estimations derived from complex samples. These samples
are usually drawn from the population under various sampling designs,
missing values and outlier scenarios. The structural earnings statistics
sample survey proposes accurate and harmonized data on the level and
structure of remuneration of employees, their individual characteristics
and the enterprise or place of employment to which they belong in EU
member states and candidate countries. At the basis of this data set, we
show how to simulate a synthetic close-to-reality population representing
the employer and employee structure of Austria. The proposed simulation is
based on work of A. Alfons, S. Kraft, M. Templ, and P. Filzmoser [{\em On
the simulation of complex universes in the case of applying the German
microcensus}, DACSEIS research paper series No. 4, University of T�bingen,
2003] and R. M�nnich and J. Sch�rle [{\em Simulation of close-to-reality
population data for household surveys with application to EU-SILC},
Statistical Methods & Applications 20(3) (2011c), pp. 383--407]. However,
new challenges are related to consider the special structure of
employer--employee data and the complexity induced with the underlying
two-stage design of the survey. By using quality measures in form of
simple summary statistics, benchmarking indicators and visualizations, the
simulated population is analysed and evaluated. An accompanying study on
literature has been made to select the most important benchmarking
indicators.
Journal: Journal of Applied Statistics
Pages: 1053-1072
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.859237
File-URL: http://hdl.handle.net/10.1080/02664763.2013.859237
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:1053-1072
Template-Type: ReDIF-Article 1.0
Author-Name: Matthias Borowski
Author-X-Name-First: Matthias
Author-X-Name-Last: Borowski
Author-Name: Nikolaus Rudak
Author-X-Name-First: Nikolaus
Author-X-Name-Last: Rudak
Author-Name: Birger Hussong
Author-X-Name-First: Birger
Author-X-Name-Last: Hussong
Author-Name: Dominik Wied
Author-X-Name-First: Dominik
Author-X-Name-Last: Wied
Author-Name: Sonja Kuhnt
Author-X-Name-First: Sonja
Author-X-Name-Last: Kuhnt
Author-Name: Wolfgang Tillmann
Author-X-Name-First: Wolfgang
Author-X-Name-Last: Tillmann
Title: On- and offline detection of structural breaks in thermal spraying processes
Abstract:
We investigate and develop methods for
structural break detection, considering time series from thermal spraying
process monitoring. Since engineers induce technical malfunctions during
the processes, the time series exhibit structural breaks at known time
points, giving us valuable information to conduct the investigations.
First, we consider a recently developed robust online
(also real-time) filtering (i.e. smoothing) procedure
that comprises a test for local linearity. This test rejects when jumps
and trend changes are present, so that it can also be useful to detect
such structural breaks online. Second, based on the filtering procedure we
develop a robust method for the online detection of ongoing trends. We
investigate these two methods as to the online detection of structural
breaks by simulations and applications to the time series from the
manipulated spraying processes. Third, we consider a recently developed
fluctuation test for constant variances that can be applied
offline, i.e. after the whole time series has been
observed, to control the spraying results. Since this test is not reliable
when jumps are present in the time series, we suggest data transformation
based on filtering and demonstrate that this transformation makes the test
applicable.
Journal: Journal of Applied Statistics
Pages: 1073-1090
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.860957
File-URL: http://hdl.handle.net/10.1080/02664763.2013.860957
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:1073-1090
Template-Type: ReDIF-Article 1.0
Author-Name: İlker �nal
Author-X-Name-First: İlker
Author-X-Name-Last: �nal
Author-Name: H. Refik Burgut
Author-X-Name-First: H. Refik
Author-X-Name-Last: Burgut
Title: Verification bias on sensitivity and specificity measurements in diagnostic medicine: a comparison of some approaches used for correction
Abstract:
Verification bias may occur when the test
results of not all subjects are verified by using a gold standard. The
correction for this bias can be made using different approaches depending
on whether missing gold standard test results are random or not. Some of
these approaches with binary test and gold standard results include the
correction method by Begg and Greenes, lower and upper limits for
diagnostic measurements by Zhou, logistic regression method, multiple
imputation method, and neural networks. In this study, all these
approaches are compared by employing a real and simulated data under
different conditions.
Journal: Journal of Applied Statistics
Pages: 1091-1104
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.862217
File-URL: http://hdl.handle.net/10.1080/02664763.2013.862217
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:1091-1104
Template-Type: ReDIF-Article 1.0
Author-Name: S. Mostafa Mokhtari
Author-X-Name-First: S. Mostafa
Author-X-Name-Last: Mokhtari
Author-Name: Hamid Alinejad-Rokny
Author-X-Name-First: Hamid
Author-X-Name-Last: Alinejad-Rokny
Author-Name: Hossein Jalalifar
Author-X-Name-First: Hossein
Author-X-Name-Last: Jalalifar
Title: Selection of the best well control system by using fuzzy multiple-attribute decision-making methods
Abstract:
There are numerous difficulties involved
in drilling operations of an oil well, one of the most important of them
being well control. Well control systems are applied when we have
irruption of liquids or unwanted intrusion of the reservoir's liquid (oil,
gas or brine) into the well, during drilling when the pressure of well
fluid column is less than formation pressure, and the permeability of the
reservoir has a value that is able to pass the liquid through. For this
purpose, a variety of methods including Driller, wait and weight, and the
concurrent methods were used to control the well at different drilling
sites. In this study, we investigate the optimum method for well control
using a fussy method based on many parameters, including technical factors
(mud weight, drilling rate, blockage of pipes, sensitivity to drilling
network changes, etc.) and security factors (existence of effervescent
mud, drilling circuit control, etc.), and cost of selection, which is one
of the most important decisions that are made under critical conditions
such as irruption. Till now, these methods were selected based on the
experience of field personnel in drilling sites. The technical criteria
and standards were influenced by experience, so the soft computerizing
system (fuzzy method) was used. Thus, both these criteria and standards
would be of greater importance and indicate whether the optimum numerical
method is the same one that is expressed by human experience. The
concurrent method was selected as the best for well control, using the
fuzzy method at the end of the evaluation, while field personnel
experience suggests the Driller method.
Journal: Journal of Applied Statistics
Pages: 1105-1121
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.862218
File-URL: http://hdl.handle.net/10.1080/02664763.2013.862218
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:1105-1121
Template-Type: ReDIF-Article 1.0
Author-Name: A. Parchami
Author-X-Name-First: A.
Author-X-Name-Last: Parchami
Author-Name: B. Sadeghpour-Gildeh
Author-X-Name-First: B.
Author-X-Name-Last: Sadeghpour-Gildeh
Author-Name: M. Nourbakhsh
Author-X-Name-First: M.
Author-X-Name-Last: Nourbakhsh
Author-Name: M. Mashinchi
Author-X-Name-First: M.
Author-X-Name-Last: Mashinchi
Title: A new generation of process capability indices based on fuzzy measurements
Abstract:
Process capability indices (PCIs) provide
numerical measures on whether a process conforms to the defined
manufacturing capability prerequisite. These have been successfully
applied by companies to compete with and to lead high-profit markets by
evaluating the quality and productivity performance. The PCI
C p compares
the output of a process to the specification limits (SLs) by forming the
ratio of the width between the process SLs with the width of the natural
tolerance limits which is measured by six process standard deviation
units. As another common PCI, C
pm incorporates two variation components which are
variation to the process mean and deviation of the process mean from the
target. A meaningful generalized version of above PCIs is introduced in
this paper which is able to handle in a fuzzy environment. These
generalized PCIs are able to measure the capability of a fuzzy-valued
process in producing products on the basis of a fuzzy quality. Fast
computing formulas for the generalized PCIs are computed for normal and
symmetric triangular fuzzy observations, where the fuzzy quality is
defined by linear and exponential fuzzy SLs. A practical example is
presented to show the performance of proposed indices.
Journal: Journal of Applied Statistics
Pages: 1122-1136
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.862219
File-URL: http://hdl.handle.net/10.1080/02664763.2013.862219
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:1122-1136
Template-Type: ReDIF-Article 1.0
Author-Name: Feridun Tasdan
Author-X-Name-First: Feridun
Author-X-Name-Last: Tasdan
Author-Name: Meral Cetin
Author-X-Name-First: Meral
Author-X-Name-Last: Cetin
Title: A simulation study on the influence of ties on uniform scores test for circular data
Abstract:
Uniform scores test is a rank-based method
that tests the homogeneity of k-populations in circular
data problems. The influence of ties on the uniform scores test has been
emphasized by several authors in several articles and books. Moreover, it
is suggested that the uniform scores test should be used with caution if
ties are present in the data. This paper investigates the influence of
ties on the uniform scores test by computing the power of the test using
average, randomization, permutation, minimum, and maximum methods to break
ties. Monte Carlo simulation is performed to compute the power of the test
under several scenarios such as having 5% or 10% of ties and tie group
structures in the data. The simulation study shows no significant
difference among the methods under the existence of ties but the test
loses its power when there are many ties or complicated group structures.
Thus, randomization or average methods are equally powerful to break ties
when applying uniform scores test. Also, it can be concluded that
k-sample uniform scores test can be used safely without
sacrificing the power if there are only less than 5% of ties or at most
two groups of a few ties.
Journal: Journal of Applied Statistics
Pages: 1137-1146
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.862224
File-URL: http://hdl.handle.net/10.1080/02664763.2013.862224
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:1137-1146
Template-Type: ReDIF-Article 1.0
Author-Name: Feridun Tasdan
Author-X-Name-First: Feridun
Author-X-Name-Last: Tasdan
Author-Name: Ozgur Yeniay
Author-X-Name-First: Ozgur
Author-X-Name-Last: Yeniay
Title: A shift parameter estimation based on smoothed Kolmogorov--Smirnov
Abstract:
A new procedure of shift parameter
estimation in the two-sample location problem is investigated and compared
with existing estimators. The proposed procedure smooths the empirical
distribution functions of each random sample and replaces empirical
distribution functions in the two-sample Kolmogorov--Smirnov method. The
smoothed Kolmogorov--Smirnov is minimized with respect to an arbitrary
shift variable in order to find an estimate of the shift parameter. The
proposed procedure can be considered the smoothed version of a very little
known method of shift parameter estimation from Rao-Schuster-Littell (RSL)
[Rao et al., Estimation of shift and center of
symmetry based on Kolmogorov--Smirnov statistics, Ann. Stat. 3(4)
(1975), pp. 862--873]. Their estimator will be discussed and compared with
the proposed estimator in this paper. An example and simulation studies
have been performed to compare the proposed procedure with existing shift
parameter estimators such as Hodges--Lehmann (H--L) and least squares in
addition to RSL's estimator. The results show that the proposed estimator
has lower mean-squared error as well as higher relative efficiency against
RSL's estimator under normal or contaminated normal model assumptions.
Moreover, the proposed estimator performs competitively against H--L and
least-squares shift estimators. Smoother function and bandwidth selections
are also discussed and several alternatives are proposed in the study.
Journal: Journal of Applied Statistics
Pages: 1147-1159
Issue: 5
Volume: 41
Year: 2014
Month: 5
X-DOI: 10.1080/02664763.2013.862225
File-URL: http://hdl.handle.net/10.1080/02664763.2013.862225
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:5:p:1147-1159
Template-Type: ReDIF-Article 1.0
Author-Name: Jie Shen
Author-X-Name-First: Jie
Author-X-Name-Last: Shen
Author-Name: Colin M. Gallagher
Author-X-Name-First: Colin M.
Author-X-Name-Last: Gallagher
Author-Name: QiQi Lu
Author-X-Name-First: QiQi
Author-X-Name-Last: Lu
Title: Detection of multiple undocumented change-points using adaptive Lasso
Abstract:
The problem of detecting multiple undocumented change-points in a
historical temperature sequence with simple linear trend is formulated by
a linear model. We apply adaptive least absolute shrinkage and selection
operator (Lasso) to estimate the number and locations of change-points.
Model selection criteria are used to choose the Lasso smoothing parameter.
As adaptive Lasso may overestimate the number of change-points, we perform
post-selection on change-points detected by adaptive Lasso using
multivariate t simultaneous confidence intervals. Our
method is demonstrated on the annual temperature data (year: 1902-2000)
from Tuscaloosa, Alabama.
Journal: Journal of Applied Statistics
Pages: 1161-1173
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.862220
File-URL: http://hdl.handle.net/10.1080/02664763.2013.862220
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1161-1173
Template-Type: ReDIF-Article 1.0
Author-Name: Marek Brabec
Author-X-Name-First: Marek
Author-X-Name-Last: Brabec
Author-Name: Viorel Badescu
Author-X-Name-First: Viorel
Author-X-Name-Last: Badescu
Author-Name: Marius Paulescu
Author-X-Name-First: Marius
Author-X-Name-Last: Paulescu
Title: Cloud shade by dynamic logistic modeling
Abstract:
During the daytime, the sun is shining or not at ground level depending on
clouds motion. Two binary variables may be used to quantify this process:
the sunshine number (SSN) and the sunshine stability number (SSSN). The
sequential features of SSN are treated in this paper by using Markovian
Logistic Regression models, which avoid usual weaknesses of autoregressive
integrated moving average modeling. The theory is illustrated with results
obtained by using measurements performed in 2010 at Timisoara (southern
Europe). Simple modeling taking into account internal dynamics with one
lag history brings substantial reduction of misclassification compared
with the persistence approach (to less than 57%). When longer history is
considered, all the lags up to at least 8 are important. The seasonal
changes are rather concentrated to low lags. Better performance is
associated with a more stable radiative regime. More involved models add
external influences (such as sun elevation angle or astronomic declination
as well as taking into account morning and afternoon effects separately).
Models including sun elevation effects are significantly better than those
ignoring them. Clearly, during the winter months, the effect of
declination is much more pronounced compared with the rest of the year.
SSSN is important in long-term considerations and it also plays a role in
retrospective assessment of the SSN. However, it is not easy to use SSSN
for predicting future SSN. Using more complicated past beam clearness
models does not necessarily provide better results than more simple models
with SSN past.
Journal: Journal of Applied Statistics
Pages: 1174-1188
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.862221
File-URL: http://hdl.handle.net/10.1080/02664763.2013.862221
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1174-1188
Template-Type: ReDIF-Article 1.0
Author-Name: Mohammad Reza Meshkani
Author-X-Name-First: Mohammad Reza
Author-X-Name-Last: Meshkani
Author-Name: Afshin Fallah
Author-X-Name-First: Afshin
Author-X-Name-Last: Fallah
Author-Name: Amir Kavousi
Author-X-Name-First: Amir
Author-X-Name-Last: Kavousi
Title: Analysis of covariance under inverse Gaussian model
Abstract:
This paper considers the problem of analysis of covariance (ANCOVA) under
the assumption of inverse Gaussian distribution for response variable. We
develop the essential methodology for estimating the model parameters via
maximum likelihood method. The general form of the maximum likelihood
estimator is obtained in color closed form. Adjusted treatment effects and
adjusted covariate effects are given, too. We also provide the asymptotic
distribution of the proposed estimators. A simulation study and a real
world application are also performed to illustrate and evaluate the
proposed methodology.
Journal: Journal of Applied Statistics
Pages: 1189-1202
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.862222
File-URL: http://hdl.handle.net/10.1080/02664763.2013.862222
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1189-1202
Template-Type: ReDIF-Article 1.0
Author-Name: Ngianga-Bakwin Kandala
Author-X-Name-First: Ngianga-Bakwin
Author-X-Name-Last: Kandala
Author-Name: Samuel O.M. Manda
Author-X-Name-First: Samuel O.M.
Author-X-Name-Last: Manda
Author-Name: William W. Tigbe
Author-X-Name-First: William W.
Author-X-Name-Last: Tigbe
Author-Name: Henry Mwambi
Author-X-Name-First: Henry
Author-X-Name-Last: Mwambi
Author-Name: Saverio Stranges
Author-X-Name-First: Saverio
Author-X-Name-Last: Stranges
Title: Geographic distribution of cardiovascular comorbidities in South Africa: a national cross-sectional analysis
Abstract:
Objectives: We sought to estimate the spatial coexistence
of hypertension, coronary heart disease (CHD), stroke and
hypercholesterolaemia in South Africa. Design:
Cross-sectional. Setting: Sub-Saharan Africa and South
Africa. Participants: Data were from 13,827 adults
(mean±SD age 39±18 years, 58.4% women) interviewed in the
1998 South African Health and Demographic Survey.
Interventions: N/A. Primary and secondary outcome
measures: We used multivariate spatial disease models to estimate
district-level shared and disease-specific spatial risk components,
controlling for known individual risk factors. Results:
In univariate analysis, observed prevalence of hypertension and CHD is was
high in the south-western parts, and low in the north east. Stroke and
high blood cholesterol prevalence appeared to be evenly distributed across
the country. In multivariate analysis (adjusting for age, gender,
ethnicity, education, urban-dwelling, smoking, alcohol consumption and
obesity), hypertension and stroke prevalence were highly concentrated in
the south-western parts, whilst CHD and hypercholesterolaemia were highly
prevalent in central and top north-eastern corridor, respectively. The
shared component, which we took to represent nutrition and other lifestyle
factors not accounted for in the model, had a larger effect on
cardiovascular disease prevalence in the south-western areas of the
country. It appeared to have greater effect on hypertension and CHD.
Conclusion: This study suggests a clear geographic
distribution of cardiovascular disease in South Africa, driven possibly by
shared lifestyle behaviours. These findings might be useful for public
health resource allocation in low-income settings.
Journal: Journal of Applied Statistics
Pages: 1203-1216
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.862223
File-URL: http://hdl.handle.net/10.1080/02664763.2013.862223
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1203-1216
Template-Type: ReDIF-Article 1.0
Author-Name: Kadri Ulas Akay
Author-X-Name-First: Kadri Ulas
Author-X-Name-Last: Akay
Title: A graphical evaluation of logistic ridge estimator in mixture experiments
Abstract:
In comparison to other experimental studies, multicollinearity appears
frequently in mixture experiments, a special study area of response
surface methodology, due to the constraints on the components composing
the mixture. In the analysis of mixture experiments by using a special
generalized linear model, logistic regression model, multicollinearity
causes precision problems in the maximum-likelihood logistic regression
estimate. Therefore, effects due to multicollinearity can be reduced to a
certain extent by using alternative approaches. One of these approaches is
to use biased estimators for the estimation of the coefficients. In this
paper, we suggest the use of logistic ridge regression (RR) estimator in
the cases where there is multicollinearity during the analysis of mixture
experiments using logistic regression. Also, for the selection of the
biasing parameter, we use fraction of design space plots for evaluating
the effect of the logistic RR estimator with respect to the scaled mean
squared error of prediction. The suggested graphical approaches are
illustrated on the tumor incidence data set.
Journal: Journal of Applied Statistics
Pages: 1217-1232
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.864261
File-URL: http://hdl.handle.net/10.1080/02664763.2013.864261
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1217-1232
Template-Type: ReDIF-Article 1.0
Author-Name: D.M. Sakate
Author-X-Name-First: D.M.
Author-X-Name-Last: Sakate
Author-Name: D.N. Kashid
Author-X-Name-First: D.N.
Author-X-Name-Last: Kashid
Title: Variable selection via penalized minimum φ-divergence estimation in logistic regression
Abstract:
We propose penalized minimum φ-divergence estimator for parameter
estimation and variable selection in logistic regression. Using an
appropriate penalty function, we show that penalized φ-divergence
estimator has oracle property. With probability tending to 1, penalized
φ-divergence estimator identifies the true model and estimates nonzero
coefficients as efficiently as if the sparsity of the true model was known
in advance. The advantage of penalized φ-divergence estimator is that
it produces estimates of nonzero parameters efficiently than penalized
maximum likelihood estimator when sample size is small and is equivalent
to it for large one. Numerical simulations confirm our findings.
Journal: Journal of Applied Statistics
Pages: 1233-1246
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.864262
File-URL: http://hdl.handle.net/10.1080/02664763.2013.864262
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1233-1246
Template-Type: ReDIF-Article 1.0
Author-Name: Stephanie Sapp
Author-X-Name-First: Stephanie
Author-X-Name-Last: Sapp
Author-Name: Mark J. van der Laan
Author-X-Name-First: Mark J.
Author-X-Name-Last: van der Laan
Author-Name: John Canny
Author-X-Name-First: John
Author-X-Name-Last: Canny
Title: Subsemble: an ensemble method for combining subset-specific algorithm fits
Abstract:
Ensemble methods using the same underlying algorithm trained on different
subsets of observations have recently received increased attention as
practical prediction tools for massive data sets. We propose Subsemble: a
general subset ensemble prediction method, which can be used for small,
moderate, or large data sets. Subsemble partitions the full data set into
subsets of observations, fits a specified underlying algorithm on each
subset, and uses a clever form of V-fold cross-validation
to output a prediction function that combines the subset-specific fits. We
give an oracle result that provides a theoretical performance guarantee
for Subsemble. Through simulations, we demonstrate that Subsemble can be a
beneficial tool for small- to moderate-sized data sets, and often has
better prediction performance than the underlying algorithm fit just once
on the full data set. We also describe how to include Subsemble as a
candidate in a SuperLearner library, providing a practical way to evaluate
the performance of Subsemble relative to the underlying algorithm fit just
once on the full data set.
Journal: Journal of Applied Statistics
Pages: 1247-1259
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.864263
File-URL: http://hdl.handle.net/10.1080/02664763.2013.864263
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1247-1259
Template-Type: ReDIF-Article 1.0
Author-Name: Guoyi Zhang
Author-X-Name-First: Guoyi
Author-X-Name-Last: Zhang
Title: Improved R and s control charts for monitoring the process variance
Abstract:
The Shewhart R control chart and s
control chart are widely used to monitor shifts in the process spread. One
fact is that the distributions of the range and sample standard deviation
are highly skewed. Therefore, the R chart and
s chart neither provide an in-control average run length
(ARL) of approximately 370 nor guarantee the desired type I error of
0.0027. Another disadvantage of these two charts is their failure in
detecting an improvement in the process variability. In order to overcome
these shortcomings, we propose the improved R chart (IRC)
and s chart (ISC) with accurate approximation of the
control limits by using cumulative distribution functions of the sample
range and standard deviation. Simulation studies show that the IRC and ISC
perform very well. We also compare the type II error risks and ARLs of the
IRC and ISC and found that the s chart is generally more
efficient than the R chart. Examples are given to
illustrate the use of the developed charts.
Journal: Journal of Applied Statistics
Pages: 1260-1273
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.864264
File-URL: http://hdl.handle.net/10.1080/02664763.2013.864264
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1260-1273
Template-Type: ReDIF-Article 1.0
Author-Name: Ying Dong
Author-X-Name-First: Ying
Author-X-Name-Last: Dong
Author-Name: Lixin Song
Author-X-Name-First: Lixin
Author-X-Name-Last: Song
Author-Name: Mingqiu Wang
Author-X-Name-First: Mingqiu
Author-X-Name-Last: Wang
Author-Name: Ying Xu
Author-X-Name-First: Ying
Author-X-Name-Last: Xu
Title: Combined-penalized likelihood estimations with a diverging number of parameters
Abstract:
In the economics and biological gene expression study area where a large
number of variables will be involved, even when the predictors are
independent, as long as the dimension is high, the maximum sample
correlation can be large. Variable selection is a fundamental method to
deal with such models. The ridge regression performs well when the
predictors are highly correlated and some nonconcave penalized
thresholding estimators enjoy the nice oracle property. In order to
provide a satisfactory solution to the collinearity problem, in this paper
we report the combined-penalization (CP) mixed by the nonconcave penalty
and ridge, with a diverging number of parameters. It is observed that the
CP estimator with a diverging number of parameters can correctly select
covariates with nonzero coefficients and can estimate parameters
simultaneously in the presence of multicollinearity. Simulation studies
and a real data example demonstrate the well performance of the proposed
method.
Journal: Journal of Applied Statistics
Pages: 1274-1285
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.868415
File-URL: http://hdl.handle.net/10.1080/02664763.2013.868415
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1274-1285
Template-Type: ReDIF-Article 1.0
Author-Name: Rafael Pimentel Maia
Author-X-Name-First: Rafael Pimentel
Author-X-Name-Last: Maia
Author-Name: Per Madsen
Author-X-Name-First: Per
Author-X-Name-Last: Madsen
Author-Name: Rodrigo Labouriau
Author-X-Name-First: Rodrigo
Author-X-Name-Last: Labouriau
Title: Multivariate survival mixed models for genetic analysis of longevity traits
Abstract:
A class of multivariate mixed survival models for continuous and discrete
time with a complex covariance structure is introduced in a context of
quantitative genetic applications. The methods introduced can be used in
many applications in quantitative genetics although the discussion
presented concentrates on longevity studies. The framework presented
allows to combine models based on continuous time with models based on
discrete time in a joint analysis. The continuous time models are
approximations of the frailty model in which the baseline hazard function
will be assumed to be piece-wise constant. The discrete time models used
are multivariate variants of the discrete relative risk models. These
models allow for regular parametric likelihood-based inference by
exploring a coincidence of their likelihood functions and the likelihood
functions of suitably defined multivariate generalized linear mixed
models. The models include a dispersion parameter, which is essential for
obtaining a decomposition of the variance of the trait of interest as a
sum of parcels representing the additive genetic effects, environmental
effects and unspecified sources of variability; as required in
quantitative genetic applications. The methods presented are implemented
in such a way that large and complex quantitative genetic data can be
analyzed. Some key model control techniques are discussed in a
supplementary online material.
Journal: Journal of Applied Statistics
Pages: 1286-1306
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.868416
File-URL: http://hdl.handle.net/10.1080/02664763.2013.868416
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1286-1306
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Dawson
Author-X-Name-First: Peter
Author-X-Name-Last: Dawson
Author-Name: Paul Downward
Author-X-Name-First: Paul
Author-X-Name-Last: Downward
Author-Name: Terence C. Mills
Author-X-Name-First: Terence C.
Author-X-Name-Last: Mills
Title: Olympic news and attitudes towards the Olympics: a compositional time-series analysis of how sentiment is affected by events
Abstract:
Sentiment affects the evolving economic valuation of companies through the
stock market. It is unclear how 'news' affects the sentiment towards major
public investments like the Olympics. In this paper we consider, from the
context of the pre-event stage of the 30th Olympiad, the relationship
between attitudes towards the Olympics and Olympic-related news;
specifically the bad news associated with an increase in the cost of
provision, and the good news associated with Team Great Britain's medal
success in 2008. Using a unique data set and an event-study approach that
involves compositional time-series analysis, it is found that 'good' news
affects sentiments much more than 'bad', but that the distribution of such
sentiment varies widely. For example, a much more pronounced effect of
good news is identified for females than males, but 'bad' news has less of
an impact on the young and older age groups.
Journal: Journal of Applied Statistics
Pages: 1307-1314
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.868417
File-URL: http://hdl.handle.net/10.1080/02664763.2013.868417
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1307-1314
Template-Type: ReDIF-Article 1.0
Author-Name: A.A.M. Nurunnabi
Author-X-Name-First: A.A.M.
Author-X-Name-Last: Nurunnabi
Author-Name: Ali S. Hadi
Author-X-Name-First: Ali S.
Author-X-Name-Last: Hadi
Author-Name: A.H.M.R. Imon
Author-X-Name-First: A.H.M.R.
Author-X-Name-Last: Imon
Title: Procedures for the identification of multiple influential observations in linear regression
Abstract:
Since the seminal paper by Cook (1977) in which he introduced Cook's
distance, the identification of influential observations has received a
great deal of interest and extensive investigation in linear regression.
It is well documented that most of the popular diagnostic measures that
are based on single-case deletion can mislead the analysis in the presence
of multiple influential observations because of the well-known masking
and/or swamping phenomena. Atkinson (1981) proposed a modification of
Cook's distance. In this paper we propose a further modification of the
Cook's distance for the identification of a single influential
observation. We then propose new measures for the identification of
multiple influential observations, which are not affected by the masking
and swamping problems. The efficiency of the new statistics is presented
through several well-known data sets and a simulation study.
Journal: Journal of Applied Statistics
Pages: 1315-1331
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.868418
File-URL: http://hdl.handle.net/10.1080/02664763.2013.868418
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1315-1331
Template-Type: ReDIF-Article 1.0
Author-Name: Laurens Beran
Author-X-Name-First: Laurens
Author-X-Name-Last: Beran
Title: Hypothesis tests to determine if all true positives have been identified on a receiver operating characteristic curve
Abstract:
For classification problems where the test data are labeled sequentially,
the point at which all true positives are first identified is often of
critical importance. This article develops hypothesis tests to assess
whether all true positives have been labeled in the test data. The tests
use a partial receiver operating characteristic (ROC) that is generated
from a labeled subset of the test data. These methods are developed in the
context of unexploded ordnance (UXO) classification, but are applicable to
any binary classification problem. First, the likelihood of the observed
ROC given binormal model parameters is derived using order statistics,
leading to a nonlinear parameter estimation problem. I then derive the
approximate distribution of the point on the ROC at which all true
instances are found. Using estimated binormal parameters, this
distribution can be integrated up to a desired confidence level to define
a critical false alarm rate (FAR). If the selected operating point is
before this critical point, then additional labels out to the critical
point are required. A second test uses the uncertainty in binormal
parameters to determine the critical FAR. These tests are demonstrated
with UXO classification examples and both approaches are recommended for
testing operating points.
Journal: Journal of Applied Statistics
Pages: 1332-1341
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.868598
File-URL: http://hdl.handle.net/10.1080/02664763.2013.868598
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1332-1341
Template-Type: ReDIF-Article 1.0
Author-Name: Xiao-Feng Wang
Author-X-Name-First: Xiao-Feng
Author-X-Name-Last: Wang
Author-Name: Bo Hu
Author-X-Name-First: Bo
Author-X-Name-Last: Hu
Author-Name: Bin Wang
Author-X-Name-First: Bin
Author-X-Name-Last: Wang
Author-Name: Kuangnan Fang
Author-X-Name-First: Kuangnan
Author-X-Name-Last: Fang
Title: Bayesian generalized varying coefficient models for longitudinal proportional data with errors-in-covariates
Abstract:
This paper is motivated from a neurophysiological study of muscle fatigue,
in which biomedical researchers are interested in understanding the
time-dependent relationships of handgrip force and electromyography
measures. A varying coefficient model is appealing here to investigate the
dynamic pattern in the longitudinal data. The response variable in the
study is continuous but bounded on the standard unit interval (0, 1) over
time, while the longitudinal covariates are contaminated with measurement
errors. We propose a generalization of varying coefficient models for the
longitudinal proportional data with errors-in-covariates. We describe two
estimation methods with penalized splines, which are formalized under a
Bayesian inferential perspective. The first method is an adaptation of the
popular regression calibration approach. The second method is based on a
joint likelihood under the hierarchical Bayesian model. A simulation study
is conducted to evaluate the efficacy of the proposed methods under
different scenarios. The analysis of the neurophysiological data is
presented to demonstrate the use of the methods.
Journal: Journal of Applied Statistics
Pages: 1342-1357
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.868870
File-URL: http://hdl.handle.net/10.1080/02664763.2013.868870
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1342-1357
Template-Type: ReDIF-Article 1.0
Author-Name: Zhengming Xing
Author-X-Name-First: Zhengming
Author-X-Name-Last: Xing
Author-Name: Bradley Nicholson
Author-X-Name-First: Bradley
Author-X-Name-Last: Nicholson
Author-Name: Monica Jimenez
Author-X-Name-First: Monica
Author-X-Name-Last: Jimenez
Author-Name: Timothy Veldman
Author-X-Name-First: Timothy
Author-X-Name-Last: Veldman
Author-Name: Lori Hudson
Author-X-Name-First: Lori
Author-X-Name-Last: Hudson
Author-Name: Joseph Lucas
Author-X-Name-First: Joseph
Author-X-Name-Last: Lucas
Author-Name: David Dunson
Author-X-Name-First: David
Author-X-Name-Last: Dunson
Author-Name: Aimee K. Zaas
Author-X-Name-First: Aimee K.
Author-X-Name-Last: Zaas
Author-Name: Christopher W. Woods
Author-X-Name-First: Christopher W.
Author-X-Name-Last: Woods
Author-Name: Geoffrey S. Ginsburg
Author-X-Name-First: Geoffrey S.
Author-X-Name-Last: Ginsburg
Author-Name: Lawrence Carin
Author-X-Name-First: Lawrence
Author-X-Name-Last: Carin
Title: Bayesian modeling of temporal properties of infectious disease in a college student population
Abstract:
A Bayesian statistical model is developed for analysis of the
time-evolving properties of infectious disease, with a particular focus on
viruses. The model employs a latent semi-Markovian state process, and the
state-transition statistics are driven by three terms: (i) a general
time-evolving trend of the overall population, (ii) a semi-periodic term
that accounts for effects caused by the days of the week, and (iii) a
regression term that relates the probability of infection to covariates
(here, specifically, to the Google Flu Trends data). Computations are
performed using Markov Chain Monte Carlo sampling. Results are presented
using a novel data set: daily self-reported symptom scores from hundreds
of Duke University undergraduate students, collected over three academic
years. The illnesses associated with these students are (imperfectly)
labeled using real-time (RT) polymerase chain reaction (PCR) testing for
several viruses, and gene-expression data were also analyzed. The
statistical analysis is performed on the daily, self-reported symptom
scores, and the RT PCR and gene-expression data are employed for analysis
and interpretation of the model results.
Journal: Journal of Applied Statistics
Pages: 1358-1382
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.870138
File-URL: http://hdl.handle.net/10.1080/02664763.2013.870138
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1358-1382
Template-Type: ReDIF-Article 1.0
Author-Name: Feng-Chang Xie
Author-X-Name-First: Feng-Chang
Author-X-Name-Last: Xie
Author-Name: Jin-Guan Lin
Author-X-Name-First: Jin-Guan
Author-X-Name-Last: Lin
Author-Name: Bo-Cheng Wei
Author-X-Name-First: Bo-Cheng
Author-X-Name-Last: Wei
Title: Bayesian zero-inflated generalized Poisson regression model: estimation and case influence diagnostics
Abstract:
Count data with excess zeros arises in many contexts. Here our concern is
to develop a Bayesian analysis for the zero-inflated generalized Poisson
(ZIGP) regression model to address this problem. This model provides a
useful generalization of zero-inflated Poisson model since the generalized
Poisson distribution is overdispersed/underdispersed relative to Poisson.
Due to the complexity of the ZIGP model, Markov chain Monte Carlo methods
are used to develop a Bayesian procedure for the considered model.
Additionally, some discussions on the model selection criteria are
presented and a Bayesian case deletion influence diagnostics is
investigated for the joint posterior distribution based on the
Kullback-Leibler divergence. Finally, a simulation study and a
psychological example are given to illustrate our methodology.
Journal: Journal of Applied Statistics
Pages: 1383-1392
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.871508
File-URL: http://hdl.handle.net/10.1080/02664763.2013.871508
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1383-1392
Template-Type: ReDIF-Article 1.0
Author-Name: Chih-Yueh Wang
Author-X-Name-First: Chih-Yueh
Author-X-Name-Last: Wang
Title: Partial differential equations for probabilists
Journal: Journal of Applied Statistics
Pages: 1393-1394
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.859806
File-URL: http://hdl.handle.net/10.1080/02664763.2013.859806
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1393-1394
Template-Type: ReDIF-Article 1.0
Author-Name: Chris Beeley
Author-X-Name-First: Chris
Author-X-Name-Last: Beeley
Title: Statistics
Journal: Journal of Applied Statistics
Pages: 1394-1394
Issue: 6
Volume: 41
Year: 2014
Month: 6
X-DOI: 10.1080/02664763.2013.868638
File-URL: http://hdl.handle.net/10.1080/02664763.2013.868638
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:6:p:1394-1394
Template-Type: ReDIF-Article 1.0
Author-Name: J.F. Mu�oz
Author-X-Name-First: J.F.
Author-X-Name-Last: Mu�oz
Author-Name: E. �lvarez
Author-X-Name-First: E.
Author-X-Name-Last: �lvarez
Author-Name: M. Rueda
Author-X-Name-First: M.
Author-X-Name-Last: Rueda
Title: Optimum design-based ratio estimators of the distribution function
Abstract:
The ratio method is commonly used to the estimation of means and totals.
This method was extended to the problem of estimating the distribution
function. An alternative ratio estimator of the distribution function is
defined. A result that compares the variances of the aforementioned ratio
estimators is used to define optimum design-based ratio estimators of the
distribution function. Different empirical results indicate that the
optimum ratio estimators can be more efficient than alternative ratio
estimators. In addition, we show by simulations that alternative ratio
estimators can have large biases, whereas biases of the optimum ratio
estimators are negligible in this situation.
Journal: Journal of Applied Statistics
Pages: 1395-1407
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2013.870983
File-URL: http://hdl.handle.net/10.1080/02664763.2013.870983
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1395-1407
Template-Type: ReDIF-Article 1.0
Author-Name: Bruno Chaves Franco
Author-X-Name-First: Bruno Chaves
Author-X-Name-Last: Franco
Author-Name: Philippe Castagliola
Author-X-Name-First: Philippe
Author-X-Name-Last: Castagliola
Author-Name: Giovanni Celano
Author-X-Name-First: Giovanni
Author-X-Name-Last: Celano
Author-Name: Antonio Fernando Branco Costa
Author-X-Name-First: Antonio Fernando Branco
Author-X-Name-Last: Costa
Title: A new sampling strategy to reduce the effect of autocorrelation on a control chart
Abstract:
On-line monitoring of quality characteristics is essential to limit scrap
and rework costs due to bad quality in a manufacturing process. In several
manufacturing environments, during production process data can be
massively collected with high sampling rates and tight sampling
frequencies. As a consequence, natural autocorrelation may arise among
consecutive measures within a sample. Autocorrelation
significantly inflates the average run length of a control chart and
deteriorates its sensitivity to the occurrence of assignable causes. In
this paper, we propose a new mixed sampling strategy for
the Shewhart chart monitoring
the sample mean in a process where temporal autocorrelation between two
consecutive observations can be represented by means of a first order
autoregressive model AR(1). With this strategy, the sample mean at each
inspection time is computed by merging measures of a generic quality
characteristic from two consecutive samples taken h hours
apart. The statistical properties of a Shewhart
control chart
implementing the proposed strategy are compared to those implementing a
skipping strategy recently proposed in literature. A
numerical analysis shows that the mixed sampling
outperforms the skipping sampling strategy for high
levels of autocorrelation.
Journal: Journal of Applied Statistics
Pages: 1408-1421
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2013.871507
File-URL: http://hdl.handle.net/10.1080/02664763.2013.871507
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1408-1421
Template-Type: ReDIF-Article 1.0
Author-Name: Tapio Nummi
Author-X-Name-First: Tapio
Author-X-Name-Last: Nummi
Author-Name: Tiina Hakanen
Author-X-Name-First: Tiina
Author-X-Name-Last: Hakanen
Author-Name: Liudmila Lipi�inen
Author-X-Name-First: Liudmila
Author-X-Name-Last: Lipi�inen
Author-Name: Ulla Harjunmaa
Author-X-Name-First: Ulla
Author-X-Name-Last: Harjunmaa
Author-Name: Matti K. Salo
Author-X-Name-First: Matti K.
Author-X-Name-Last: Salo
Author-Name: Marja-Terttu Saha
Author-X-Name-First: Marja-Terttu
Author-X-Name-Last: Saha
Author-Name: Nina Vuorela
Author-X-Name-First: Nina
Author-X-Name-Last: Vuorela
Title: A trajectory analysis of body mass index for Finnish children
Abstract:
The aim of this study is to investigate the early development of body mass
index (BMI), a standard tool for assessing the body shape and average
level of adiposity for children and adults. The main aim of the study is
to identify the primary trajectories of BMI development and to investigate
the changes of certain growth characteristics over time. Based on our
longitudinal data of 4223 Finnish children, we took anthropometric
measurements from birth up to 15 years of age for birth years 1974, 1981,
1991 and 1995, but only up to 11 years of age for the birth year 2001. As
a statistical method, we utilized trajectory analysis with the methods of
nonparametric regression. We identified four main trajectories of BMI
growth. Two of these trajectories do not seem to follow the normal growth
pattern. The highest growth track appears to yield to a track that may
yield to overweight and the low birth BMI track shows that the girls'
track differs that of boys on the same track, and on the normal tracks.
The so-called adiposity rebound time decreased over time and started
earlier for those on the overweight track. According to our study, this
kind of acceleration of growth might be more of a general phenomenon that
also relates to the other phases of BMI development. The major change
seems to occur especially for those children on high growth tracks.
Journal: Journal of Applied Statistics
Pages: 1422-1435
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2013.872232
File-URL: http://hdl.handle.net/10.1080/02664763.2013.872232
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1422-1435
Template-Type: ReDIF-Article 1.0
Author-Name: Zhi-Sheng Ye
Author-X-Name-First: Zhi-Sheng
Author-X-Name-Last: Ye
Author-Name: Jian-Guo Li
Author-X-Name-First: Jian-Guo
Author-X-Name-Last: Li
Author-Name: Mengru Zhang
Author-X-Name-First: Mengru
Author-X-Name-Last: Zhang
Title: Application of ridge regression and factor analysis in design and production of alloy wheels
Abstract:
This study proposes using statistical approaches to help with both the
design and manufacture of wheels. The quality of a wheel is represented by
the mechanical properties of spokes. Variation in the mechanical
properties of different wheels is attributed to two sources, i.e.
between-model variation and within-model variation. The between-model
variation is due to different shapes of different wheel models. To model
the effect of shapes on the mechanical properties, we first specify eight
shape variables potentially critical to the mechanical properties, and
then we collect relevant data on 18-wheel models and perform ridge
regression to find the effects of these variables on the mechanical
properties. These results are linked to the solidification theory of the
A356 alloy. The within-model variation is due to natural variability and
process abnormality. We extract mechanical data of a particular wheel
model from the database. Factor analysis is employed to analyze the data
with a view to identifying the latent factors behind the mechanical
properties. We then look into the microstructure of the alloy to
corroborate the fact that these two latent factors are essentially the Si
phase and the Mg2Si phase, respectively. These results can be
used to efficiently identify the root cause when the manufacturing process
goes wrong.
Journal: Journal of Applied Statistics
Pages: 1436-1452
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2013.872233
File-URL: http://hdl.handle.net/10.1080/02664763.2013.872233
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1436-1452
Template-Type: ReDIF-Article 1.0
Author-Name: Saeed Heravi
Author-X-Name-First: Saeed
Author-X-Name-Last: Heravi
Author-Name: Peter Morgan
Author-X-Name-First: Peter
Author-X-Name-Last: Morgan
Title: Sampling schemes for price index construction: a performance comparison across the classification of individual consumption by purpose food groups
Abstract:
Five sampling schemes (SS) for price index construction - one cut-off
sampling technique and four probability-proportional-to-size
(pps) methods - are evaluated by comparing their
performance on a homescan market research data set across 21 months for
each of the 13 classification of individual consumption by purpose
(COICOP) food groups. Classifications are derived for each of the food
groups and the population index value is used as a reference to derive
performance error measures, such as root mean squared error, bias and
standard deviation for each food type. Repeated samples are taken for each
of the pps schemes and the resulting performance error
measures analysed using regression of three of the pps
schemes to assess the overall effect of SS and COICOP group whilst
controlling for sample size, month and population index value. Cut-off
sampling appears to perform less well than pps methods
and multistage pps seems to have no advantage over its
single-stage counterpart. The jackknife resampling technique is also
explored as a means of estimating the standard error of the index and
compared with the actual results from repeated sampling.
Journal: Journal of Applied Statistics
Pages: 1453-1470
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2014.881466
File-URL: http://hdl.handle.net/10.1080/02664763.2014.881466
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1453-1470
Template-Type: ReDIF-Article 1.0
Author-Name: Wei-Hwa Wu
Author-X-Name-First: Wei-Hwa
Author-X-Name-Last: Wu
Author-Name: Hsin-Neng Hsieh
Author-X-Name-First: Hsin-Neng
Author-X-Name-Last: Hsieh
Title: Generalized confidence interval estimation for the mean of delta-lognormal distribution: an application to New Zealand trawl survey data
Abstract:
Highly skewed and non-negative data can often be modeled by the
delta-lognormal distribution in fisheries research. However, the coverage
probabilities of extant interval estimation procedures are less
satisfactory in small sample sizes and highly skewed data. We propose a
heuristic method of estimating confidence intervals for the mean of the
delta-lognormal distribution. This heuristic method is an estimation based
on asymptotic generalized pivotal quantity to construct generalized
confidence interval for the mean of the delta-lognormal distribution.
Simulation results show that the proposed interval estimation procedure
yields satisfactory coverage probabilities, expected interval lengths and
reasonable relative biases. Finally, the proposed method is employed in
red cod densities data for a demonstration.
Journal: Journal of Applied Statistics
Pages: 1471-1485
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2014.881780
File-URL: http://hdl.handle.net/10.1080/02664763.2014.881780
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1471-1485
Template-Type: ReDIF-Article 1.0
Author-Name: J.S.K. Chan
Author-X-Name-First: J.S.K.
Author-X-Name-Last: Chan
Author-Name: W.Y. Wan
Author-X-Name-First: W.Y.
Author-X-Name-Last: Wan
Author-Name: P.L.H. Yu
Author-X-Name-First: P.L.H.
Author-X-Name-Last: Yu
Title: A Poisson geometric process approach for predicting drop-out and committed first-time blood donors
Abstract:
A Poisson geometric process (PGP) model is proposed to study individual
blood donation patterns for a blood donor retention program. Extended from
the geometric process (GP) model of Lam [16], the PGP model captures the
rather pronounced trend patterns across clusters of donors via the ratio
parameters in a mixture setting. Within the state-space modeling
framework, it allows for overdispersion by equating the mean of the
Poisson data distribution to a latent GP. Alternatively, by simply
setting, the mean of the Poisson distribution to be the mean of a GP, it
has equidispersion. With the group-specific mean and ratio functions, the
mixture PGP model facilitates classification of donors into committed,
drop-out and one-time groups. Based on only two years of observations, the
PGP model nicely predicts donors' future donations to foster timely
recruitment decision. The model is implemented using a Bayesian approach
via the user-friendly software WinBUGS.
Journal: Journal of Applied Statistics
Pages: 1486-1503
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2014.881781
File-URL: http://hdl.handle.net/10.1080/02664763.2014.881781
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1486-1503
Template-Type: ReDIF-Article 1.0
Author-Name: D. Senthilkumar
Author-X-Name-First: D.
Author-X-Name-Last: Senthilkumar
Author-Name: B. Esha Raffie
Author-X-Name-First: B. Esha
Author-X-Name-Last: Raffie
Title: Designing and selection of two-plan variables scheme indexed by crossover point
Abstract:
In this paper, the scheme of the inspection plan, namely the tightened
normal tightened (nT,
nN; k) is considered and
procedures and necessary tables are developed for the selection of the
variables sampling scheme, indexed through crossover point (COP). The
importance of COP, the properties and advantages of the operating
characteristic curve with respect to COP are studied.
Journal: Journal of Applied Statistics
Pages: 1504-1515
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2014.881782
File-URL: http://hdl.handle.net/10.1080/02664763.2014.881782
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1504-1515
Template-Type: ReDIF-Article 1.0
Author-Name: Antonio Mart�n Andr�s
Author-X-Name-First: Antonio Mart�n
Author-X-Name-Last: Andr�s
Author-Name: Mar�a �lvarez Hern�ndez
Author-X-Name-First: Mar�a
Author-X-Name-Last: �lvarez Hern�ndez
Title: Two-tailed asymptotic inferences for a proportion
Abstract:
This paper evaluates 29 methods for obtaining a two-sided confidence
interval for a binomial proportion (16 of which are new proposals) and
comes to the conclusion that: Wilson's classic method is only optimal for
a confidence of 99%, although generally it can be applied when
n≥50; for a confidence of 95% or 90%, the optimal
method is the one based on the arcsine transformation (when this is
applied to the data incremented by 0.5), which behaves in a very similar
manner to Jeffreys' Bayesian method. A simpler option, though not so good
as those just mentioned, is the classic-adjusted Wald method of Agresti
and Coull.
Journal: Journal of Applied Statistics
Pages: 1516-1529
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2014.881783
File-URL: http://hdl.handle.net/10.1080/02664763.2014.881783
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1516-1529
Template-Type: ReDIF-Article 1.0
Author-Name: Jalmar M.F. Carrasco
Author-X-Name-First: Jalmar M.F.
Author-X-Name-Last: Carrasco
Author-Name: Silvia L.P. Ferrari
Author-X-Name-First: Silvia L.P.
Author-X-Name-Last: Ferrari
Author-Name: Reinaldo B. Arellano-Valle
Author-X-Name-First: Reinaldo B.
Author-X-Name-Last: Arellano-Valle
Title: Errors-in-variables beta regression models
Abstract:
Beta regression models provide an adequate approach for modeling
continuous outcomes limited to the interval (0, 1). This paper deals with
an extension of beta regression models that allow for explanatory
variables to be measured with error. The structural approach, in which the
covariates measured with error are assumed to be random variables, is
employed. Three estimation methods are presented, namely maximum
likelihood, maximum pseudo-likelihood and regression calibration. Monte
Carlo simulations are used to evaluate the performance of the proposed
estimators and the na�ve estimator. Also, a residual analysis for beta
regression models with measurement errors is proposed. The results are
illustrated in a real data set.
Journal: Journal of Applied Statistics
Pages: 1530-1547
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2014.881784
File-URL: http://hdl.handle.net/10.1080/02664763.2014.881784
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1530-1547
Template-Type: ReDIF-Article 1.0
Author-Name: Ramesh C. Gupta
Author-X-Name-First: Ramesh C.
Author-X-Name-Last: Gupta
Author-Name: Jie Huang
Author-X-Name-First: Jie
Author-X-Name-Last: Huang
Title: Analysis of survival data by a Weibull-generalized Poisson distribution
Abstract:
In life-testing and survival analysis, sometimes the components are
arranged in series or parallel system and the number of components is
initially unknown. Thus, the number of components, say Z,
is considered as random with an appropriate probability mass function. In
this paper, we model the survival data with baseline distribution as
Weibull and the distribution of Z as generalized Poisson,
giving rise to four parameters in the model: increasing, decreasing,
bathtub and upside bathtub failure rates. Two examples are provided and
the maximum-likelihood estimation of the parameters is studied. Rao's
score test is developed to compare the results with the exponential
Poisson model studied by Kus [17] and the exponential-generalized Poisson
distribution with baseline distribution as exponential and the
distribution of Z as generalized Poisson. Simulation
studies are carried out to examine the performance of the estimates.
Journal: Journal of Applied Statistics
Pages: 1548-1564
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2014.881785
File-URL: http://hdl.handle.net/10.1080/02664763.2014.881785
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1548-1564
Template-Type: ReDIF-Article 1.0
Author-Name: Yazhao Lv
Author-X-Name-First: Yazhao
Author-X-Name-Last: Lv
Author-Name: Riquan Zhang
Author-X-Name-First: Riquan
Author-X-Name-Last: Zhang
Author-Name: Weihua Zhao
Author-X-Name-First: Weihua
Author-X-Name-Last: Zhao
Author-Name: Jicai Liu
Author-X-Name-First: Jicai
Author-X-Name-Last: Liu
Title: Quantile regression and variable selection for the single-index model
Abstract:
In this paper, we propose a new full iteration estimation method for
quantile regression (QR) of the single-index model (SIM). The asymptotic
properties of the proposed estimator are derived. Furthermore, we propose
a variable selection procedure for the QR of SIM by combining the
estimation method with the adaptive LASSO penalized method to get sparse
estimation of the index parameter. The oracle properties of the variable
selection method are established. Simulations with various non-normal
errors are conducted to demonstrate the finite sample performance of the
estimation method and the variable selection procedure. Furthermore, we
illustrate the proposed method by analyzing a real data set.
Journal: Journal of Applied Statistics
Pages: 1565-1577
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2014.881786
File-URL: http://hdl.handle.net/10.1080/02664763.2014.881786
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1565-1577
Template-Type: ReDIF-Article 1.0
Author-Name: Rajat Malik
Author-X-Name-First: Rajat
Author-X-Name-Last: Malik
Author-Name: Rob Deardon
Author-X-Name-First: Rob
Author-X-Name-Last: Deardon
Author-Name: Grace P.S. Kwong
Author-X-Name-First: Grace P.S.
Author-X-Name-Last: Kwong
Author-Name: Benjamin J. Cowling
Author-X-Name-First: Benjamin J.
Author-X-Name-Last: Cowling
Title: Individual-level modeling of the spread of influenza within households
Abstract:
A class of individual-level models (ILMs) outlined by R. Deardon
et al., [Inference for individual level models of
infectious diseases in large populations, Statist. Sin. 20
(2010), pp. 239-261] can be used to model the spread of infectious
diseases in discrete time. The key feature of these ILMs is that they take
into account covariate information on susceptible and infectious
individuals as well as shared covariate information such as geography or
contact measures. Here, such ILMs are fitted in a Bayesian framework using
Markov chain Monte Carlo techniques to data sets from two studies on
influenza transmission within households in Hong Kong during 2008 to 2009
and 2009 to 2010. The focus of this paper is to estimate the effect of
vaccination on infection risk and choose a model that best fits the
infection data.
Journal: Journal of Applied Statistics
Pages: 1578-1592
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2014.881787
File-URL: http://hdl.handle.net/10.1080/02664763.2014.881787
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1578-1592
Template-Type: ReDIF-Article 1.0
Author-Name: Ufuk Beyaztas
Author-X-Name-First: Ufuk
Author-X-Name-Last: Beyaztas
Author-Name: Aylin Alin
Author-X-Name-First: Aylin
Author-X-Name-Last: Alin
Author-Name: Michael A. Martin
Author-X-Name-First: Michael A.
Author-X-Name-Last: Martin
Title: Robust BCa-JaB method as a diagnostic tool for linear regression models
Abstract:
The Jackknife-after-bootstrap (JaB) technique originally developed by
Efron [8] has been proposed as an approach to improve the detection of
influential observations in linear regression models by Martin and Roberts
[12] and Beyaztas and Alin [2]. The method is based on the use of
percentile-method confidence intervals to provide improved cut-off values
for several single case-deletion influence measures. In order to improve
JaB, we propose using robust versions of Efron [7]'s bias-corrected and
accelerated (BCa) bootstrap confidence intervals. In this study, the
performances of robust BCa-JaB and conventional JaB methods are compared
in the cases of DFFITS, Welsch's distance and modified Cook's distance
influence diagnostics. Comparisons are based on both real data examples
and through a simulation study. Our results reveal that under a variety of
scenarios, our proposed method provides more accurate and reliable
results, and it is more robust to masking effects.
Journal: Journal of Applied Statistics
Pages: 1593-1610
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2014.881788
File-URL: http://hdl.handle.net/10.1080/02664763.2014.881788
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1593-1610
Template-Type: ReDIF-Article 1.0
Author-Name: P. Angelopoulos
Author-X-Name-First: P.
Author-X-Name-Last: Angelopoulos
Author-Name: C. Koukouvinos
Author-X-Name-First: C.
Author-X-Name-Last: Koukouvinos
Author-Name: A. Skountzou
Author-X-Name-First: A.
Author-X-Name-Last: Skountzou
Title: A cusum control chart approach for screening active effects in orthogonal-saturated experiments
Abstract:
The analysis of designs based on saturated orthogonal arrays poses a very
difficult challenge since there are no degrees of freedom left to estimate
the error variance. In this paper we propose a heuristic approach for the
use of cumulative sum control chart for screening active effects in
orthogonal-saturated experiments. A comparative simulation study
establishes the powerfulness of the proposed method.
Journal: Journal of Applied Statistics
Pages: 1611-1618
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2014.881982
File-URL: http://hdl.handle.net/10.1080/02664763.2014.881982
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1611-1618
Template-Type: ReDIF-Article 1.0
Author-Name: Yang-Jin Kim
Author-X-Name-First: Yang-Jin
Author-X-Name-Last: Kim
Title: Regression analysis of recurrent events data with incomplete observation gaps
Abstract:
For analyzing recurrent event data, either total time scale or gap time
scale is adopted according to research interest. In particular, gap time
scale is known to be more appropriate for modeling a renewal process. In
this paper, we adopt gap time scale to analyze recurrent event data with
repeated observation gaps which cannot be observed completely because of
unknown termination times of observation gaps. In order to estimate
termination times, interval-censored mechanism is applied. Simulation
studies are done to compare the suggested methods with the unadjusted
method ignoring incomplete observation gaps. As a real example, conviction
data set with suspensions is analyzed with suggested methods.
Journal: Journal of Applied Statistics
Pages: 1619-1626
Issue: 7
Volume: 41
Year: 2014
Month: 7
X-DOI: 10.1080/02664763.2014.885002
File-URL: http://hdl.handle.net/10.1080/02664763.2014.885002
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:7:p:1619-1626
Template-Type: ReDIF-Article 1.0
Author-Name: Siti Haslinda Mohd Din
Author-X-Name-First: Siti Haslinda
Author-X-Name-Last: Mohd Din
Author-Name: Marek Molas
Author-X-Name-First: Marek
Author-X-Name-Last: Molas
Author-Name: Jolanda Luime
Author-X-Name-First: Jolanda
Author-X-Name-Last: Luime
Author-Name: Emmanuel Lesaffre
Author-X-Name-First: Emmanuel
Author-X-Name-Last: Lesaffre
Title: Longitudinal profiles of bounded outcome scores as predictors for disease activity in rheumatoid arthritis patients: a joint modeling approach
Abstract:
A variety of statistical approaches have been suggested in the literature
for the analysis of bounded outcome scores (BOS). In this paper, we
suggest a statistical approach when BOSs are repeatedly measured over time
and used as predictors in a regression model. Instead of directly using
the BOS as a predictor, we propose to extend the approaches suggested in
[16,21,28] to a joint modeling setting. Our approach is illustrated on
longitudinal profiles of multiple patients' reported outcomes to predict
the current clinical status of rheumatoid arthritis patients by a disease
activities score of 28 joints (DAS28). Both a maximum likelihood as well
as a Bayesian approach is developed.
Journal: Journal of Applied Statistics
Pages: 1627-1644
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.882499
File-URL: http://hdl.handle.net/10.1080/02664763.2014.882499
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1627-1644
Template-Type: ReDIF-Article 1.0
Author-Name: Tao Lu
Author-X-Name-First: Tao
Author-X-Name-Last: Lu
Author-Name: Yangxin Huang
Author-X-Name-First: Yangxin
Author-X-Name-Last: Huang
Author-Name: Min Wang
Author-X-Name-First: Min
Author-X-Name-Last: Wang
Author-Name: Feng Qian
Author-X-Name-First: Feng
Author-X-Name-Last: Qian
Title: A refined parameter estimating approach for HIV dynamic model
Abstract:
HIV dynamic models, a set of ordinary differential equations (ODEs), have
provided new understanding of the pathogenesis of HIV infection and the
treatment effects of antiviral therapies. However, to estimate parameters
for ODEs is very challenging due to the complexity of this nonlinear
system. In this article, we propose a comprehensive procedure to deal with
this issue. In the proposed procedure, a series of cutting-edge
statistical methods and techniques are employed, including nonparametric
mixed-effects smoothing-based methods for ODE models and stochastic
approximation expectation-maximization (EM) approach for mixed-effects ODE
models. A simulation study is performed to validate the proposed approach.
An application example from a real HIV clinical trial study is used to
illustrate the usefulness of the proposed method.
Journal: Journal of Applied Statistics
Pages: 1645-1657
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.885001
File-URL: http://hdl.handle.net/10.1080/02664763.2014.885001
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1645-1657
Template-Type: ReDIF-Article 1.0
Author-Name: Weihua Zhao
Author-X-Name-First: Weihua
Author-X-Name-Last: Zhao
Author-Name: Riquan Zhang
Author-X-Name-First: Riquan
Author-X-Name-Last: Zhang
Author-Name: Jicai Liu
Author-X-Name-First: Jicai
Author-X-Name-Last: Liu
Title: Sparse group variable selection based on quantile hierarchical Lasso
Abstract:
The group Lasso is a penalized regression method, used in regression
problems where the covariates are partitioned into groups to promote
sparsity at the group level [27]. Quantile group Lasso, a natural
extension of quantile Lasso [25], is a good alternative when the data has
group information and has many outliers and/or heavy tails. How to
discover important features that are correlated with interest of outcomes
and immune to outliers has been paid much attention. In many applications,
however, we may also want to keep the flexibility of selecting variables
within a group. In this paper, we develop a sparse group variable
selection based on quantile methods which select important covariates at
both the group level and within the group level, which penalizes the
empirical check loss function by the sum of square root group-wise
L1-norm penalties. The oracle properties are
established where the number of parameters diverges. We also apply our new
method to varying coefficient model with categorial effect modifiers.
Simulations and real data example show that the newly proposed method has
robust and superior performance.
Journal: Journal of Applied Statistics
Pages: 1658-1677
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.888541
File-URL: http://hdl.handle.net/10.1080/02664763.2014.888541
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1658-1677
Template-Type: ReDIF-Article 1.0
Author-Name: Raffaella Calabrese
Author-X-Name-First: Raffaella
Author-X-Name-Last: Calabrese
Title: Optimal cut-off for rare events and unbalanced misclassification costs
Abstract:
This paper develops a method for handling two-class classification
problems with highly unbalanced class sizes and misclassification costs.
When the class sizes are highly unbalanced and the minority class
represents a rare event, conventional classification methods tend to
strongly favour the majority class, resulting in very low detection of the
minority class. A method is proposed to determine the optimal cut-off for
asymmetric misclassification costs and for unbalanced class sizes. Monte
Carlo simulations show that this proposal performs better than the method
based on the notion of classification accuracy. Finally, the proposed
method is applied to empirical data on Italian small and medium
enterprises to classify them into default and non-default groups.
Journal: Journal of Applied Statistics
Pages: 1678-1693
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.888542
File-URL: http://hdl.handle.net/10.1080/02664763.2014.888542
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1678-1693
Template-Type: ReDIF-Article 1.0
Author-Name: Alain Bensoussan
Author-X-Name-First: Alain
Author-X-Name-Last: Bensoussan
Author-Name: Pierre Bertrand
Author-X-Name-First: Pierre
Author-X-Name-Last: Bertrand
Author-Name: Alexandre Brouste
Author-X-Name-First: Alexandre
Author-X-Name-Last: Brouste
Title: A generalized linear model approach to seasonal aspects of wind speed modeling
Abstract:
The aim of the article is to identify the intraday seasonality in a wind
speed time series. Following the traditional approach, the marginal
probability law is Weibull and, consequently, we consider seasonal Weibull
law. A new estimation and decision procedure to estimate the seasonal
Weibull law intraday scale parameter is presented. We will also give
statistical decision-making tools to discard or not the trend parameter
and to validate the seasonal model.
Journal: Journal of Applied Statistics
Pages: 1694-1707
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.888543
File-URL: http://hdl.handle.net/10.1080/02664763.2014.888543
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1694-1707
Template-Type: ReDIF-Article 1.0
Author-Name: Amjad D. Al-Nasser
Author-X-Name-First: Amjad D.
Author-X-Name-Last: Al-Nasser
Title: Two steps generalized maximum entropy estimation procedure for fitting linear regression when both covariates are subject to error
Abstract:
This paper presents a procedure utilizing the generalized maximum entropy
(GME) estimation method in two steps to quantify the uncertainty of the
simple linear structural measurement error model parameters exactly. The
first step estimates the unknowns from the horizontal line, and then the
estimates were used in a second step to estimate the unknowns from the
vertical line. The proposed estimation procedure has the ability to
minimize the number of unknown parameters in formulating the GME system
within each step, and hence reduce variability of the estimates.
Analytical and illustrative Monte Carlo simulation comparison experiments
with the maximum likelihood estimators and a one-step GME estimation
procedure were presented. Simulation experiments demonstrated that the two
steps estimation procedure produced parameter estimates that are more
accurate and more efficient than the classical estimation methods. An
application of the proposed method is illustrated using a data set
gathered from the Centre for Integrated Government Services in Delma
Island - UAE to predict the association between perceived quality and the
customer satisfaction.
Journal: Journal of Applied Statistics
Pages: 1708-1720
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.888544
File-URL: http://hdl.handle.net/10.1080/02664763.2014.888544
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1708-1720
Template-Type: ReDIF-Article 1.0
Author-Name: Tahani Coolen-Maturi
Author-X-Name-First: Tahani
Author-X-Name-Last: Coolen-Maturi
Title: A new weighted rank coefficient of concordance
Abstract:
There are many situations where n objects are ranked by
b>2 independent sources or observers and in which the
interest is focused on agreement on the top rankings. Kendall's
coefficient of concordance [10] assigns equal weights to all rankings. In
this paper, a new coefficient of concordance is introduced which is more
sensitive to agreement on the top rankings. The limiting distribution of
the new concordance coefficient under the null hypothesis of no
association among the rankings is presented, and a summary of the exact
and approximate quantiles for this coefficient is provided. A simulation
study is carried out to compare the performance of Kendall's, the top-down
and the new concordance coefficients in detecting the agreement on the top
rankings. Finally, examples are given for illustration purposes, including
a real data set from financial market indices.
Journal: Journal of Applied Statistics
Pages: 1721-1745
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.889664
File-URL: http://hdl.handle.net/10.1080/02664763.2014.889664
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1721-1745
Template-Type: ReDIF-Article 1.0
Author-Name: M. Qamarul Islam
Author-X-Name-First: M. Qamarul
Author-X-Name-Last: Islam
Author-Name: Fetih Yildirim
Author-X-Name-First: Fetih
Author-X-Name-Last: Yildirim
Author-Name: Mehmet Yazici
Author-X-Name-First: Mehmet
Author-X-Name-Last: Yazici
Title: Inference in multivariate linear regression models with elliptically distributed errors
Abstract:
In this study we investigate the problem of estimation and testing of
hypotheses in multivariate linear regression models when the errors
involved are assumed to be non-normally distributed. We consider the class
of heavy-tailed distributions for this purpose. Although our method is
applicable for any distribution in this class, we take the multivariate
t-distribution for illustration. This distribution has
applications in many fields of applied research such as Economics,
Business, and Finance. For estimation purpose, we use the modified maximum
likelihood method in order to get the so-called modified maximum
likelihood estimates that are obtained in a closed form. We show that
these estimates are substantially more efficient than least-square
estimates. They are also found to be robust to reasonable deviations from
the assumed distribution and also many data anomalies such as the presence
of outliers in the sample, etc. We further provide test statistics for
testing the relevant hypothesis regarding the regression coefficients.
Journal: Journal of Applied Statistics
Pages: 1746-1766
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.890177
File-URL: http://hdl.handle.net/10.1080/02664763.2014.890177
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1746-1766
Template-Type: ReDIF-Article 1.0
Author-Name: Xavier de Luna
Author-X-Name-First: Xavier
Author-X-Name-Last: de Luna
Author-Name: Mathias Lundin
Author-X-Name-First: Mathias
Author-X-Name-Last: Lundin
Title: Sensitivity analysis of the unconfoundedness assumption with an application to an evaluation of college choice effects on earnings
Abstract:
We evaluate the effects of college choice on earnings using Swedish
register databases. This case study is used to motivate the introduction
of a novel procedure to analyse the sensitivity of such an observational
study to the assumption made that there are no unobserved confounders -
variables affecting both college choice and earnings. This assumption is
not testable without further information, and should be considered an
approximation of reality. To perform a sensitivity analysis, we measure
the departure from the unconfoundedness assumption with the correlation
between college choice and earnings when conditioning on observed
covariates. The use of a correlation as a measure of dependence allows us
to propose a standardised procedure by advocating the use of a fixed value
for the correlation, typically 1% or 5%, when checking the sensitivity of
an evaluation study. A correlation coefficient is, moreover, intuitive to
most empirical scientists, which makes the results of our sensitivity
analysis easier to communicate than those of previously proposed methods.
In our evaluation of the effects of college choice on earnings, the
significantly positive effect obtained could not be questioned by a
sensitivity analysis allowing for unobserved confounders inducing at most
5% correlation between college choice and earnings.
Journal: Journal of Applied Statistics
Pages: 1767-1784
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.890178
File-URL: http://hdl.handle.net/10.1080/02664763.2014.890178
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1767-1784
Template-Type: ReDIF-Article 1.0
Author-Name: Francisco Louzada
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada
Author-Name: Paulo H. Ferreira
Author-X-Name-First: Paulo H.
Author-X-Name-Last: Ferreira
Author-Name: Carlos A.R. Diniz
Author-X-Name-First: Carlos A.R.
Author-X-Name-Last: Diniz
Title: Skew-normal distribution for growth curve models in presence of a heteroscedasticity structure
Abstract:
In general, growth models are adjusted under the assumptions that the
error terms are homoscedastic and normally distributed. However, these
assumptions are often not verified in practice. In this work we propose
four growth models (Morgan-Mercer-Flodin, von Bertalanffy, Gompertz, and
Richards) considering different distributions (normal, skew-normal) for
the error terms and three different covariance structures. Maximum
likelihood estimation procedure is addressed. A simulation study is
performed in order to verify the appropriateness of the proposed growth
curve models. The methodology is also illustrated on a real dataset.
Journal: Journal of Applied Statistics
Pages: 1785-1798
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.891005
File-URL: http://hdl.handle.net/10.1080/02664763.2014.891005
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1785-1798
Template-Type: ReDIF-Article 1.0
Author-Name: Abdelmalek Kouadri
Author-X-Name-First: Abdelmalek
Author-X-Name-Last: Kouadri
Author-Name: Karim Baiche
Author-X-Name-First: Karim
Author-X-Name-Last: Baiche
Author-Name: Mimoun Zelmat
Author-X-Name-First: Mimoun
Author-X-Name-Last: Zelmat
Title: Blind source separation filters-based-fault detection and isolation in a three tank system
Abstract:
Fault detection and Isolation takes a strategic position in modern
industrial processes for which various approaches are proposed. These
approaches are usually developed and based on a consistency test between
an observed state of the process provided by sensors and an expected
behaviour provided by a mathematical model of the system. These methods
require a reliable model of the system to be monitored which is a complex
task. Alternatively, we propose in this paper to use blind source
separation filters (BSSFs) in order to detect and isolate faults in a
three tank pilot plant. This technique is very beneficial as it uses blind
identification without an explicit mathematical model of the system. The
independent component analysis (ICA), relying on the assumption of the
statistical independence of the extracted sources, is used as a tool for
each BSSF to extract signals of the process under consideration. The
experimental results show the effectiveness and robustness of this
approach in detecting and isolating faults that are on sensors in the
system.
Journal: Journal of Applied Statistics
Pages: 1799-1813
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.891570
File-URL: http://hdl.handle.net/10.1080/02664763.2014.891570
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1799-1813
Template-Type: ReDIF-Article 1.0
Author-Name: Tsai-Hung Fan
Author-X-Name-First: Tsai-Hung
Author-X-Name-Last: Fan
Author-Name: Yi-Fu Wang
Author-X-Name-First: Yi-Fu
Author-X-Name-Last: Wang
Author-Name: Yi-Chen Zhang
Author-X-Name-First: Yi-Chen
Author-X-Name-Last: Zhang
Title: Bayesian model selection in linear mixed effects models with autoregressive(p) errors using mixture priors
Abstract:
In this article, we apply the Bayesian approach to the linear mixed effect
models with autoregressive(p) random errors under mixture priors obtained
with the Markov chain Monte Carlo (MCMC) method. The mixture structure of
a point mass and continuous distribution can help to select the variables
in fixed and random effects models from the posterior sample generated
using the MCMC method. Bayesian prediction of future observations is also
one of the major concerns. To get the best model, we consider the commonly
used highest posterior probability model and the median posterior
probability model. As a result, both criteria tend to be needed to choose
the best model from the entire simulation study. In terms of predictive
accuracy, a real example confirms that the proposed method provides
accurate results.
Journal: Journal of Applied Statistics
Pages: 1814-1829
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.894001
File-URL: http://hdl.handle.net/10.1080/02664763.2014.894001
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1814-1829
Template-Type: ReDIF-Article 1.0
Author-Name: Luigi Ippoliti
Author-X-Name-First: Luigi
Author-X-Name-Last: Ippoliti
Author-Name: Simone Di Zio
Author-X-Name-First: Simone
Author-X-Name-Last: Di Zio
Author-Name: Arcangelo Merla
Author-X-Name-First: Arcangelo
Author-X-Name-Last: Merla
Title: Classification of biomedical signals for differential diagnosis of Raynaud's phenomenon
Abstract:
This paper discusses a supervised classification approach for the
differential diagnosis of Raynaud's phenomenon (RP). The classification of
data from healthy subjects and from patients suffering for primary and
secondary RP is obtained by means of a set of classifiers derived within
the framework of linear discriminant analysis. A set of functional
variables and shape measures extracted from rewarming/reperfusion curves
are proposed as discriminant features. Since the prediction of group
membership is based on a large number of these features, the high
dimension/small sample size problem is considered to overcome the
singularity problem of the within-group covariance matrix. Results on a
data set of 72 subjects demonstrate that a satisfactory classification of
the subjects can be achieved through the proposed methodology.
Journal: Journal of Applied Statistics
Pages: 1830-1847
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.894002
File-URL: http://hdl.handle.net/10.1080/02664763.2014.894002
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1830-1847
Template-Type: ReDIF-Article 1.0
Author-Name: Pablo Mart�nez-Camblor
Author-X-Name-First: Pablo
Author-X-Name-Last: Mart�nez-Camblor
Author-Name: Carlos Carleos
Author-X-Name-First: Carlos
Author-X-Name-Last: Carleos
Author-Name: Jesus � Baro
Author-X-Name-First: Jesus �
Author-X-Name-Last: Baro
Author-Name: Javier Ca��n
Author-X-Name-First: Javier
Author-X-Name-Last: Ca��n
Title: Standard statistical tools for the breed allocation problem
Abstract:
Modern technologies are frequently used in order to deal with new genomic
problems. For instance, the STRUCTURE software is
usually employed for breed assignment based on genetic information.
However, standard statistical techniques offer a number of valuable tools
which can be successfully used for dealing with most problems. In this
paper, we investigated the capability of microsatellite markers for
individual identification and their potential use for breed assignment of
individuals in seventy Lidia breed lines and breeders. Traditional
binomial logistic regression is applied to each line and used to assign
one individual to a particular line. In addition, the area under receiver
operating curve (AUC) criterion is used to measure the capability of the
microsatellite-based models to separate the groups. This method allows us
to identify which microsatellite loci are related to each line. Overall,
only one subject was misclassified or a 99.94% correct allocation. The
minimum observed AUC was 0.986 with an average of 0.997. These results
suggest that our method is competitive for animal allocation and has some
interpretative advantages and a strong relationship with methods based on
SNPs and related techniques.
Journal: Journal of Applied Statistics
Pages: 1848-1856
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.898136
File-URL: http://hdl.handle.net/10.1080/02664763.2014.898136
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1848-1856
Template-Type: ReDIF-Article 1.0
Author-Name: Konstantinos C. Fragkos
Author-X-Name-First: Konstantinos C.
Author-X-Name-Last: Fragkos
Title: Applied medical statistics using SAS
Journal: Journal of Applied Statistics
Pages: 1857-1858
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2013.877650
File-URL: http://hdl.handle.net/10.1080/02664763.2013.877650
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1857-1858
Template-Type: ReDIF-Article 1.0
Author-Name: Božidar V. Popović
Author-X-Name-First: Božidar V.
Author-X-Name-Last: Popović
Title: Exercises and solutions in statistical theory
Journal: Journal of Applied Statistics
Pages: 1858-1858
Issue: 8
Volume: 41
Year: 2014
Month: 8
X-DOI: 10.1080/02664763.2014.883685
File-URL: http://hdl.handle.net/10.1080/02664763.2014.883685
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:8:p:1858-1858
Template-Type: ReDIF-Article 1.0
Author-Name: Edwin M.M. Ortega
Author-X-Name-First: Edwin M.M.
Author-X-Name-Last: Ortega
Author-Name: Gauss M. Cordeiro
Author-X-Name-First: Gauss M.
Author-X-Name-Last: Cordeiro
Author-Name: Elizabeth M. Hashimoto
Author-X-Name-First: Elizabeth M.
Author-X-Name-Last: Hashimoto
Author-Name: Kahadawala Cooray
Author-X-Name-First: Kahadawala
Author-X-Name-Last: Cooray
Title: A log-linear regression model for the odd Weibull distribution with censored data
Abstract:
We introduce the log-odd Weibull regression model based on the odd Weibull
distribution (Cooray, 2006). We derive some mathematical properties of the
log-transformed distribution. The new regression model represents a
parametric family of models that includes as sub-models some widely known
regression models that can be applied to censored survival data. We employ
a frequentist analysis and a parametric bootstrap for the parameters of
the proposed model. We derive the appropriate matrices for assessing local
influence on the parameter estimates under different perturbation schemes
and present some ways to assess global influence. Further, for different
parameter settings, sample sizes and censoring percentages, some
simulations are performed. In addition, the empirical distribution of some
modified residuals are given and compared with the standard normal
distribution. These studies suggest that the residual analysis usually
performed in normal linear regression models can be extended to a modified
deviance residual in the proposed regression model applied to censored
data. We define martingale and deviance residuals to check the model
assumptions. The extended regression model is very useful for the analysis
of real data.
Journal: Journal of Applied Statistics
Pages: 1859-1880
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.897689
File-URL: http://hdl.handle.net/10.1080/02664763.2014.897689
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:1859-1880
Template-Type: ReDIF-Article 1.0
Author-Name: V�ctor Leiva
Author-X-Name-First: V�ctor
Author-X-Name-Last: Leiva
Author-Name: Carolina Marchant
Author-X-Name-First: Carolina
Author-X-Name-Last: Marchant
Author-Name: Helton Saulo
Author-X-Name-First: Helton
Author-X-Name-Last: Saulo
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Author-Name: Fernando Rojas
Author-X-Name-First: Fernando
Author-X-Name-Last: Rojas
Title: Capability indices for Birnbaum-Saunders processes applied to electronic and food industries
Abstract:
Process capability indices (PCIs) are tools widely used by the industries
to determine the quality of their products and the performance of their
manufacturing processes. Classic versions of these indices were
constructed for processes whose quality characteristics have a normal
distribution. In practice, many of these characteristics do not follow
this distribution. In such a case, the classic PCIs must be modified to
take into account the non-normality. Ignoring the effect of this
non-normality can lead to misinterpretation of the process capability and
ill-advised business decisions. An asymmetric non-normal model that is
receiving considerable attention due to its good properties is the
Birnbaum-Saunders (BS) distribution. We propose, develop, implement and
apply a methodology based on PCIs for BS processes considering estimation,
parametric inference, bootstrap and optimization tools. This methodology
is implemented in the statistical software {\tt R}. A simulation study is
conducted to evaluate its performance. Real-world case studies with
applications for three data sets are carried out to illustrate its
potentiality. One of these data sets was already published and is
associated with the electronic industry, whereas the other two are
unpublished and associated with the food industry.
Journal: Journal of Applied Statistics
Pages: 1881-1902
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.897690
File-URL: http://hdl.handle.net/10.1080/02664763.2014.897690
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:1881-1902
Template-Type: ReDIF-Article 1.0
Author-Name: Shun Matsuura
Author-X-Name-First: Shun
Author-X-Name-Last: Matsuura
Title: Effectiveness of a random compound noise strategy for robust parameter design
Abstract:
Robust parameter design has been widely used to improve the quality of
products and processes. Although a product array, in which an orthogonal
array for control factors is crossed with an orthogonal array for noise
factors, is commonly used for parameter design experiments, this may lead
to an unacceptably large number of experimental runs. The compound noise
strategy proposed by Taguchi [30] can be used to reduce the number of
experimental runs. In this strategy, a compound noise factor is formed
based on the directionality of the effects of noise factors. However, the
directionality is usually unknown in practice. Recently, Singh et
al. [28] proposed a random compound noise strategy, in which a
compound noise factor is formed by randomly selecting a setting of the
levels of noise factors. The present paper evaluates the random compound
noise strategy in terms of the precision of the estimators of the response
mean and the response variance. In addition, the variances of the
estimators in the random compound noise strategy are compared with those
in the n-replication design. The random compound noise
strategy is shown to have smaller variances of the estimators than the
2-replication design, especially when the control-by-noise-interactions
are strong.
Journal: Journal of Applied Statistics
Pages: 1903-1918
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.898130
File-URL: http://hdl.handle.net/10.1080/02664763.2014.898130
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:1903-1918
Template-Type: ReDIF-Article 1.0
Author-Name: Rita Esther Zapata-V�zquez
Author-X-Name-First: Rita Esther
Author-X-Name-Last: Zapata-V�zquez
Author-Name: Anthony O'Hagan
Author-X-Name-First: Anthony
Author-X-Name-Last: O'Hagan
Author-Name: Leonardo Soares Bastos
Author-X-Name-First: Leonardo
Author-X-Name-Last: Soares Bastos
Title: Eliciting expert judgements about a set of proportions
Abstract:
Eliciting expert knowledge about several uncertain quantities is a complex
task when those quantities exhibit associations. A well-known example of
such a problem is eliciting knowledge about a set of uncertain proportions
which must sum to 1. The usual approach is to assume that the expert's
knowledge can be adequately represented by a Dirichlet distribution, since
this is by far the simplest multivariate distribution that is appropriate
for such a set of proportions. It is also the most convenient,
particularly when the expert's prior knowledge is to be combined with a
multinomial sample since then the Dirichlet is the conjugate prior family.
Several methods have been described in the literature for eliciting
beliefs in the form of a Dirichlet distribution, which typically involve
eliciting from the expert enough judgements to identify uniquely the
Dirichlet hyperparameters. We describe here a new method which employs the
device of over-fitting, i.e. eliciting more than the minimal number of
judgements, in order to (a) produce a more carefully considered Dirichlet
distribution and (b) ensure that the Dirichlet distribution is indeed a
reasonable fit to the expert's knowledge. The method has been implemented
in a software extension of the Sheffield elicitation framework (SHELF) to
facilitate the multivariate elicitation process.
Journal: Journal of Applied Statistics
Pages: 1919-1933
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.898131
File-URL: http://hdl.handle.net/10.1080/02664763.2014.898131
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:1919-1933
Template-Type: ReDIF-Article 1.0
Author-Name: T. Baghfalaki
Author-X-Name-First: T.
Author-X-Name-Last: Baghfalaki
Author-Name: M. Ganjali
Author-X-Name-First: M.
Author-X-Name-Last: Ganjali
Author-Name: D. Berridge
Author-X-Name-First: D.
Author-X-Name-Last: Berridge
Title: Joint modeling of multivariate longitudinal mixed measurements and time to event data using a Bayesian approach
Abstract:
In many longitudinal studies multiple characteristics of each individual,
along with time to occurrence of an event of interest, are often
collected. In such data set, some of the correlated characteristics may be
discrete and some of them may be continuous. In this paper, a joint model
for analysing multivariate longitudinal data comprising mixed continuous
and ordinal responses and a time to event variable is proposed. We model
the association structure between longitudinal mixed data and time to
event data using a multivariate zero-mean Gaussian process. For modeling
discrete ordinal data we assume a continuous latent variable follows the
logistic distribution and for continuous data a Gaussian mixed effects
model is used. For the event time variable, an accelerated failure time
model is considered under different distributional assumptions. For
parameter estimation, a Bayesian approach using Markov Chain Monte Carlo
is adopted. The performance of the proposed methods is illustrated using
some simulation studies. A real data set is also analyzed, where different
model structures are used. Model comparison is performed using a variety
of statistical criteria.
Journal: Journal of Applied Statistics
Pages: 1934-1955
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.898132
File-URL: http://hdl.handle.net/10.1080/02664763.2014.898132
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:1934-1955
Template-Type: ReDIF-Article 1.0
Author-Name: Yun Lu
Author-X-Name-First: Yun
Author-X-Name-Last: Lu
Author-Name: Sridhar Hannenhalli
Author-X-Name-First: Sridhar
Author-X-Name-Last: Hannenhalli
Author-Name: Tom Cappola
Author-X-Name-First: Tom
Author-X-Name-Last: Cappola
Author-Name: Mary Putt
Author-X-Name-First: Mary
Author-X-Name-Last: Putt
Title: An evaluation of Monte-Carlo logic and logicFS motivated by a study of the regulation of gene expression in heart failure
Abstract:
Monte-Carlo (MC) Logic and Logic Feature Selection (logicFS) identify
binary predictors of outcome using repeated iterations of logic
regression, a variable selection method that identifies Boolean
combinations of predictors. Both methods compute the frequency with which
predictors appear in the model with the output of the logicFS program
providing specific summaries of predictor form. We sought to identify
variables related to transcription factor-related regulation of gene
expression differences in a study of failing and non-failing hearts.
Results based broadly on the frequency of occurrence of predictors into
the MC Logic or logicFS models were similar. However key to logicFS are
variable importance measures (VIMs), which augment the frequency metrics
and seek to evaluate a predictor's contribution to classification.
Analytic work and simulation studies indicate that the VIM vary as a
function of the joint prevalence of outcome and predictor. Thus, findings
from logicFS have limited generalizability, particularly with respect to
case-control studies where the prevalence of outcome is determined by
study design. Interpretation of VIM for those variables with near-zero or
negative values is particularly ambiguous. Additional issues with
interpretability arise because the VIM are strongly affected by other
variables selected into the model but logicFS does not explicitly identify
these variables in its output.
Journal: Journal of Applied Statistics
Pages: 1956-1975
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.898133
File-URL: http://hdl.handle.net/10.1080/02664763.2014.898133
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:1956-1975
Template-Type: ReDIF-Article 1.0
Author-Name: Chwu-Shiun Tarng
Author-X-Name-First: Chwu-Shiun
Author-X-Name-Last: Tarng
Title: Third-order likelihood-based inference for the log-normal regression model
Abstract:
This paper examines the general third-order theory to the log-normal
regression model. The interest parameter is its conditional mean. For
inference, traditional first-order approximations need large sample sizes
and normal-like distributions. Some specific third-order methods need the
explicit forms of the nuisance parameter and ancillary statistic, which
are quite complicated. Note that this general third-order theory can be
applied to any continuous models with standard asymptotic properties. It
only needs the log-likelihood function. With small sample settings, the
simulation studies for confidence intervals of the conditional mean
illustrate that the general third-order theory is much superior to the
traditional first-order methods.
Journal: Journal of Applied Statistics
Pages: 1976-1988
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.898134
File-URL: http://hdl.handle.net/10.1080/02664763.2014.898134
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:1976-1988
Template-Type: ReDIF-Article 1.0
Author-Name: W. Liu
Author-X-Name-First: W.
Author-X-Name-Last: Liu
Author-Name: J.C. Hsu
Author-X-Name-First: J.C.
Author-X-Name-Last: Hsu
Author-Name: F. Bretz
Author-X-Name-First: F.
Author-X-Name-Last: Bretz
Author-Name: A.J. Hayter
Author-X-Name-First: A.J.
Author-X-Name-Last: Hayter
Author-Name: Y. Han
Author-X-Name-First: Y.
Author-X-Name-Last: Han
Title: Shelf-life and its estimation in drug stability studies
Abstract:
One important property of any drug product is its stability over time.
Drug stability studies are routinely carried out in the pharmaceutical
industry in order to measure the degradation of an active pharmaceutical
ingredient of a drug product. One important study objective is to estimate
the shelf-life of the drug; the estimated shelf-life is required by the US
Food and Drug Administration to be printed on the package label of the
drug. This involves a suitable definition of the true shelf-life and the
construction of an appropriate estimate of the true shelf-life. In this
paper, the true shelf-life Tβ is defined
as the time point at which 100β% of all the individual dosage units
(e.g. tablets) of the drug have the active ingredient content no less than
the lowest acceptable limit L, where β and
L are prespecified constants. The value of
Tβ depends on the parameters of the
assumed degradation model of the active ingredient content and so is
unknown. A lower confidence bound
Tˆβ for
Tβ is then provided and used as the
estimated shelf-life of the drug.
Journal: Journal of Applied Statistics
Pages: 1989-2000
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.898135
File-URL: http://hdl.handle.net/10.1080/02664763.2014.898135
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:1989-2000
Template-Type: ReDIF-Article 1.0
Author-Name: Le Chen
Author-X-Name-First: Le
Author-X-Name-Last: Chen
Author-Name: Ao Yuan
Author-X-Name-First: Ao
Author-X-Name-Last: Yuan
Author-Name: Aiyi Liu
Author-X-Name-First: Aiyi
Author-X-Name-Last: Liu
Author-Name: Guanjie Chen
Author-X-Name-First: Guanjie
Author-X-Name-Last: Chen
Title: Longitudinal data analysis using Bayesian-frequentist hybrid random effects model
Abstract:
The mixed random effect model is commonly used in longitudinal data
analysis within either frequentist or Bayesian framework. Here we consider
a case, in which we have prior knowledge on partial parameters, while no
such information on the rest of the parameters. Thus, we use the hybrid
approach on the random-effects model with partial parameters. The
parameters are estimated via Bayesian procedure, and the rest of
parameters by the frequentist maximum likelihood estimation (MLE),
simultaneously on the same model. In practice, we often know partial prior
information such as, covariates of age, gender, etc. These information can
be used, and accurate estimations in mixed random-effects model can be
obtained. A series of simulation studies were performed to compare the
results with the commonly used random-effects model with and without
partial prior information. The results in hybrid estimation (HYB) and MLE
were very close to each other. The estimated θ values in with
partial prior information model (HYB) were more closer to true θ
values, and showed less variances than without partial prior information
in MLE. To compare with true θ values, the mean square of errors are
much less in HYB than in MLE. This advantage of HYB is very obvious in
longitudinal data with a small sample size. The methods of HYB and MLE are
applied to a real longitudinal data for illustration purposes.
Journal: Journal of Applied Statistics
Pages: 2001-2010
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.898137
File-URL: http://hdl.handle.net/10.1080/02664763.2014.898137
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:2001-2010
Template-Type: ReDIF-Article 1.0
Author-Name: Robert Drake
Author-X-Name-First: Robert
Author-X-Name-Last: Drake
Author-Name: Apratim Guha
Author-X-Name-First: Apratim
Author-X-Name-Last: Guha
Title: A mutual information-based k-sample test for discrete distributions
Abstract:
The two-sample problem and its extension to the k-sample
problem are well known in the statistical literature. However, the
discrete version of the k-sample problem is relatively
less explored. Here in this work we suggest a k-sample
non-parametric test procedure for discrete distributions based on mutual
information. A detailed power study with comparison with other
alternatives is provided. Finally, a comparison of some English soccer
league teams based on their goal-scoring pattern is discussed.
Journal: Journal of Applied Statistics
Pages: 2011-2027
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.899325
File-URL: http://hdl.handle.net/10.1080/02664763.2014.899325
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:2011-2027
Template-Type: ReDIF-Article 1.0
Author-Name: Jelani Wiltshire
Author-X-Name-First: Jelani
Author-X-Name-Last: Wiltshire
Author-Name: Fred W. Huffer
Author-X-Name-First: Fred W.
Author-X-Name-Last: Huffer
Author-Name: William C. Parker
Author-X-Name-First: William C.
Author-X-Name-Last: Parker
Title: A general class of test statistics for Van Valen's Red Queen hypothesis
Abstract:
Van Valen's Red Queen hypothesis states that within a homogeneous
taxonomic group the age is statistically independent of the rate of
extinction. The case of the Red Queen hypothesis being addressed here is
when the homogeneous taxonomic group is a group of similar species. Since
Van Valen's work, various statistical approaches have been used to address
the relationship between taxon age and the rate of extinction. We propose
a general class of test statistics that can be used to test for the effect
of age on the rate of extinction. These test statistics allow for a
varying background rate of extinction and attempt to remove the effects of
other covariates when assessing the effect of age on extinction. No model
is assumed for the covariate effects. Instead we control for covariate
effects by pairing or grouping together similar species. Simulations are
used to compare the power of the statistics. We apply the test statistics
to data on Foram extinctions and find that age has a positive effect on
the rate of extinction. A derivation of the null distribution of one of
the test statistics is provided in the supplementary material.
Journal: Journal of Applied Statistics
Pages: 2028-2043
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.907394
File-URL: http://hdl.handle.net/10.1080/02664763.2014.907394
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:2028-2043
Template-Type: ReDIF-Article 1.0
Author-Name: Federico Andreis
Author-X-Name-First: Federico
Author-X-Name-Last: Andreis
Author-Name: Pier Alda Ferrari
Author-X-Name-First: Pier Alda
Author-X-Name-Last: Ferrari
Title: Multidimensional item response theory models for dichotomous data in customer satisfaction evaluation
Abstract:
In this paper, multidimensional item response theory models for
dichotomous data, developed in the fields of psychometrics and ability
assessment, are discussed in connection with the problem of evaluating
customer satisfaction. These models allow us to take into account latent
constructs at various degrees of complexity and provide interesting new
perspectives for services quality assessment. Markov chain Monte Carlo
techniques are considered for estimation. An application to a real data
set is also presented.
Journal: Journal of Applied Statistics
Pages: 2044-2055
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.907395
File-URL: http://hdl.handle.net/10.1080/02664763.2014.907395
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:2044-2055
Template-Type: ReDIF-Article 1.0
Author-Name: Josemar Rodrigues
Author-X-Name-First: Josemar
Author-X-Name-Last: Rodrigues
Author-Name: Gauss M. Cordeiro
Author-X-Name-First: Gauss M.
Author-X-Name-Last: Cordeiro
Author-Name: Jorge Bazan
Author-X-Name-First: Jorge
Author-X-Name-Last: Bazan
Title: An extended exponentiated-G-negative binomial family with threshold effect
Abstract:
In this paper, we formulate a very flexible family of models which unifies
most recent lifetime distributions. The main idea is to obtain a
cumulative distribution function to transform the baseline distribution
with an activation mechanism characterized by a latent threshold variable.
The new family has a strong biological interpretation from the competitive
risks point of view and the Box-Cox transformation provides an elegant
manner to interpret the effect on the baseline distribution to obtain this
alternative model. Several structural properties of the new model are
investigated. A Bayesian analysis using Markov Chain Monte Carlo procedure
is developed to illustrate with a real data the usefulness of the proposed
family.
Journal: Journal of Applied Statistics
Pages: 2056-2074
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.907396
File-URL: http://hdl.handle.net/10.1080/02664763.2014.907396
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:2056-2074
Template-Type: ReDIF-Article 1.0
Author-Name: Daniel Thorburn
Author-X-Name-First: Daniel
Author-X-Name-Last: Thorburn
Author-Name: Can Tongur
Author-X-Name-First: Can
Author-X-Name-Last: Tongur
Title: Assessing direct and indirect seasonal decomposition in state space
Abstract:
The problem of whether seasonal decomposition should be used prior to or
after aggregation of time series is quite old. We tackle the problem by
using a state-space representation and the variance/covariance structure
of a simplified one-component model. The variances of the estimated
components in a two-series system are compared for direct and indirect
approaches and also to a multivariate method. The covariance structure
between the two time series is important for the relative efficiency.
Indirect estimation is always best when the series are independent, but
when the series or the measurement errors are negatively correlated,
direct estimation may be much better in the above sense. Some covariance
structures indicate that direct estimation should be used while others
indicate that an indirect approach is more efficient. Signal-to-noise
ratios and relative variances are used for inference.
Journal: Journal of Applied Statistics
Pages: 2075-2091
Issue: 9
Volume: 41
Year: 2014
Month: 9
X-DOI: 10.1080/02664763.2014.909779
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909779
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:9:p:2075-2091
Template-Type: ReDIF-Article 1.0
Author-Name: Weiyan Mu
Author-X-Name-First: Weiyan
Author-X-Name-Last: Mu
Author-Name: Shifeng Xiong
Author-X-Name-First: Shifeng
Author-X-Name-Last: Xiong
Title: Some notes on robust sure independence screening
Abstract:
Sure independence screening (SIS) proposed by Fan and Lv [4] uses marginal
correlations to select important variables, and has proven to be an
efficient method for ultrahigh-dimensional linear models. This paper
provides two robust versions of SIS against outliers. The two methods,
respectively, replace the sample correlation in SIS with two robust
measures, and screen variables by ranking them. Like SIS, the proposed
methods are simple and fast. In addition, they are highly robust against a
substantial fraction of outliers in the data. These features make them
applicable to large datasets which may contain outliers. Simulation
results are presented to show their effectiveness.
Journal: Journal of Applied Statistics
Pages: 2092-2102
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909777
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909777
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2092-2102
Template-Type: ReDIF-Article 1.0
Author-Name: Joseph W. Sakshaug
Author-X-Name-First: Joseph W.
Author-X-Name-Last: Sakshaug
Author-Name: Trivellore E. Raghunathan
Author-X-Name-First: Trivellore E.
Author-X-Name-Last: Raghunathan
Title: Generating synthetic data to produce public-use microdata for small geographic areas based on complex sample survey data with application to the National Health Interview Survey
Abstract:
Small area statistics obtained from sample survey data provide a critical
source of information used to study health, economic, and sociological
trends. However, most large-scale sample surveys are not designed for the
purpose of producing small area statistics. Moreover, data disseminators
are prevented from releasing public-use microdata for small geographic
areas for disclosure reasons; thus, limiting the utility of the data they
collect. This research evaluates a synthetic data method, intended for
data disseminators, for releasing public-use microdata for small
geographic areas based on complex sample survey data. The method replaces
all observed survey values with synthetic (or imputed) values generated
from a hierarchical Bayesian model that explicitly accounts for complex
sample design features, including stratification, clustering, and sampling
weights. The method is applied to restricted microdata from the National
Health Interview Survey and synthetic data are generated for both sampled
and non-sampled small areas. The analytic validity of the resulting small
area inferences is assessed by direct comparison with the actual data, a
simulation study, and a cross-validation study.
Journal: Journal of Applied Statistics
Pages: 2103-2122
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909778
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909778
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2103-2122
Template-Type: ReDIF-Article 1.0
Author-Name: A. Rouigueb
Author-X-Name-First: A.
Author-X-Name-Last: Rouigueb
Author-Name: S. Chitroub
Author-X-Name-First: S.
Author-X-Name-Last: Chitroub
Author-Name: A. Bouridane
Author-X-Name-First: A.
Author-X-Name-Last: Bouridane
Title: Bayesian inference over ICA models: application to multibiometric score fusion with quality estimates
Abstract:
Bayesian networks are not well-formulated for continuous variables. The
majority of recent works dealing with Bayesian inference are restricted
only to special types of continuous variables such as the conditional
linear Gaussian model for Gaussian variables. In this context, an exact
Bayesian inference algorithm for clusters of continuous variables which
may be approximated by independent component analysis models is proposed.
The complexity in memory space is linear and the overfitting problem is
attenuated, while the inference time is still exponential. Experiments for
multibiometric score fusion with quality estimates are conducted, and it
is observed that the performances are satisfactory compared to some known
fusion techniques.
Journal: Journal of Applied Statistics
Pages: 2123-2140
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909780
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909780
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2123-2140
Template-Type: ReDIF-Article 1.0
Author-Name: Abdul Haq
Author-X-Name-First: Abdul
Author-X-Name-Last: Haq
Author-Name: Jennifer Brown
Author-X-Name-First: Jennifer
Author-X-Name-Last: Brown
Author-Name: Elena Moltchanova
Author-X-Name-First: Elena
Author-X-Name-Last: Moltchanova
Author-Name: Amer Ibrahim Al-Omari
Author-X-Name-First: Amer Ibrahim
Author-X-Name-Last: Al-Omari
Title: Mixed ranked set sampling design
Abstract:
The main focus of agricultural, ecological and environmental studies is to
develop well designed, cost-effective and efficient sampling designs.
Ranked set sampling (RSS) is one method that leads to accomplish such
objectives by incorporating expert knowledge to its advantage. In this
paper, we propose an efficient sampling scheme, named mixed RSS (MxRSS),
for estimation of the population mean and median. The MxRSS scheme is a
suitable mixture of both simple random sampling (SRS) and RSS schemes. The
MxRSS scheme provides an unbiased estimator of the population mean, and
its variance is always less than the variance of sample mean based on SRS.
For both symmetric and asymmetric populations, the mean and median
estimators based on SRS, partial RSS (PRSS) and MxRSS schemes are
compared. It turns out that the mean and median estimates under MxRSS
scheme are more precise than those based on SRS scheme. Moreover, when
estimating the mean of symmetric and some asymmetric populations, the mean
estimates under MxRSS scheme are found to be more efficient than the mean
estimates with PRSS scheme. An application to real data is also provided
to illustrate the implementation of the proposed sampling scheme.
Journal: Journal of Applied Statistics
Pages: 2141-2156
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909781
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909781
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2141-2156
Template-Type: ReDIF-Article 1.0
Author-Name: Jaehee Kim
Author-X-Name-First: Jaehee
Author-X-Name-Last: Kim
Author-Name: Sooyoung Cheon
Author-X-Name-First: Sooyoung
Author-X-Name-Last: Cheon
Title: Stochastic approximation Monte Carlo Gibbs sampling for structural change inference in a Bayesian heteroscedastic time series model
Abstract:
We consider a Bayesian deterministically trending dynamic time series
model with heteroscedastic error variance, in which there exist multiple
structural changes in level, trend and error variance, but the number of
change-points and the timings are unknown. For a Bayesian analysis, a
truncated Poisson prior and conjugate priors are used for the number of
change-points and the distributional parameters, respectively. To identify
the best model and estimate the model parameters simultaneously, we
propose a new method by sequentially making use of the Gibbs sampler in
conjunction with stochastic approximation Monte Carlo simulations, as an
adaptive Monte Carlo algorithm. The numerical results are in favor of our
method in terms of the quality of estimates.
Journal: Journal of Applied Statistics
Pages: 2157-2177
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909782
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909782
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2157-2177
Template-Type: ReDIF-Article 1.0
Author-Name: Francis Pike
Author-X-Name-First: Francis
Author-X-Name-Last: Pike
Author-Name: Lisa A. Weissfeld
Author-X-Name-First: Lisa A.
Author-X-Name-Last: Weissfeld
Author-Name: Chung-Chou H. Chang
Author-X-Name-First: Chung-Chou H.
Author-X-Name-Last: Chang
Title: Joint modeling of multivariate censored longitudinal and event time data with application to the Genetic Markers of Inflammation Study
Abstract:
The Genetic Markers of Inflammation Study (GenIMS) was conceived to
investigate the role of severe sepsis, which is typically defined as
system-wide multi-organ failure, on survival. One major hypothesis for
this systemic collapse, and reduction in survival, is a cascade of
pro-inflammatory and anti-inflammatory cytokines. In this paper, we
devised a novel joint modeling strategy to evaluate the joint effect of
longitudinal anti-inflammatory marker IL-6 and pro-inflammatory marker
IL-10 on 90-day survival. We found that, on average, patients with high
initial values of both IL-6 and IL-10, that tend to increase over time,
are associated with a reduction in survival expectancy and that accounting
for their assumed correlation was justified.
Journal: Journal of Applied Statistics
Pages: 2178-2191
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909783
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909783
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2178-2191
Template-Type: ReDIF-Article 1.0
Author-Name: Jiin Choi
Author-X-Name-First: Jiin
Author-X-Name-Last: Choi
Author-Name: Stewart J. Anderson
Author-X-Name-First: Stewart J.
Author-X-Name-Last: Anderson
Author-Name: Thomas J. Richards
Author-X-Name-First: Thomas J.
Author-X-Name-Last: Richards
Author-Name: Wesley K. Thompson
Author-X-Name-First: Wesley K.
Author-X-Name-Last: Thompson
Title: Prediction of transplant-free survival in idiopathic pulmonary fibrosis patients using joint models for event times and mixed multivariate longitudinal data
Abstract:
We implement a joint model for mixed multivariate longitudinal
measurements, applied to the prediction of time until lung transplant or
death in idiopathic pulmonary fibrosis. Specifically, we formulate a
unified Bayesian joint model for the mixed longitudinal responses and
time-to-event outcomes. For the longitudinal model of continuous and
binary responses, we investigate multivariate generalized linear mixed
models using shared random effects. Longitudinal and time-to-event data
are assumed to be independent conditional on available covariates and
shared parameters. A Markov chain Monte Carlo algorithm, implemented in
OpenBUGS, is used for parameter estimation. To illustrate practical
considerations in choosing a final model, we fit 37 different candidate
models using all possible combinations of random effects and employ a
deviance information criterion to select a best-fitting model. We
demonstrate the prediction of future event probabilities within a fixed
time interval for patients utilizing baseline data, post-baseline
longitudinal responses, and the time-to-event outcome. The performance of
our joint model is also evaluated in simulation studies.
Journal: Journal of Applied Statistics
Pages: 2192-2205
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909784
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909784
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2192-2205
Template-Type: ReDIF-Article 1.0
Author-Name: Jason S. Bergtold
Author-X-Name-First: Jason S.
Author-X-Name-Last: Bergtold
Author-Name: Eberechukwu Onukwugha
Author-X-Name-First: Eberechukwu
Author-X-Name-Last: Onukwugha
Title: The probabilistic reduction approach to specifying multinomial logistic regression models in health outcomes research
Abstract:
The paper provides a novel application of the probabilistic reduction (PR)
approach to the analysis of multi-categorical outcomes. The PR approach,
which systematically takes account of heterogeneity and functional form
concerns, can improve the specification of binary regression models.
However, its utility for systematically enriching the specification of and
inference from models of multi-categorical outcomes has not been examined,
while multinomial logistic regression models are commonly used for
inference and, increasingly, prediction. Following a theoretical
derivation of the PR-based multinomial logistic model (MLM), we compare
functional specification and marginal effects from a traditional
specification and a PR-based specification in a model of post-stroke
hospital discharge disposition and find that the traditional MLM is
misspecified. Results suggest that the impact on the reliability of
substantive inferences from a misspecified model may be significant, even
when model fit statistics do not suggest a strong lack of fit compared
with a properly specified model using the PR approach. We identify
situations under which a PR-based MLM specification can be advantageous to
the applied researcher.
Journal: Journal of Applied Statistics
Pages: 2206-2221
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909785
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909785
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2206-2221
Template-Type: ReDIF-Article 1.0
Author-Name: M. Nitti
Author-X-Name-First: M.
Author-X-Name-Last: Nitti
Author-Name: E. Ciavolino
Author-X-Name-First: E.
Author-X-Name-Last: Ciavolino
Title: A deflated indicators approach for estimating second-order reflective models through PLS-PM: an empirical illustration
Abstract:
The paper provides a procedure aimed at obtaining more interpretable
second-order models estimated with the partial least squares-path
modeling. Advantages in interpretation stem from the separation of the two
sources of influence on the data. As a matter of fact, in hierarchical
models effects on manifest variables (MVs) are assigned to both
first-order (specific) factors and second-order (general) factors. In
order to separate these overlapping contributions, MVs are deflated from
the effect of the specific latent variables (LVs) and used as indicators
of the second-order LV. A case study is presented in order to illustrate
the application of the proposed method.
Journal: Journal of Applied Statistics
Pages: 2222-2239
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909786
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909786
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2222-2239
Template-Type: ReDIF-Article 1.0
Author-Name: Wei Xiong
Author-X-Name-First: Wei
Author-X-Name-Last: Xiong
Author-Name: Maozai Tian
Author-X-Name-First: Maozai
Author-X-Name-Last: Tian
Title: A new model selection procedure based on dynamic quantile regression
Abstract:
In this article, we propose a novel robust data-analytic procedure,
dynamic quantile regression (DQR), for model selection. It is robust in
the sense that it can simultaneously estimate the coefficients and the
distribution of errors over a large collection of error distributions even
those that are heavy-tailed and may not even possess variances or means;
and DQR is easy to implement in the sense that it does not need to decide
in advance which quantile(s) should be gathered. Asymptotic properties of
related estimators are derived. Simulations and illustrative real examples
are also given.
Journal: Journal of Applied Statistics
Pages: 2240-2256
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909787
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909787
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2240-2256
Template-Type: ReDIF-Article 1.0
Author-Name: Lin Liu
Author-X-Name-First: Lin
Author-X-Name-Last: Liu
Author-Name: Jianbo Li
Author-X-Name-First: Jianbo
Author-X-Name-Last: Li
Author-Name: Riquan Zhang
Author-X-Name-First: Riquan
Author-X-Name-Last: Zhang
Title: General partially linear additive transformation model with right-censored data
Abstract:
We propose a class of general partially linear additive transformation
models (GPLATM) with right-censored survival data in this work. The class
of models are flexible enough to cover many commonly used parametric and
nonparametric survival analysis models as its special cases. Based on the
B spline interpolation technique, we estimate the unknown regression
parameters and functions by the maximum marginal likelihood estimation
method. One important feature of the estimation procedure is that it does
not need the baseline and censoring cumulative density distributions. Some
numerical studies illustrate that this procedure can work very well for
the moderate sample size.
Journal: Journal of Applied Statistics
Pages: 2257-2269
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909788
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909788
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2257-2269
Template-Type: ReDIF-Article 1.0
Author-Name: Feng-Shou Ko
Author-X-Name-First: Feng-Shou
Author-X-Name-Last: Ko
Title: Identification of longitudinal biomarkers for survival by a score test derived from a joint model of longitudinal and competing risks data
Abstract:
In this paper, we consider joint modelling of repeated measurements and
competing risks failure time data. For competing risks time data, a
semiparametric mixture model in which proportional hazards model are
specified for failure time models conditional on cause and a multinomial
model for the marginal distribution of cause conditional on covariates. We
also derive a score test based on joint modelling of repeated measurements
and competing risks failure time data to identify longitudinal biomarkers
or surrogates for a time to event outcome in competing risks data.
Journal: Journal of Applied Statistics
Pages: 2270-2281
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909789
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909789
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2270-2281
Template-Type: ReDIF-Article 1.0
Author-Name: N. Withanage
Author-X-Name-First: N.
Author-X-Name-Last: Withanage
Author-Name: A.R. de Leon
Author-X-Name-First: A.R.
Author-X-Name-Last: de Leon
Author-Name: C.J. Rudnisky
Author-X-Name-First: C.J.
Author-X-Name-Last: Rudnisky
Title: Joint estimation of disease-specific sensitivities and specificities in reader-based multi-disease diagnostic studies of paired organs
Abstract:
Binocular data typically arise in ophthalmology where pairs of eyes are
evaluated, through some diagnostic procedure, for the presence of certain
diseases or pathologies. Treating eyes as independent and adopting the
usual approach in estimating the sensitivity and specificity of a
diagnostic test ignores the correlation between fellow eyes. This may
consequently yield incorrect estimates, especially of the standard errors.
The paper is concerned with diagnostic studies wherein several diagnostic
tests, or the same test read by several readers, are administered to
identify one or more diseases. A likelihood-based method of estimating
disease-specific sensitivities and specificities via hierarchical
generalized linear mixed models is proposed to meaningfully delineate the
various correlations in the data. The efficiency of the estimates is
assessed in a simulation study. Data from a study on diabetic retinopathy
are analyzed to illustrate the methodology.
Journal: Journal of Applied Statistics
Pages: 2282-2297
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909790
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909790
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2282-2297
Template-Type: ReDIF-Article 1.0
Author-Name: Octavio Ramirez
Author-X-Name-First: Octavio
Author-X-Name-Last: Ramirez
Author-Name: Jeff Mullen
Author-X-Name-First: Jeff
Author-X-Name-Last: Mullen
Author-Name: Alba J. Collart
Author-X-Name-First: Alba J.
Author-X-Name-Last: Collart
Title: Insights into the appropriate level of disaggregation for efficient time series model forecasting
Abstract:
This paper provides a potentially valuable insight on how to assess if the
forecasts from an autoregressive moving average model based on aggregated
data could be substantially improved through disaggregation. It is argued
that, theoretically, the absence of moving average (MA) terms indicates
that no forecasting efficiency improvements can be achieved through
disaggregation. In practice, it is found that there is a strong
correlation between the statistical significance of the MA component in
the aggregate model and the magnitude of the forecast mean square error
(MSE) decreases that can be achieved through disaggregation. That is, if a
model includes significant MA terms, the forecast MSE improvements that
may be gained from disaggregation could be substantial. Otherwise, they
are more likely to be relatively small or non-existent.
Journal: Journal of Applied Statistics
Pages: 2298-2311
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909791
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909791
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2298-2311
Template-Type: ReDIF-Article 1.0
Author-Name: Marta Blangiardo
Author-X-Name-First: Marta
Author-X-Name-Last: Blangiardo
Author-Name: Gianluca Baio
Author-X-Name-First: Gianluca
Author-X-Name-Last: Baio
Title: Evidence of bias in the Eurovision song contest: modelling the votes using Bayesian hierarchical models
Abstract:
The Eurovision Song Contest is an annual musical competition held among
active members of the European Broadcasting Union since 1956. The event is
televised live across Europe. Each participating country presents a song
and receive a vote based on a combination of tele-voting and jury. Over
the years, this has led to speculations of tactical voting, discriminating
against some participants and thus inducing bias in the final results. In
this paper we investigate the presence of positive or negative bias (which
may roughly indicate favouritisms or discrimination) in the votes based on
geographical proximity, migration and cultural characteristics of the
participating countries through a Bayesian hierarchical model. Our
analysis found no evidence of negative bias, although mild positive bias
does seem to emerge systematically, linking voters to performers.
Journal: Journal of Applied Statistics
Pages: 2312-2322
Issue: 10
Volume: 41
Year: 2014
Month: 10
X-DOI: 10.1080/02664763.2014.909792
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909792
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:10:p:2312-2322
Template-Type: ReDIF-Article 1.0
Author-Name: R.A.B. Assumpção
Author-X-Name-First: R.A.B.
Author-X-Name-Last: Assumpção
Author-Name: M.A. Uribe-Opazo
Author-X-Name-First: M.A.
Author-X-Name-Last: Uribe-Opazo
Author-Name: M. Galea
Author-X-Name-First: M.
Author-X-Name-Last: Galea
Title: Analysis of local influence in geostatistics using Student's t-distribution
Abstract:
This article aims to estimate parameters of spatial variability with
Student's t-distribution by the EM algorithm and present
the study of local influence by means of two methods known as likelihood
displacement and Q-displacement of likelihood, both using
Student's t-distribution with fixed degrees of freedom
(ν). The results showed that both methods are effective in the
identification of influential points.
Journal: Journal of Applied Statistics
Pages: 2323-2341
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.909793
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909793
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2323-2341
Template-Type: ReDIF-Article 1.0
Author-Name: Julian J. Faraway
Author-X-Name-First: Julian J.
Author-X-Name-Last: Faraway
Title: Regression for non-Euclidean data using distance matrices
Abstract:
Regression methods for common data types such as measured, count and
categorical variables are well understood but increasingly statisticians
need ways to model relationships between variable types such as shapes,
curves, trees, correlation matrices and images that do not fit into the
standard framework. Data types that lie in metric spaces but not in vector
spaces are difficult to use within the usual regression setting, either as
the response and/or a predictor. We represent the information in these
variables using distance matrices which requires only the specification of
a distance function. A low-dimensional representation of such distance
matrices can be obtained using methods such as multidimensional scaling.
Once these variables have been represented as scores, an internal model
linking the predictors and the responses can be developed using standard
methods. We call scoring as the transformation from a new observation to a
score, whereas backscoring is a method to represent a score as an
observation in the data space. Both methods are essential for prediction
and explanation. We illustrate the methodology for shape data,
unregistered curve data and correlation matrices using motion capture data
from an experiment to study the motion of children with cleft lip.
Journal: Journal of Applied Statistics
Pages: 2342-2357
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.909794
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909794
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2342-2357
Template-Type: ReDIF-Article 1.0
Author-Name: Maximilian Riedl
Author-X-Name-First: Maximilian
Author-X-Name-Last: Riedl
Author-Name: Ingo Geishecker
Author-X-Name-First: Ingo
Author-X-Name-Last: Geishecker
Title: Keep it simple: estimation strategies for ordered response models with fixed effects
Abstract:
By running Monte Carlo simulations, we compare different estimation
strategies of ordered response models in the presence of non-random
unobserved heterogeneity. We find that very simple binary recoding schemes
deliver parameter estimates with very low bias and high efficiency.
Furthermore, if the researcher is interested in the relative size of
parameters the simple linear fixed effects model is the method of choice.
Journal: Journal of Applied Statistics
Pages: 2358-2374
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.909969
File-URL: http://hdl.handle.net/10.1080/02664763.2014.909969
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2358-2374
Template-Type: ReDIF-Article 1.0
Author-Name: Manoj Kumar Rastogi
Author-X-Name-First: Manoj Kumar
Author-X-Name-Last: Rastogi
Author-Name: Yogesh Mani Tripathi
Author-X-Name-First: Yogesh Mani
Author-X-Name-Last: Tripathi
Title: Estimation for an inverted exponentiated Rayleigh distribution under type II progressive censoring
Abstract:
In this paper, we consider estimation of unknown parameters of an inverted
exponentiated Rayleigh distribution under type II progressive censored
samples. Estimation of reliability and hazard functions is also
considered. Maximum likelihood estimators are obtained using the
Expectation--Maximization (EM) algorithm. Further, we obtain expected
Fisher information matrix using the missing value principle. Bayes
estimators are derived under squared error and linex loss functions. We
have used Lindley, and Tiernery and Kadane methods to compute these
estimates. In addition, Bayes estimators are computed using importance
sampling scheme as well. Samples generated from this scheme are further
utilized for constructing highest posterior density intervals for unknown
parameters. For comparison purposes asymptotic intervals are also
obtained. A numerical comparison is made between proposed estimators using
simulations and observations are given. A real-life data set is analyzed
for illustrative purposes.
Journal: Journal of Applied Statistics
Pages: 2375-2405
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.910500
File-URL: http://hdl.handle.net/10.1080/02664763.2014.910500
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2375-2405
Template-Type: ReDIF-Article 1.0
Author-Name: Eduardo Fé
Author-X-Name-First: Eduardo
Author-X-Name-Last: Fé
Title: Estimation and inference in regression discontinuity designs with asymmetric kernels
Abstract:
We study the behaviour of the Wald estimator of causal effects in
regression discontinuity design when local linear regression (LLR) methods
are combined with an asymmetric gamma kernel. We show that the resulting
statistic is no more complex to implement than existing methods, remains
consistent at the usual non-parametric rate, and maintains an asymptotic
normal distribution but, crucially, has bias and variance that do not
depend on kernel-related constants. As a result, the new estimator is more
efficient and yields more reliable inference. A limited Monte Carlo
experiment is used to illustrate the efficiency gains. As a by product of
the main discussion, we extend previous published work by establishing the
asymptotic normality of the LLR estimator with a gamma kernel. Finally,
the new method is used in a substantive application.
Journal: Journal of Applied Statistics
Pages: 2406-2417
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.910638
File-URL: http://hdl.handle.net/10.1080/02664763.2014.910638
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2406-2417
Template-Type: ReDIF-Article 1.0
Author-Name: Kalyan Das
Author-X-Name-First: Kalyan
Author-X-Name-Last: Das
Author-Name: Angshuman Sarkar
Author-X-Name-First: Angshuman
Author-X-Name-Last: Sarkar
Title: Robust inference for generalized partially linear mixed models that account for censored responses and missing covariates -- an application to Arctic data analysis
Abstract:
In this article, we propose a family of bounded influence robust estimates
for the parametric and non-parametric components of a generalized
partially linear mixed model that are subject to censored responses and
missing covariates. The asymptotic properties of the proposed estimates
have been looked into. The estimates are obtained by using Monte Carlo
expectation--maximization algorithm. An approximate method which reduces
the computational time to a great extent is also proposed. A simulation
study shows that performances of the two approaches are similar in terms
of bias and mean square error. The analysis is illustrated through a study
on the effect of environmental factors on the phytoplankton cell count.
Journal: Journal of Applied Statistics
Pages: 2418-2436
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.910886
File-URL: http://hdl.handle.net/10.1080/02664763.2014.910886
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2418-2436
Template-Type: ReDIF-Article 1.0
Author-Name: Ji-Xia Wang
Author-X-Name-First: Ji-Xia
Author-X-Name-Last: Wang
Author-Name: Qing-Xian Xiao
Author-X-Name-First: Qing-Xian
Author-X-Name-Last: Xiao
Title: Local composite quantile regression estimation of time-varying parameter vector for multidimensional time-inhomogeneous diffusion models
Abstract:
This paper is dedicated to the study of the composite quantile regression
(CQR) estimations of time-varying parameter vectors for multidimensional
diffusion models. Based on the local linear fitting for parameter vectors,
we propose the local linear CQR estimations of the drift parameter
vectors, and verify their asymptotic biases, asymptotic variances and
asymptotic normality. Moreover, we discuss the asymptotic relative
efficiency (ARE) of the local linear CQR estimations with respect to the
local linear least-squares estimations. We obtain that the local
estimations that we proposed are much more efficient than the local linear
least-squares estimations. Simulation studies are constructed to show the
performance of the estimations proposed.
Journal: Journal of Applied Statistics
Pages: 2437-2449
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.911824
File-URL: http://hdl.handle.net/10.1080/02664763.2014.911824
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2437-2449
Template-Type: ReDIF-Article 1.0
Author-Name: Sean L. Simpson
Author-X-Name-First: Sean L.
Author-X-Name-Last: Simpson
Author-Name: Lloyd J. Edwards
Author-X-Name-First: Lloyd J.
Author-X-Name-Last: Edwards
Author-Name: Martin A. Styner
Author-X-Name-First: Martin A.
Author-X-Name-Last: Styner
Author-Name: Keith E. Muller
Author-X-Name-First: Keith E.
Author-X-Name-Last: Muller
Title: Separability tests for high-dimensional, low-sample size multivariate repeated measures data
Abstract:
Longitudinal imaging studies have moved to the forefront of medical
research due to their ability to characterize spatio-temporal features of
biological structures across the lifespan. Valid inference in longitudinal
imaging requires enough flexibility of the covariance model to allow
reasonable fidelity to the true pattern. On the other hand, the existence
of computable estimates demands a parsimonious parameterization of the
covariance structure. Separable (Kronecker product) covariance models
provide one such parameterization in which the spatial and temporal
covariances are modeled separately. However, evaluating the validity of
this parameterization in high dimensions remains a challenge. Here we
provide a scientifically informed approach to assessing the adequacy of
separable (Kronecker product) covariance models when the number of
observations is large relative to the number of independent sampling units
(sample size). We address both the general case, in which unstructured
matrices are considered for each covariance model, and the structured
case, which assumes a particular structure for each model. For the
structured case, we focus on the situation where the within-subject
correlation is believed to decrease exponentially in time and space as is
common in longitudinal imaging studies. However, the provided framework
equally applies to all covariance patterns used within the more general
multivariate repeated measures context. Our approach provides useful
guidance for high dimension, low-sample size data that preclude using
standard likelihood-based tests. Longitudinal medical imaging data of
caudate morphology in schizophrenia illustrate the approaches appeal.
Journal: Journal of Applied Statistics
Pages: 2450-2461
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.919251
File-URL: http://hdl.handle.net/10.1080/02664763.2014.919251
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2450-2461
Template-Type: ReDIF-Article 1.0
Author-Name: Amadou Sawadogo
Author-X-Name-First: Amadou
Author-X-Name-Last: Sawadogo
Author-Name: Dominique Lafon
Author-X-Name-First: Dominique
Author-X-Name-Last: Lafon
Author-Name: Simplice Dossou Gbété
Author-X-Name-First: Simplice Dossou
Author-X-Name-Last: Gbété
Title: Statistical analysis of rank data from a visual matching of colored textures
Abstract:
Nowadays, sensory properties of materials are subject to growing attention
both in an hedonic point of view and in an utilitarian one. Hence, the
formulation of the foundations of an instrumental metrological approach
that will allow for the characterization of visual similarities between
textures belonging to the same type becomes a challenge of the research
activities in the domain of perception. In this paper, our specific
objective is to link an instrumental approach of metrology of the
assessment of visual textures with a metrology approach based on a
softcopy experiment performed by human judges. The experiment consisted in
ranking of isochromatic colored textures according to the visual contrast.
A fixed effects additive model is considered for the analysis of the rank
data collected from the softcopy experiment. The model is fitted to the
data using a least-squares criterion. The resulting data analysis gives
rise to a sensory scale that shows a non-linear correlation and a
monotonic functional relationship with the physical attribute on which the
ranking experiment is based. Furthermore, the capacity of the judges to
discriminate the textures according to the visual contrast varies
according to the color ranges and the textures types.
Journal: Journal of Applied Statistics
Pages: 2462-2482
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.920775
File-URL: http://hdl.handle.net/10.1080/02664763.2014.920775
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2462-2482
Template-Type: ReDIF-Article 1.0
Author-Name: Riten Mitra
Author-X-Name-First: Riten
Author-X-Name-Last: Mitra
Author-Name: Peter Müller
Author-X-Name-First: Peter
Author-X-Name-Last: Müller
Author-Name: Yuan Ji
Author-X-Name-First: Yuan
Author-X-Name-Last: Ji
Author-Name: Yitan Zhu
Author-X-Name-First: Yitan
Author-X-Name-Last: Zhu
Author-Name: Gordon Mills
Author-X-Name-First: Gordon
Author-X-Name-Last: Mills
Author-Name: Yiling Lu
Author-X-Name-First: Yiling
Author-X-Name-Last: Lu
Title: A Bayesian hierarchical model for inference across related reverse phase protein arrays experiments
Abstract:
We consider inference for functional proteomics experiments that record
protein activation over time following perturbation under different dose
levels of several drugs. The main inference goal is the dependence
structure of the selected proteins. A critical challenge is the lack of
sufficient data under any one drug and dose level to allow meaningful
inference on dependence structure. We propose a hierarchical model to
implement the desired inference. The key element of the model is a shared
dependence structure on (latent) binary indicators of protein activation.
Journal: Journal of Applied Statistics
Pages: 2483-2492
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.920776
File-URL: http://hdl.handle.net/10.1080/02664763.2014.920776
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2483-2492
Template-Type: ReDIF-Article 1.0
Author-Name: Maryam Karimi
Author-X-Name-First: Maryam
Author-X-Name-Last: Karimi
Author-Name: Sayed Mohammad Reza Alavi
Author-X-Name-First: Sayed Mohammad Reza
Author-X-Name-Last: Alavi
Title: The effect of weight function on hypothesis testing in weighted sampling
Abstract:
In this paper the problem of statistical hypothesis testing under weighted
sampling is considered for obtaining the most powerful test. Some
simulated powers of tests, using the Monte Carlo method, are performed.
Using a convenient sample of the specialist physicians of Social Security
Organization of Ahvaz in Iran, two weighted samplings versus random
sampling are tested. Among the three mentioned sampling, the size-biased
sampling order 0.2 is more appropriate for the mechanism of data
collection.
Journal: Journal of Applied Statistics
Pages: 2493-2503
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.920777
File-URL: http://hdl.handle.net/10.1080/02664763.2014.920777
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2493-2503
Template-Type: ReDIF-Article 1.0
Author-Name: Dengke Xu
Author-X-Name-First: Dengke
Author-X-Name-Last: Xu
Author-Name: Zhongzhan Zhang
Author-X-Name-First: Zhongzhan
Author-X-Name-Last: Zhang
Author-Name: Liucang Wu
Author-X-Name-First: Liucang
Author-X-Name-Last: Wu
Title: Bayesian analysis of joint mean and covariance models for longitudinal data
Abstract:
Efficient estimation of the regression coefficients in longitudinal data
analysis requires a correct specification of the covariance structure. If
misspecification occurs, it may lead to inefficient or biased estimators
of parameters in the mean. One of the most commonly used methods for
handling the covariance matrix is based on simultaneous modeling of the
Cholesky decomposition. Therefore, in this paper, we reparameterize
covariance structures in longitudinal data analysis through the modified
Cholesky decomposition of itself. Based on this modified Cholesky
decomposition, the within-subject covariance matrix is decomposed into a
unit lower triangular matrix involving moving average coefficients and a
diagonal matrix involving innovation variances, which are modeled as
linear functions of covariates. Then, we propose a fully Bayesian
inference for joint mean and covariance models based on this
decomposition. A computational efficient Markov chain Monte Carlo method
which combines the Gibbs sampler and Metropolis--Hastings algorithm is
implemented to simultaneously obtain the Bayesian estimates of unknown
parameters, as well as their standard deviation estimates. Finally,
several simulation studies and a real example are presented to illustrate
the proposed methodology.
Journal: Journal of Applied Statistics
Pages: 2504-2514
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.920778
File-URL: http://hdl.handle.net/10.1080/02664763.2014.920778
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2504-2514
Template-Type: ReDIF-Article 1.0
Author-Name: Su Yun Kang
Author-X-Name-First: Su Yun
Author-X-Name-Last: Kang
Author-Name: James McGree
Author-X-Name-First: James
Author-X-Name-Last: McGree
Author-Name: Peter Baade
Author-X-Name-First: Peter
Author-X-Name-Last: Baade
Author-Name: Kerrie Mengersen
Author-X-Name-First: Kerrie
Author-X-Name-Last: Mengersen
Title: An investigation of the impact of various geographical scales for the specification of spatial dependence
Abstract:
Ecological studies are based on characteristics of groups of individuals,
which are common in various disciplines including epidemiology. It is of
great interest for epidemiologists to study the geographical variation of
a disease by accounting for the positive spatial dependence between
neighbouring areas. However, the choice of scale of the spatial
correlation requires much attention. In view of a lack of studies in this
area, this study aims to investigate the impact of differing definitions
of geographical scales using a multilevel model. We propose a new approach
-- the grid-based partitions and compare it with the popular census region
approach. Unexplained geographical variation is accounted for via
area-specific unstructured random effects and spatially structured random
effects specified as an intrinsic conditional autoregressive process.
Using grid-based modelling of random effects in contrast to the census
region approach, we illustrate conditions where improvements are observed
in the estimation of the linear predictor, random effects, parameters, and
the identification of the distribution of residual risk and the aggregate
risk in a study region. The study has found that grid-based modelling is a
valuable approach for spatially sparse data while the statistical local
area-based and grid-based approaches perform equally well for spatially
dense data.
Journal: Journal of Applied Statistics
Pages: 2515-2538
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.920779
File-URL: http://hdl.handle.net/10.1080/02664763.2014.920779
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2515-2538
Template-Type: ReDIF-Article 1.0
Author-Name: N. Lu
Author-X-Name-First: N.
Author-X-Name-Last: Lu
Author-Name: T. Chen
Author-X-Name-First: T.
Author-X-Name-Last: Chen
Author-Name: P. Wu
Author-X-Name-First: P.
Author-X-Name-Last: Wu
Author-Name: D. Gunzler
Author-X-Name-First: D.
Author-X-Name-Last: Gunzler
Author-Name: H. Zhang
Author-X-Name-First: H.
Author-X-Name-Last: Zhang
Author-Name: H. He
Author-X-Name-First: H.
Author-X-Name-Last: He
Author-Name: X.M. Tu
Author-X-Name-First: X.M.
Author-X-Name-Last: Tu
Title: Functional response models for intraclass correlation coefficients
Abstract:
Intraclass correlation coefficients (ICC) are employed in a wide range of
behavioral, biomedical, psychosocial, and health care related research for
assessing reliability of continuous outcomes. The linear mixed-effects
model (LMM) is the most popular approach for inference about the ICC.
However, since LMM is a normal distribution-based model and non-normal
data are the norm rather than the exception in most studies, its
applications to real study data always beg the question of inference
validity. In this paper, we propose a distribution-free alternative to
provide robust inference based on the functional response models. We
illustrate the performance of the new approach using both real and
simulated data.
Journal: Journal of Applied Statistics
Pages: 2539-2556
Issue: 11
Volume: 41
Year: 2014
Month: 11
X-DOI: 10.1080/02664763.2014.920780
File-URL: http://hdl.handle.net/10.1080/02664763.2014.920780
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:11:p:2539-2556
Template-Type: ReDIF-Article 1.0
Author-Name: Sanku Dey
Author-X-Name-First: Sanku
Author-X-Name-Last: Dey
Author-Name: Tanujit Dey
Author-X-Name-First: Tanujit
Author-X-Name-Last: Dey
Title: On progressively censored generalized inverted exponential distribution
Abstract:
A generalized version of inverted exponential distribution (IED) is
considered in this paper. This lifetime distribution is capable of
modeling various shapes of failure rates, and hence various shapes of
aging criteria. The model can be considered as another useful
two-parameter generalization of the IED. Maximum likelihood and Bayes
estimates for two parameters of the generalized inverted exponential
distribution (GIED) are obtained on the basis of a progressively type-II
censored sample. We also showed the existence, uniqueness and finiteness
of the maximum likelihood estimates of the parameters of GIED based on
progressively type-II censored data. Bayesian estimates are obtained using
squared error loss function. These Bayesian estimates are evaluated by
applying the Lindley's approximation method and via importance sampling
technique. The importance sampling technique is used to compute the Bayes
estimates and the associated credible intervals. We further consider the
Bayes prediction problem based on the observed samples, and provide the
appropriate predictive intervals. Monte Carlo simulations are performed to
compare the performances of the proposed methods and a data set has been
analyzed for illustrative purposes.
Journal: Journal of Applied Statistics
Pages: 2557-2576
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.922165
File-URL: http://hdl.handle.net/10.1080/02664763.2014.922165
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2557-2576
Template-Type: ReDIF-Article 1.0
Author-Name: Jiaqing Xu
Author-X-Name-First: Jiaqing
Author-X-Name-Last: Xu
Author-Name: Cheng Peng
Author-X-Name-First: Cheng
Author-X-Name-Last: Peng
Title: Fitting and testing the Marshall-Olkin extended Weibull model with randomly censored data
Abstract:
The random censorship model (RCM) is commonly used in biomedical science
for modeling life distributions. The popular non-parametric Kaplan-Meier
estimator and some semiparametric models such as Cox proportional hazard
models are extensively discussed in the literature. In this paper, we
propose to fit the RCM with the assumption that the actual life
distribution and the censoring distribution have a proportional odds
relationship. The parametric model is defined using Marshall-Olkin's
extended Weibull distribution. We utilize the maximum-likelihood procedure
to estimate model parameters, the survival distribution, the mean residual
life function, and the hazard rate as well. The proportional odds
assumption is also justified by the newly proposed bootstrap
Komogorov-Smirnov type goodness-of-fit test. A simulation study on the MLE
of model parameters and the median survival time is carried out to assess
the finite sample performance of the model. Finally, we implement the
proposed model on two real-life data sets.
Journal: Journal of Applied Statistics
Pages: 2577-2595
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.922166
File-URL: http://hdl.handle.net/10.1080/02664763.2014.922166
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2577-2595
Template-Type: ReDIF-Article 1.0
Author-Name: Athanasios Christou Micheas
Author-X-Name-First: Athanasios
Author-X-Name-Last: Christou Micheas
Title: Hierarchical Bayesian modeling of marked non-homogeneous Poisson processes with finite mixtures and inclusion of covariate information
Abstract:
We investigate marked non-homogeneous Poisson processes using finite
mixtures of bivariate normal components to model the spatial intensity
function. We employ a Bayesian hierarchical framework for estimation of
the parameters in the model, and propose an approach for including
covariate information in this context. The methodology is exemplified
through an application involving modeling of and inference for tornado
occurrences.
Journal: Journal of Applied Statistics
Pages: 2596-2615
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.922167
File-URL: http://hdl.handle.net/10.1080/02664763.2014.922167
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2596-2615
Template-Type: ReDIF-Article 1.0
Author-Name: Walmes Marques Zeviani
Author-X-Name-First: Walmes Marques
Author-X-Name-Last: Zeviani
Author-Name: Paulo Justiniano Ribeiro
Author-X-Name-First: Paulo Justiniano
Author-X-Name-Last: Ribeiro
Author-Name: Wagner Hugo Bonat
Author-X-Name-First: Wagner Hugo
Author-X-Name-Last: Bonat
Author-Name: Silvia Emiko Shimakura
Author-X-Name-First: Silvia Emiko
Author-X-Name-Last: Shimakura
Author-Name: Joel Augusto Muniz
Author-X-Name-First: Joel Augusto
Author-X-Name-Last: Muniz
Title: The Gamma-count distribution in the analysis of experimental underdispersed data
Abstract:
Event counts are response variables with non-negative integer values
representing the number of times that an event occurs within a fixed
domain such as a time interval, a geographical area or a cell of a
contingency table. Analysis of counts by Gaussian regression models
ignores the discreteness, asymmetry and heteroscedasticity and is
inefficient, providing unrealistic standard errors or possibly negative
predictions of the expected number of events. The Poisson regression is
the standard model for count data with underlying assumptions on the
generating process which may be implausible in many applications.
Statisticians have long recognized the limitation of imposing
equidispersion under the Poisson regression model. A typical situation is
when the conditional variance exceeds the conditional mean, in which case
models allowing for overdispersion are routinely used. Less reported is
the case of underdispersion with fewer modeling alternatives and
assessments available in the literature. One of such alternatives, the
Gamma-count model, is adopted here in the analysis of an agronomic
experiment designed to investigate the effect of levels of defoliation on
different phenological states upon the number of cotton bolls. Data set
and code for analysis are available as online supplements. Results show
improvements over the Poisson model and the semi-parametric quasi-Poisson
model in capturing the observed variability in the data. Estimating rather
than assuming the underlying variance process leads to important insights
into the process.
Journal: Journal of Applied Statistics
Pages: 2616-2626
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.922168
File-URL: http://hdl.handle.net/10.1080/02664763.2014.922168
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2616-2626
Template-Type: ReDIF-Article 1.0
Author-Name: Shin-Fu Tsai
Author-X-Name-First: Shin-Fu
Author-X-Name-Last: Tsai
Title: A generalized test variable approach for grain yield comparisons of rice
Abstract:
Traditionally, an assessment for grain yield of rice is to split it into
the yield components, including the number of panicles per plant, the
number of spikelets per panicle, the 1000-grain weight and the
filled-spikelet percentage, such that the yield performance can be
individually evaluated through each component, and the products of yield
components are employed for grain yield comparisons. However, when using
the standard statistical methods, such as the two-sample
t-test and analysis of variance, the assumptions of
normality and variance homogeneity cannot be fully justified for comparing
the grain yields, leading to that the empirical sizes cannot be adequately
controlled. In this study, based on the concepts of generalized test
variables and generalized p-values, a novel statistical
testing procedure is developed for grain yield comparisons of rice. The
proposed method is assessed by a series of numerical simulations.
According to the simulation results, the proposed method performs
reasonably well in Type I error control and empirical power. In addition,
a real-life field experiment is analyzed by the proposed method, some
productive rice varieties are screened out and suggested for a follow-up
investigation.
Journal: Journal of Applied Statistics
Pages: 2627-2638
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.922169
File-URL: http://hdl.handle.net/10.1080/02664763.2014.922169
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2627-2638
Template-Type: ReDIF-Article 1.0
Author-Name: Yangyi Xu
Author-X-Name-First: Yangyi
Author-X-Name-Last: Xu
Author-Name: Inyoung Kim
Author-X-Name-First: Inyoung
Author-X-Name-Last: Kim
Author-Name: Patrick Schaumont
Author-X-Name-First: Patrick
Author-X-Name-Last: Schaumont
Title: Adaptive Bayes sum test for the equality of two nonparametric functions
Abstract:
The statistical difference among massive data sets or signals is of
interest to many diverse fields including neurophysiology, imaging,
engineering, and other related fields. However, such data often have
nonlinear curves, depending on spatial patterns, and have non-white noise
that leads to difficulties in testing the significant differences between
them. In this paper, we propose an adaptive Bayes sum test that can test
the significance between two nonlinear curves by taking into account
spatial dependence and by reducing the effect of non-white noise. Our
approach is developed by adapting the Bayes sum test statistic by Hart
[13]. The spatial pattern is treated through Fourier transformation.
Resampling techniques are employed to construct the empirical distribution
of test statistic to reduce the effect of non-white noise. A simulation
study suggests that our approach performs better than the alternative
method, the adaptive Neyman test by Fan and Lin [9]. The usefulness of our
approach is demonstrated with an application in the identification of
electronic chips as well as an application to test the change of pattern
of precipitations.
Journal: Journal of Applied Statistics
Pages: 2639-2657
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.925100
File-URL: http://hdl.handle.net/10.1080/02664763.2014.925100
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2639-2657
Template-Type: ReDIF-Article 1.0
Author-Name: R. Chen
Author-X-Name-First: R.
Author-X-Name-Last: Chen
Author-Name: T. Chen
Author-X-Name-First: T.
Author-X-Name-Last: Chen
Author-Name: N. Lu
Author-X-Name-First: N.
Author-X-Name-Last: Lu
Author-Name: H. Zhang
Author-X-Name-First: H.
Author-X-Name-Last: Zhang
Author-Name: P. Wu
Author-X-Name-First: P.
Author-X-Name-Last: Wu
Author-Name: C. Feng
Author-X-Name-First: C.
Author-X-Name-Last: Feng
Author-Name: X.M. Tu
Author-X-Name-First: X.M.
Author-X-Name-Last: Tu
Title: Extending the Mann-Whitney-Wilcoxon rank sum test to longitudinal regression analysis
Abstract:
Outliers are commonly observed in psychosocial research, generally
resulting in biased estimates when comparing group differences using
popular mean-based models such as the analysis of variance model.
Rank-based methods such as the popular Mann-Whitney-Wilcoxon (MWW) rank
sum test are more effective to address such outliers. However, available
methods for inference are limited to cross-sectional data and cannot be
applied to longitudinal studies under missing data. In this paper, we
propose a generalized MWW test for comparing multiple groups with
covariates within a longitudinal data setting, by utilizing the functional
response models. Inference is based on a class of U-statistics-based
weighted generalized estimating equations, providing consistent and
asymptotically normal estimates not only under complete but missing data
as well. The proposed approach is illustrated with both real and simulated
study data.
Journal: Journal of Applied Statistics
Pages: 2658-2675
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.925101
File-URL: http://hdl.handle.net/10.1080/02664763.2014.925101
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2658-2675
Template-Type: ReDIF-Article 1.0
Author-Name: Michael R. Crager
Author-X-Name-First: Michael R.
Author-X-Name-Last: Crager
Author-Name: Gong Tang
Author-X-Name-First: Gong
Author-X-Name-Last: Tang
Title: Patient-specific meta-analysis for risk assessment using multivariate proportional hazards regression
Abstract:
We propose a method for assessing an individual patient's risk of a future
clinical event using clinical trial or cohort data and Cox proportional
hazards regression, combining the information from several studies using
meta-analysis techniques. The method combines patient-specific estimates
of the log cumulative hazard across studies, weighting by the relative
precision of the estimates, using either fixed- or random-effects
meta-analysis calculations. Risk assessment can be done for any future
patient using a few key summary statistics determined once and for all
from each study. Generalizations of the method to logistic regression and
linear models are immediate. We evaluate the methods using simulation
studies and illustrate their application using real data.
Journal: Journal of Applied Statistics
Pages: 2676-2695
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.925102
File-URL: http://hdl.handle.net/10.1080/02664763.2014.925102
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2676-2695
Template-Type: ReDIF-Article 1.0
Author-Name: Delphine Maucort-Boulch
Author-X-Name-First: Delphine
Author-X-Name-Last: Maucort-Boulch
Author-Name: Pascal Roy
Author-X-Name-First: Pascal
Author-X-Name-Last: Roy
Author-Name: Janez Stare
Author-X-Name-First: Janez
Author-X-Name-Last: Stare
Title: On a measure of information gain for regression models in survival analysis
Abstract:
Papers dealing with measures of predictive power in survival analysis have
seen their independence of censoring, or their estimates being unbiased
under censoring, as the most important property. We argue that this
property has been wrongly understood. Discussing the so-called measure of
information gain, we point out that we cannot have unbiased estimates if
all values, greater than a given time τ, are censored. This is due to
the fact that censoring before τ has a different effect than censoring
after τ. Such τ is often introduced by design of a study.
Independence can only be achieved under the assumption of the model being
valid after τ, which is impossible to verify. But if one is willing to
make such an assumption, we suggest using multiple imputation to obtain a
consistent estimate. We further show that censoring has different effects
on the estimation of the measure for the Cox model than for parametric
models, and we discuss them separately. We also give some warnings about
the usage of the measure, especially when it comes to comparing
essentially different models.
Journal: Journal of Applied Statistics
Pages: 2696-2708
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.926596
File-URL: http://hdl.handle.net/10.1080/02664763.2014.926596
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2696-2708
Template-Type: ReDIF-Article 1.0
Author-Name: Jonathan Jaeger
Author-X-Name-First: Jonathan
Author-X-Name-Last: Jaeger
Author-Name: Philippe Lambert
Author-X-Name-First: Philippe
Author-X-Name-Last: Lambert
Title: Bayesian penalized smoothing approaches in models specified using differential equations with unknown error distributions
Abstract:
A full Bayesian approach based on ordinary differential equation
(ODE)-penalized B-splines and penalized Gaussian mixture is proposed to
jointly estimate ODE-parameters, state function and error distribution
from the observation of some state functions involved in systems of affine
differential equations. Simulations inspired by pharmacokinetic (PK)
studies show that the proposed method provides comparable results to the
method based on the standard ODE-penalized B-spline approach (i.e. with
the Gaussian error distribution assumption) and outperforms the standard
ODE-penalized B-splines when the distribution is not Gaussian. This
methodology is illustrated on a PK data set.
Journal: Journal of Applied Statistics
Pages: 2709-2726
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.927839
File-URL: http://hdl.handle.net/10.1080/02664763.2014.927839
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2709-2726
Template-Type: ReDIF-Article 1.0
Author-Name: Ram C. Kafle
Author-X-Name-First: Ram C.
Author-X-Name-Last: Kafle
Author-Name: Netra Khanal
Author-X-Name-First: Netra
Author-X-Name-Last: Khanal
Author-Name: Chris P. Tsokos
Author-X-Name-First: Chris P.
Author-X-Name-Last: Tsokos
Title: Bayesian age-stratified joinpoint regression model: an application to lung and brain cancer mortality
Abstract:
Joinpoint regression model identifies significant changes in the trends of
the incidence, mortality, and survival of a specific disease in a given
population. The purpose of the present study is to develop an
age-stratified Bayesian joinpoint regression model to describe mortality
trend assuming that the observed counts are probabilistically
characterized by the Poisson distribution. The proposed model is based on
Bayesian model selection criteria with the smallest number of joinpoints
that are sufficient to explain the Annual Percentage Change. The prior
probability distributions are chosen in such a way that they are
automatically derived from the model index contained in the model space.
The proposed model and methodology estimates the age-adjusted mortality
rates in different epidemiological studies to compare the trends by
accounting the confounding effects of age. In developing the subject
methods, we use the cancer mortality counts of adult lung and bronchus
cancer, and brain and other Central Nervous System cancer patients
obtained from the Surveillance Epidemiology and End Results data base of
the National Cancer Institute.
Journal: Journal of Applied Statistics
Pages: 2727-2742
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.927840
File-URL: http://hdl.handle.net/10.1080/02664763.2014.927840
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2727-2742
Template-Type: ReDIF-Article 1.0
Author-Name: Huaiye Zhang
Author-X-Name-First: Huaiye
Author-X-Name-Last: Zhang
Author-Name: Inyoung Kim
Author-X-Name-First: Inyoung
Author-X-Name-Last: Kim
Author-Name: Chun Gun Park
Author-X-Name-First: Chun Gun
Author-X-Name-Last: Park
Title: Semiparametric Bayesian hierarchical models for heterogeneous population in nonlinear mixed effect model: application to gastric emptying studies
Abstract:
Gastric emptying studies are frequently used in medical research, both
human and animal, when evaluating the effectiveness and determining the
unintended side-effects of new and existing medications, diets, and
procedures or interventions. It is essential that gastric emptying data be
appropriately summarized before making comparisons between study groups of
interest and to allow study the comparisons. Since gastric emptying data
have a nonlinear emptying curve and are longitudinal data, nonlinear mixed
effect (NLME) models can accommodate both the variation among measurements
within individuals and the individual-to-individual variation. However,
the NLME model requires strong assumptions that are often not satisfied in
real applications that involve a relatively small number of subjects, have
heterogeneous measurement errors, or have large variation among subjects.
Therefore, we propose three semiparametric Bayesian NLMEs constructed with
Dirichlet process priors, which automatically cluster sub-populations and
estimate heterogeneous measurement errors. To compare three semiparametric
models with the parametric model we propose a penalized posterior Bayes
factor. We compare the performance of our semiparametric hierarchical
Bayesian approaches with that of the parametric Bayesian hierarchical
approach. Simulation results suggest that our semiparametric approaches
are more robust and flexible. Our gastric emptying studies from equine
medicine are used to demonstrate the advantage of our approaches.
Journal: Journal of Applied Statistics
Pages: 2743-2760
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.928848
File-URL: http://hdl.handle.net/10.1080/02664763.2014.928848
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2743-2760
Template-Type: ReDIF-Article 1.0
Author-Name: E. Bahrami Samani
Author-X-Name-First: E. Bahrami
Author-X-Name-Last: Samani
Title: Sensitivity analysis for the identifiability with application to latent random effect model for the mixed data
Abstract:
In this paper, we study the indentifiability of a latent random effect
model for the mixed correlated continuous and ordinal longitudinal
responses. We derive conditions for the identifiability of the covariance
parameters of the responses. Also, we proposed sensitivity analysis to
investigate the perturbation from the non-identifiability of the
covariance parameters, it is shown how one can use some elements of
covariance structure. These elements associate conditions for
identifiability of the covariance parameters of the responses. Influence
of small perturbation of these elements on maximal normal curvature is
also studied. The model is illustrated using medical data.
Journal: Journal of Applied Statistics
Pages: 2761-2776
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.929641
File-URL: http://hdl.handle.net/10.1080/02664763.2014.929641
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2761-2776
Template-Type: ReDIF-Article 1.0
Author-Name: Božidar V. Popović
Author-X-Name-First: Božidar V.
Author-X-Name-Last: Popović
Title: Understanding advanced statistical methods
Journal: Journal of Applied Statistics
Pages: 2777-2777
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.913838
File-URL: http://hdl.handle.net/10.1080/02664763.2014.913838
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2777-2777
Template-Type: ReDIF-Article 1.0
Author-Name: S�ren Feodor Nielsen
Author-X-Name-First: S�ren Feodor
Author-X-Name-Last: Nielsen
Title: An introduction to analysis of financial data with R
Journal: Journal of Applied Statistics
Pages: 2777-2778
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.913839
File-URL: http://hdl.handle.net/10.1080/02664763.2014.913839
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2777-2778
Template-Type: ReDIF-Article 1.0
Author-Name: Yves Laberge
Author-X-Name-First: Yves
Author-X-Name-Last: Laberge
Title: Self-organised criticality: theory, models and characterisation
Journal: Journal of Applied Statistics
Pages: 2778-2779
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.913844
File-URL: http://hdl.handle.net/10.1080/02664763.2014.913844
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2778-2779
Template-Type: ReDIF-Article 1.0
Author-Name: Mariano Ruiz Espejo
Author-X-Name-First: Mariano Ruiz
Author-X-Name-Last: Espejo
Title: Statistical methods for handling incomplete data
Journal: Journal of Applied Statistics
Pages: 2779-2780
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.913845
File-URL: http://hdl.handle.net/10.1080/02664763.2014.913845
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2779-2780
Template-Type: ReDIF-Article 1.0
Author-Name: Michail Tsagris
Author-X-Name-First: Michail
Author-X-Name-Last: Tsagris
Title: Statistics through resampling methods and R, second edition
Journal: Journal of Applied Statistics
Pages: 2780-2781
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.914130
File-URL: http://hdl.handle.net/10.1080/02664763.2014.914130
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2780-2781
Template-Type: ReDIF-Article 1.0
Author-Name: Yves Laberge
Author-X-Name-First: Yves
Author-X-Name-Last: Laberge
Title: Risk modelling in general insurance from principles to practice
Journal: Journal of Applied Statistics
Pages: 2781-2782
Issue: 12
Volume: 41
Year: 2014
Month: 12
X-DOI: 10.1080/02664763.2014.913849
File-URL: http://hdl.handle.net/10.1080/02664763.2014.913849
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:41:y:2014:i:12:p:2781-2782
Template-Type: ReDIF-Article 1.0
Author-Name: Nilesh H. Shah
Author-X-Name-First: Nilesh H.
Author-X-Name-Last: Shah
Author-Name: Alison E. Hipwell
Author-X-Name-First: Alison E.
Author-X-Name-Last: Hipwell
Author-Name: Stephanie D. Stepp
Author-X-Name-First: Stephanie D.
Author-X-Name-Last: Stepp
Author-Name: Chung-Chou H. Chang
Author-X-Name-First: Chung-Chou H.
Author-X-Name-Last: Chang
Title: Measures of discrimination for latent group-based trajectory models
Abstract:
In clinical research, patient care decisions are often easier to make if
patients are classified into a manageable number of groups based on
homogeneous risk patterns. Investigators can use latent group-based
trajectory modeling to estimate the posterior probabilities that an
individual will be classified into a particular group of risk patterns.
Although this method is increasingly used in clinical research, there is
currently no measure that can be used to determine whether an individual's
group assignment has a high level of discrimination. In this study, we
propose a discrimination index and provide confidence intervals of the
probability of the assigned group for each individual. We also propose a
modified form of entropy to measure discrimination. The two proposed
measures were applied to assess the group assignments of the longitudinal
patterns of conduct disorders among young adolescent girls.
Journal: Journal of Applied Statistics
Pages: 1-11
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.928849
File-URL: http://hdl.handle.net/10.1080/02664763.2014.928849
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:1-11
Template-Type: ReDIF-Article 1.0
Author-Name: Arief Gusnanto
Author-X-Name-First: Arief
Author-X-Name-Last: Gusnanto
Author-Name: Yudi Pawitan
Author-X-Name-First: Yudi
Author-X-Name-Last: Pawitan
Title: Sparse alternatives to ridge regression: a random effects approach
Abstract:
In a calibration of near-infrared (NIR) instrument, we regress some
chemical compositions of interest as a function of their NIR spectra. In
this process, we have two immediate challenges: first, the number of
variables exceeds the number of observations and, second, the
multicollinearity between variables are extremely high. To deal with the
challenges, prediction models that produce sparse solutions have recently
been proposed. The term 'sparse' means that some model parameters are zero
estimated and the other parameters are estimated naturally away from zero.
In effect, a variable selection is embedded in the model to potentially
achieve a better prediction. Many studies have investigated sparse
solutions for latent variable models, such as partial least squares and
principal component regression, and for direct regression models such as
ridge regression (RR). However, in the latter, it mainly involves an
L1 norm penalty to the objective function such
as lasso regression. In this study, we investigate new sparse alternative
models for RR within a random effects model framework, where we consider
Cauchy and mixture-of-normals distributions on the random effects. The
results indicate that the mixture-of-normals model produces a sparse
solution with good prediction and better interpretation. We illustrate the
methods using NIR spectra datasets from milk and corn specimens.
Journal: Journal of Applied Statistics
Pages: 12-26
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.929640
File-URL: http://hdl.handle.net/10.1080/02664763.2014.929640
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:12-26
Template-Type: ReDIF-Article 1.0
Author-Name: P. Elliott
Author-X-Name-First: P.
Author-X-Name-Last: Elliott
Author-Name: K. Riggs
Author-X-Name-First: K.
Author-X-Name-Last: Riggs
Title: Confidence regions for two proportions from independent negative binomial distributions
Abstract:
The negative binomial distribution offers an alternative view to the
binomial distribution for modeling count data. This alternative view is
particularly useful when the probability of success is very small,
because, unlike the fixed sampling scheme of the binomial distribution,
the inverse sampling approach allows one to collect enough data in order
to adequately estimate the proportion of success. However, despite work
that has been done on the joint estimation of two binomial proportions
from independent samples, there is little, if any, similar work for
negative binomial proportions. In this paper, we construct and investigate
three confidence regions for two negative binomial proportions based on
three statistics: the Wald (W), score (S) and likelihood ratio (LR)
statistics. For large-to-moderate sample sizes, this paper finds that all
three regions have good coverage properties, with comparable average areas
for large sample sizes but with the S method producing the smaller regions
for moderate sample sizes. In the small sample case, the LR method has
good coverage properties, but often at the expense of comparatively larger
areas. Finally, we apply these three regions to some real data for the
joint estimation of liver damage rates in patients taking one of two
drugs.
Journal: Journal of Applied Statistics
Pages: 27-36
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.929642
File-URL: http://hdl.handle.net/10.1080/02664763.2014.929642
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:27-36
Template-Type: ReDIF-Article 1.0
Author-Name: Stan Lipovetsky
Author-X-Name-First: Stan
Author-X-Name-Last: Lipovetsky
Title: Analytical closed-form solution for binary logit regression by categorical predictors
Abstract:
In contrast to the common belief that the logit model has no analytical
presentation, it is possible to find such a solution in the case of
categorical predictors. This paper shows that a binary logistic regression
by categorical explanatory variables can be constructed in a closed-form
solution. No special software and no iterative procedures of nonlinear
estimation are needed to obtain a model with all its parameters and
characteristics, including coefficients of regression, their standard
errors and t-statistics, as well as the residual and null
deviances. The derivation is performed for logistic models with one binary
or categorical predictor, and several binary or categorical predictors.
The analytical formulae can be used for arithmetical calculation of all
the parameters of the logit regression. The explicit expressions for the
characteristics of logit regression are convenient for the analysis and
interpretation of the results of logistic modeling.
Journal: Journal of Applied Statistics
Pages: 37-49
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.932760
File-URL: http://hdl.handle.net/10.1080/02664763.2014.932760
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:37-49
Template-Type: ReDIF-Article 1.0
Author-Name: Leena Pasanen
Author-X-Name-First: Leena
Author-X-Name-Last: Pasanen
Author-Name: Lasse Holmstr�m
Author-X-Name-First: Lasse
Author-X-Name-Last: Holmstr�m
Title: Bayesian scale space analysis of temporal changes in satellite images
Abstract:
We consider the detection of land cover changes using pairs of Landsat
ETM+ satellite images. The images consist of eight spectral bands and to
simplify the multidimensional change detection task, the image pair is
first transformed to a one-dimensional image. When the transformation is
non-linear, the true change in the images may be masked by complex noise.
For example, when changes in the Normalized Difference Vegetation Index is
considered, the variance of noise may not be constant over the image and
methods based on image thresholding can be ineffective. To facilitate
detection of change in such situations, we propose an approach that uses
Bayesian statistical modeling and simulation-based inference. In order to
detect both large and small scale changes, our method uses a scale space
approach that employs multi-level smoothing. We demonstrate the technique
using artificial test images and two pairs of real Landsat ETM+satellite
images.
Journal: Journal of Applied Statistics
Pages: 50-70
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.932761
File-URL: http://hdl.handle.net/10.1080/02664763.2014.932761
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:50-70
Template-Type: ReDIF-Article 1.0
Author-Name: Evgeny D. Maslennikov
Author-X-Name-First: Evgeny D.
Author-X-Name-Last: Maslennikov
Author-Name: Alexey V. Sulimov
Author-X-Name-First: Alexey V.
Author-X-Name-Last: Sulimov
Author-Name: Igor A. Savkin
Author-X-Name-First: Igor A.
Author-X-Name-Last: Savkin
Author-Name: Marina A. Evdokimova
Author-X-Name-First: Marina A.
Author-X-Name-Last: Evdokimova
Author-Name: Dmitry A. Zateyshchikov
Author-X-Name-First: Dmitry A.
Author-X-Name-Last: Zateyshchikov
Author-Name: Valery V. Nosikov
Author-X-Name-First: Valery V.
Author-X-Name-Last: Nosikov
Author-Name: Vladimir B. Sulimov
Author-X-Name-First: Vladimir B.
Author-X-Name-Last: Sulimov
Title: An intuitive risk factors search algorithm: usage of the Bayesian network technique in personalized medicine
Abstract:
The article focuses on the application of the Bayesian networks (BN)
technique to problems of personalized medicine. The simple (intuitive)
algorithm of BN optimization with respect to the number of nodes using
naive network topology is developed. This algorithm allows to increase the
BN prediction quality and to identify the most important variables of the
network. The parallel program implementing the algorithm has demonstrated
good scalability with an increase in the computational cores number, and
it can be applied to the large patients database containing thousands of
variables. This program is applied for the prediction for the unfavorable
outcome of coronary artery disease (CAD) for patients who survived the
acute coronary syndrome (ACS). As a result, the quality of the predictions
of the investigated networks was significantly improved and the most
important risk factors were detected. The significance of the tumor
necrosis factor-alpha gene polymorphism for the prediction of the
unfavorable outcome of CAD for patients survived after ACS was revealed
for the first time.
Journal: Journal of Applied Statistics
Pages: 71-87
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.934664
File-URL: http://hdl.handle.net/10.1080/02664763.2014.934664
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:71-87
Template-Type: ReDIF-Article 1.0
Author-Name: Douglas M. Hawkins
Author-X-Name-First: Douglas M.
Author-X-Name-Last: Hawkins
Author-Name: F. Lombard
Author-X-Name-First: F.
Author-X-Name-Last: Lombard
Title: Segmentation of circular data
Abstract:
Circular data - data whose values lie in the interval [0,2π) - are
important in a number of application areas. In some, there is a suspicion
that a sequence of circular readings may contain two or more segments
following different models. An analysis may then seek to decide whether
there are multiple segments, and if so, to estimate the changepoints
separating them. This paper presents an optimal method for segmenting
sequences of data following the von Mises distribution. It is shown by
example that the method is also successful in data following a
distribution with much heavier tails.
Journal: Journal of Applied Statistics
Pages: 88-97
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.934665
File-URL: http://hdl.handle.net/10.1080/02664763.2014.934665
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:88-97
Template-Type: ReDIF-Article 1.0
Author-Name: Peter Malave
Author-X-Name-First: Peter
Author-X-Name-Last: Malave
Author-Name: Arkadiusz Sitek
Author-X-Name-First: Arkadiusz
Author-X-Name-Last: Sitek
Title: Bayesian analysis of a one-compartment kinetic model used in medical imaging
Abstract:
Kinetic models are used extensively in science, engineering, and medicine.
Mathematically, they are a set of coupled differential equations including
a source function, otherwise known as an input function. We investigate
whether parametric modeling of a noisy input function offers any benefit
over the non-parametric input function in estimating kinetic parameters.
Our analysis includes four formulations of Bayesian posteriors of model
parameters where noise is taken into account in the likelihood functions.
Posteriors are determined numerically with a Markov chain Monte Carlo
simulation. We compare point estimates derived from the posteriors to a
weighted non-linear least squares estimate. Results imply that parametric
modeling of the input function does not improve the accuracy of model
parameters, even with perfect knowledge of the functional form. Posteriors
are validated using an unconventional utilization of the
χ-super-2-test. We demonstrate that if the noise in the input function
is not taken into account, the resulting posteriors are incorrect.
Journal: Journal of Applied Statistics
Pages: 98-113
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.934666
File-URL: http://hdl.handle.net/10.1080/02664763.2014.934666
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:98-113
Template-Type: ReDIF-Article 1.0
Author-Name: Afsane Rastegaran
Author-X-Name-First: Afsane
Author-X-Name-Last: Rastegaran
Author-Name: Mohammad Reza Zadkarami
Author-X-Name-First: Mohammad Reza
Author-X-Name-Last: Zadkarami
Title: A skew-normal random effects model for longitudinal ordinal categorical responses with missing data
Abstract:
Missing values are common in longitudinal data studies. The missing data
mechanism is termed non-ignorable (NI) if the probability of missingness
depends on the non-response (missing) observations. This paper presents a
model for the ordinal categorical longitudinal data with NI non-monotone
missing values. We assumed two separate models for the response and
missing procedure. The response is modeled as ordinal logistic, whereas
the logistic binary model is considered for the missing process. We employ
these models in the context of so-called shared-parameter models, where
the outcome and missing data models are connected by a common set of
random effects. It is commonly assumed that the random effect follows the
normal distribution in longitudinal data with or without missing data.
This can be extremely restrictive in practice, and it may result in
misleading statistical inferences. In this paper, we instead adopt a more
flexible alternative distribution which is called the skew-normal
distribution. The methodology is illustrated through an application to
Schizophrenia Collaborative Study data [19] and a simulation.
Journal: Journal of Applied Statistics
Pages: 114-126
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.938223
File-URL: http://hdl.handle.net/10.1080/02664763.2014.938223
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:114-126
Template-Type: ReDIF-Article 1.0
Author-Name: Ronaldo Dias
Author-X-Name-First: Ronaldo
Author-X-Name-Last: Dias
Author-Name: Nancy L. Garcia
Author-X-Name-First: Nancy L.
Author-X-Name-Last: Garcia
Author-Name: Guilherme Ludwig
Author-X-Name-First: Guilherme
Author-X-Name-Last: Ludwig
Author-Name: Marley A. Saraiva
Author-X-Name-First: Marley A.
Author-X-Name-Last: Saraiva
Title: Aggregated functional data model for near-infrared spectroscopy calibration and prediction
Abstract:
Calibration and prediction for NIR spectroscopy data are performed based
on a functional interpretation of the Beer-Lambert formula. Considering
that, for each chemical sample, the resulting spectrum is a continuous
curve obtained as the summation of overlapped absorption spectra from each
analyte plus a Gaussian error, we assume that each individual spectrum can
be expanded as a linear combination of B-splines basis. Calibration is
then performed using two procedures for estimating the individual
analytes' curves: basis smoothing and smoothing splines. Prediction is
done by minimizing the square error of prediction. To assess the variance
of the predicted values, we use a leave-one-out jackknife technique.
Departures from the standard error models are discussed through a
simulation study, in particular, how correlated errors impact on the
calibration step and consequently on the analytes' concentration
prediction. Finally, the performance of our methodology is demonstrated
through the analysis of two publicly available datasets.
Journal: Journal of Applied Statistics
Pages: 127-143
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.938224
File-URL: http://hdl.handle.net/10.1080/02664763.2014.938224
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:127-143
Template-Type: ReDIF-Article 1.0
Author-Name: Ying-Ju Chen
Author-X-Name-First: Ying-Ju
Author-X-Name-Last: Chen
Author-Name: Wei Ning
Author-X-Name-First: Wei
Author-X-Name-Last: Ning
Author-Name: Arjun K. Gupta
Author-X-Name-First: Arjun K.
Author-X-Name-Last: Gupta
Title: Jackknife empirical likelihood method for testing the equality of two variances
Abstract:
In this paper, we propose a nonparametric method based on jackknife
empirical likelihood ratio to test the equality of two variances. The
asymptotic distribution of the test statistic has been shown to follow
χ-super-2 distribution with the degree of freedom 1. Simulations have
been conducted to show the type I error and the power compared to Levene's
test and F test under different distribution settings.
The proposed method has been applied to a real data set to illustrate the
testing procedure.
Journal: Journal of Applied Statistics
Pages: 144-160
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.938225
File-URL: http://hdl.handle.net/10.1080/02664763.2014.938225
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:144-160
Template-Type: ReDIF-Article 1.0
Author-Name: Enrico Ciavolino
Author-X-Name-First: Enrico
Author-X-Name-Last: Ciavolino
Author-Name: Maurizio Carpita
Author-X-Name-First: Maurizio
Author-X-Name-Last: Carpita
Author-Name: Amjad Al-Nasser
Author-X-Name-First: Amjad
Author-X-Name-Last: Al-Nasser
Title: Modelling the quality of work in the Italian social co-operatives combining NPCA-RSM and SEM-GME approaches
Abstract:
The objective of this paper is to describe and analyse with appropriate
statistical models the links between work quality latent factors. Due to
the complexity of the task, the analysis is carried out through a two-step
approach: In the first step, we
construct some multidimensional measures of the subjective quality of
work, using nonlinear principal component analysis (NPCA)
and Rasch analysis with the Rating Scale
Model (NPCA-RSM);In the second step, we
adopt a Structural Equation Model based on
generalized maximum entropy (SEM-GME) to integrate the
measures achieved with the previous step and to evaluate the relationships
between the subjective work quality latent factors.
Therefore, the novel aspects of this paper are the following: (i) The
integration between the NPCA-RSM and SEM-GME, which allows reduction of
the variables analysed and evaluation of the measurement errors; (ii) The
formalization of a Job Satisfaction Model for the study of the
relationships between the subjective work quality latent factors in the
Italian social services sector.
Journal: Journal of Applied Statistics
Pages: 161-179
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.938226
File-URL: http://hdl.handle.net/10.1080/02664763.2014.938226
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:161-179
Template-Type: ReDIF-Article 1.0
Author-Name: Krystallenia Drosou
Author-X-Name-First: Krystallenia
Author-X-Name-Last: Drosou
Author-Name: Andreas Artemiou
Author-X-Name-First: Andreas
Author-X-Name-Last: Artemiou
Author-Name: Christos Koukouvinos
Author-X-Name-First: Christos
Author-X-Name-Last: Koukouvinos
Title: A comparative study of the use of large margin classifiers on seismic data
Abstract:
In this work we present a study on the analysis of a large data set from
seismology. A set of different large margin classifiers based on the
well-known support vector machine (SVM) algorithm is used to classify the
data into two classes based on their magnitude on the Richter scale. Due
to the imbalance of nature between the two classes reweighing techniques
are used to show the importance of reweighing algorithms. Moreover, we
present an incremental algorithm to explore the possibility of predicting
the strength of an earthquake with incremental techniques.
Journal: Journal of Applied Statistics
Pages: 180-201
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.938619
File-URL: http://hdl.handle.net/10.1080/02664763.2014.938619
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:180-201
Template-Type: ReDIF-Article 1.0
Author-Name: Nicholas Apergis
Author-X-Name-First: Nicholas
Author-X-Name-Last: Apergis
Author-Name: Christina Christou
Author-X-Name-First: Christina
Author-X-Name-Last: Christou
Author-Name: James E. Payne
Author-X-Name-First: James E.
Author-X-Name-Last: Payne
Author-Name: James W. Saunoris
Author-X-Name-First: James W.
Author-X-Name-Last: Saunoris
Title: The change in real interest rate persistence in OECD countries: evidence from modified panel ratio tests
Abstract:
This study examines whether real interest rates exhibit changes in
persistence for a panel of Organization of Economic Cooperation and
Development countries. The findings show that for long-term real interest
rates there are changes in persistence from I(0) to I(1). For short-term
real interest rates, the results display the absence of changes in
persistence, while under cross-sectional dependence there is only weak
evidence of changes in persistence from I(1) to I(0). The evidence of
changes in persistence when the direction is considered unknown is even
weaker.
Journal: Journal of Applied Statistics
Pages: 202-213
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.938620
File-URL: http://hdl.handle.net/10.1080/02664763.2014.938620
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:202-213
Template-Type: ReDIF-Article 1.0
Author-Name: Juan Martin Barrios
Author-X-Name-First: Juan Martin
Author-X-Name-Last: Barrios
Author-Name: Eliane R. Rodrigues
Author-X-Name-First: Eliane R.
Author-X-Name-Last: Rodrigues
Title: A queueing model to study the occurrence and duration of ozone exceedances in Mexico City
Abstract:
It is well known that long-term exposure to high levels of pollution is
hazardous to human health. Therefore, it is important to study and
understand the behavior of pollutants in general. In this work, we study
the occurrence of a pollutant concentration's surpassing a given threshold
(an exceedance) as well as the length of time that the concentration stays
above it. A general
N(t)/D/1 queueing model
is considered to jointly analyze those problems. A non-homogeneous Poisson
process is used to model the arrivals of clusters of exceedances.
Geometric and generalized negative binomial distributions are used to
model the amount of time (cluster size) that the pollutant concentration
stays above the threshold. A mixture model is also used for the cluster
size distribution. The rate function of the non-homogeneous Poisson
process is assumed to be of either the Weibull or the Musa-Okumoto type.
The selection of the model that best fits the data is performed using the
Bayes discrimination method and the sum of absolute differences as well as
using a graphical criterion. Results are applied to the daily maximum
ozone measurements provided by the monitoring network of the Metropolitan
Area of Mexico City.
Journal: Journal of Applied Statistics
Pages: 214-230
Issue: 1
Volume: 42
Year: 2015
Month: 1
X-DOI: 10.1080/02664763.2014.939613
File-URL: http://hdl.handle.net/10.1080/02664763.2014.939613
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:1:p:214-230
Template-Type: ReDIF-Article 1.0
Author-Name: Zeinab Amin
Author-X-Name-First: Zeinab
Author-X-Name-Last: Amin
Author-Name: Maram Salem
Author-X-Name-First: Maram
Author-X-Name-Last: Salem
Title: Bayesian modelling of health insurance losses
Abstract:
The purpose of this paper is to build a model for aggregate losses which
constitutes a crucial step in evaluating premiums for health insurance
systems. It aims at obtaining the predictive distribution of the aggregate
loss within each age class of insured persons over the time horizon
involved in planning employing the Bayesian methodology. The model
proposed using the Bayesian approach is a generalization of the collective
risk model, a commonly used model for analysing risk of an insurance
system. Aggregate loss prediction is based on past information on size of
loss, number of losses and size of population at risk. In modelling the
frequency and severity of losses, the number of losses is assumed to
follow a negative binomial distribution, individual loss sizes are
independent and identically distributed exponential random variables,
while the number of insured persons in a finite number of possible age
groups is assumed to follow the multinomial distribution. Prediction of
aggregate losses is based on the Gibbs sampling algorithm which
incorporates the missing data approach.
Journal: Journal of Applied Statistics
Pages: 231-251
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.947247
File-URL: http://hdl.handle.net/10.1080/02664763.2014.947247
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:231-251
Template-Type: ReDIF-Article 1.0
Author-Name: Wagner Hugo Bonat
Author-X-Name-First: Wagner Hugo
Author-X-Name-Last: Bonat
Author-Name: Paulo Justiniano Ribeiro
Author-X-Name-First: Paulo Justiniano
Author-X-Name-Last: Ribeiro
Author-Name: Walmes Marques Zeviani
Author-X-Name-First: Walmes Marques
Author-X-Name-Last: Zeviani
Title: Likelihood analysis for a class of beta mixed models
Abstract:
Beta regression is a suitable choice for modelling continuous response
variables taking values on the unit interval. Data structures such as
hierarchical, repeated measures and longitudinal typically induce extra
variability and/or dependence and can be accounted for by the inclusion of
random effects. In this sense, Statistical inference typically requires
numerical methods, possibly combined with sampling algorithms. A class of
Beta mixed models is adopted for the analysis of two real problems with
grouped data structures. We focus on likelihood inference and describe the
implemented algorithms. The first is a study on the life quality index of
industry workers with data collected according to an hierarchical sampling
scheme. The second is a study assessing the impact of hydroelectric power
plants upon measures of water quality indexes up, downstream and at the
reservoirs of the dammed rivers, with a nested and longitudinal data
structure. Results from different algorithms are reported for comparison
including from data-cloning, an alternative to numerical approximations
which also allows assessing identifiability. Confidence intervals based on
profiled likelihoods are compared with those obtained by asymptotic
quadratic approximations, showing relevant differences for parameters
related to the random effects. In both cases, the scientific hypothesis of
interest was investigated by comparing alternative models, leading to
relevant interpretations of the results within each context.
Journal: Journal of Applied Statistics
Pages: 252-266
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.947248
File-URL: http://hdl.handle.net/10.1080/02664763.2014.947248
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:252-266
Template-Type: ReDIF-Article 1.0
Author-Name: Jairo Alberto Fúquene Pati�o
Author-X-Name-First: Jairo Alberto
Author-X-Name-Last: Fúquene Pati�o
Title: A semi-parametric Bayesian extreme value model using a Dirichlet process mixture of gamma densities
Abstract:
In this paper, we propose a model with a Dirichlet process mixture of
gamma densities in the bulk part below threshold and a generalized Pareto
density in the tail for extreme value estimation. The proposed model is
simple and flexible for posterior density estimation and posterior
inference for high quantiles. The model works well even for small sample
sizes and in the absence of prior information. We evaluate the performance
of the proposed model through a simulation study. Finally, the proposed
model is applied to a real environmental data.
Journal: Journal of Applied Statistics
Pages: 267-280
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.947357
File-URL: http://hdl.handle.net/10.1080/02664763.2014.947357
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:267-280
Template-Type: ReDIF-Article 1.0
Author-Name: Young-Ju Kim
Author-X-Name-First: Young-Ju
Author-X-Name-Last: Kim
Title: Nonparametric estimation of varying-coefficient single-index models
Abstract:
The varying-coefficient single-index model has two distinguishing
features: partially linear varying-coefficient functions and a
single-index structure. This paper proposes a nonparametric method based
on smoothing splines for estimating varying-coefficient functions and an
unknown link function. Moreover, the average derivative estimation method
is applied to obtain the single-index parameter estimates. For interval
inference, Bayesian confidence intervals were obtained based on Bayes
models for varying-coefficient functions and the link function. The
performance of the proposed method is examined both through simulations
and by applying it to Boston housing data.
Journal: Journal of Applied Statistics
Pages: 281-291
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.947358
File-URL: http://hdl.handle.net/10.1080/02664763.2014.947358
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:281-291
Template-Type: ReDIF-Article 1.0
Author-Name: Xinzhong Bao
Author-X-Name-First: Xinzhong
Author-X-Name-Last: Bao
Author-Name: Qiuyan Tao
Author-X-Name-First: Qiuyan
Author-X-Name-Last: Tao
Author-Name: Hongyu Fu
Author-X-Name-First: Hongyu
Author-X-Name-Last: Fu
Title: Dynamic financial distress prediction based on Kalman filtering
Abstract:
In models for predicting financial distress, ranging from traditional
statistical models to artificial intelligence models, scholars have
primarily paid attention to improving predictive accuracy as well as the
progressivism and intellectualization of the prognostic methods. However,
the extant models use static or short-term data rather than time-series
data to draw inferences on future financial distress. If financial
distress occurs at the end of a progressive process, then omitting time
series of historical financial ratios from the analysis ignores the
cumulative effect of previous financial ratios on the current
consequences. This study incorporated the cumulative characteristics of
financial distress by using the characteristics of a state space model
that is able to perform long-term forecasts to dynamically predict an
enterprise's financial distress. Kalman filtering is used to estimate the
model parameters. Thus, the model constructed in this paper is a dynamic
financial prediction model that has the benefit of forecasting over the
long term. Additionally, current data are used to forecast the future
annual financial position and to judge whether the establishment will be
in financial distress.
Journal: Journal of Applied Statistics
Pages: 292-308
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.947359
File-URL: http://hdl.handle.net/10.1080/02664763.2014.947359
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:292-308
Template-Type: ReDIF-Article 1.0
Author-Name: Haiyan Zhao
Author-X-Name-First: Haiyan
Author-X-Name-Last: Zhao
Author-Name: Fred Huffer
Author-X-Name-First: Fred
Author-X-Name-Last: Huffer
Author-Name: Xu-Feng Niu
Author-X-Name-First: Xu-Feng
Author-X-Name-Last: Niu
Title: Time-varying coefficient models with ARMA-GARCH structures for longitudinal data analysis
Abstract:
Time-varying coefficient models with autoregressive and
moving-average-generalized autoregressive conditional heteroscedasticity
structure are proposed for examining the time-varying effects of risk
factors in longitudinal studies. Compared with existing models in the
literature, the proposed models give explicit patterns for the
time-varying coefficients. Maximum likelihood and marginal likelihood
(based on a Laplace approximation) are used to estimate the parameters in
the proposed models. Simulation studies are conducted to evaluate the
performance of these two estimation methods, which is measured in terms of
the Kullback-Leibler divergence and the root mean square error. The
marginal likelihood approach leads to the more accurate parameter
estimates, although it is more computationally intensive. The proposed
models are applied to the Framingham Heart Study to investigate the
time-varying effects of covariates on coronary heart disease incidence.
The Bayesian information criterion is used for specifying the time series
structures of the coefficients of the risk factors.
Journal: Journal of Applied Statistics
Pages: 309-326
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.949638
File-URL: http://hdl.handle.net/10.1080/02664763.2014.949638
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:309-326
Template-Type: ReDIF-Article 1.0
Author-Name: Yan Fang
Author-X-Name-First: Yan
Author-X-Name-Last: Fang
Author-Name: Ling Liu
Author-X-Name-First: Ling
Author-X-Name-Last: Liu
Author-Name: JinZhi Liu
Author-X-Name-First: JinZhi
Author-X-Name-Last: Liu
Title: A dynamic double asymmetric copula generalized autoregressive conditional heteroskedasticity model: application to China's and US stock market
Abstract:
Modeling the relationship between multiple financial markets has had a
great deal of attention in both literature and real-life applications. One
state-of-the-art technique is that the individual financial market is
modeled by generalized autoregressive conditional heteroskedasticity
(GARCH) process, while market dependence is modeled by copula, e.g.
dynamic asymmetric copula-GARCH. As an extension, we propose a dynamic
double asymmetric copula (DDAC)-GARCH model to allow for the joint
asymmetry caused by the negative shocks as well as by the copula model.
Furthermore, our model adopts a more intuitive way of constructing the
sample correlation matrix. Our new model yet satisfies the
positive-definite condition as found in dynamic conditional
correlation-GARCH and constant conditional correlation-GARCH models. The
simulation study shows the performance of the maximum likelihood estimate
for DDAC-GARCH model. As a case study, we apply this model to examine the
dependence between China and US stock markets since 1990s. We conduct a
series of likelihood ratio test tests that demonstrate our extension
(dynamic double joint asymmetry) is adequate in dynamic dependence
modeling. Also, we propose a simulation method involving the DDAC-GARCH
model to estimate value at risk (VaR) of a portfolio. Our study shows that
the proposed method depicts VaR much better than well-established
variance-covariance method.
Journal: Journal of Applied Statistics
Pages: 327-346
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.949639
File-URL: http://hdl.handle.net/10.1080/02664763.2014.949639
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:327-346
Template-Type: ReDIF-Article 1.0
Author-Name: Kofi Placid Adragni
Author-X-Name-First: Kofi Placid
Author-X-Name-Last: Adragni
Title: Independent screening in high-dimensional exponential family predictors' space
Abstract:
We present a methodology for screening predictors that, given the
response, follow a one-parameter exponential family distributions.
Screening predictors can be an important step in regressions when the
number of predictors p is excessively large or larger
than n the number of observations. We consider instances
where a large number of predictors are suspected irrelevant for having no
information about the response. The proposed methodology helps remove
these irrelevant predictors while capturing those linearly or nonlinearly
related to the response.
Journal: Journal of Applied Statistics
Pages: 347-359
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.949640
File-URL: http://hdl.handle.net/10.1080/02664763.2014.949640
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:347-359
Template-Type: ReDIF-Article 1.0
Author-Name: Pao-Sheng Shen
Author-X-Name-First: Pao-Sheng
Author-X-Name-Last: Shen
Title: Median regression model with doubly truncated data
Abstract:
We study the problem of fitting a heteroscedastic median regression model
with doubly truncated data. A self-consistency equation is proposed to
obtain an estimator. We set up a least absolute deviation estimating
function. We establish the consistency and asymptotic normality for the
case when covariates are discrete. The finite sample performance of the
proposed estimators are investigated through simulation studies. The
proposed method is illustrated using the AIDS Blood Transfusion Data.
Journal: Journal of Applied Statistics
Pages: 360-370
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.951602
File-URL: http://hdl.handle.net/10.1080/02664763.2014.951602
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:360-370
Template-Type: ReDIF-Article 1.0
Author-Name: Kouji Tahata
Author-X-Name-First: Kouji
Author-X-Name-Last: Tahata
Author-Name: Takuya Yoshimoto
Author-X-Name-First: Takuya
Author-X-Name-Last: Yoshimoto
Title: Marginal asymmetry model for square contingency tables with ordered categories
Abstract:
For the analysis of square contingency tables with ordered categories,
this paper proposes a model which indicates the structure of marginal
asymmetry. The model states that the absolute values of logarithm of ratio
of the cumulative probability that an observation will fall in row
category i or below and column category
i+1 or above to the corresponding cumulative probability
that the observation falls in column category i or below
and row category i+1 or above are constant for every
i. We deal with the estimation problem for the model
parameter and goodness-of-fit tests. Also we discuss the relationships
between the model and a measure which represents the degree of departure
from marginal homogeneity. Examples are given.
Journal: Journal of Applied Statistics
Pages: 371-379
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.951603
File-URL: http://hdl.handle.net/10.1080/02664763.2014.951603
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:371-379
Template-Type: ReDIF-Article 1.0
Author-Name: Bertil Wegmann
Author-X-Name-First: Bertil
Author-X-Name-Last: Wegmann
Title: Bayesian comparison of private and common values in structural second-price auctions
Abstract:
Private and common values (CVs) are the two main competing valuation
models in auction theory and empirical work. In the framework of
second-price auctions, we compare the empirical performance of the
independent private value (IPV) model to the CV model on a number of
different dimensions, both on real data from eBay coin auctions and on
simulated data. Both models fit the eBay data well with a slight edge for
the CV model. However, the differences between the fit of the models seem
to depend to some extent on the complexity of the models. According to log
predictive score the IPV model predicts auction prices slightly better in
most auctions, while the more robust CV model is much better at predicting
auction prices in more unusual auctions. In terms of posterior odds, the
CV model is clearly more supported by the eBay data.
Journal: Journal of Applied Statistics
Pages: 380-397
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.951604
File-URL: http://hdl.handle.net/10.1080/02664763.2014.951604
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:380-397
Template-Type: ReDIF-Article 1.0
Author-Name: T. G�recki
Author-X-Name-First: T.
Author-X-Name-Last: G�recki
Title: Sequential combining in discriminant analysis
Abstract:
In practice, it often happens that we have a number of base methods of
classification. We are not able to clearly determine which method is
optimal in the sense of the smallest error rate. Then we have a combined
method that allows us to consolidate information from multiple sources in
a better classifier. I propose a different approach, a sequential
approach. Sequentiality is understood here in the sense of adding
posterior probabilities to the original data set and so created data are
used during classification process. We combine posterior probabilities
obtained from base classifiers using all combining methods. Finally, we
combine these probabilities using a mean combining method. To the original
data set we add obtained posterior probabilities as additional features.
In each step we change our additional probabilities to achieve the minimum
error rate for base methods. Experimental results on different data sets
demonstrate that the method is efficient and that this approach
outperforms base methods providing a reduction in the mean classification
error rate.
Journal: Journal of Applied Statistics
Pages: 398-408
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.951605
File-URL: http://hdl.handle.net/10.1080/02664763.2014.951605
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:398-408
Template-Type: ReDIF-Article 1.0
Author-Name: Sterling McPherson
Author-X-Name-First: Sterling
Author-X-Name-Last: McPherson
Author-Name: Celestina Barbosa-Leiker
Author-X-Name-First: Celestina
Author-X-Name-Last: Barbosa-Leiker
Title: Biomarker classification derived from finite growth mixture modeling with a time-varying covariate: an example with phosphorus and glomerular filtration rate
Abstract:
Finite growth mixture modeling may prove extremely useful for identifying
initial pharmacotherapeutic targets for clinical intervention purposes in
chronic kidney disease. The primary goal of this research is to
demonstrate and describe the process of identifying a longitudinal
classification scheme to guide timing and dose of treatment in future
randomized clinical trials. After discussing the statistical architecture,
we describe the model selection and fit criteria in detail before choosing
and selecting our final 4-class solution (BIC = 1612.577, BLRT of
p > .001). The first class (highly elevated group) had an
average starting point of 3.969 mg/dl of phosphorus at Visit 1, and
increased 0.143 every two years until Visit 4. The second, elevated class
had an average starting point of 3.460 mg/dl of phosphorus at Visit 1, and
increased 0.101 every two years until Visit 4. The normative class had an
average starting point of 3.019 mg/dl of phosphorus at Visit 1, and
increased 0.099 every two years until Visit 4. Lastly, the low class had
an average starting point of 2.525 mg/dl of phosphorus at Visit 1, and
increased 0.158 every two years until Visit 4. We hope that this example
will spur future applications in biomedical sciences in order to refine
therapeutic targets and/or construct long-term risk categories.
Journal: Journal of Applied Statistics
Pages: 409-427
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.957263
File-URL: http://hdl.handle.net/10.1080/02664763.2014.957263
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:409-427
Template-Type: ReDIF-Article 1.0
Author-Name: Lei Shi
Author-X-Name-First: Lei
Author-X-Name-Last: Shi
Author-Name: Md. Mostafizur Rahman
Author-X-Name-First: Md. Mostafizur
Author-X-Name-Last: Rahman
Author-Name: Wen Gan
Author-X-Name-First: Wen
Author-X-Name-Last: Gan
Author-Name: Jianhua Zhao
Author-X-Name-First: Jianhua
Author-X-Name-Last: Zhao
Title: Stepwise local influence in generalized autoregressive conditional heteroskedasticity models
Abstract:
Detection of outliers or influential observations is an important work in
statistical modeling, especially for the correlated time series data. In
this paper we propose a new procedure to detect patch of influential
observations in the generalized autoregressive conditional
heteroskedasticity (GARCH) model. Firstly we compare the performance of
innovative perturbation scheme, additive perturbation scheme and data
perturbation scheme in local influence analysis. We find that the
innovative perturbation scheme give better result than other two schemes
although this perturbation scheme may suffer from masking effects. Then we
use the stepwise local influence method under innovative perturbation
scheme to detect patch of influential observations and uncover the masking
effects. The simulated studies show that the new technique can
successfully detect a patch of influential observations or outliers under
innovative perturbation scheme. The analysis based on simulation studies
and two real data sets show that the stepwise local influence method under
innovative perturbation scheme is efficient for detecting multiple
influential observations and dealing with masking effects in the GARCH
model.
Journal: Journal of Applied Statistics
Pages: 428-444
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.957661
File-URL: http://hdl.handle.net/10.1080/02664763.2014.957661
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:428-444
Template-Type: ReDIF-Article 1.0
Author-Name: Rahim Alhamzawi
Author-X-Name-First: Rahim
Author-X-Name-Last: Alhamzawi
Title: Model selection in quantile regression models
Abstract:
Lasso methods are regularisation and shrinkage methods widely used for
subset selection and estimation in regression problems. From a Bayesian
perspective, the Lasso-type estimate can be viewed as a Bayesian posterior
mode when specifying independent Laplace prior distributions for the
coefficients of independent variables [32]. A scale mixture of normal
priors can also provide an adaptive regularisation method and represents
an alternative model to the Bayesian Lasso-type model. In this paper, we
assign a normal prior with mean zero and unknown variance for each
quantile coefficient of independent variable. Then, a simple Markov Chain
Monte Carlo-based computation technique is developed for quantile
regression (QReg) models, including continuous, binary and left-censored
outcomes. Based on the proposed prior, we propose a criterion for model
selection in QReg models. The proposed criterion can be applied to
classical least-squares, classical QReg, classical Tobit QReg and many
others. For example, the proposed criterion can be applied to
rq(), lm() and
crq() which is available in an R package called
Brq. Through simulation studies and analysis of a prostate cancer data
set, we assess the performance of the proposed methods. The simulation
studies and the prostate cancer data set analysis confirm that our methods
perform well, compared with other approaches.
Journal: Journal of Applied Statistics
Pages: 445-458
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.959905
File-URL: http://hdl.handle.net/10.1080/02664763.2014.959905
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:445-458
Template-Type: ReDIF-Article 1.0
Author-Name: William Hughes
Author-X-Name-First: William
Author-X-Name-Last: Hughes
Title: Paradoxes in scientific inference
Journal: Journal of Applied Statistics
Pages: 459-460
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.942770
File-URL: http://hdl.handle.net/10.1080/02664763.2014.942770
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:459-460
Template-Type: ReDIF-Article 1.0
Author-Name: Abhay Kumar Tiwari
Author-X-Name-First: Abhay Kumar
Author-X-Name-Last: Tiwari
Title: Getting started with business analytics
Journal: Journal of Applied Statistics
Pages: 460-461
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.942771
File-URL: http://hdl.handle.net/10.1080/02664763.2014.942771
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:460-461
Template-Type: ReDIF-Article 1.0
Author-Name: Prabhanjan Tattar
Author-X-Name-First: Prabhanjan
Author-X-Name-Last: Tattar
Title: Statistical methods with applications to demography and life insurance
Journal: Journal of Applied Statistics
Pages: 461-462
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.942772
File-URL: http://hdl.handle.net/10.1080/02664763.2014.942772
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:461-462
Template-Type: ReDIF-Article 1.0
Author-Name: Hassan S. Bakouch
Author-X-Name-First: Hassan S.
Author-X-Name-Last: Bakouch
Title: Generalized linear models for categorical and continuous limited dependent variables
Journal: Journal of Applied Statistics
Pages: 462-462
Issue: 2
Volume: 42
Year: 2015
Month: 2
X-DOI: 10.1080/02664763.2014.942773
File-URL: http://hdl.handle.net/10.1080/02664763.2014.942773
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:2:p:462-462
Template-Type: ReDIF-Article 1.0
Author-Name: Wei Liu
Author-X-Name-First: Wei
Author-X-Name-Last: Liu
Author-Name: Shuyou Li
Author-X-Name-First: Shuyou
Author-X-Name-Last: Li
Title: A multiple imputation approach to nonlinear mixed-effects models with covariate measurement errors and missing values
Abstract:
In longitudinal studies, nonlinear mixed-effects models have been widely
applied to describe the intra- and the inter-subject variations in data.
The inter-subject variation usually receives great attention and it may be
partially explained by time-dependent covariates. However, some covariates
may be measured with substantial errors and may contain missing values. We
proposed a multiple imputation method, implemented by a Markov Chain
Monte-Carlo method along with Gibbs sampler, to address the covariate
measurement errors and missing data in nonlinear mixed-effects models. The
multiple imputation method is illustrated in a real data example.
Simulation studies show that the multiple imputation method outperforms
the commonly used naive methods.
Journal: Journal of Applied Statistics
Pages: 463-476
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.960372
File-URL: http://hdl.handle.net/10.1080/02664763.2014.960372
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:463-476
Template-Type: ReDIF-Article 1.0
Author-Name: Ahmed Hossain
Author-X-Name-First: Ahmed
Author-X-Name-Last: Hossain
Author-Name: Joseph Beyene
Author-X-Name-First: Joseph
Author-X-Name-Last: Beyene
Title: Application of skew-normal distribution for detecting differential expression to microRNA data
Abstract:
Traditional statistical modeling of continuous outcome variables relies
heavily on the assumption of a normal distribution. However, in some
applications, such as analysis of microRNA (miRNA) data, normality may not
hold. Skewed distributions play an important role in such studies and
might lead to robust results in the presence of extreme outliers. We apply
a skew-normal (SN) distribution, which is indexed by three parameters
(location, scale and shape), in the context of miRNA studies. We developed
a test statistic for comparing means of two conditions replacing the
normal assumption with SN distribution. We compared the performance of the
statistic with other Wald-type statistics through simulations. Two real
miRNA datasets are analyzed to illustrate the methods. Our simulation
findings showed that the use of a SN distribution can result in improved
identification of differentially expressed miRNAs, especially with
markedly skewed data and when the two groups have different variances. It
also appeared that the statistic with SN assumption performs comparably
with other Wald-type statistics irrespective of the sample size or
distribution. Moreover, the real dataset analyses suggest that the
statistic with SN assumption can be used effectively for identification of
important miRNAs. Overall, the statistic with SN distribution is useful
when data are asymmetric and when the samples have different variances for
the two groups.
Journal: Journal of Applied Statistics
Pages: 477-491
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.962490
File-URL: http://hdl.handle.net/10.1080/02664763.2014.962490
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:477-491
Template-Type: ReDIF-Article 1.0
Author-Name: Himadri Ghosh
Author-X-Name-First: Himadri
Author-X-Name-Last: Ghosh
Author-Name: Bishal Gurung
Author-X-Name-First: Bishal
Author-X-Name-Last: Gurung
Author-Name: Prajneshu
Author-X-Name-First:
Author-X-Name-Last: Prajneshu
Title: Kalman filter-based modelling and forecasting of stochastic volatility with threshold
Abstract:
We propose a parametric nonlinear time-series model, namely the
Autoregressive-Stochastic volatility with threshold (AR-SVT) model with
mean equation for forecasting level and volatility. Methodology for
estimation of parameters of this model is developed by first obtaining
recursive Kalman filter time-update equation and then employing the
unrestricted quasi-maximum likelihood method. Furthermore, optimal
one-step and two-step-ahead out-of-sample forecasts formulae along with
forecast error variances are derived analytically by recursive use of
conditional expectation and variance. As an illustration, volatile
all-India monthly spices export during the period January 2006 to January
2012 is considered. Entire data analysis is carried out using EViews and
matrix laboratory (MATLAB) software packages. The AR-SVT model is fitted
and interval forecasts for 10 hold-out data points are obtained.
Superiority of this model for describing and forecasting over other
competing models for volatility, namely AR-Generalized autoregressive
conditional heteroscedastic, AR-Exponential GARCH, AR-Threshold GARCH, and
AR-Stochastic volatility models is shown for the data under consideration.
Finally, for the AR-SVT model, optimal out-of-sample forecasts along with
forecasts of one-step-ahead variances are obtained.
Journal: Journal of Applied Statistics
Pages: 492-507
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.963524
File-URL: http://hdl.handle.net/10.1080/02664763.2014.963524
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:492-507
Template-Type: ReDIF-Article 1.0
Author-Name: Babulal Seal
Author-X-Name-First: Babulal
Author-X-Name-Last: Seal
Author-Name: Sk Jakir Hossain
Author-X-Name-First: Sk Jakir
Author-X-Name-Last: Hossain
Title: Empirical Bayes estimation of parameters in Markov transition probability matrix with computational methods
Abstract:
Empirical Bayes estimator for the transition probability matrix is worked
out in the cases where we have belief regarding the parameters, For
example, where the states seem to be equal or not. In both cases, priors
are in accordance with our beliefs. Using EM algorithm, computational
methods for different hyperparameters of the empirical Bayes are
described. Also, robustness of empirical Bayes procedure is investigated.
Journal: Journal of Applied Statistics
Pages: 508-519
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.963525
File-URL: http://hdl.handle.net/10.1080/02664763.2014.963525
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:508-519
Template-Type: ReDIF-Article 1.0
Author-Name: E. Ciavolino
Author-X-Name-First: E.
Author-X-Name-Last: Ciavolino
Author-Name: A. Calcagnì
Author-X-Name-First: A.
Author-X-Name-Last: Calcagnì
Title: Generalized cross entropy method for analysing the SERVQUAL model
Abstract:
The aim of this paper is to define a new approach for the analysis of data
collected by means of SERVQUAL questionnaires which is based on the
generalized cross entropy (GCE) approach. In this respect, we firstly give
a short review about the important role that SERVQUAL plays in the
analysis of service quality as well as in the assessment of the
competitiveness of public and private organizations. Secondly, we provide
a formal definition of GCE approach together with a brief discussion about
its features and usefulness. Finally, we show the application of GCE for a
SERVQUAL model, based on a patients' satisfaction case study and we
discuss the results obtained by using the proposed GCE-SERVQUAL
methodology.
Journal: Journal of Applied Statistics
Pages: 520-534
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.963526
File-URL: http://hdl.handle.net/10.1080/02664763.2014.963526
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:520-534
Template-Type: ReDIF-Article 1.0
Author-Name: Paola Annoni
Author-X-Name-First: Paola
Author-X-Name-Last: Annoni
Author-Name: Rainer Bruggemann
Author-X-Name-First: Rainer
Author-X-Name-Last: Bruggemann
Author-Name: Lars Carlsen
Author-X-Name-First: Lars
Author-X-Name-Last: Carlsen
Title: A multidimensional view on poverty in the European Union by partial order theory
Abstract:
Poverty can be seen as a multidimensional phenomenon described by a set of
indicators, the poverty components. A one-dimensional measure of poverty
serving as a ranking index can be obtained by combining the component
indicators via aggregation techniques. Ranking indices are thought of as
supporting political decisions. This paper proposes an alternative to
aggregation based on simple concepts of partial order theory and
illustrates the pros and cons of this approach taking as case study a
multidimensional measure of poverty comprising three components - absolute
poverty, relative poverty and income - computed for the European Union
regions. The analysis enables one to highlight conflicts across the
components with some regions detected as controversial, with, for example,
low levels of relative poverty and high levels of monetary poverty. The
partial order approach enables one to point to the regions with the most
severe data conflicts and to the component indicators that cause these
conflicts.
Journal: Journal of Applied Statistics
Pages: 535-554
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.978269
File-URL: http://hdl.handle.net/10.1080/02664763.2014.978269
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:535-554
Template-Type: ReDIF-Article 1.0
Author-Name: Nicholas Beyler
Author-X-Name-First: Nicholas
Author-X-Name-Last: Beyler
Author-Name: Wayne Fuller
Author-X-Name-First: Wayne
Author-X-Name-Last: Fuller
Author-Name: Sarah Nusser
Author-X-Name-First: Sarah
Author-X-Name-Last: Nusser
Author-Name: Gregory Welk
Author-X-Name-First: Gregory
Author-X-Name-Last: Welk
Title: Predicting objective physical activity from self-report surveys: a model validation study using estimated generalized least-squares regression
Abstract:
Physical activity measurements derived from self-report surveys are prone
to measurement errors. Monitoring devices like accelerometers offer more
objective measurements of physical activity, but are impractical for use
in large-scale surveys. A model capable of predicting objective
measurements of physical activity from self-reports would offer a
practical alternative to obtaining measurements directly from monitoring
devices. Using data from National Health and Nutrition Examination Survey
2003-2006, we developed and validated models for predicting objective
physical activity from self-report variables and other demographic
characteristics. The prediction intervals produced by the models were
large, suggesting that the ability to predict objective physical activity
for individuals from self-reports is limited.
Journal: Journal of Applied Statistics
Pages: 555-565
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.978271
File-URL: http://hdl.handle.net/10.1080/02664763.2014.978271
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:555-565
Template-Type: ReDIF-Article 1.0
Author-Name: Tong Siu Tung Wong
Author-X-Name-First: Tong Siu Tung
Author-X-Name-Last: Wong
Author-Name: Wai Keung Li
Author-X-Name-First: Wai Keung
Author-X-Name-Last: Li
Title: Extreme values identification in regression using a peaks-over-threshold approach
Abstract:
The problem of heavy tail in regression models is studied. It is proposed
that regression models are estimated by a standard procedure and a
statistical check for heavy tail using residuals is conducted as a tool
for regression diagnostic. Using the peaks-over-threshold approach, the
generalized Pareto distribution quantifies the degree of heavy tail by the
extreme value index. The number of excesses is determined by means of an
innovative threshold model which partitions the random sample into extreme
values and ordinary values. The overall decision on a significant heavy
tail is justified by both a statistical test and a quantile-quantile plot.
The usefulness of the approach includes justification of goodness of fit
of the estimated regression model and quantification of the occurrence of
extremal events. The proposed methodology is supplemented by surface ozone
level in the city center of Leeds.
Journal: Journal of Applied Statistics
Pages: 566-576
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.978843
File-URL: http://hdl.handle.net/10.1080/02664763.2014.978843
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:566-576
Template-Type: ReDIF-Article 1.0
Author-Name: Luo Yong
Author-X-Name-First: Luo
Author-X-Name-Last: Yong
Author-Name: Zhu Bo
Author-X-Name-First: Zhu
Author-X-Name-Last: Bo
Author-Name: Tang Yong
Author-X-Name-First: Tang
Author-X-Name-Last: Yong
Title: Dynamic optimal capital growth of diversified investment
Abstract:
We investigate the problem of dynamic optimal capital growth of
diversified investment. A general framework that the trader maximize the
expected log utility of long-term growth rate of initial wealth was
developed. We show that the trader's fortune will exceed any fixed bound
when the fraction is chosen less than critical value. But, if the fraction
is larger than that value, ruin is almost sure. In order to maximize
wealth, we should choose the optimal fraction at each trade. Empirical
results with real financial data show the feasible allocation. The larger
the fraction and hence the larger the chance of falling below the desired
wealth growth path.
Journal: Journal of Applied Statistics
Pages: 577-588
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.980783
File-URL: http://hdl.handle.net/10.1080/02664763.2014.980783
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:577-588
Template-Type: ReDIF-Article 1.0
Author-Name: Guglielmo Maria Caporale
Author-X-Name-First: Guglielmo Maria
Author-X-Name-Last: Caporale
Author-Name: Luis A. Gil-Alana
Author-X-Name-First: Luis A.
Author-X-Name-Last: Gil-Alana
Title: Infant mortality rates: time trends and fractional integration
Abstract:
This paper examines the existence of time trends in the infant mortality
rates in a number of countries in the twentieth century. We test for the
presence of deterministic trends by adopting a linear model for the
log-transformed data. Instead of assuming that the error term is a
stationary I(0), or alternatively, a non-stationary
I(1) process, we allow for the possibility of fractional
integration and hence for a much greater degree of flexibility in the
dynamic specification of the series. Indeed, once the linear trend is
removed, all series appear to be I(d)
with 0>d>1, implying long-range dependence. As expected,
the time trend coefficients are significantly negative, although of a
different magnitude from those obtained assuming integer orders of
differentiation.
Journal: Journal of Applied Statistics
Pages: 589-602
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.980785
File-URL: http://hdl.handle.net/10.1080/02664763.2014.980785
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:589-602
Template-Type: ReDIF-Article 1.0
Author-Name: Guoyi Zhang
Author-X-Name-First: Guoyi
Author-X-Name-Last: Zhang
Author-Name: Zhongxue Chen
Author-X-Name-First: Zhongxue
Author-X-Name-Last: Chen
Title: Inferences on correlation coefficients of bivariate log-normal distributions
Abstract:
This article considers inference on correlation coefficients of bivariate
log-normal distributions. We developed generalized confidence intervals
and hypothesis tests for the correlation coefficients, and extended the
results to compare two independent correlations. Simulation studies show
that the suggested methods work well. Two practical examples are used to
illustrate the application of the proposed methods.
Journal: Journal of Applied Statistics
Pages: 603-613
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.980786
File-URL: http://hdl.handle.net/10.1080/02664763.2014.980786
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:603-613
Template-Type: ReDIF-Article 1.0
Author-Name: Pavel Krupskii
Author-X-Name-First: Pavel
Author-X-Name-Last: Krupskii
Author-Name: Harry Joe
Author-X-Name-First: Harry
Author-X-Name-Last: Joe
Title: Tail-weighted measures of dependence
Abstract:
Multivariate copula models are commonly used in place of Gaussian
dependence models when plots of the data suggest tail dependence and tail
asymmetry. In these cases, it is useful to have simple statistics to
summarize the strength of dependence in different joint tails. Measures of
monotone association such as Kendall's tau and Spearman's rho are
insufficient to distinguish commonly used parametric bivariate families
with different tail properties. We propose lower and upper tail-weighted
bivariate measures of dependence as additional scalar measures to
distinguish bivariate copulas with roughly the same overall monotone
dependence. These measures allow the efficient estimation of strength of
dependence in the joint tails and can be used as a guide for selection of
bivariate linking copulas in vine and factor models as well as for
assessing the adequacy of fit of multivariate copula models. We apply the
tail-weighted measures of dependence to a financial data set and show that
the measures better discriminate models with different tail properties
compared to commonly used risk measures - the portfolio value-at-risk and
conditional tail expectation.
Journal: Journal of Applied Statistics
Pages: 614-629
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.980787
File-URL: http://hdl.handle.net/10.1080/02664763.2014.980787
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:614-629
Template-Type: ReDIF-Article 1.0
Author-Name: Chun-Xia Zhang
Author-X-Name-First: Chun-Xia
Author-X-Name-Last: Zhang
Author-Name: Guan-Wei Wang
Author-X-Name-First: Guan-Wei
Author-X-Name-Last: Wang
Author-Name: Jun-Min Liu
Author-X-Name-First: Jun-Min
Author-X-Name-Last: Liu
Title: RandGA: injecting randomness into parallel genetic algorithm for variable selection
Abstract:
Recently, the ensemble learning approaches have been proven to be quite
effective for variable selection in linear regression models. In general,
a good variable selection ensemble should consist of a diverse collection
of strong members. Based on the parallel genetic algorithm (PGA) proposed
in [41], in this paper, we propose a novel method RandGA through injecting
randomness into PGA with the aim to increase the diversity among ensemble
members. Using a number of simulated data sets, we show that the newly
proposed method RandGA compares favorably with other variable selection
techniques. As a real example, the new method is applied to the diabetes
data.
Journal: Journal of Applied Statistics
Pages: 630-647
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.980788
File-URL: http://hdl.handle.net/10.1080/02664763.2014.980788
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:630-647
Template-Type: ReDIF-Article 1.0
Author-Name: C.B. Garc�a
Author-X-Name-First: C.B.
Author-X-Name-Last: Garc�a
Author-Name: J. Garc�a
Author-X-Name-First: J.
Author-X-Name-Last: Garc�a
Author-Name: M.M. L�pez Mart�n
Author-X-Name-First: M.M.
Author-X-Name-Last: L�pez Mart�n
Author-Name: R. Salmer�n
Author-X-Name-First: R.
Author-X-Name-Last: Salmer�n
Title: Collinearity: revisiting the variance inflation factor in ridge regression
Abstract:
Ridge regression has been widely applied to estimate under collinearity by
defining a class of estimators that are dependent on the parameter
k. The variance inflation factor (VIF) is applied to
detect the presence of collinearity and also as an objective method to
obtain the value of k in ridge regression. Contrarily to
the definition of the VIF, the expressions traditionally applied in ridge
regression do not necessarily lead to values of VIFs equal to or greater
than 1. This work presents an alternative expression to calculate the VIF
in ridge regression that satisfies the aforementioned condition and also
presents other interesting properties.
Journal: Journal of Applied Statistics
Pages: 648-661
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.980789
File-URL: http://hdl.handle.net/10.1080/02664763.2014.980789
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:648-661
Template-Type: ReDIF-Article 1.0
Author-Name: Aviral Kumar Tiwari
Author-X-Name-First: Aviral Kumar
Author-X-Name-Last: Tiwari
Author-Name: Alexander Ludwig
Author-X-Name-First: Alexander
Author-X-Name-Last: Ludwig
Title: Short- and long-run rolling causality techniques and optimal window-wise lag selection: an application to the export-led growth hypothesis
Abstract:
The literature devoted to the export-led growth (ELG) hypothesis, which is
of utmost importance for policymaking in emerging countries, provides
mixed evidence for the validity of the hypothesis. Recent contributions
focus on the time-dependence of the relationship between export and output
growth using rolling causality techniques based on vector autoregressive
models. These models focus on a short-term view which captures single
policy-induced developments. However, long-term structural changes cannot
be covered by examinations related to the short-term. This paper hence
examines the time-varying validity of the ELG hypothesis for India for the
period 1960-2011 using rolling causality techniques for both the short-run
and long-run horizon. For the first time, window-wise optimal
lag-selection procedures are applied in connection with these techniques.
We find that exports long-run caused output growth from 1997 until 2009
which can be seen as a consequence of political reforms of the 1990s that
boosted economic growth by generating foreign direct investment
opportunities and higher exports. For the short-run, export significantly
caused output in the period 1998-2003 which followed a concentration of
liberalization measures in 1997. Causality in the reversed direction, from
output to exports, only seems to be relevant in the short-run.
Journal: Journal of Applied Statistics
Pages: 662-675
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.980790
File-URL: http://hdl.handle.net/10.1080/02664763.2014.980790
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:662-675
Template-Type: ReDIF-Article 1.0
Author-Name: J. Lee
Author-X-Name-First: J.
Author-X-Name-Last: Lee
Author-Name: Y. Wu
Author-X-Name-First: Y.
Author-X-Name-Last: Wu
Author-Name: H. Kim
Author-X-Name-First: H.
Author-X-Name-Last: Kim
Title: Unbalanced data classification using support vector machines with active learning on scleroderma lung disease patterns
Abstract:
Unbalanced data classification has been a long-standing issue in the field
of medical vision science. We introduced the methods of support vector
machines (SVM) with active learning (AL) to improve prediction of
unbalanced classes in the medical imaging field. A standard SVM algorithm
with four different AL approaches are proposed: (1) The first one uses
random sampling to select the initial pool with AL algorithm; (2) the
second doubles the training instances of the rare category to reduce the
unbalanced ratio before the AL algorithm; (3) the third uses a balanced
pool with equal number from each category; and (4) the fourth uses a
balanced pool and implements balanced sampling throughout the AL
algorithm. Grid pixel data of two scleroderma lung disease patterns, lung
fibrosis (LF), and honeycomb (HC) were extracted from computed tomography
images of 71 patients to produce a training set of 348 HC and 3009 LF
instances and a test set of 291 HC and 2665 LF. From our research, SVM
with AL using balanced sampling compared to random sampling increased the
test sensitivity of HC by 56% (17.5% vs. 73.5%) and 47% (23% vs. 70%) for
the original and denoised dataset, respectively. SVM with AL with balanced
sampling can improve the classification performances of unbalanced data.
Journal: Journal of Applied Statistics
Pages: 676-689
Issue: 3
Volume: 42
Year: 2015
Month: 3
X-DOI: 10.1080/02664763.2014.978270
File-URL: http://hdl.handle.net/10.1080/02664763.2014.978270
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:3:p:676-689
Template-Type: ReDIF-Article 1.0
Author-Name: Muhammad Shahbaz
Author-X-Name-First: Muhammad
Author-X-Name-Last: Shahbaz
Author-Name: Aviral Kumar Tiwari
Author-X-Name-First: Aviral Kumar
Author-X-Name-Last: Tiwari
Author-Name: Mohammad Iqbal Tahir
Author-X-Name-First: Mohammad Iqbal
Author-X-Name-Last: Tahir
Title: Analyzing time-frequency relationship between oil price and exchange rate in Pakistan through wavelets
Abstract:
This study analyzed the time-frequency relationship between oil price and
exchange rate for Pakistan by using measures of continuous wavelet such as
wavelet power, cross-wavelet power, and cross-wavelet coherency (WTC). The
results of cross-wavelet analysis indicated that covariance between oil
price and exchange rate is unable to give clear-cut results, but both
variables have been in phase and out phase (i.e. they are anti-cyclical
and cyclical in nature) in some or other durations. However, results of
squared wavelet coherence disclose that both variables are out of phase
and real exchange rate was leading during the entire period studied,
corresponding to the 10-15 months' scale. These results
are the unique contribution of the present study, which would have not
been drawn if one would have utilized any other time series or frequency
domain-based approach. This finding provides evidence of anti-cyclical
relationship between oil price and real effective exchange rate; however,
in most of the period studied, real exchange rate was leading and passing
anti-cycle effects on oil price shocks which is the major contribution of
the study.
Journal: Journal of Applied Statistics
Pages: 690-704
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.980784
File-URL: http://hdl.handle.net/10.1080/02664763.2014.980784
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:690-704
Template-Type: ReDIF-Article 1.0
Author-Name: Resit Çelik
Author-X-Name-First: Resit
Author-X-Name-Last: Çelik
Title: Stabilizing heteroscedasticity for butterfly-distributed residuals by the weighting absolute centered external variable
Abstract:
In the current study, a new method by the weighting absolute centered
external variable (WCEV) was proposed to stabilize heteroscedasticity for
butterfly-distributed residuals (BDRs). After giving brief information
about heteroscedasticity and BDRs, WCEV was introduced. The WCEV and
commonly used variance stabilizing methods are compared on a simple and a
multiple regression model. The WCEV was also tested for other type of
heteroscedasticity patterns. In addition to heteroscedasticity, other
regression assumptions were checked for the WCEV.
Journal: Journal of Applied Statistics
Pages: 705-721
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.980791
File-URL: http://hdl.handle.net/10.1080/02664763.2014.980791
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:705-721
Template-Type: ReDIF-Article 1.0
Author-Name: L. Alam�
Author-X-Name-First: L.
Author-X-Name-Last: Alam�
Author-Name: D. Conesa
Author-X-Name-First: D.
Author-X-Name-Last: Conesa
Author-Name: A. Forte
Author-X-Name-First: A.
Author-X-Name-Last: Forte
Author-Name: E. Tortosa-Ausina
Author-X-Name-First: E.
Author-X-Name-Last: Tortosa-Ausina
Title: The geography of Spanish bank branches
Abstract:
This article analyzes the determinants of bank branch location in Spain
taking the role of geography explicitly into account. After a long period
of intense territorial expansion, especially by savings banks, many of
these firms are now involved in merger processes triggered off by the
financial crisis, most of which entail the closing of many branches.
However, given the contributions of this type of banks to limit financial
exclusion, this process might exacerbate the consequences of the crisis
for some disadvantaged social groups. Related problems such as new banking
regulation initiatives (Basel III), or the current excess capacity in the
sector add further relevance to this problem. We address this issue from a
Bayesian perspective, using a Poisson regression model within the
framework of generalized linear mixed models. This proposal allows us to
assess whether over-branching or under-branching has taken place. Our
results suggest, among other findings, that both phenomena are present in
the Spanish banking sector, although the implications for the three types
of banks in the industry, namely commercial banks, savings banks or credit
unions, vary a great deal.
Journal: Journal of Applied Statistics
Pages: 722-744
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.980792
File-URL: http://hdl.handle.net/10.1080/02664763.2014.980792
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:722-744
Template-Type: ReDIF-Article 1.0
Author-Name: Ying-zi Fu
Author-X-Name-First: Ying-zi
Author-X-Name-Last: Fu
Author-Name: Pei-xiao Chu
Author-X-Name-First: Pei-xiao
Author-X-Name-Last: Chu
Author-Name: Li-ying Lu
Author-X-Name-First: Li-ying
Author-X-Name-Last: Lu
Title: A Bayesian approach of joint models for clustered zero-inflated count data with skewness and measurement errors
Abstract:
Count data with excess zeros are widely encountered in the fields of
biomedical, medical, public health and social survey, etc. Zero-inflated
Poisson (ZIP) regression models with mixed effects are useful tools for
analyzing such data, in which covariates are usually incorporated in the
model to explain inter-subject variation and normal distribution is
assumed for both random effects and random errors. However, in many
practical applications, such assumptions may be violated as the data often
exhibit skewness and some covariates may be measured with measurement
errors. In this paper, we deal with these issues simultaneously by
developing a Bayesian joint hierarchical modeling approach. Specifically,
by treating intercepts and slopes in logistic and Poisson regression as
random, a flexible two-level ZIP regression model is proposed, where a
covariate process with measurement errors is established and a
skew-t-distribution is considered for both random errors and random
effects. Under the Bayesian framework, model selection is carried out
using deviance information criterion (DIC) and a goodness-of-fit
statistics is also developed for assessing the plausibility of the posited
model. The main advantage of our method is that it allows for more
robustness and correctness for investigating heterogeneity from different
levels, while accommodating the skewness and measurement errors
simultaneously. An application to Shanghai Youth Fitness Survey is used as
an illustrate example. Through this real example, it is showed that our
approach is of interest and usefulness for applications.
Journal: Journal of Applied Statistics
Pages: 745-761
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.980941
File-URL: http://hdl.handle.net/10.1080/02664763.2014.980941
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:745-761
Template-Type: ReDIF-Article 1.0
Author-Name: Jing Chang
Author-X-Name-First: Jing
Author-X-Name-Last: Chang
Author-Name: Herbert K.H. Lee
Author-X-Name-First: Herbert K.H.
Author-X-Name-Last: Lee
Title: Variable selection via a multi-stage strategy
Abstract:
Variable selection for nonlinear regression is a complex problem, made
even more difficult when there are a large number of potential covariates
and a limited number of datapoints. We propose herein a multi-stage method
that combines state-of-the-art techniques at each stage to best discover
the relevant variables. At the first stage, an extension of the Bayesian
Additive Regression tree is adopted to reduce the total number of
variables to around 30. At the second stage, sensitivity analysis in the
treed Gaussian process is adopted to further reduce the total number of
variables. Two stopping rules are designed and sequential design is
adopted to make best use of previous information. We demonstrate our
approach on two simulated examples and one real data set.
Journal: Journal of Applied Statistics
Pages: 762-774
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.985640
File-URL: http://hdl.handle.net/10.1080/02664763.2014.985640
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:762-774
Template-Type: ReDIF-Article 1.0
Author-Name: Mohamed El Ghourabi
Author-X-Name-First: Mohamed
Author-X-Name-Last: El Ghourabi
Author-Name: Amira Dridi
Author-X-Name-First: Amira
Author-X-Name-Last: Dridi
Author-Name: Mohamed Limam
Author-X-Name-First: Mohamed
Author-X-Name-Last: Limam
Title: A new financial stress index model based on support vector regression and control chart
Abstract:
Financial stress index (FSI) is considered to be an important risk
management tool to quantify financial vulnerabilities. This paper proposes
a new framework based on a hybrid classifier model that integrates rough
set theory (RST), FSI, support vector regression (SVR) and a control chart
to identify stressed periods. First, the RST method is applied to select
variables. The outputs are used as input data for FSI-SVR computation.
Empirical analysis is conducted based on monthly FSI of the Federal
Reserve Bank of Saint Louis from January 1992 to June 2011. A comparison
study is performed between FSI based on the principal component analysis
and FSI-SVR. A control chart based on FSI-SVR and extreme value theory is
proposed to identify the extremely stressed periods. Our approach
identified different stressed periods including internet bubble, subprime
crisis and actual financial stress episodes, along with the calmest
periods, agreeing with those given by Federal Reserve System reports.
Journal: Journal of Applied Statistics
Pages: 775-788
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.986076
File-URL: http://hdl.handle.net/10.1080/02664763.2014.986076
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:775-788
Template-Type: ReDIF-Article 1.0
Author-Name: M. Liu
Author-X-Name-First: M.
Author-X-Name-Last: Liu
Author-Name: T.I. Lin
Author-X-Name-First: T.I.
Author-X-Name-Last: Lin
Title: Skew-normal factor analysis models with incomplete data
Abstract:
Traditional factor analysis (FA) rests on the assumption of multivariate
normality. However, in some practical situations, the data do not meet
this assumption; thus, the statistical inference made from such data may
be misleading. This paper aims at providing some new tools for the
skew-normal (SN) FA model when missing values occur in the data. In such a
model, the latent factors are assumed to follow a restricted version of
multivariate SN distribution with additional shape parameters for
accommodating skewness. We develop an analytically feasible expectation
conditional maximization algorithm for carrying out parameter estimation
and imputation of missing values under missing at random mechanisms. The
practical utility of the proposed methodology is illustrated with two real
data examples and the results are compared with those obtained from the
traditional FA counterparts.
Journal: Journal of Applied Statistics
Pages: 789-805
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.986437
File-URL: http://hdl.handle.net/10.1080/02664763.2014.986437
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:789-805
Template-Type: ReDIF-Article 1.0
Author-Name: Fei Yang
Author-X-Name-First: Fei
Author-X-Name-Last: Yang
Author-Name: Lin Chen
Author-X-Name-First: Lin
Author-X-Name-Last: Chen
Author-Name: Yang Cheng
Author-X-Name-First: Yang
Author-X-Name-Last: Cheng
Author-Name: Zhenxing Yao
Author-X-Name-First: Zhenxing
Author-X-Name-Last: Yao
Author-Name: Xu Zhang
Author-X-Name-First: Xu
Author-X-Name-Last: Zhang
Title: Urban public transport choice behavior analysis and service improvement policy-making: a case study from the metropolitan city, Chengdu, China
Abstract:
As the metropolitan city in Western China, Chengdu has been suffered from
serious traffic congestion. The strategy of urban public transport
priority was put into agenda to relieve traffic congestion. But the public
transport sharing rate is only 27% in Chengdu which is much lower than the
developed country. Consequently, it is of great importance to study the
measures to improve the service, and provide technical support to the
policy-makers. This paper selected the traffic corridor between Southwest
Jiaotong University district and downtown as the experiment subject. The
orthogonal design was used to generate stated preference questionnaires in
order to achieve the reliable parameter estimates. Some variables were
used to define the utility of the three alternatives and construct the
Logit model. Then, the relationships between the cost, time variable and
the choice probability of the public transport were analyzed. According to
the results, we found that the orthogonal design does improve the
goodness-of-fit. The workability of Multinomial Logit Model was better
than Nest Logit model. We also put forward some effective measures to
improve the service level of public transit, including reducing the access
time to Metro, limiting parking supply to control the car use.
Journal: Journal of Applied Statistics
Pages: 806-816
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.986438
File-URL: http://hdl.handle.net/10.1080/02664763.2014.986438
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:806-816
Template-Type: ReDIF-Article 1.0
Author-Name: Gurprit Grover
Author-X-Name-First: Gurprit
Author-X-Name-Last: Grover
Author-Name: Vinay K. Gupta
Author-X-Name-First: Vinay K.
Author-X-Name-Last: Gupta
Title: Multiple imputation of censored survival data in the presence of missing covariates using restricted mean survival time
Abstract:
Missing covariates data with censored outcomes put a challenge in the
analysis of clinical data especially in small sample settings. Multiple
imputation (MI) techniques are popularly used to impute missing covariates
and the data are then analyzed through methods that can handle censoring.
However, techniques based on MI are available to impute censored data also
but they are not much in practice. In the present study, we applied a
method based on multiple imputation by chained equations to impute missing
values of covariates and also to impute censored outcomes using restricted
survival time in small sample settings. The complete data were then
analyzed using linear regression models. Simulation studies and a real
example of CHD data show that the present method produced better estimates
and lower standard errors when applied on the data having missing
covariate values and censored outcomes than the analysis of the data
having censored outcome but excluding cases with missing covariates or the
analysis when cases with missing covariate values and censored outcomes
were excluded from the data (complete case analysis).
Journal: Journal of Applied Statistics
Pages: 817-827
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.986439
File-URL: http://hdl.handle.net/10.1080/02664763.2014.986439
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:817-827
Template-Type: ReDIF-Article 1.0
Author-Name: M.S. Hamada
Author-X-Name-First: M.S.
Author-X-Name-Last: Hamada
Author-Name: B.L. Mitchell
Author-X-Name-First: B.L.
Author-X-Name-Last: Mitchell
Author-Name: C.T. Necker
Author-X-Name-First: C.T.
Author-X-Name-Last: Necker
Title: On uncertainty of a proportion from a stratified random sample of a small population
Abstract:
This article considers the uncertainty of a proportion based on a
stratified random sample of a small population. Using the hypergeometric
distribution, a Clopper-Pearson type upper confidence bound is presented.
Another frequentist approach that uses the estimated variance of the
proportion estimator is also considered as well as a Bayesian alternative.
These methods are demonstrated with an illustrative example. Some aspects
of planning, that is, the impact of specified strata sample sizes, on
uncertainty are studied through a simulation study.
Journal: Journal of Applied Statistics
Pages: 828-833
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.987651
File-URL: http://hdl.handle.net/10.1080/02664763.2014.987651
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:828-833
Template-Type: ReDIF-Article 1.0
Author-Name: Andrew Hoegh
Author-X-Name-First: Andrew
Author-X-Name-Last: Hoegh
Author-Name: Scotland Leman
Author-X-Name-First: Scotland
Author-X-Name-Last: Leman
Title: A spatio-temporal model for assessing winter damage risk to east coast vineyards
Abstract:
Climate is an essential component in site suitability for agriculture in
general, and specifically in viticulture. With the recent increase in
vineyards on the East Coast, an important climactic consideration in site
suitability is extreme winter temperature. Often, maps of annual minimum
temperatures are used to determine cold hardiness. However, cold hardiness
of grapes is a more complicated process, since the temperature that grapes
are able to withstand without damage is not constant. Rather, recent
temperature cause acclimation or deacclimation and hence, have a large
influence on cold hardiness. By combining National Oceanic and Atmospheric
Administration (NOAA) weather station data and leveraging recently created
cold hardiness models for grapes, we develop a dynamic spatio-temporal
model to determine the risk of winter damage due to extreme cold for
several grape varieties commonly grown in the eastern United States. This
analysis provides maps of winter damage risk to three grape varieties,
Chardonnay, Cabernet Sauvignon, and Concord.
Journal: Journal of Applied Statistics
Pages: 834-845
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.987652
File-URL: http://hdl.handle.net/10.1080/02664763.2014.987652
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:834-845
Template-Type: ReDIF-Article 1.0
Author-Name: Costas Panagiotakis
Author-X-Name-First: Costas
Author-X-Name-Last: Panagiotakis
Author-Name: Georgios Tziritas
Author-X-Name-First: Georgios
Author-X-Name-Last: Tziritas
Title: A minimum spanning tree equipartition algorithm for microaggregation
Abstract:
In this paper, we propose a solution on microaggregation problem based on
the hierarchical tree equi-partition (HTEP) algorithm. Microaggregation is
a family of methods for statistical disclosure control of microdata, that
is, for masking microdata, so that they can be released without disclose
private information on the underlying individuals. Knowing that the
microaggregation problem is non-deterministic polynomial-time-hard, the
goal is to partition N given data into groups of at least
K items, so that the sum of the within-partition squared
error is minimized. The proposed method is general and it can be applied
to any tree partition problem aiming at the minimization of a total score.
The method is divisive, so that the tree with the highest 'score' is split
into two trees, resulting in a hierarchical forest of trees with almost
equal 'score' (equipartition). We propose a version of HTEP for
microaggregation (HTEPM), that is applied on the minimum spanning tree
(MST) of the graph defined by the data. The merit of the HTEPM algorithm
is that it solves optimally some instances of the multivariate
microaggregation problem on MST search space in
. Experimental
results and comparisons with existing methods from literature prove the
high performance and robustness of HTEPM.
Journal: Journal of Applied Statistics
Pages: 846-865
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.993361
File-URL: http://hdl.handle.net/10.1080/02664763.2014.993361
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:846-865
Template-Type: ReDIF-Article 1.0
Author-Name: Kouji Yamamoto
Author-X-Name-First: Kouji
Author-X-Name-Last: Yamamoto
Author-Name: Fumika Shimada
Author-X-Name-First: Fumika
Author-X-Name-Last: Shimada
Author-Name: Sadao Tomizawa
Author-X-Name-First: Sadao
Author-X-Name-Last: Tomizawa
Title: Measure of departure from symmetry for the analysis of collapsed square contingency tables with ordered categories
Abstract:
For square contingency tables with ordered categories, there may be some
cases that one wants to analyze them by considering collapsed tables with
some adjacent categories combined in the original table. This paper
considers the symmetry model for collapsed square contingency tables and
proposes a measure to represent the degree of departure from symmetry. The
proposed measure is defined as the arithmetic mean of submeasures each of
which represents the degree of departure from symmetry for each collapsed
3×3 table. Each submeasure also represents the mean of
power-divergence or diversity index for each collapsed table. Examples are
given.
Journal: Journal of Applied Statistics
Pages: 866-875
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.993362
File-URL: http://hdl.handle.net/10.1080/02664763.2014.993362
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:866-875
Template-Type: ReDIF-Article 1.0
Author-Name: Wali Ullah
Author-X-Name-First: Wali
Author-X-Name-Last: Ullah
Author-Name: Yasumasa Matsuda
Author-X-Name-First: Yasumasa
Author-X-Name-Last: Matsuda
Author-Name: Yoshihiko Tsukuda
Author-X-Name-First: Yoshihiko
Author-X-Name-Last: Tsukuda
Title: Generalized Nelson-Siegel term structure model: do the second slope and curvature factors improve the in-sample fit and out-of-sample forecasts?
Abstract:
The dynamic Nelson-Siegel (DNS) model and even the Svensson generalization
of the model have trouble in fitting the short maturity yields and fail to
grasp the characteristics of the Japanese government bonds yield curve,
which is flat at the short end and has multiple inflection points.
Therefore, a closely related generalized dynamic Nelson-Siegel (GDNS)
model that has two slopes and curvatures is considered and compared
empirically to the traditional DNS in terms of in-sample fit as well as
out-of-sample forecasts. Furthermore, the GDNS with time-varying
volatility component, modeled as standard EGARCH process, is also
considered to evaluate its performance in relation to the GDNS. The GDNS
model unanimously outperforms the DNS in terms of in-sample fit as well as
out-of-sample forecasts. Moreover, the extended model that accounts for
time-varying volatility outpace the other models for fitting the yield
curve and produce relatively more accurate 6- and 12-month ahead
forecasts, while the GDNS model comes with more precise forecasts for very
short forecast horizons.
Journal: Journal of Applied Statistics
Pages: 876-904
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.993363
File-URL: http://hdl.handle.net/10.1080/02664763.2014.993363
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:876-904
Template-Type: ReDIF-Article 1.0
Author-Name: S. Rao Jammalamadaka
Author-X-Name-First: S. Rao
Author-X-Name-Last: Jammalamadaka
Author-Name: Elvynna Leong
Author-X-Name-First: Elvynna
Author-X-Name-Last: Leong
Title: Analysis of discrete lifetime data under middle-censoring and in the presence of covariates
Abstract:
'Middle censoring' is a very general censoring scheme where the actual
value of an observation in the data becomes unobservable if it falls
inside a random interval (L, R) and includes both left
and right censoring. In this paper, we consider discrete lifetime data
that follow a geometric distribution that is subject to middle censoring.
Two major innovations in this paper, compared to the earlier work of
Davarzani and Parsian [3], include (i) an extension and generalization to
the case where covariates are present along with the data and (ii) an
alternate approach and proofs which exploit the simple relationship
between the geometric and the exponential distributions, so that the
theory is more in line with the work of Iyer et al. [6].
It is also demonstrated that this kind of discretization of life times
gives results that are close to the original data involving exponential
life times. Maximum likelihood estimation of the parameters is studied for
this middle-censoring scheme with covariates and their large sample
distributions discussed. Simulation results indicate how well the proposed
estimation methods work and an illustrative example using
time-to-pregnancy data from Baird and Wilcox [1] is included.
Journal: Journal of Applied Statistics
Pages: 905-913
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.993364
File-URL: http://hdl.handle.net/10.1080/02664763.2014.993364
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:905-913
Template-Type: ReDIF-Article 1.0
Author-Name: Philip Pallmann
Author-X-Name-First: Philip
Author-X-Name-Last: Pallmann
Title: Applied meta-analysis with R
Journal: Journal of Applied Statistics
Pages: 914-915
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.989464
File-URL: http://hdl.handle.net/10.1080/02664763.2014.989464
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:914-915
Template-Type: ReDIF-Article 1.0
Author-Name: Marina A.P. Andrade
Author-X-Name-First: Marina A.P.
Author-X-Name-Last: Andrade
Title: Statistical analysis of human growth and development
Journal: Journal of Applied Statistics
Pages: 915-915
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.989465
File-URL: http://hdl.handle.net/10.1080/02664763.2014.989465
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:915-915
Template-Type: ReDIF-Article 1.0
Author-Name: Jonathan Gillard
Author-X-Name-First: Jonathan
Author-X-Name-Last: Gillard
Title: Constrained principal component analysis and related techniques
Journal: Journal of Applied Statistics
Pages: 916-916
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.989466
File-URL: http://hdl.handle.net/10.1080/02664763.2014.989466
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:916-916
Template-Type: ReDIF-Article 1.0
Author-Name: Paul M. Ramsay
Author-X-Name-First: Paul M.
Author-X-Name-Last: Ramsay
Title: Handbook of spatial point-pattern analysis in ecology
Journal: Journal of Applied Statistics
Pages: 916-917
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.989467
File-URL: http://hdl.handle.net/10.1080/02664763.2014.989467
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:916-917
Template-Type: ReDIF-Article 1.0
Author-Name: Mariano Ruiz Espejo
Author-X-Name-First: Mariano Ruiz
Author-X-Name-Last: Espejo
Title: Modern survey sampling
Journal: Journal of Applied Statistics
Pages: 917-918
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.991071
File-URL: http://hdl.handle.net/10.1080/02664763.2014.991071
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:917-918
Template-Type: ReDIF-Article 1.0
Author-Name: Giuseppe Pandolfo
Author-X-Name-First: Giuseppe
Author-X-Name-Last: Pandolfo
Title: Circular statistics in R
Journal: Journal of Applied Statistics
Pages: 918-919
Issue: 4
Volume: 42
Year: 2015
Month: 4
X-DOI: 10.1080/02664763.2014.991072
File-URL: http://hdl.handle.net/10.1080/02664763.2014.991072
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:4:p:918-919
Template-Type: ReDIF-Article 1.0
Author-Name: M. Brabec
Author-X-Name-First: M.
Author-X-Name-Last: Brabec
Author-Name: O. Kon�r
Author-X-Name-First: O.
Author-X-Name-Last: Kon�r
Author-Name: M. Malý
Author-X-Name-First: M.
Author-X-Name-Last: Malý
Author-Name: I. Kasanický
Author-X-Name-First: I.
Author-X-Name-Last: Kasanický
Author-Name: E. Pelik�n
Author-X-Name-First: E.
Author-X-Name-Last: Pelik�n
Title: Statistical models for disaggregation and reaggregation of natural gas consumption data
Abstract:
In this paper, we present a unified framework for natural gas consumption
modeling and forecasting. This consists of models of GAM class and their
nonlinear extension, tailored for easy estimation, aggregation and
treatment of the delayed relationship between temperature and consumption.
Since the consumption data for households and small commercial customers
are routinely available in many countries only as long-term sum meter
readings, their disaggregation and possibly reaggregation to different
time intervals is necessary for a variety of purposes. We show some
examples of specific models based on the presented framework and then we
demonstrate their use in practice, especially for the disaggregation and
reaggregation tasks.
Journal: Journal of Applied Statistics
Pages: 921-937
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.993365
File-URL: http://hdl.handle.net/10.1080/02664763.2014.993365
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:921-937
Template-Type: ReDIF-Article 1.0
Author-Name: Ekele Alih
Author-X-Name-First: Ekele
Author-X-Name-Last: Alih
Author-Name: Hong Choon Ong
Author-X-Name-First: Hong Choon
Author-X-Name-Last: Ong
Title: Cluster-based multivariate outlier identification and re-weighted regression in linear models
Abstract:
A cluster methodology, motivated by a robust similarity matrix is proposed
for identifying likely multivariate outlier structure and to estimate
weighted least-square (WLS) regression parameters in
linear models. The proposed method is an agglomeration of procedures that
begins from clustering the n-observations through a test
of 'no-outlier hypothesis' (TONH) to a weighted
least-square regression estimation. The cluster phase partition the
n-observations into h-set called main
cluster and a minor cluster of size n -
h. A robust distance emerge from the main cluster upon
which a test of no outlier hypothesis' is conducted. An initial
WLS regression estimation is computed from the robust
distance obtained from the main cluster. Until convergence, a re-weighted
least-squares (RLS) regression estimate is updated with
weights based on the normalized residuals. The proposed procedure blends
an agglomerative hierarchical cluster analysis of a complete linkage
through the TONH to the Re-weighted regression estimation
phase. Hence, we propose to call it cluster-based re-weighted regression
(CBRR). The CBRR is compared with three
existing procedures using two data sets known to exhibit masking and
swamping. The performance of CBRR is further examined
through simulation experiment. The results obtained from the data set
illustration and the Monte Carlo study shows that the
CBRR is effective in detecting multivariate outliers
where other methods are susceptible to it. The CBRR does
not require enormous computation and is substantially not susceptible to
masking and swamping.
Journal: Journal of Applied Statistics
Pages: 938-955
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.993366
File-URL: http://hdl.handle.net/10.1080/02664763.2014.993366
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:938-955
Template-Type: ReDIF-Article 1.0
Author-Name: Monica Billio
Author-X-Name-First: Monica
Author-X-Name-Last: Billio
Author-Name: Silvio Di Sanzo
Author-X-Name-First: Silvio
Author-X-Name-Last: Di Sanzo
Title: Granger-causality in Markov switching models
Abstract:
In this paper, we propose a new approach for characterizing and testing
Granger-causality, which is well equipped to handle models where the
change in regime evolves according to multiple Markov chains. Differently
from the existing literature, we propose a method for analysing causal
links that specifically takes into account Markov chains. Tests for
independence are also provided. We illustrate the methodology with an
empirical application, and in particular, we investigate the causality and
interdependence between financial and economic cycles in USA using the
bivariate Markov switching model proposed by Hamilton and Lin [13]. We
find that financial variables are useful in forecasting the aggregate
economic activity, and vice versa.
Journal: Journal of Applied Statistics
Pages: 956-966
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.993367
File-URL: http://hdl.handle.net/10.1080/02664763.2014.993367
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:956-966
Template-Type: ReDIF-Article 1.0
Author-Name: Francisco Cribari-Neto
Author-X-Name-First: Francisco
Author-X-Name-Last: Cribari-Neto
Author-Name: Sadraque E.F. Lucena
Author-X-Name-First: Sadraque E.F.
Author-X-Name-Last: Lucena
Title: Nonnested hypothesis testing in the class of varying dispersion beta regressions
Abstract:
Oftentimes practitioners have at their disposal two or more competing
models with different parametric structures. Whenever each model cannot be
obtained as a particular case of the remaining models through a set of
parametric restrictions the models are said to be nonnested. Tests that
can be used to select a model from a set of nonnested linear regression
models are available in the literature. Particularly, useful tests are the
J and MJ tests. In this paper, we extend
these two tests to the class of beta regression models, which is useful
for modeling responses that assume values in the standard unit interval,
. We report Monte
Carlo evidence on the finite sample behavior of the tests. Bootstrap-based
testing inference is also considered. Overall, the best performing test is
the bootstrap MJ test. Two empirical applications are
presented and discussed.
Journal: Journal of Applied Statistics
Pages: 967-985
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.993368
File-URL: http://hdl.handle.net/10.1080/02664763.2014.993368
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:967-985
Template-Type: ReDIF-Article 1.0
Author-Name: Alper Sinan
Author-X-Name-First: Alper
Author-X-Name-Last: Sinan
Author-Name: B. Barıs Alkan
Author-X-Name-First: B. Barıs
Author-X-Name-Last: Alkan
Title: A useful approach to identify the multicollinearity in the presence of outliers
Abstract:
The presence of outliers in the data sets affects the structure of
multicollinearity which arises from a high degree of correlation between
explanatory variables in a linear regression analysis. This affect could
be seen as an increase or decrease in the diagnostics used to determine
multicollinearity. Thus, the cases of outliers reduce the reliability of
diagnostics such as variance inflation factors, condition numbers and
variance decomposition proportions. In this study, we propose to use a
robust estimation of the correlation matrix obtained by the minimum
covariance determinant method to determine the diagnostics of
multicollinearity in the presence of outliers. As a result, the present
paper demonstrates that the diagnostics of multicollinearity obtained by
the robust estimation of the correlation matrix are more reliable in the
presence of outliers.
Journal: Journal of Applied Statistics
Pages: 986-993
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.993369
File-URL: http://hdl.handle.net/10.1080/02664763.2014.993369
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:986-993
Template-Type: ReDIF-Article 1.0
Author-Name: Fernando A. Otero
Author-X-Name-First: Fernando A.
Author-X-Name-Last: Otero
Author-Name: Helcio R. Barreto Orlande
Author-X-Name-First: Helcio R.
Author-X-Name-Last: Barreto Orlande
Author-Name: Gloria L. Frontini
Author-X-Name-First: Gloria L.
Author-X-Name-Last: Frontini
Author-Name: Guillermo E. Eli�abe
Author-X-Name-First: Guillermo E.
Author-X-Name-Last: Eli�abe
Title: Bayesian approach to the inverse problem in a light scattering application
Abstract:
In this article, static light scattering (SLS) measurements are processed
to estimate the particle size distribution of particle systems
incorporating prior information obtained from an alternative experimental
technique: scanning electron microscopy (SEM). For this purpose we propose
two Bayesian schemes (one parametric and another non-parametric) to solve
the stated light scattering problem and take advantage of the obtained
results to summarize some features of the Bayesian approach within the
context of inverse problems. The features presented in this article
include the improvement of the results when some useful prior information
from an alternative experiment is considered instead of a non-informative
prior as it occurs in a deterministic maximum likelihood estimation. This
improvement will be shown in terms of accuracy and precision in the
corresponding results and also in terms of minimizing the effect of
multiple minima by including significant information in the optimization.
Both Bayesian schemes are implemented using Markov Chain Monte Carlo
methods. They have been developed on the basis of the Metropolis-Hastings
(MH) algorithm using Matlab-super-® and are tested with the analysis
of simulated and experimental examples of concentrated and
semi-concentrated particles. In the simulated examples, SLS measurements
were generated using a rigorous model, while the inversion stage was
solved using an approximate model in both schemes and also using the
rigorous model in the parametric scheme. Priors from SEM micrographs were
also simulated and experimented, where the simulated ones were obtained
using a Monte Carlo routine. In addition to the presentation of these
features of the Bayesian approach, some other topics will be discussed,
such as regularization and some implementation issues of the proposed
schemes, among which we remark the selection of the parameters used in the
MH algorithm.
Journal: Journal of Applied Statistics
Pages: 994-1016
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.993370
File-URL: http://hdl.handle.net/10.1080/02664763.2014.993370
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:994-1016
Template-Type: ReDIF-Article 1.0
Author-Name: Stan Lipovetsky
Author-X-Name-First: Stan
Author-X-Name-Last: Lipovetsky
Author-Name: W. Michael Conklin
Author-X-Name-First: W. Michael
Author-X-Name-Last: Conklin
Title: Predictor relative importance and matching regression parameters
Abstract:
Predictor importance in applied regression modeling gives the main
operational tools for managers and decision-makers. The paper considers
estimation of predictors' importance in regression using measures
introduced in works by Gibson and R. Johnson (GJ), then modified by Green,
Carroll, and DeSarbo, and developed further by J. Johnson (JJ). These
indices of importance are based on the orthonormal decomposition of the
data matrix, and the work shows how to improve this approximation. Using
predictor importance, the regression coefficients can also be adjusted to
reach the best data fit and to be meaningful and interpretable. The
results are compared with the robust to multicollinearity, but
computationally difficult, Shapley value regression (SVR). They show that
the JJ index is good for importance estimation, but the GJ index
outperforms it if both predictor importance and coefficients of regression
are needed; hence, this index (GJ) can be used in place of the more
computationally intensive estimation by SVR. The results can be easily
estimated by the considered approach that is very useful in practical
regression modeling and analysis, especially for big data.
Journal: Journal of Applied Statistics
Pages: 1017-1031
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.994480
File-URL: http://hdl.handle.net/10.1080/02664763.2014.994480
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:1017-1031
Template-Type: ReDIF-Article 1.0
Author-Name: Shazia Ghufran
Author-X-Name-First: Shazia
Author-X-Name-Last: Ghufran
Author-Name: Saman Khowaja
Author-X-Name-First: Saman
Author-X-Name-Last: Khowaja
Author-Name: M.J. Ahsan
Author-X-Name-First: M.J.
Author-X-Name-Last: Ahsan
Title: Optimum multivariate stratified double sampling design: Chebyshev's Goal Programming approach
Abstract:
In stratified sampling when strata weights are unknown a double sampling
technique may be used to estimate them. A large simple random sample from
the unstratified population is drawn and units falling in each stratum are
recorded. A stratified random sample is then selected and simple random
subsamples are obtained out of the previously selected units of the
strata. This procedure is called double sampling for stratification. If
the problem of non-response is there, then subsamples are divided into
classes of respondents and non-respondents. A second subsample is then
obtained out of the non-respondents and an attempt is made to obtain the
information by increasing efforts, persuasion and call backs. In this
paper, the problem of obtaining a compromise allocation in multivariate
stratified random sampling is discussed when strata weights are unknown
and non-response is present. The problem turns out to be a multiobjective
non-linear integer programming problem. An approximation of the problem to
an integer linear programming problem by linearizing the non-linear
objective functions at their individual optima is worked out. Chebyshev's
goal programming technique is then used to solve the approximated problem.
A numerical example is also presented to exhibit the practical application
of the developed procedure.
Journal: Journal of Applied Statistics
Pages: 1032-1042
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.995603
File-URL: http://hdl.handle.net/10.1080/02664763.2014.995603
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:1032-1042
Template-Type: ReDIF-Article 1.0
Author-Name: M. Pilar Alonso
Author-X-Name-First: M. Pilar
Author-X-Name-Last: Alonso
Author-Name: Asunci�n Beamonte
Author-X-Name-First: Asunci�n
Author-X-Name-Last: Beamonte
Author-Name: Pilar Gargallo
Author-X-Name-First: Pilar
Author-X-Name-Last: Gargallo
Author-Name: Manuel Salvador
Author-X-Name-First: Manuel
Author-X-Name-Last: Salvador
Title: Local labour markets delineation: an approach based on evolutionary algorithms and classification methods
Abstract:
In this paper a methodology for the delineation of local labour markets
(LLMs) using evolutionary algorithms is proposed. This procedure, based on
that in Fl�rez-Revuelta et al. [13,14], introduces three
modifications. First, initial groups of municipalities with a minimum size
requirement are built using the travel time between them. Second, a not
fully random initiation algorithm is proposed. And third, as a final stage
of the procedure, a contiguity step is implemented. These modifications
significantly decrease the computational times of the algorithm (up to a
99%) without any deterioration of the quality of the solutions. The
optimization algorithm may give a set of potential solutions with very
similar values with respect to the objective function what would lead to
different partitions, both in terms of number of markets and their
composition. In order to capture their common aspects an algorithm based
on a cluster partitioning of k-means type is presented.
This stage of the procedure also provides a ranking of LLMs foci useful
for planners and administrations in decision-making processes on issues
related to labour activities. Finally, to evaluate the performance of the
algorithm a toy example with artificial data is analysed. The full
methodology is illustrated through a real commuting data set of the region
of Arag�n (Spain).
Journal: Journal of Applied Statistics
Pages: 1043-1063
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.995604
File-URL: http://hdl.handle.net/10.1080/02664763.2014.995604
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:1043-1063
Template-Type: ReDIF-Article 1.0
Author-Name: S.M. Najibi
Author-X-Name-First: S.M.
Author-X-Name-Last: Najibi
Author-Name: M.R. Faghihi
Author-X-Name-First: M.R.
Author-X-Name-Last: Faghihi
Author-Name: M. Golalizadeh
Author-X-Name-First: M.
Author-X-Name-Last: Golalizadeh
Author-Name: S.S. Arab
Author-X-Name-First: S.S.
Author-X-Name-Last: Arab
Title: Bayesian alignment of proteins via Delaunay tetrahedralization
Abstract:
An active area of research in bioinformatics is finding structural
similarity of proteins by alignment. Among many methods, the popular one
is to find the similarity based on statistical features. This method
involves gathering information from the complex biomolecule structure and
obtaining the best alignment by maximizing the number of matched features.
In this paper, after reviewing statistical models for matching the
structural biomolecule, it is shown that local alignment based on the
Delaunay tetrahedralization (DT) can be used for Bayesian alignment of
proteins. In this method, we use DT to add a priori structural information
of protein in the Bayesian methodology. We demonstrate that this method
shows advantages over competing methods in achieving a global alignment of
proteins, accelerating the convergence rate and improving the parameter
estimates.
Journal: Journal of Applied Statistics
Pages: 1064-1079
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.995605
File-URL: http://hdl.handle.net/10.1080/02664763.2014.995605
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:1064-1079
Template-Type: ReDIF-Article 1.0
Author-Name: Sheng Luo
Author-X-Name-First: Sheng
Author-X-Name-Last: Luo
Author-Name: Xiao Su
Author-X-Name-First: Xiao
Author-X-Name-Last: Su
Author-Name: Min Yi
Author-X-Name-First: Min
Author-X-Name-Last: Yi
Author-Name: Kelly K. Hunt
Author-X-Name-First: Kelly K.
Author-X-Name-Last: Hunt
Title: Simultaneous inference of a misclassified outcome and competing risks failure time data
Abstract:
Ipsilateral breast tumor relapse (IBTR) often occurs in breast cancer
patients after their breast conservation therapy. The IBTR status'
classification (true local recurrence versus new ipsilateral primary
tumor) is subject to error and there is no widely accepted gold standard.
Time to IBTR is likely informative for IBTR classification because new
primary tumor tends to have a longer mean time to IBTR and is associated
with improved survival as compared with the true local recurrence tumor.
Moreover, some patients may die from breast cancer or other causes in a
competing risk scenario during the follow-up period. Because the time to
death can be correlated to the unobserved true IBTR status and time to
IBTR (if relapse occurs), this terminal mechanism is non-ignorable. In
this paper, we propose a unified framework that addresses these issues
simultaneously by modeling the misclassified binary outcome without a gold
standard and the correlated time to IBTR, subject to dependent competing
terminal events. We evaluate the proposed framework by a simulation study
and apply it to a real data set consisting of 4477 breast cancer patients.
The adaptive Gaussian quadrature tools in SAS
procedure NLMIXED can be conveniently used to fit
the proposed model. We expect to see broad applications of our model in
other studies with a similar data structure.
Journal: Journal of Applied Statistics
Pages: 1080-1090
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.995606
File-URL: http://hdl.handle.net/10.1080/02664763.2014.995606
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:1080-1090
Template-Type: ReDIF-Article 1.0
Author-Name: B. Ganguli
Author-X-Name-First: B.
Author-X-Name-Last: Ganguli
Author-Name: M. Naskar
Author-X-Name-First: M.
Author-X-Name-Last: Naskar
Author-Name: E.J. Malloy
Author-X-Name-First: E.J.
Author-X-Name-Last: Malloy
Author-Name: E.A. Eisen
Author-X-Name-First: E.A.
Author-X-Name-Last: Eisen
Title: Determination of the functional form of the relationship of covariates to the log hazard ratio in a Cox model
Abstract:
In this paper, we review available methods for determination of the
functional form of the relation between a covariate and the log hazard
ratio for a Cox model. We pay special attention to the detection of
influential observations to the extent that they influence the estimated
functional form of the relation between a covariate and the log hazard
ratio. Our paper is motivated by a data set from a cohort study of lung
cancer and silica exposure, where the nonlinear shape of the estimated log
hazard ratio for silica exposure plotted against cumulative exposure and
hereafter referred to as the exposure-response curve was greatly affected
by whether or not two individuals with the highest exposures were included
in the analysis. Formal influence diagnostics did not identify these two
individuals but did identify the three highest exposed cases. Removal of
these three cases resulted in a biologically plausible exposure-response
curve.
Journal: Journal of Applied Statistics
Pages: 1091-1105
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.995607
File-URL: http://hdl.handle.net/10.1080/02664763.2014.995607
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:1091-1105
Template-Type: ReDIF-Article 1.0
Author-Name: Adam J. Branscum
Author-X-Name-First: Adam J.
Author-X-Name-Last: Branscum
Author-Name: Dunlei Cheng
Author-X-Name-First: Dunlei
Author-X-Name-Last: Cheng
Author-Name: J. Jack Lee
Author-X-Name-First: J. Jack
Author-X-Name-Last: Lee
Title: Testing hypotheses about medical test accuracy: considerations for design and inference
Abstract:
Developing new medical tests and identifying single biomarkers or panels
of biomarkers with superior accuracy over existing classifiers promotes
lifelong health of individuals and populations. Before a medical test can
be routinely used in clinical practice, its accuracy within diseased and
non-diseased populations must be rigorously evaluated. We introduce a
method for sample size determination for studies designed to test
hypotheses about medical test or biomarker sensitivity and specificity. We
show how a sample size can be determined to guard against making type I
and/or type II errors by calculating Bayes factors from multiple data sets
simulated under null and/or alternative models. The approach can be
implemented across a variety of study designs, including investigations
into one test or two conditionally independent or dependent tests. We
focus on a general setting that involves non-identifiable models for data
when true disease status is unavailable due to the nonexistence of or
undesirable side effects from a perfectly accurate (i.e. 'gold standard')
test; special cases of the general method apply to identifiable models
with or without gold-standard data. Calculation of Bayes factors is
performed by incorporating prior information for model parameters (e.g.
sensitivity, specificity, and disease prevalence) and augmenting the
observed test-outcome data with unobserved latent data on disease status
to facilitate Gibbs sampling from posterior distributions. We illustrate
our methods using a thorough simulation study and an application to
toxoplasmosis.
Journal: Journal of Applied Statistics
Pages: 1106-1119
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.995608
File-URL: http://hdl.handle.net/10.1080/02664763.2014.995608
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:1106-1119
Template-Type: ReDIF-Article 1.0
Author-Name: Huei-Wen Teng
Author-X-Name-First: Huei-Wen
Author-X-Name-Last: Teng
Author-Name: Wen-Liang Hung
Author-X-Name-First: Wen-Liang
Author-X-Name-Last: Hung
Author-Name: Yen-Ju Chao
Author-X-Name-First: Yen-Ju
Author-X-Name-Last: Chao
Title: Bayesian Markov chain Monte Carlo imputation for the transiting exoplanets with an application in clustering analysis
Abstract:
To impute the missing values of mass in the transiting exoplanet data,
this paper uses the Frank copula to combine two Pareto marginal
distributions. Next, a Bayesian Markov chain Monte Carlo (MCMC) imputation
method is proposed. The proposed Bayesian MCMC imputation method is found
to outperform the mean imputation method. Clustering analysis can shed
light on the formation and evolution of exoplanets. After imputing the
missing values of mass in the transiting exoplanet data using the proposed
approach, the similarity-based clustering method (SCM) clustering
algorithm is applied to the logarithm of mass and period for this complete
data set. The SCM clustering result indicates two clusters. Furthermore,
the intracluster Spearman rank-order correlation coefficients
for mass and
period in these two clusters are 0.401 and , respectively,
at a significance level of 0.01. This result illustrates that the mass and
period correlate in an opposite way between the two different clusters. It
implies that the formation and evolution processes of these two clusters
are different.
Journal: Journal of Applied Statistics
Pages: 1120-1132
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.995609
File-URL: http://hdl.handle.net/10.1080/02664763.2014.995609
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:1120-1132
Template-Type: ReDIF-Article 1.0
Author-Name: Hongxia Yang
Author-X-Name-First: Hongxia
Author-X-Name-Last: Yang
Author-Name: Aurelie Lozano
Author-X-Name-First: Aurelie
Author-X-Name-Last: Lozano
Title: Multi-relational learning via hierarchical nonparametric Bayesian collective matrix factorization
Abstract:
Relational learning addresses problems where the data come from multiple
sources and are linked together through complex relational networks. Two
important goals are pattern discovery (e.g. by (co)-clustering) and
predicting unknown values of a relation, given a set of entities and
observed relations among entities. In the presence of multiple relations,
combining information from different but related relations can lead to
better insights and improved prediction. For this purpose, we propose a
nonparametric hierarchical Bayesian model that improves on existing
collaborative factorization models and frames a large number of relational
learning problems. The proposed model naturally incorporates
(co)-clustering and prediction analysis in a single unified framework, and
allows for the estimation of entire missing row or column vectors. We
develop an efficient Gibbs algorithm and a hybrid Gibbs using Newton's
method to enable fast computation in high dimensions. We demonstrate the
value of our framework on simulated experiments and on two real-world
problems: discovering kinship systems and predicting the authors of
certain articles based on article-word co-occurrence features.
Journal: Journal of Applied Statistics
Pages: 1133-1147
Issue: 5
Volume: 42
Year: 2015
Month: 5
X-DOI: 10.1080/02664763.2014.999028
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999028
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:5:p:1133-1147
Template-Type: ReDIF-Article 1.0
Author-Name: Aldo M. Garay
Author-X-Name-First: Aldo M.
Author-X-Name-Last: Garay
Author-Name: Victor H. Lachos
Author-X-Name-First: Victor H.
Author-X-Name-Last: Lachos
Author-Name: Heleno Bolfarine
Author-X-Name-First: Heleno
Author-X-Name-Last: Bolfarine
Title: Bayesian estimation and case influence diagnostics for the zero-inflated negative binomial regression model
Abstract:
In recent years, there has been considerable interest in regression models
based on zero-inflated distributions. These models are commonly
encountered in many disciplines, such as medicine, public health, and
environmental sciences, among others. The zero-inflated Poisson (ZIP)
model has been typically considered for these types of problems. However,
the ZIP model can fail if the non-zero counts are overdispersed in
relation to the Poisson distribution, hence the zero-inflated negative
binomial (ZINB) model may be more appropriate. In this paper, we present a
Bayesian approach for fitting the ZINB regression model. This model
considers that an observed zero may come from a point mass distribution at
zero or from the negative binomial model. The likelihood function is
utilized to compute not only some Bayesian model selection measures, but
also to develop Bayesian case-deletion influence diagnostics based on
q-divergence measures. The approach can be easily
implemented using standard Bayesian software, such as WinBUGS. The
performance of the proposed method is evaluated with a simulation study.
Further, a real data set is analyzed, where we show that ZINB regression
models seems to fit the data better than the Poisson counterpart.
Journal: Journal of Applied Statistics
Pages: 1148-1165
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.995610
File-URL: http://hdl.handle.net/10.1080/02664763.2014.995610
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1148-1165
Template-Type: ReDIF-Article 1.0
Author-Name: Tatjana Miljkovic
Author-X-Name-First: Tatjana
Author-X-Name-Last: Miljkovic
Author-Name: Nikita Barabanov
Author-X-Name-First: Nikita
Author-X-Name-Last: Barabanov
Title: Modeling veterans' health benefit grants using the expectation maximization algorithm
Abstract:
A novel application of the expectation maximization (EM) algorithm is
proposed for modeling right-censored multiple regression. Parameter
estimates, variability assessment, and model selection are summarized in a
multiple regression settings assuming a normal model. The performance of
this method is assessed through a simulation study. New formulas for
measuring model utility and diagnostics are derived based on the EM
algorithm. They include reconstructed coefficient of determination and
influence diagnostics based on a one-step deletion method. A real data
set, provided by North Dakota Department of Veterans Affairs, is modeled
using the proposed methodology. Empirical findings should be of benefit to
government policy-makers.
Journal: Journal of Applied Statistics
Pages: 1166-1182
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999029
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999029
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1166-1182
Template-Type: ReDIF-Article 1.0
Author-Name: Hamid Shahriari
Author-X-Name-First: Hamid
Author-X-Name-Last: Shahriari
Author-Name: Orod Ahmadi
Author-X-Name-First: Orod
Author-X-Name-Last: Ahmadi
Title: Robust estimation of the mean vector for high-dimensional data set using robust clustering
Abstract:
The first step in statistical analysis is the parameter estimation. In
multivariate analysis, one of the parameters of interest to be estimated
is the mean vector. In multivariate statistical analysis, it is usually
assumed that the data come from a multivariate normal distribution. In
this situation, the maximum likelihood estimator (MLE), that is, the
sample mean vector, is the best estimator. However, when outliers exist in
the data, the use of sample mean vector will result in poor estimation.
So, other estimators which are robust to the existence of outliers should
be used. The most popular robust multivariate estimator for estimating the
mean vector is S-estimator with desirable properties. However, computing
this estimator requires the use of a robust estimate of mean vector as a
starting point. Usually minimum volume ellipsoid (MVE) is used as a
starting point in computing S-estimator. For high-dimensional data
computing, the MVE takes too much time. In some cases, this time is so
large that the existing computers cannot perform the computation. In
addition to the computation time, for high-dimensional data set the MVE
method is not precise. In this paper, a robust starting point for
S-estimator based on robust clustering is proposed which could be used for
estimating the mean vector of the high-dimensional data. The performance
of the proposed estimator in the presence of outliers is studied and the
results indicate that the proposed estimator performs precisely and much
better than some of the existing robust estimators for high-dimensional
data.
Journal: Journal of Applied Statistics
Pages: 1183-1205
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999030
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999030
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1183-1205
Template-Type: ReDIF-Article 1.0
Author-Name: Cindy Xin Feng
Author-X-Name-First: Cindy Xin
Author-X-Name-Last: Feng
Title: Bayesian joint modeling of correlated counts data with application to adverse birth outcomes
Abstract:
In disease mapping, health outcomes measured at the same spatial locations
may be correlated, so one can consider joint modeling the multivariate
health outcomes accounting for their dependence. The general approaches
often used for joint modeling include shared component models and
multivariate models. An alternative way to model the association between
two health outcomes, when one outcome can naturally serve as a covariate
of the other, is to use ecological regression model. For example, in our
application, preterm birth (PTB) can be treated as a predictor for low
birth weight (LBW) and vice versa. Therefore, we proposed to blend the
ideas from joint modeling and ecological regression methods to jointly
model the relative risks for LBW and PTBs over the health districts in
Saskatchewan, Canada, in 2000-2010. This approach is helpful when proxy of
areal-level contextual factors can be derived based on the outcomes
themselves when direct information on risk factors are not readily
available. Our results indicate that the proposed approach improves the
model fit when compared with the conventional joint modeling methods.
Further, we showed that when no strong spatial autocorrelation is present,
joint outcome modeling using only independent error terms can still
provide a better model fit when compared with the separate modeling.
Journal: Journal of Applied Statistics
Pages: 1206-1222
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999031
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999031
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1206-1222
Template-Type: ReDIF-Article 1.0
Author-Name: Carles Serrat
Author-X-Name-First: Carles
Author-X-Name-Last: Serrat
Author-Name: Montserrat Ru�
Author-X-Name-First: Montserrat
Author-X-Name-Last: Ru�
Author-Name: Carmen Armero
Author-X-Name-First: Carmen
Author-X-Name-Last: Armero
Author-Name: Xavier Piulachs
Author-X-Name-First: Xavier
Author-X-Name-Last: Piulachs
Author-Name: H�ctor Perpi��n
Author-X-Name-First: H�ctor
Author-X-Name-Last: Perpi��n
Author-Name: Anabel Forte
Author-X-Name-First: Anabel
Author-X-Name-Last: Forte
Author-Name: �lvaro P�ez
Author-X-Name-First: �lvaro
Author-X-Name-Last: P�ez
Author-Name: Guadalupe G�mez
Author-X-Name-First: Guadalupe
Author-X-Name-Last: G�mez
Title: Frequentist and Bayesian approaches for a joint model for prostate cancer risk and longitudinal prostate-specific antigen data
Abstract:
The paper describes the use of frequentist and Bayesian shared-parameter
joint models of longitudinal measurements of prostate-specific antigen
(PSA) and the risk of prostate cancer (PCa). The motivating dataset
corresponds to the screening arm of the Spanish branch of the European
Randomized Screening for Prostate Cancer study. The results show that PSA
is highly associated with the risk of being diagnosed with PCa and that
there is an age-varying effect of PSA on PCa risk. Both the frequentist
and Bayesian paradigms produced very close parameter estimates and
subsequent 95% confidence and credibility intervals. Dynamic estimations
of disease-free probabilities obtained using Bayesian inference highlight
the potential of joint models to guide personalized risk-based screening
strategies.
Journal: Journal of Applied Statistics
Pages: 1223-1239
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999032
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999032
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1223-1239
Template-Type: ReDIF-Article 1.0
Author-Name: Guoyou Qin
Author-X-Name-First: Guoyou
Author-X-Name-Last: Qin
Author-Name: Zhongyi Zhu
Author-X-Name-First: Zhongyi
Author-X-Name-Last: Zhu
Title: Robust estimation of mean and covariance for longitudinal data with dropouts
Abstract:
In this paper, we study estimation of linear models in the framework of
longitudinal data with dropouts. Under the assumptions that random errors
follow an elliptical distribution and all the subjects share the same
within-subject covariance matrix which does not depend on covariates, we
develop a robust method for simultaneous estimation of mean and
covariance. The proposed method is robust against outliers, and does not
require to model the covariance and missing data process. Theoretical
properties of the proposed estimator are established and simulation
studies show its good performance. In the end, the proposed method is
applied to a real data analysis for illustration.
Journal: Journal of Applied Statistics
Pages: 1240-1254
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999033
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999033
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1240-1254
Template-Type: ReDIF-Article 1.0
Author-Name: Thierry Chekouo
Author-X-Name-First: Thierry
Author-X-Name-Last: Chekouo
Author-Name: Alejandro Murua
Author-X-Name-First: Alejandro
Author-X-Name-Last: Murua
Title: The penalized biclustering model and related algorithms
Abstract:
Biclustering is the simultaneous clustering of two related dimensions, for
example, of individuals and features, or genes and experimental
conditions. Very few statistical models for biclustering have been
proposed in the literature. Instead, most of the research has focused on
algorithms to find biclusters. The models underlying them have not
received much attention. Hence, very little is known about the adequacy
and limitations of the models and the efficiency of the algorithms. In
this work, we shed light on associated statistical models behind the
algorithms. This allows us to generalize most of the known popular
biclustering techniques, and to justify, and many times improve on, the
algorithms used to find the biclusters. It turns out that most of the
known techniques have a hidden Bayesian flavor. Therefore, we adopt a
Bayesian framework to model biclustering. We propose a measure of
biclustering complexity (number of biclusters and overlapping) through a
penalized plaid model, and present a suitable version of the deviance
information criterion to choose the number of biclusters, a problem that
has not been adequately addressed yet. Our ideas are motivated by the
analysis of gene expression data.
Journal: Journal of Applied Statistics
Pages: 1255-1277
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999647
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999647
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1255-1277
Template-Type: ReDIF-Article 1.0
Author-Name: Antoine Dany
Author-X-Name-First: Antoine
Author-X-Name-Last: Dany
Author-Name: Emmanuelle Dantony
Author-X-Name-First: Emmanuelle
Author-X-Name-Last: Dantony
Author-Name: Mad-H�l�nie Elsensohn
Author-X-Name-First: Mad-H�l�nie
Author-X-Name-Last: Elsensohn
Author-Name: Emmanuel Villar
Author-X-Name-First: Emmanuel
Author-X-Name-Last: Villar
Author-Name: C�cile Couchoud
Author-X-Name-First: C�cile
Author-X-Name-Last: Couchoud
Author-Name: Ren� Ecochard
Author-X-Name-First: Ren�
Author-X-Name-Last: Ecochard
Title: Using repeated-prevalence data in multi-state modeling of renal replacement therapy
Abstract:
Multi-state models help predict future numbers of patients requiring
specific treatments but these models require exhaustive incidence data.
Deriving reliable predictions from repeated-prevalence data would be
helpful. A new method to model the number of patients that switch between
therapeutic modalities using repeated-prevalence data is presented and
illustrated. The parameters and goodness of fit obtained with the new
method and repeated-prevalence data were compared to those obtained with
the classical method and incidence data. The multi-state model parameters'
confidence intervals obtained with annually collected repeated-prevalence
data were wider than those obtained with incidence data and six out of
nine pairs of confidence intervals did not overlap. However, most
parameters were of the same order of magnitude and the predicted patient
distributions among various renal replacement therapies were similar
regardless of the type of data used. In the absence of incidence data, a
multi-state model can still be successfully built with annually collected
repeated-prevalence data to predict the numbers of patients requiring
specific treatments. This modeling technique can be extended to other
chronic diseases.
Journal: Journal of Applied Statistics
Pages: 1278-1290
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999648
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999648
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1278-1290
Template-Type: ReDIF-Article 1.0
Author-Name: Gurprit Grover
Author-X-Name-First: Gurprit
Author-X-Name-Last: Grover
Author-Name: Ravi Vajala
Author-X-Name-First: Ravi
Author-X-Name-Last: Vajala
Author-Name: Prafulla Kumar Swain
Author-X-Name-First: Prafulla Kumar
Author-X-Name-Last: Swain
Title: On the assessment of various factors effecting the improvement in CD4 count of aids patients undergoing antiretroviral therapy using generalized Poisson regression
Abstract:
An important marker for identifying the progression of human
immunodeficiency virus (HIV) infection in an individual is the CD4 cell
count. Antiretroviral therapy (ART) is a treatment for HIV/AIDS (AIDS,
acquired immune-deficiency syndrome) which prolongs and improves the lives
of patients by improving the CD4 cell count and strengthen the immune
system. This strengthening of the immune system in terms of CD4 count, not
only depends on various biological factors, but also other behavioral
factors. Previous studies have shown the effect of CD4 count on the
mortality, but nobody has attempted to study the factors which are likely
to influence the improvement in CD4 count of patients diagnosed of AIDS
and undergoing ART. In this paper, we use Poisson regression model (GPR)
for exploring the effect of various socio-demographic covariates such as
age, gender, geographical location, and drug usage on the improvement in
the CD4 count of AIDS patients. However, if the CD4 count data suffers
from under or overdispersion, we use GPR model and compare it with
negative binomial distribution. Finally, the model is applied for the
analysis of data on patients undergoing the ART in the Ram Manohar Lohia
Hospital, Delhi, India. The data exhibited overdispersion and hence, GPR
model provided the best fit.
Journal: Journal of Applied Statistics
Pages: 1291-1305
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999649
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999649
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1291-1305
Template-Type: ReDIF-Article 1.0
Author-Name: Peter A. Dowd
Author-X-Name-First: Peter A.
Author-X-Name-Last: Dowd
Author-Name: Eulogio Pardo-Igúzquiza
Author-X-Name-First: Eulogio
Author-X-Name-Last: Pardo-Igúzquiza
Author-Name: Juan Jos� Egozcue
Author-X-Name-First: Juan Jos�
Author-X-Name-Last: Egozcue
Title: The total bootstrap median: a robust and efficient estimator of location and scale for small samples
Abstract:
We propose the total bootstrap median (TBM) as a robust and efficient
estimator of location and scale for small samples. We demonstrate its
performance by estimating the mean and variance of a variety of
distributions. We also show that, if the underlying distribution is
unknown and there is either no contamination or low to moderate
contamination, the TBM provides a better estimate of the mean, in mean
square terms, than the sample mean or the sample median. In addition, the
TBM is a better estimator of the variance of the underlying distribution
than the sample variance or the square of the bias-corrected median
absolute deviation from the median estimator. We also show that the TBM is
an explicit L-estimator, which allows a direct study of its properties.
Journal: Journal of Applied Statistics
Pages: 1306-1321
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999650
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999650
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1306-1321
Template-Type: ReDIF-Article 1.0
Author-Name: Thiago Rezende dos Santos
Author-X-Name-First: Thiago Rezende
Author-X-Name-Last: dos Santos
Author-Name: Enrico A. Colosimo
Author-X-Name-First: Enrico A.
Author-X-Name-Last: Colosimo
Title: A modified approximate method for analysis of degradation data
Abstract:
Estimation of the lifetime distribution of industrial components and
systems yields very important information for manufacturers and consumers.
However, obtaining reliability data is time consuming and costly. In this
context, degradation tests are a useful alternative approach to lifetime
and accelerated life tests in reliability studies. The approximate method
is one of the most used techniques for degradation data analysis. It is
very simple to understand and easy to implement numerically in any
statistical software package. This paper uses time series techniques in
order to propose a modified approximate method (MAM). The MAM improves the
standard one in two aspects: (1) it uses previous observations in the
degradation path as a Markov process for future prediction and (2) it is
not necessary to specify a parametric form for the degradation path.
Characteristics of interest such as mean or median time to failure and
percentiles, among others, are obtained by using the modified method. A
simulation study is performed in order to show the improved properties of
the modified method over the standard one. Both methods are also used to
estimate the failure time distribution of the fatigue-crack-growth data
set.
Journal: Journal of Applied Statistics
Pages: 1322-1331
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999651
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999651
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1322-1331
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaorong Yang
Author-X-Name-First: Xiaorong
Author-X-Name-Last: Yang
Title: Bootstrap unit root test based on least absolute deviation estimation under dependence assumptions
Abstract:
In this paper, a bootstrap test based on the least absolute deviation
(LAD) estimation for the unit root test in first-order autoregressive
models with dependent residuals is considered. The convergence in
probability of the bootstrap distribution function is established. Under
the frame of dependence assumptions, the asymptotic behavior of the
bootstrap LAD estimator is independent of the covariance matrix of the
residuals, which automatically approximates the target distribution.
Journal: Journal of Applied Statistics
Pages: 1332-1347
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999652
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999652
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1332-1347
Template-Type: ReDIF-Article 1.0
Author-Name: Xu Tang
Author-X-Name-First: Xu
Author-X-Name-Last: Tang
Author-Name: Fah Fatt Gan
Author-X-Name-First: Fah Fatt
Author-X-Name-Last: Gan
Author-Name: Lingyun Zhang
Author-X-Name-First: Lingyun
Author-X-Name-Last: Zhang
Title: Standardized mortality ratio for an estimated number of deaths
Abstract:
The traditional standardized mortality ratio (SMR) compares the mortality
rate of a study population with that of a reference population. In order
to measure the performance of a surgeon or a group of surgeons in a
hospital performing a particular type of surgical operation, a different
SMR is used. This SMR compares the observed number of deaths in a sample
with an estimated number of deaths usually calculated based on the average
performance of a group of surgeons. The estimated number of deaths
involved in the new SMR is not a constant but a random variable. This
means that all existing results for the traditional SMR may no longer be
valid for the new SMR. In this paper, the asymptotic distribution of the
SMR based on an estimated number of deaths is derived. We also use the
bootstrap procedure to estimate the finite-sample distribution. A
simulation study is used to compare both probabilities of type I error and
powers of existing confidence intervals and confidence intervals
constructed using the asymptotic and bootstrap distributions of SMR. Our
study reveals that, in general, existing confidence intervals are
conservative in terms of probability of type I error, and the two new
confidence intervals are more accurate. To perform a fair power
comparison, the coverage probabilities of existing confidence intervals
are recalibrated to match that based on the asymptotic distribution of
SMR, and then our study shows that the powers of the asymptotic and
bootstrap approaches are lower than existing approaches when the odds
ratio of death Q is greater than the odds ratio of death
under the null hypothesis, , but higher when
Q is smaller than . The effect of
patients' risk distribution on the SMR is also investigated.
Journal: Journal of Applied Statistics
Pages: 1348-1366
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999653
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999653
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1348-1366
Template-Type: ReDIF-Article 1.0
Author-Name: Shuling Wang
Author-X-Name-First: Shuling
Author-X-Name-Last: Wang
Author-Name: Xiaoyan Wang
Author-X-Name-First: Xiaoyan
Author-X-Name-Last: Wang
Author-Name: Jiangtao Dai
Author-X-Name-First: Jiangtao
Author-X-Name-Last: Dai
Title: Statistical diagnosis for non-parametric regression models with random right censorship based on the empirical likelihood method
Abstract:
In this paper, we consider statistical diagnostic for non-parametric
regression models with right-censored data based on empirical likelihood.
First, the primary model is transformed to the non-parametric regression
model. Then, based on empirical likelihood methodology, we define some
diagnostic statistics. At last, some simulation studies show that our
proposed procedure can work fairly well.
Journal: Journal of Applied Statistics
Pages: 1367-1373
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999656
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999656
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1367-1373
Template-Type: ReDIF-Article 1.0
Author-Name: Kung-Jong Lui
Author-X-Name-First: Kung-Jong
Author-X-Name-Last: Lui
Title: Notes on estimation of the intraclass correlation under the AB/BA crossover trial
Abstract:
Under the AB/BA crossover trial, we focus our attention on estimation of
the intraclass correlation in normal data. We develop both point and
interval estimators in closed form for the intraclass correlation. We
employ Monte Carlo simulation to study the performance of these estimators
in a variety of situations. We note that the estimators developed here for
the intraclass correlation remain valid even when there are possibly
unexpected carry-over effects.
Journal: Journal of Applied Statistics
Pages: 1374-1381
Issue: 6
Volume: 42
Year: 2015
Month: 6
X-DOI: 10.1080/02664763.2014.999657
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999657
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:6:p:1374-1381
Template-Type: ReDIF-Article 1.0
Author-Name: Jung-Yu Cheng
Author-X-Name-First: Jung-Yu
Author-X-Name-Last: Cheng
Author-Name: Shinn-Jia Tzeng
Author-X-Name-First: Shinn-Jia
Author-X-Name-Last: Tzeng
Title: Buckley-James-type estimation of quantile regression with recurrent gap time data
Abstract:
In longitudinal studies, an individual may potentially undergo a series of
repeated recurrence events. The gap times, which are referred to as the
times between successive recurrent events, are typically the outcome
variables of interest. Various regression models have been developed in
order to evaluate covariate effects on gap times based on recurrence event
data. The proportional hazards model, additive hazards model, and the
accelerated failure time model are all notable examples. Quantile
regression is a useful alternative to the aforementioned models for
survival analysis since it can provide great flexibility to assess
covariate effects on the entire distribution of the gap time. In order to
analyze recurrence gap time data, we must overcome the problem of the last
gap time subjected to induced dependent censoring, when numbers of
recurrent events exceed one time. In this paper, we adopt the
Buckley-James-type estimation method in order to construct a weighted
estimation equation for regression coefficients under the quantile model,
and develop an iterative procedure to obtain the estimates. We use
extensive simulation studies to evaluate the finite-sample performance of
the proposed estimator. Finally, analysis of bladder cancer data is
presented as an illustration of our proposed methodology.
Journal: Journal of Applied Statistics
Pages: 1383-1401
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.999654
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999654
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1383-1401
Template-Type: ReDIF-Article 1.0
Author-Name: Cuizhen Niu
Author-X-Name-First: Cuizhen
Author-X-Name-Last: Niu
Author-Name: Qiang Xia
Author-X-Name-First: Qiang
Author-X-Name-Last: Xia
Title: Testing the rate ratio under inverse sampling based on gradient statistic
Abstract:
Inverse sampling is widely applied in studies with dichotomous outcomes,
especially when the subjects arrive sequentially or the response of
interest is difficult to obtain. In this paper, we investigate the rate
ratio test problem under inverse sampling based on gradient statistic with
the asymptotic method and parametric bootstrap technique. The gradient
statistic has many advantages, for example, it is simple to calculate and
competitive with Wald-type, score and likelihood ratio tests in terms of
local power. Numerical studies are carried out to evaluate the performance
of our gradient test and the existing tests, namely Wald-type, score and
likelihood ratio tests. The simulation results suggest that the gradient
test based on the parametric bootstrap method has excellent type I error
control and large powers even in small sample design. Two real examples,
from a heart disease study and a drug comparison study, are applied to
illustrate our methods.
Journal: Journal of Applied Statistics
Pages: 1402-1420
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.999655
File-URL: http://hdl.handle.net/10.1080/02664763.2014.999655
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1402-1420
Template-Type: ReDIF-Article 1.0
Author-Name: Derek S. Young
Author-X-Name-First: Derek S.
Author-X-Name-Last: Young
Author-Name: David R. Hunter
Author-X-Name-First: David R.
Author-X-Name-Last: Hunter
Title: Random effects regression mixtures for analyzing infant habituation
Abstract:
Random effects regression mixture models are a way to classify
longitudinal data (or trajectories) having possibly varying lengths. The
mixture structure of the traditional random effects regression mixture
model arises through the distribution of the random regression
coefficients, which is assumed to be a mixture of multivariate normals. An
extension of this standard model is presented that accounts for various
levels of heterogeneity among the trajectories, depending on their assumed
error structure. A standard likelihood ratio test is presented for testing
this error structure assumption. Full details of an
expectation-conditional maximization algorithm for maximum likelihood
estimation are also presented. This model is used to analyze data from an
infant habituation experiment, where it is desirable to assess whether
infants comprise different populations in terms of their habituation time.
Journal: Journal of Applied Statistics
Pages: 1421-1441
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.1000272
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1000272
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1421-1441
Template-Type: ReDIF-Article 1.0
Author-Name: Bijamma Thomas
Author-X-Name-First: Bijamma
Author-X-Name-Last: Thomas
Author-Name: N.N. Midhu
Author-X-Name-First: N.N.
Author-X-Name-Last: Midhu
Author-Name: P.G. Sankaran
Author-X-Name-First: P.G.
Author-X-Name-Last: Sankaran
Title: A software reliability model using mean residual quantile function
Abstract:
In this paper, we propose a class of distributions with the inverse linear
mean residual quantile function. The distributional properties of the
family of distributions are studied. We then discuss the reliability
characteristics of the family of distributions. Some characterizations of
the class of distributions are also discussed. The parameters of the class
of distributions are estimated using the method of L-moments. The proposed
class of distributions is applied to a real data set.
Journal: Journal of Applied Statistics
Pages: 1442-1457
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.1000273
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1000273
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1442-1457
Template-Type: ReDIF-Article 1.0
Author-Name: Meihui Guo
Author-X-Name-First: Meihui
Author-X-Name-Last: Guo
Author-Name: Yi-Ting Guo
Author-X-Name-First: Yi-Ting
Author-X-Name-Last: Guo
Author-Name: Chi-Jeng Wang
Author-X-Name-First: Chi-Jeng
Author-X-Name-Last: Wang
Author-Name: Liang-Ching Lin
Author-X-Name-First: Liang-Ching
Author-X-Name-Last: Lin
Title: Assessing influential trade effects via high-frequency market reactions
Abstract:
In the literature, traders are often classified into informed and
uninformed and the trades from informed traders have market impacts. We
investigate these trades by first establishing a scheme to identify the
influential trades from the ordinary trades under certain criteria. The
differential properties between these two types of trades are examined via
the four transaction states classified by the trade price, trade volume,
quotes, and quoted depth. Marginal distribution of the four states and the
transition probability between different states are shown to be distinct
for informed trades and ordinary liquidity trades. Furthermore, four
market reaction factors are introduced and logistic regression models of
the influential trades are established based on these four factors.
Empirical study on the high-frequency transaction data from the NYSE TAQ
database show supportive evidence for high correct classification rates of
the logistic regression models.
Journal: Journal of Applied Statistics
Pages: 1458-1471
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.1000274
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1000274
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1458-1471
Template-Type: ReDIF-Article 1.0
Author-Name: Aydin Karakoca
Author-X-Name-First: Aydin
Author-X-Name-Last: Karakoca
Author-Name: Ulku Erisoglu
Author-X-Name-First: Ulku
Author-X-Name-Last: Erisoglu
Author-Name: Murat Erisoglu
Author-X-Name-First: Murat
Author-X-Name-Last: Erisoglu
Title: A comparison of the parameter estimation methods for bimodal mixture Weibull distribution with complete data
Abstract:
Bimodal mixture Weibull distribution being a special case of mixture
Weibull distribution has been used recently as a suitable model for
heterogeneous data sets in many practical applications. The bimodal
mixture Weibull term represents a mixture of two Weibull distributions.
Although many estimation methods have been proposed for the bimodal
mixture Weibull distribution, there is not a comprehensive comparison.
This paper presents a detailed comparison of five kinds of numerical
methods, such as maximum likelihood estimation, least-squares method,
method of moments, method of logarithmic moments and percentile method
(PM) in terms of several criteria by simulation study. Also parameter
estimation methods are applied to real data.
Journal: Journal of Applied Statistics
Pages: 1472-1489
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.1000275
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1000275
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1472-1489
Template-Type: ReDIF-Article 1.0
Author-Name: Hasan Ertas
Author-X-Name-First: Hasan
Author-X-Name-Last: Ertas
Author-Name: Selma Toker
Author-X-Name-First: Selma
Author-X-Name-Last: Toker
Author-Name: Selahattin Ka�ıranlar
Author-X-Name-First: Selahattin
Author-X-Name-Last: Ka�ıranlar
Title: Robust two parameter ridge M-estimator for linear regression
Abstract:
The problem of multicollinearity and outliers in the data set produce
undesirable effects on the ordinary least squares estimator. Therefore,
robust two parameter ridge estimation based on M-estimator (ME) is
introduced to deal with multicollinearity and outliers in the
y-direction. The proposed estimator outperforms ME, two
parameter ridge estimator and robust ridge M-estimator according to mean
square error criterion. Moreover, a numerical example and a Monte Carlo
simulation experiment are presented.
Journal: Journal of Applied Statistics
Pages: 1490-1502
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.1000577
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1000577
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1490-1502
Template-Type: ReDIF-Article 1.0
Author-Name: Wen-Liang Hung
Author-X-Name-First: Wen-Liang
Author-X-Name-Last: Hung
Author-Name: Jenn-Hwai Yang
Author-X-Name-First: Jenn-Hwai
Author-X-Name-Last: Yang
Title: Automatic clustering algorithm for fuzzy data
Abstract:
Coppi et al. [7] applied Yang and Wu's [20] idea to
propose a possibilistic k-means (PkM)
clustering algorithm for LR-type fuzzy numbers. The
memberships in the objective function of PkM no longer
need to satisfy the constraint in fuzzy k-means that of a
data point across classes sum to one. However, the clustering performance
of PkM depends on the initializations and weighting
exponent. In this paper, we propose a robust clustering method based on a
self-updating procedure. The proposed algorithm not only solves the
initialization problems but also obtains a good clustering result. Several
numerical examples also demonstrate the effectiveness and accuracy of the
proposed clustering method, especially the robustness to initial values
and noise. Finally, three real fuzzy data sets are used to illustrate the
superiority of this proposed algorithm.
Journal: Journal of Applied Statistics
Pages: 1503-1518
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.1001326
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1001326
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1503-1518
Template-Type: ReDIF-Article 1.0
Author-Name: Jahida Gulshan
Author-X-Name-First: Jahida
Author-X-Name-Last: Gulshan
Author-Name: Md. Mejbahuddin Mina
Author-X-Name-First: Md. Mejbahuddin
Author-X-Name-Last: Mina
Author-Name: Syed Shahadat Hossain
Author-X-Name-First: Syed Shahadat
Author-X-Name-Last: Hossain
Title: Migration pattern in Bangladesh: a covariate-dependent Markov model
Abstract:
Internal migration is one of the major components of rapid and unplanned
growth of towns and cities especially in the developing countries. This
paper describes the transition pattern of internal out migration in
Bangladesh and some sociodemographic factors influencing such migration in
the country using a covariate-dependent Markov model. Four types of
migration behavior namely, rural to rural, rural to urban, urban to rural
and urban to urban are under consideration of this paper. Defining two
discrete states, urban and rural, each of such transition can be
characterized by a stochastic process; hence we use a two-state Markov
chain for this purpose. We find that age, sex, division and reason of
migration are significantly associated with internal migration in
Bangladesh. The major findings include that any type of migration, rural
to rural, rural to urban, urban to rural and urban to urban, mostly take
place at the ages of 15-30 as well as at the ages of 0-15; females have
higher odds than males to make a migration; Dhaka, Rajshahi and Chittagong
divisions have remarkably higher migration rate as compared to Barisal and
Sylhet division; and the professional reason is the main reason for rural
to urban migration.
Journal: Journal of Applied Statistics
Pages: 1519-1530
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.1001327
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1001327
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1519-1530
Template-Type: ReDIF-Article 1.0
Author-Name: Luis A. Gil-Alana
Author-X-Name-First: Luis A.
Author-X-Name-Last: Gil-Alana
Title: Linear and segmented trends in sea surface temperature data
Abstract:
This paper deals with the analysis of the MET Office Hadley Centre's sea
surface temperature data set (HadSST3) by using long-range dependence
techniques. We incorporate linear and segmented trends using fractional
integration, and thus permitting long memory behavior in the detrended
series. The results indicate the existence of warming trends in the three
series examined (Northern and Southern Hemispheres along with global
temperatures), with orders of integration which are in the range (0.5, 1)
and thus implying nonstationary long memory and mean reverting behavior.
This is innovative compared with other works that assume short memory
behavior in the detrended series. Allowing for segmented trends two
features are observed: increasing values in the degree of dependence of
the series across time and significant warming trends from 1940 onwards.
Journal: Journal of Applied Statistics
Pages: 1531-1546
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.1001328
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1001328
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1531-1546
Template-Type: ReDIF-Article 1.0
Author-Name: Sugata Sen Roy
Author-X-Name-First: Sugata Sen
Author-X-Name-Last: Roy
Author-Name: Moumita Chatterjee
Author-X-Name-First: Moumita
Author-X-Name-Last: Chatterjee
Title: Estimating the hazard functions of two alternately occurring recurrent events
Abstract:
Often two recurrent events of equal importance can occur alternately. The
life-time patterns of the two events can then be of considerable interest.
In this paper, we consider two such events, the inclusion and exclusion of
players in a team sport, and study whether there is any inherent pattern
in the time-lengths between these events. The life-time distributions are
modelled and methods of estimating the model parameters suggested taking
into account any relationship in the pattern of recurrence. The results
are then applied to the inclusion and exclusion of players in the Indian
national cricket team. As further illustration, a simulation study is
made. Broad application areas are identified both in the introduction and
conclusion.
Journal: Journal of Applied Statistics
Pages: 1547-1555
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.1001329
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1001329
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1547-1555
Template-Type: ReDIF-Article 1.0
Author-Name: Ayten Yiğiter
Author-X-Name-First: Ayten
Author-X-Name-Last: Yiğiter
Author-Name: Jie Chen
Author-X-Name-First: Jie
Author-X-Name-Last: Chen
Author-Name: Lingling An
Author-X-Name-First: Lingling
Author-X-Name-Last: An
Author-Name: Nazan Danacioğlu
Author-X-Name-First: Nazan
Author-X-Name-Last: Danacioğlu
Title: An online copy number variant detection method for short sequencing reads
Abstract:
The availability of the next generation sequencing (NGS) technology in
today's biomedical research has provided new opportunities in scientific
discovery of genetic information. The high-throughput NGS technology,
especially DNA-seq, is particularly useful in profiling a genome for the
analysis of DNA copy number variants (CNVs). The read count (RC) data
resulting from NGS technology are massive and information rich. How to
exploit the RC data for accurate CNV detection has become a computational
and statistical challenge. We provide a statistical online change point
method to help detect CNVs in the sequencing RC data in this paper. This
method uses the idea of online searching for change point (or breakpoint)
with a Markov chain assumption on the breakpoints loci and an iterative
computing process via a Bayesian framework. We illustrate that an online
change-point detection method is particularly suitable for identifying
CNVs in the RC data. The algorithm is applied to the publicly available
NCI-H2347 lung cancer cell line sequencing reads data for locating the
breakpoints. Extensive simulation studies have been carried out and
results show the good behavior of the proposed algorithm. The algorithm is
implemented in R and the codes are available upon request.
Journal: Journal of Applied Statistics
Pages: 1556-1571
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.1001330
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1001330
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1556-1571
Template-Type: ReDIF-Article 1.0
Author-Name: E. Cene
Author-X-Name-First: E.
Author-X-Name-Last: Cene
Author-Name: F. Karaman
Author-X-Name-First: F.
Author-X-Name-Last: Karaman
Title: Analysing organic food buyers' perceptions with Bayesian networks: a case study in Turkey
Abstract:
Bayesian network (BN) is an efficient graphical method that uses directed
acyclic graphs (DAG) to provide information about a set of data. BNs
consist of nodes and arcs (or edges) where nodes represent variables and
arcs represent relations and influences between nodes. Interest in organic
food has been increasing in the world during the last decade. The same
trend is also valid in Turkey. Although there are numerous studies that
deal with customer perception of organic food and customer
characteristics, none of them used BNs. Thus, this study, which shows a
new application area of BNs, aims to reveal the perception and
characteristics of organic food buyers. In this work, a survey is designed
and applied in seven different organic bazaars in Turkey. Afterwards, BNs
are constructed with the data gathered from 611 organic food consumers.
The findings match with the previous studies as factors such as health,
environmental factors, food availability, product price, consumers' income
and trust to organization are found to influence consumers effectively.
Journal: Journal of Applied Statistics
Pages: 1572-1590
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.1001331
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1001331
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1572-1590
Template-Type: ReDIF-Article 1.0
Author-Name: Karan Veer
Author-X-Name-First: Karan
Author-X-Name-Last: Veer
Author-Name: Ravinder Agarwal
Author-X-Name-First: Ravinder
Author-X-Name-Last: Agarwal
Title: Wavelet and short-time Fourier transform comparison-based analysis of myoelectric signals
Abstract:
In this investigation, extracted features ofsignals have been analyzed for
the recognition of arm movements. Short-time Fourier transform and wavelet
transform based on Euclidian distance were applied to reordered signals.
Results show that wavelet is a more useful and powerful tool for analyzing
signals, since it shows multiresolution property with a significant
reduction in the computation time for eliminating resolution problems.
Finally, a statistical technique of repeated factorial analysis of
variance for experimental recorded data was implemented in a way to
investigate the effect of class separability for multiple motions for
establishing surface electromyogram-muscular force relationship.
Journal: Journal of Applied Statistics
Pages: 1591-1601
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2014.1001728
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1001728
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1591-1601
Template-Type: ReDIF-Article 1.0
Author-Name: Jin-Jian Hsieh
Author-X-Name-First: Jin-Jian
Author-X-Name-Last: Hsieh
Author-Name: Wei-Cheng Huang
Author-X-Name-First: Wei-Cheng
Author-X-Name-Last: Huang
Title: Nonparametric estimation and test of conditional Kendall's tau under semi-competing risks data and truncated data
Abstract:
In this article, we focus on estimation and test of conditional Kendall's
tau under semi-competing risks data and truncated data. We apply the
inverse probability censoring weighted technique to construct an estimator
of conditional Kendall's tau, . Then, this
study provides a test statistic for , where
. When two
random variables are quasi-independent, it implies
. Thus,
is a proxy for
quasi-independence. Tsai [12], and Martin and Betensky [10] considered the
testing problem for quasi-independence. Via simulation studies, we compare
the three test statistics for quasi-independence, and examine the
finite-sample performance of the proposed estimator and the suggested test
statistic. Furthermore, we provide the large sample properties for our
proposed estimator. Finally, we provide two real data examples for
illustration.
Journal: Journal of Applied Statistics
Pages: 1602-1616
Issue: 7
Volume: 42
Year: 2015
Month: 7
X-DOI: 10.1080/02664763.2015.1004624
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1004624
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:7:p:1602-1616
Template-Type: ReDIF-Article 1.0
Author-Name: Ekele Alih
Author-X-Name-First: Ekele
Author-X-Name-Last: Alih
Author-Name: Hong Choon Ong
Author-X-Name-First: Hong Choon
Author-X-Name-Last: Ong
Title: An outlier-resistant test for heteroscedasticity in linear models
Abstract:
The presence of contamination often called outlier is a very common
attribute in data. Among other causes, outliers in a homoscedastic model
make the model heteroscedastic. Moreover, outliers distort diagnostic
tools for heteroscedasticity such that it may not be correctly identified.
In this article, we show how outliers affect heteroscedasticity
diagnostics. We then proposed a robust procedure for detecting
heteroscedasticity in the presence of outliers by robustifying the
non-robust component of the Goldfeld-Quandt (GQ) test. The performance of
the proposed procedure is examined using simulation experiment and real
data sets. The proposed procedure offers great improvement where the
conventional GQ and other procedures fail.
Journal: Journal of Applied Statistics
Pages: 1617-1634
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1004623
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1004623
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1617-1634
Template-Type: ReDIF-Article 1.0
Author-Name: Guanfu Liu
Author-X-Name-First: Guanfu
Author-X-Name-Last: Liu
Author-Name: Xiaolong Pu
Author-X-Name-First: Xiaolong
Author-X-Name-Last: Pu
Author-Name: Lei Wang
Author-X-Name-First: Lei
Author-X-Name-Last: Wang
Author-Name: Dongdong Xiang
Author-X-Name-First: Dongdong
Author-X-Name-Last: Xiang
Title: CUSUM chart for detecting range shifts when monotonicity of likelihood ratio is invalid
Abstract:
It is often encountered in the literature that the log-likelihood ratios
(LLR) of some distributions (e.g. the student t
distribution) are not monotonic. Existing charts for monitoring such
processes may suffer from the fact that the average run length (ARL) curve
is a discontinuous function of control limit. It implies that some
pre-specified in-control (IC) ARLs of these charts may not be reached. To
guarantee the false alarm rate of a control chart lower than the nominal
level, a larger IC ARL is usually suggested in the literature. However,
the large IC ARL may weaken the performance of a control chart when the
process is out-of-control (OC), compared with a just right IC ARL. To
overcome it, we adjust the LLR to be a monotonic one in this paper. Based
on it, a multiple CUSUM chart is developed to detect range shifts in IC
distribution. Theoretical result in this paper ensures the continuity of
its ARL curve. Numerical results show our proposed chart performs well
under the range shifts, especially under the large shifts. In the end, a
real data example is utilized to illustrate our proposed chart.
Journal: Journal of Applied Statistics
Pages: 1635-1644
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1004625
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1004625
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1635-1644
Template-Type: ReDIF-Article 1.0
Author-Name: Y. Ziane
Author-X-Name-First: Y.
Author-X-Name-Last: Ziane
Author-Name: S. Adjabi
Author-X-Name-First: S.
Author-X-Name-Last: Adjabi
Author-Name: N. Zougab
Author-X-Name-First: N.
Author-X-Name-Last: Zougab
Title: Adaptive Bayesian bandwidth selection in asymmetric kernel density estimation for nonnegative heavy-tailed data
Abstract:
In this paper, we consider an interesting problem on adaptive
Birnbaum-Saunders-power-exponential (BS-PE) kernel density estimation for
nonnegative heavy-tailed (HT) data. Treating the variable bandwidths
,
of adaptive
BS-PE kernel as parameters, we then propose a conjugate prior and estimate
the 's by using the
popular quadratic and entropy loss functions. Explicit formulas are
obtained for the posterior and Bayes estimators. Comparison simulations
with global unbiased cross-validation bandwidth selection technique were
conducted under four HT distributions. Finally, two applications based on
HT real data are presented and analyzed.
Journal: Journal of Applied Statistics
Pages: 1645-1658
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1004626
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1004626
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1645-1658
Template-Type: ReDIF-Article 1.0
Author-Name: Filidor Vilca
Author-X-Name-First: Filidor
Author-X-Name-Last: Vilca
Author-Name: Camila Borelli Zeller
Author-X-Name-First: Camila Borelli
Author-X-Name-Last: Zeller
Author-Name: Gauss M. Cordeiro
Author-X-Name-First: Gauss M.
Author-X-Name-Last: Cordeiro
Title: The sinh-normal/independent nonlinear regression model
Abstract:
The normal/independent family of distributions is an attractive class of
symmetric heavy-tailed density functions. They have a nice hierarchical
representation to make inferences easily. We propose the
Sinh-normal/independent distribution which extends the Sinh-normal (SN)
distribution [23]. We discuss some of its properties and propose the
Sinh-normal/independent nonlinear regression model based on a similar
setup of Lemonte and Cordeiro [18], who applied the Birnbaum-Saunders
distribution. We develop an EM-algorithm for maximum likelihood estimation
of the model parameters. In order to examine the robustness of this
flexible class against outlying observations, we perform a simulation
study and analyze a real data set to illustrate the usefulness of the new
model.
Journal: Journal of Applied Statistics
Pages: 1659-1676
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1005059
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1005059
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1659-1676
Template-Type: ReDIF-Article 1.0
Author-Name: Meesun Sun
Author-X-Name-First: Meesun
Author-X-Name-Last: Sun
Author-Name: Kwanghyun Choi
Author-X-Name-First: Kwanghyun
Author-X-Name-Last: Choi
Author-Name: Sungzoon Cho
Author-X-Name-First: Sungzoon
Author-X-Name-Last: Cho
Title: Estimating the minority class proportion with the ROC curve using Military Personality Inventory data of the ROK Armed Forces
Abstract:
The Republic of Korea Armed Forces includes maladjusted conscripts such as
the mentally ill, the suicidal, the imprisoned, and those determined by
the military commander to be maladjusted. To counteract these problems, it
is necessary to identify the maladjusted conscripts to determine who among
them would qualify for exemption from active military service or need
special attention. We use the Military Personality Inventory (MPI) to make
this prediction. Such a prediction presents a kind of class imbalance and
class overlap problem, where the majority fulfil active service and the
minority are maladjusted, the latter being discharged early from active
service. Therefore, most classification algorithms are likely to show low
classification performance. As an alternative, this study demonstrates the
effective utilization of the receiver operating characteristics curve
using MPI data to estimate the maladjusted proportion of persons sharing
similar MPI test results. We confirm that the suggested method performs
well using the real-world MPI data set. The suggested method is very
useful to estimate the proportion of conscripts maladjusted to military
life and can help in the management of such persons subject to
conscription.
Journal: Journal of Applied Statistics
Pages: 1677-1689
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1005060
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1005060
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1677-1689
Template-Type: ReDIF-Article 1.0
Author-Name: Deidra A. Coleman
Author-X-Name-First: Deidra A.
Author-X-Name-Last: Coleman
Author-Name: Donald E.K. Martin
Author-X-Name-First: Donald E.K.
Author-X-Name-Last: Martin
Author-Name: Brian J. Reich
Author-X-Name-First: Brian J.
Author-X-Name-Last: Reich
Title: Multiple window discrete scan statistic for higher-order Markovian sequences
Abstract:
Accurate and efficient methods to detect unusual clusters of abnormal
activity are needed in many fields such as medicine and business. Often
the size of clusters is unknown; hence, multiple (variable) window scan
statistics are used to identify clusters using a set of different
potential cluster sizes. We give an efficient method to compute the exact
distribution of multiple window discrete scan statistics for higher-order,
multi-state Markovian sequences. We define a Markov chain to efficiently
keep track of probabilities needed to compute p-values
for the statistic. The state space of the Markov chain is set up by a
criterion developed to identify strings that are associated with observing
the specified values of the statistic. Using our algorithm, we identify
cases where the available approximations do not perform well. We
demonstrate our methods by detecting unusual clusters of made free throw
shots by National Basketball Association players during the 2009-2010
regular season.
Journal: Journal of Applied Statistics
Pages: 1690-1705
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1005061
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1005061
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1690-1705
Template-Type: ReDIF-Article 1.0
Author-Name: Tsung-Shan Tsou
Author-X-Name-First: Tsung-Shan
Author-X-Name-Last: Tsou
Author-Name: Hsiao-Yun Liu
Author-X-Name-First: Hsiao-Yun
Author-X-Name-Last: Liu
Title: Testing the homogeneity of proportions for clustered binary data without knowing the correlation structure
Abstract:
A robust generalized score test for comparing groups of cluster binary
data is proposed. This novel test is asymptotically valid for practically
any underlying correlation configurations including the situation when
correlation coefficients vary within or between clusters. This structure
generally undermines the validity of the typical large sample properties
of the method of maximum likelihood. Simulations and real data analysis
are used to demonstrate the merit of this parametric robust method.
Results show that our test is superior to two recently proposed test
statistics advocated by other researchers.
Journal: Journal of Applied Statistics
Pages: 1706-1715
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1005062
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1005062
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1706-1715
Template-Type: ReDIF-Article 1.0
Author-Name: B. Baris Alkan
Author-X-Name-First: B. Baris
Author-X-Name-Last: Alkan
Author-Name: Cemal Atakan
Author-X-Name-First: Cemal
Author-X-Name-Last: Atakan
Author-Name: Nesrin Alkan
Author-X-Name-First: Nesrin
Author-X-Name-Last: Alkan
Title: A comparison of different procedures for principal component analysis in the presence of outliers
Abstract:
Principal component analysis (PCA) is a popular technique that is useful
for dimensionality reduction but it is affected by the presence of
outliers. The outlier sensitivity of classical PCA (CPCA) has caused the
development of new approaches. Effects of using estimates obtained by
expectation-maximization - EM and multiple imputation - MI instead of
outliers were examined on the artificial and a real data set. Furthermore,
robust PCA based on minimum covariance determinant (MCD), PCA based on
estimates obtained by EM instead of outliers and PCA based on estimates
obtained by MI instead of outliers were compared with the results of CPCA.
In this study, we tried to show the effects of using estimates obtained by
MI and EM instead of outliers, depending on the ratio of outliers in data
set. Finally, when the ratio of outliers exceeds 20%, we suggest the use
of estimates obtained by MI and EM instead of outliers as an alternative
approach.
Journal: Journal of Applied Statistics
Pages: 1716-1722
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1005063
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1005063
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1716-1722
Template-Type: ReDIF-Article 1.0
Author-Name: Waleed Dhhan
Author-X-Name-First: Waleed
Author-X-Name-Last: Dhhan
Author-Name: Sohel Rana
Author-X-Name-First: Sohel
Author-X-Name-Last: Rana
Author-Name: Habshah Midi
Author-X-Name-First: Habshah
Author-X-Name-Last: Midi
Title: Non-sparse ϵ-insensitive support vector regression for outlier detection
Abstract:
To estimate the approximate relationship between the dependent variable
and its independent variables, it is necessary to diagnose outliers
commonly present in numerous real applications before constructing the
model. Nevertheless, the techniques of the standard support vector
regression (-SVR) and
modified support vector regression () achieved good
performance for outliers' detection for nonlinear functions with
high-dimensional inputs. However, they still suffer from the costs of time
and the setting of parameters. In this study, we propose a practical
method for detecting outliers, using non-sparse
-SVR, which
minimizes time cost and introduces fixed parameters. We apply this
approach for real and simulation data sets to test its effectiveness.
Journal: Journal of Applied Statistics
Pages: 1723-1739
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1005064
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1005064
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1723-1739
Template-Type: ReDIF-Article 1.0
Author-Name: Aaron Anderson
Author-X-Name-First: Aaron
Author-X-Name-Last: Anderson
Title: A Monte Carlo comparison of alternative methods of maximum likelihood ranking in racing sports
Abstract:
Applications of maximum likelihood techniques to rank competitors in
sports are commonly based on the assumption that each competitor's
performance is a function of a deterministic component that represents
inherent ability and a stochastic component that the competitor has
limited control over. Perhaps based on an appeal to the central limit
theorem, the stochastic component of performance has often been assumed to
be a normal random variable. However, in the context of a racing sport,
this assumption is problematic because the resulting model is the
computationally difficult rank-ordered probit. Although a rank-ordered
logit is a viable alternative, a Thurstonian paired-comparison model could
also be applied. The purpose of this analysis was to compare the
performance of the rank-ordered logit and Thurstonian paired-comparison
models given the objective of ranking competitors based on ability. Monte
Carlo simulations were used to generate race results based on a known
ranking of competitors, assign rankings from the results of the two
models, and judge performance based on Spearman's rank correlation
coefficient. Results suggest that in many applications, a Thurstonian
model can outperform a rank-ordered logit if each competitor's performance
is normally distributed.
Journal: Journal of Applied Statistics
Pages: 1740-1756
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1005065
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1005065
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1740-1756
Template-Type: ReDIF-Article 1.0
Author-Name: Mario Hasler
Author-X-Name-First: Mario
Author-X-Name-Last: Hasler
Title: Comment on multiple comparisons with a control under heteroscedasticity
Journal: Journal of Applied Statistics
Pages: 1757-1758
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1005582
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1005582
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1757-1758
Template-Type: ReDIF-Article 1.0
Author-Name: Javier Fern�ndez-Macho
Author-X-Name-First: Javier
Author-X-Name-Last: Fern�ndez-Macho
Title: Comment on testing for spurious and cointegrated regressions: a wavelet approach
Abstract:
In a recent paper, Leong and Huang [6] proposed a
wavelet-correlation-based approach to test for cointegration between two
time series. However, correlation and cointegration are two different
concepts even when wavelet analysis is used. It is known that statistics
based on non-stationary integrated variables have non-standard asymptotic
distributions. However, wavelet analysis offsets the integrating order of
non-stationary series so that traditional asymptotics on stationary
variables suffices to ascertain the statistical properties of
wavelet-based statistics. Based on this, this note shows that wavelet
correlations cannot be used as a test of cointegration.
Journal: Journal of Applied Statistics
Pages: 1759-1769
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1005583
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1005583
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1759-1769
Template-Type: ReDIF-Article 1.0
Author-Name: Chee Kian Leong
Author-X-Name-First: Chee Kian
Author-X-Name-Last: Leong
Title: Response to the comment on testing for spurious and cointegrated regressions: a wavelet approach
Journal: Journal of Applied Statistics
Pages: 1770-1772
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1020006
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1020006
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1770-1772
Template-Type: ReDIF-Article 1.0
Author-Name: Oguz Akbilgic
Author-X-Name-First: Oguz
Author-X-Name-Last: Akbilgic
Title: Classification trees aided mixed regression model
Abstract:
This paper introduces a novel hybrid regression method (MixReg) combining
two linear regression methods, ordinary least square (OLS) and least
squares ratio (LSR) regression. LSR regression is a method to find the
regression coefficients minimizing the sum of squared error rate while OLS
minimizes the sum of squared error itself. The goal of this study is to
combine two methods in a way that the proposed method superior both OLS
and LSR regression methods in terms of R-super-2
statistics and relative error rate. Applications of MixReg, on both
simulated and real data, show that MixReg method outperforms both OLS and
LSR regression.
Journal: Journal of Applied Statistics
Pages: 1773-1781
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1006394
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1006394
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1773-1781
Template-Type: ReDIF-Article 1.0
Author-Name: Ehab F. Abd-Elfattah
Author-X-Name-First: Ehab F.
Author-X-Name-Last: Abd-Elfattah
Title: Saddlepoint p-values for a class of tests for comparing competing risks with censored data
Abstract:
One of the general problems in clinical trials and mortality rates is the
comparison of competing risks. Most of the test statistics used for
independent and dependent risks with censored data belong to the class of
weighted linear rank tests in its multivariate version. In this paper, we
introduce the saddlepoint approximations as accurate and fast
approximations for the exact p-values of this class of
tests instead of the asymptotic and permutation simulated calculations.
Real data examples and extensive simulation studies showed the accuracy
and stability performance of the saddlepoint approximations over different
scenarios of lifetime distributions, sample sizes and censoring.
Journal: Journal of Applied Statistics
Pages: 1782-1791
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1006590
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1006590
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1782-1791
Template-Type: ReDIF-Article 1.0
Author-Name: S.P. Singh
Author-X-Name-First: S.P.
Author-X-Name-Last: Singh
Author-Name: S. Mukhopadhyay
Author-X-Name-First: S.
Author-X-Name-Last: Mukhopadhyay
Author-Name: A. Roy
Author-X-Name-First: A.
Author-X-Name-Last: Roy
Title: Comparison of three-level cluster randomized trials using quantile dispersion graphs
Abstract:
The purpose of this article is to evaluate and compare several three-level
cluster randomized designs on the basis of their power functions. The
power function of cluster designs depends on the intracluster correlations
(ICCs), which are generally unknown at the planning stage. Thus, to
compare these designs a prior knowledge of the ICCs is required. Three
interval estimation methods are proposed for assigning joint confidence
intervals to the two ICCs (corresponding to each cluster level). A
detailed simulation study comparing the confidence intervals attained by
the different techniques is given. The technique of quantile dispersion
graphs is used for comparing the three-level cluster designs. For a given
design, quantiles of the power function, are obtained for various effect
sizes. These quantiles are functions of the unknown ICC coefficients. To
address the dependence of the quantiles on the correlations, a
confidence
region is computed, and used as a parameter space. A three-level nested
data set collected by the University of Michigan to study various school
reforms on the achievements of students is used to illustrate the proposed
methodology.
Journal: Journal of Applied Statistics
Pages: 1792-1812
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1010491
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1010491
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1792-1812
Template-Type: ReDIF-Article 1.0
Author-Name: Sanghoo Yoon
Author-X-Name-First: Sanghoo
Author-X-Name-Last: Yoon
Author-Name: Bungon Kumphon
Author-X-Name-First: Bungon
Author-X-Name-Last: Kumphon
Author-Name: Jeong-Soo Park
Author-X-Name-First: Jeong-Soo
Author-X-Name-Last: Park
Title: Spatial modeling of extreme rainfall in northeast Thailand
Abstract:
It is well recognized that the generalized extreme value (GEV)
distribution is widely used for any extreme events. This notion is based
on the study of discrete choice behavior; however, there is a limit for
predicting the distribution at ungauged sites. Hence, there have been
studies on spatial dependence within extreme events in continuous space
using recorded observations. We model the annual maximum daily rainfall
data consisting of 25 locations for the period from 1982 to 2013. The
spatial GEV model that is established under observations is assumed to be
mutually independent because there is no spatial dependency between the
stations. Furthermore, we divide the region into two regions for a better
model fit and identify the best model for each region. We show that the
regional spatial GEV model reflects the spatial pattern well compared with
the spatial GEV model over the entire region as the local GEV
distribution. The advantage of spatial extreme modeling is that more
robust return levels and some indices of extreme rainfall can be obtained
for observed stations as well as for locations without observed data.
Thus, the model helps to determine the effects and assessment of
vulnerability due to heavy rainfall in northeast Thailand.
Journal: Journal of Applied Statistics
Pages: 1813-1828
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1010492
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1010492
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1813-1828
Template-Type: ReDIF-Article 1.0
Author-Name: Chien-Lang Chen
Author-X-Name-First: Chien-Lang
Author-X-Name-Last: Chen
Author-Name: Shang-Ling Ou
Author-X-Name-First: Shang-Ling
Author-X-Name-Last: Ou
Author-Name: Chen-Tuo Liao
Author-X-Name-First: Chen-Tuo
Author-X-Name-Last: Liao
Title: Interval estimation for conformance proportions of multiple quality characteristics
Abstract:
A conformance proportion is an important and useful index to assess
industrial quality improvement. Statistical confidence limits for a
conformance proportion are usually required not only to perform
statistical significance tests, but also to provide useful information for
determining practical significance. In this article, we propose approaches
for constructing statistical confidence limits for a conformance
proportion of multiple quality characteristics. Under the assumption that
the variables of interest are distributed with a multivariate normal
distribution, we develop an approach based on the concept of a fiducial
generalized pivotal quantity (FGPQ). Without any distribution assumption
on the variables, we apply some confidence interval construction methods
for the conformance proportion by treating it as the probability of a
success in a binomial distribution. The performance of the proposed
methods is evaluated through detailed simulation studies. The results
reveal that the simulated coverage probability (cp) for the FGPQ-based
method is generally larger than the claimed value. On the other hand, one
of the binomial distribution-based methods, that is, the standard method
suggested in classical textbooks, appears to have smaller simulated cps
than the nominal level. Two alternatives to the standard method are found
to maintain their simulated cps sufficiently close to the claimed level,
and hence their performances are judged to be satisfactory. In addition,
three examples are given to illustrate the application of the proposed
methods.
Journal: Journal of Applied Statistics
Pages: 1829-1841
Issue: 8
Volume: 42
Year: 2015
Month: 8
X-DOI: 10.1080/02664763.2015.1010493
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1010493
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:8:p:1829-1841
Template-Type: ReDIF-Article 1.0
Author-Name: Yuan Liu
Author-X-Name-First: Yuan
Author-X-Name-Last: Liu
Author-Name: Hongyun Liu
Author-X-Name-First: Hongyun
Author-X-Name-Last: Liu
Author-Name: Hang Li
Author-X-Name-First: Hang
Author-X-Name-Last: Li
Author-Name: Qian Zhao
Author-X-Name-First: Qian
Author-X-Name-Last: Zhao
Title: The effects of individually varying times of observations on growth parameter estimations in piecewise growth model
Abstract:
When using latent growth modeling (LGM), researchers often restrict the
factor loadings, while the multilevel modeling (MLM) treats time as a
metric variable. However, when individually varying times of observations
are concerned in the longitudinal studies, the use of specified loadings
would lead to inaccurate estimation. Based on piecewise growth modeling
(PGM), this simulation study showed that (i) individually varying times of
observations with larger boundaries got worse estimates and model fits
when LGM was used; (ii) estimating the PGM across all the simulation
situations was robust within MLM, whereas LGM got identically equal
estimation with MLM only in the case of time boundaries of ±1 month
or shorter; (iii) larger change of slope in piecewise modeling indicated
better estimation.
Journal: Journal of Applied Statistics
Pages: 1843-1860
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1014884
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1014884
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:1843-1860
Template-Type: ReDIF-Article 1.0
Author-Name: Y. Xia
Author-X-Name-First: Y.
Author-X-Name-Last: Xia
Author-Name: N. Lu
Author-X-Name-First: N.
Author-X-Name-Last: Lu
Author-Name: I. Katz
Author-X-Name-First: I.
Author-X-Name-Last: Katz
Author-Name: R. Bossarte
Author-X-Name-First: R.
Author-X-Name-Last: Bossarte
Author-Name: J. Arora
Author-X-Name-First: J.
Author-X-Name-Last: Arora
Author-Name: H. He
Author-X-Name-First: H.
Author-X-Name-Last: He
Author-Name: J.X. Tu
Author-X-Name-First: J.X.
Author-X-Name-Last: Tu
Author-Name: B. Stephens
Author-X-Name-First: B.
Author-X-Name-Last: Stephens
Author-Name: A. Watts
Author-X-Name-First: A.
Author-X-Name-Last: Watts
Author-Name: X.M. Tu
Author-X-Name-First: X.M.
Author-X-Name-Last: Tu
Title: Models for surveillance data under reporting delay: applications to US veteran first-time suicide attempters
Abstract:
Surveillance data provide a vital source of information for assessing the
spread of a health problem or disease of interest and for planning for
future health-care needs. However, the use of surveillance data requires
proper adjustments of the reported caseload due to underreporting caused
by reporting delays within a limited observation period. Although methods
are available to address this classic statistical problem, they are
largely focused on inference for the reporting delay distribution, with
inference about caseload of disease incidence based on estimates for the
delay distribution. This approach limits the complexity of models for
disease incidence to provide reliable estimates and projections of
incidence. Also, many of the available methods lack robustness since they
require parametric distribution assumptions. We propose a new approach to
overcome such limitations by allowing for separate models for the
incidence and the reporting delay in a distribution-free fashion, but with
joint inference for both modeling components, based on functional response
models. In addition, we discuss inference about projections of future
disease incidence to help identify significant shifts in temporal trends
modeled based on the observed data. This latter issue on detecting 'change
points' is not sufficiently addressed in the literature, despite the fact
that such warning signs of potential outbreak are critically important for
prevention purposes. We illustrate the approach with both simulated and
real data, with the latter involving data for suicide attempts from the
Veteran Healthcare Administration.
Journal: Journal of Applied Statistics
Pages: 1861-1876
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1014885
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1014885
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:1861-1876
Template-Type: ReDIF-Article 1.0
Author-Name: Hea-Jung Kim
Author-X-Name-First: Hea-Jung
Author-X-Name-Last: Kim
Title: Segmented classification analysis with a class of rectangle-screened elliptical populations
Abstract:
In many practical situations, a statistical practitioner often faces a
problem of classifying an object from one of the segmented (or screened)
populations where the segmentation was conducted by a set of screening
variables. This paper addresses this problem, proposing and studying yet
another optimal rule for classification with segmented populations. A
class of q-dimensional rectangle-screened elliptically
contoured (RSEC) distributions is considered for flexibly modeling the
segmented populations. Based on the properties of the RSEC distributions,
a parametric procedure for the segmented classification analysis (SCA) is
proposed. This includes motivation for the SCA as well as some theoretical
propositions regarding its optimal rule and properties. These properties
allow us to establish other important results which include an efficient
estimation of the rule by the Monte Carlo expectation-conditional
maximization algorithm and an optimal variable selection procedure. Two
numerical examples making use of utilizing a simulation study and a real
dataset application and advocating the SCA procedure are also provided.
Journal: Journal of Applied Statistics
Pages: 1877-1895
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1014886
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1014886
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:1877-1895
Template-Type: ReDIF-Article 1.0
Author-Name: A. Spagnoli
Author-X-Name-First: A.
Author-X-Name-Last: Spagnoli
Author-Name: J.J. Houwing-Duistermaat
Author-X-Name-First: J.J.
Author-X-Name-Last: Houwing-Duistermaat
Author-Name: M. Alf�
Author-X-Name-First: M.
Author-X-Name-Last: Alf�
Title: Mixed-effect models for longitudinal responses with different types of dropout: an application to the Leiden 85-plus study
Abstract:
Longitudinal studies on cognitive functioning in geriatric populations
usually cover short follow-up times and may be influenced by different
sources of selection: only a portion of the designed sample may agree to
participate in the study, and only some of the participants may complete
the study. Motivated by a real-life data example, we discuss a variance
component model with two peculiar features. First, we account for
differences in individual status when entering the study by defining a
flexible association structure between baseline and subsequent responses,
where individual characteristics influencing entrance and participation in
the follow-up are jointly modelled. Second, since we may argue that death
and non-participation could not be treated as equivalent reasons for
dropout, we introduce a pattern mixture model that takes into account the
information on the time spent in the study and the reasons for dropout.
The model is applied to data on cognitive functioning from the Leiden
study, and its
performance is analysed through a large-scale simulation study.
Journal: Journal of Applied Statistics
Pages: 1896-1910
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1014887
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1014887
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:1896-1910
Template-Type: ReDIF-Article 1.0
Author-Name: Rabindra Nath Das
Author-X-Name-First: Rabindra Nath
Author-X-Name-Last: Das
Author-Name: Jinseog Kim
Author-X-Name-First: Jinseog
Author-X-Name-Last: Kim
Author-Name: Youngjo Lee
Author-X-Name-First: Youngjo
Author-X-Name-Last: Lee
Title: Robust first-order rotatable lifetime improvement experimental designs
Abstract:
Experimental designs are widely used in predicting the optimal operating
conditions of the process parameters in lifetime improvement experiments.
The most commonly observed lifetime distributions are log-normal,
exponential, gamma and Weibull. In the present article,
invariant robust first-order rotatable designs are
derived for autocorrelated lifetime responses having log-normal,
exponential, gamma and Weibull distributions. In the process, robust
first-order D-optimal and rotatable conditions have been
derived under these situations. For these lifetime distributions with
correlated errors, it is shown that robust first-order
D-optimal designs are always robust rotatable but the
converse is not true. Moreover, it is observed that
robust first-order D-optimal and rotatable designs depend
on the respective error variance-covariance structure but
are independent from these considered lifetime response distributions.
Journal: Journal of Applied Statistics
Pages: 1911-1930
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1014888
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1014888
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:1911-1930
Template-Type: ReDIF-Article 1.0
Author-Name: Amal Saki Malehi
Author-X-Name-First: Amal
Author-X-Name-Last: Saki Malehi
Author-Name: Ebrahim Hajizadeh
Author-X-Name-First: Ebrahim
Author-X-Name-Last: Hajizadeh
Author-Name: Kambiz A. Ahmadi
Author-X-Name-First: Kambiz A.
Author-X-Name-Last: Ahmadi
Author-Name: Parvin Mansouri
Author-X-Name-First: Parvin
Author-X-Name-Last: Mansouri
Title: Joint modelling of longitudinal biomarker and gap time between recurrent events: copula-based dependence
Abstract:
In this paper, we will extend the joint model of longitudinal biomarker
and recurrent event via copula function for accounting the dependence
between the two processes. The general idea of joining separate processes
by allowing model-specific random effect may come from different families
distribution. It is a main advantage of the proposed method that a copula
construction does not constrain the choice of marginal distributions of
random effects. A maximum likelihood estimation with importance sampling
technique as a simple and easy understanding method is employed to model
inference. To evaluate and verify the validation of the proposed joint
model, a bootstrapping method as a model-based resampling is developed.
Our proposed joint model is also applied to pemphigus disease data for
assessing the effect of biomarker trajectory on risk of recurrence.
Journal: Journal of Applied Statistics
Pages: 1931-1945
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1014889
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1014889
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:1931-1945
Template-Type: ReDIF-Article 1.0
Author-Name: Seokho Lee
Author-X-Name-First: Seokho
Author-X-Name-Last: Lee
Author-Name: Johan Lim
Author-X-Name-First: Johan
Author-X-Name-Last: Lim
Author-Name: Insuk Sohn
Author-X-Name-First: Insuk
Author-X-Name-Last: Sohn
Author-Name: Sin-Ho Jung
Author-X-Name-First: Sin-Ho
Author-X-Name-Last: Jung
Author-Name: Cheol-Keun Park
Author-X-Name-First: Cheol-Keun
Author-X-Name-Last: Park
Title: Two sample test for high-dimensional partially paired data
Abstract:
In this paper, we study two sample test for the equality of mean vectors
of high-dimensional partially paired data. Extending the results of Lim
et al. [12], we propose a new type of regularized
statistics, denoted by , which is a
convex combination of the regularized Hotelling's
t-statistic (HT) for two independent multivariate samples
and that for multivariate paired samples. The proposed
involves the
shrinkage estimator of the covariance matrix and, depending on the choice
of the shrinkage estimator, two versions of the
are proposed.
We compute the asymptotic null distribution of one version of the RT for a
fixed tuning parameter of the covariance matrix estimation. A procedure to
estimate the tuning parameter is proposed and discussed. The power of the
proposed test is compared to two existing ad-hoc procedures, the HT based
on a few principal components (PCs) from the PC analysis and that with the
generalized inverse of the sample covariance matrix. It is also compared
to the test with only independent two samples or paired samples. Finally,
we illustrate the advantage of the using the
microarray experiment of the liver cancer.
Journal: Journal of Applied Statistics
Pages: 1946-1961
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1014890
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1014890
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:1946-1961
Template-Type: ReDIF-Article 1.0
Author-Name: A. Nodehi
Author-X-Name-First: A.
Author-X-Name-Last: Nodehi
Author-Name: M. Golalizadeh
Author-X-Name-First: M.
Author-X-Name-Last: Golalizadeh
Author-Name: A. Heydari
Author-X-Name-First: A.
Author-X-Name-Last: Heydari
Title: Dihedral angles principal geodesic analysis using nonlinear statistics
Abstract:
Statistics, as one of the applied sciences, has great impacts in vast area
of other sciences. Prediction of protein structures with great emphasize
on their geometrical features using dihedral angles has invoked the new
branch of statistics, known as directional statistics. One of the
available biological techniques to predict is molecular dynamics
simulations producing high-dimensional molecular structure data. Hence, it
is expected that the principal component analysis (PCA) can response some
related statistical problems particulary to reduce dimensions of the
involved variables. Since the dihedral angles are variables on
non-Euclidean space (their locus is the torus), it is expected that direct
implementation of PCA does not provide great information in this case. The
principal geodesic analysis is one of the recent methods to reduce the
dimensions in the non-Euclidean case. A procedure to utilize this
technique for reducing the dimension of a set of dihedral angles is
highlighted in this paper. We further propose an extension of this tool,
implemented in such way the torus is approximated by the product of two
unit circle and evaluate its application in studying a real data set. A
comparison of this technique with some previous methods is also
undertaken.
Journal: Journal of Applied Statistics
Pages: 1962-1972
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1014892
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1014892
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:1962-1972
Template-Type: ReDIF-Article 1.0
Author-Name: Hadi Alizadeh Noughabi
Author-X-Name-First: Hadi
Author-X-Name-Last: Alizadeh Noughabi
Title: Empirical likelihood ratio-based goodness-of-fit test for the logistic distribution
Abstract:
The logistic distribution has been used to model growth curves in survival
analysis and biological studies. In this article, we propose a
goodness-of-fit test for the logistic distribution based on the empirical
likelihood ratio. The test is constructed based on the methodology
introduced by Vexler and Gurevich [17]. In order to compute the test
statistic, parameters of the distribution are estimated by the method of
maximum likelihood. Power comparisons of the proposed test with some known
competing tests are carried out via simulations. Finally, an illustrative
example is presented and analyzed.
Journal: Journal of Applied Statistics
Pages: 1973-1983
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1014893
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1014893
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:1973-1983
Template-Type: ReDIF-Article 1.0
Author-Name: A.R. Silva
Author-X-Name-First: A.R.
Author-X-Name-Last: Silva
Author-Name: C.T.S. Dias
Author-X-Name-First: C.T.S.
Author-X-Name-Last: Dias
Author-Name: P.R. Cecon
Author-X-Name-First: P.R.
Author-X-Name-Last: Cecon
Author-Name: E.R. Rêgo
Author-X-Name-First: E.R.
Author-X-Name-Last: Rêgo
Title: An alternative procedure for performing a power analysis of Mantel's test
Abstract:
This study proposes a simple way to perform a power analysis of Mantel's
test applied to squared Euclidean distance matrices. The general
statistical aspects of the simple Mantel's test are reviewed. The Monte
Carlo method is used to generate bivariate Gaussian variables in order to
create squared Euclidean distance matrices. The power of the parametric
correlation t-test applied to raw data is also evaluated
and compared with that of Mantel's test. The standard procedure for
calculating punctual power levels is used for validation. The proposed
procedure allows one to draw the power curve by running the test only
once, dispensing with the time demanding standard procedure of Monte Carlo
simulations. Unlike the standard procedure, it does not depend on a
knowledge of the distribution of the raw data. The simulated
power function has all the properties of the power analysis
theory and is in agreement with the results of the standard procedure.
Journal: Journal of Applied Statistics
Pages: 1984-1992
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1014894
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1014894
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:1984-1992
Template-Type: ReDIF-Article 1.0
Author-Name: Firoozeh Rivaz
Author-X-Name-First: Firoozeh
Author-X-Name-Last: Rivaz
Author-Name: Majid Jafari Khaledi
Author-X-Name-First: Majid Jafari
Author-X-Name-Last: Khaledi
Title: Bayesian spatial prediction of skew and censored data via a hybrid algorithm
Abstract:
A correct detection of areas with excess of pollution relies first on
accurate predictions of pollutant concentrations, a task that is usually
complicated by skewed histograms and the presence of censored data. The
unified skew-Gaussian (SUG) random field proposed by Zareifard and Jafari
Khaledi [19] offers a more flexible class of sampling spatial models to
account for skewness. In this paper, we adopt a Bayesian framework to
perform prediction for the SUG model in the presence of censored data.
Owing to the presence of many latent variables with strongly dependent
components in the model, we encounter convergence issues when using Monte
Carlo Markov Chain algorithms. To overcome this obstacle, we use a
computationally efficient inverse Bayes formulas sampling procedure to
obtain approximately independent samples from the posterior distribution
of latent variables. Then they are applied to update parameters in a Gibbs
sampler scheme. This hybrid algorithm provides effective samples,
resulting in some computational advantages and precise predictions. The
proposed approach is illustrated with a simulation study and applied to a
spatial data set which contains right censored data.
Journal: Journal of Applied Statistics
Pages: 1993-2009
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1014895
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1014895
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:1993-2009
Template-Type: ReDIF-Article 1.0
Author-Name: Eufr�sio de A. Lima Neto
Author-X-Name-First: Eufr�sio de A.
Author-X-Name-Last: Lima Neto
Author-Name: Ulisses U. dos Anjos
Author-X-Name-First: Ulisses U.
Author-X-Name-Last: dos Anjos
Title: Regression model for interval-valued variables based on copulas
Abstract:
In real problems, it is usual to have the available data presented as
intervals. Therefore, different approaches have been proposed to obtain a
regression model for this new type of data. In this paper, we represent
the interval-valued response variable as a bivariate
random vector and we consider the copula's theory to propose a general
bivariate distribution for Z, creating a more flexible
random component to the model. Inference techniques and a residual
definition based on deviance are considered, as well as applications to
synthetic and real data sets that demonstrate the usefulness of the
proposed approach. The new method is also compared with other methods
reported in the literature.
Journal: Journal of Applied Statistics
Pages: 2010-2029
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1015114
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1015114
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:2010-2029
Template-Type: ReDIF-Article 1.0
Author-Name: Marcin Kozak
Author-X-Name-First: Marcin
Author-X-Name-Last: Kozak
Author-Name: Wojtek Krzanowski
Author-X-Name-First: Wojtek
Author-X-Name-Last: Krzanowski
Author-Name: Izabela Cichocka
Author-X-Name-First: Izabela
Author-X-Name-Last: Cichocka
Author-Name: James Hartley
Author-X-Name-First: James
Author-X-Name-Last: Hartley
Title: The effects of data input errors on subsequent statistical inference
Abstract:
Data input errors can potentially affect statistical inferences, but
little research has been published to date on this topic. In the present
paper, we report the effect of data input errors on the statistical
inferences drawn about population parameters in an empirical study
involving 280 students from two Polish universities, namely the Warsaw
University of Life Sciences - SGGW and the University of Information
Technology and Management in Rzeszow. We found that 28% of the students
committed at least one data error. While some of these errors were small
and did not have any real effect, a few of them had substantial effects on
the statistical inferences drawn about the population parameters.
Journal: Journal of Applied Statistics
Pages: 2030-2037
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1016410
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1016410
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:2030-2037
Template-Type: ReDIF-Article 1.0
Author-Name: Philip B. Holden
Author-X-Name-First: Philip B.
Author-X-Name-Last: Holden
Author-Name: Neil R. Edwards
Author-X-Name-First: Neil R.
Author-X-Name-Last: Edwards
Author-Name: Paul H. Garthwaite
Author-X-Name-First: Paul H.
Author-X-Name-Last: Garthwaite
Author-Name: Richard D. Wilkinson
Author-X-Name-First: Richard D.
Author-X-Name-Last: Wilkinson
Title: Emulation and interpretation of high-dimensional climate model outputs
Abstract:
Running complex computer models can be expensive in computer time, while
learning about the relationships between input and output variables can be
difficult. An emulator is a fast approximation to a computationally
expensive model that can be used as a surrogate for the model, to quantify
uncertainty or to improve process understanding. Here, we examine
emulators based on singular value decompositions (SVDs) and use them to
emulate global climate and vegetation fields, examining how these fields
are affected by changes in the Earth's orbit. The vegetation field may be
emulated directly from the orbital variables, but an appealing alternative
is to relate it to emulations of the climate fields, which involves
high-dimensional input and output. The SVDs radically reduce the
dimensionality of the input and output spaces and are shown to clarify the
relationships between them. The method could potentially be useful for any
complex process with correlated, high-dimensional inputs and/or outputs.
Journal: Journal of Applied Statistics
Pages: 2038-2055
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1016412
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1016412
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:2038-2055
Template-Type: ReDIF-Article 1.0
Author-Name: Abhik Ghosh
Author-X-Name-First: Abhik
Author-X-Name-Last: Ghosh
Author-Name: Ayanendranath Basu
Author-X-Name-First: Ayanendranath
Author-X-Name-Last: Basu
Title: Robust estimation for non-homogeneous data and the selection of the optimal tuning parameter: the density power divergence approach
Abstract:
The density power divergence (DPD) measure, defined in terms of a single
parameter α, has proved to be a popular tool in the
area of robust estimation [1]. Recently, Ghosh and Basu [5] rigorously
established the asymptotic properties of the MDPDEs in case of independent
non-homogeneous observations. In this paper, we present an extensive
numerical study to describe the performance of the method in the case of
linear regression, the most common setup under the case of non-homogeneous
data. In addition, we extend the existing methods for the selection of the
optimal robustness tuning parameter from the case of independent and
identically distributed (i.i.d.) data to the case of non-homogeneous
observations. Proper selection of the tuning parameter is critical to the
appropriateness of the resulting analysis. The selection of the optimal
robustness tuning parameter is explored in the context of the linear
regression problem with an extensive numerical study involving real and
simulated data.
Journal: Journal of Applied Statistics
Pages: 2056-2072
Issue: 9
Volume: 42
Year: 2015
Month: 9
X-DOI: 10.1080/02664763.2015.1016901
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1016901
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:9:p:2056-2072
Template-Type: ReDIF-Article 1.0
Author-Name: Myung Geun Kim
Author-X-Name-First: Myung Geun
Author-X-Name-Last: Kim
Title: Geometric aspects of deletion diagnostics in multivariate regression
Abstract:
In multivariate regression, a graphical diagnostic method of detecting
observations that are influential in estimating regression coefficients is
introduced. It is based on the principal components and their variances
obtained from the covariance matrix of the probability distribution for
the change in the estimator of the matrix of unknown regression
coefficients due to a single-case deletion. As a result, each deletion
statistic obtained in a form of matrix is transformed into a
two-dimensional quantity. Its univariate version is also introduced in a
little different way. No distributional form is assumed. For illustration,
we provide a numerical example in which the graphical method introduced
here is seen to be effective in getting information about influential
observations.
Journal: Journal of Applied Statistics
Pages: 2073-2079
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1016411
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1016411
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2073-2079
Template-Type: ReDIF-Article 1.0
Author-Name: M.G.M. Khan
Author-X-Name-First: M.G.M.
Author-X-Name-Last: Khan
Author-Name: K.G. Reddy
Author-X-Name-First: K.G.
Author-X-Name-Last: Reddy
Author-Name: D.K. Rao
Author-X-Name-First: D.K.
Author-X-Name-Last: Rao
Title: Designing stratified sampling in economic and business surveys
Abstract:
In most economic and business surveys, the target variables (e.g. turnover
of enterprises, income of households, etc.) commonly resemble skewed
distributions with many small and few large units. In such surveys, if a
stratified sampling technique is used as a method of sampling and
estimation, the convenient way of stratification such as the use of
demographical variables (e.g. gender, socioeconomic class, geographical
region, religion, ethnicity, etc.) or other natural criteria, which is
widely practiced in economic surveys, may fail to form homogeneous strata
and is not much useful in order to increase the precision of the estimates
of variables of interest. In this paper, a stratified sampling design for
economic surveys based on auxiliary information has been developed, which
can be used for constructing optimum stratification and determining
optimum sample allocation to maximize the precision in estimate.
Journal: Journal of Applied Statistics
Pages: 2080-2099
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1018674
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1018674
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2080-2099
Template-Type: ReDIF-Article 1.0
Author-Name: Uttam Bandyopadhyay
Author-X-Name-First: Uttam
Author-X-Name-Last: Bandyopadhyay
Author-Name: Joydeep Basu
Author-X-Name-First: Joydeep
Author-X-Name-Last: Basu
Author-Name: Ganesh Dutta
Author-X-Name-First: Ganesh
Author-X-Name-Last: Dutta
Title: Crossover design in clinical trials for binary response
Abstract:
In this paper, we consider a binary response model for the analysis of the
two-treatment, two-period and four-sequence crossover design. We have
introduced intra-patient drug dependency parameter in the model and
provide two tests for the hypothesis of equality of treatment effects. We
employ Monte Carlo simulation to compare our tests and a test that works
under parallel design on the basis of type I error rate and power. We find
that our procedures are dominant over the competitor with respect to
power. Finally, we use a data set to illustrate the applicability of our
procedure.
Journal: Journal of Applied Statistics
Pages: 2100-2114
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1018675
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1018675
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2100-2114
Template-Type: ReDIF-Article 1.0
Author-Name: Juan Ding
Author-X-Name-First: Juan
Author-X-Name-Last: Ding
Author-Name: Wenjun Xiong
Author-X-Name-First: Wenjun
Author-X-Name-Last: Xiong
Title: Robust group testing for multiple traits with misclassification
Abstract:
Determining group size is a crucial stage before conducting experiments
using group testing methods. Considering misclassification, we propose
D-criterion and A-criterion to determine
a robust group size for screening multiple infections simultaneously.
Extensive simulation shows the advantage of the proposed method when the
goal is estimation.
Journal: Journal of Applied Statistics
Pages: 2115-2125
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1019841
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1019841
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2115-2125
Template-Type: ReDIF-Article 1.0
Author-Name: Istem Koymen Keser
Author-X-Name-First: Istem
Author-X-Name-Last: Koymen Keser
Author-Name: Ipek Deveci Kocako�
Author-X-Name-First: Ipek
Author-X-Name-Last: Deveci Kocako�
Title: Smoothed functional canonical correlation analysis of humidity and temperature data
Abstract:
This paper focuses on smoothed functional canonical correlation analysis
(SFCCA) to investigate the relationships and changes in large, seasonal
and long-term data sets. The aim of this study is to introduce a guideline
for SFCCA for functional data and to give some insights on the fine tuning
of the methodology for long-term periodical data. The guidelines are
applied on temperature and humidity data for 11 years between 2000 and
2010 and the results are interpreted. Seasonal changes or periodical
shifts are visually studied by yearly comparisons. The effects of the
'number of basis functions' and the 'selection of smoothing parameter' on
the general variability structure and on correlations between the curves
are examined. It is concluded that the number of time points (knots),
number of basis functions and the time span of evaluation (monthly, daily,
etc.) should all be chosen harmoniously. It is found that changing the
smoothing parameter does not have a significant effect on the structure of
curves and correlations. The number of basis functions is found to be the
main effector on both individual and correlation weight functions.
Journal: Journal of Applied Statistics
Pages: 2126-2140
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1019842
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1019842
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2126-2140
Template-Type: ReDIF-Article 1.0
Author-Name: Lucia Modugno
Author-X-Name-First: Lucia
Author-X-Name-Last: Modugno
Author-Name: Silvia Cagnone
Author-X-Name-First: Silvia
Author-X-Name-Last: Cagnone
Author-Name: Simone Giannerini
Author-X-Name-First: Simone
Author-X-Name-Last: Giannerini
Title: A multilevel model with autoregressive components for the analysis of tribal art prices
Abstract:
In this paper, we introduce a multilevel model specification with
time-series components for the analysis of prices of artworks sold at
auctions. Since auction data do not constitute a panel or a time series
but are composed of repeated cross-sections, they require a specification
with items at the first level nested in time-points. Our approach combines
the flexibility of mixed effect models together with the predicting
performance of time series as it allows to model the time dynamics
directly. Model estimation is obtained by means of maximum likelihood
through the expectation-maximization algorithm. The model is motivated by
the analysis of the first database ethnic artworks sold in the most
important auctions worldwide. The results show that the proposed
specification improves considerably over classical proposals both in terms
of fit and prediction.
Journal: Journal of Applied Statistics
Pages: 2141-2158
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1021304
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1021304
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2141-2158
Template-Type: ReDIF-Article 1.0
Author-Name: Edwin M.M. Ortega
Author-X-Name-First: Edwin M.M.
Author-X-Name-Last: Ortega
Author-Name: Artur J. Lemonte
Author-X-Name-First: Artur J.
Author-X-Name-Last: Lemonte
Author-Name: Giovana O. Silva
Author-X-Name-First: Giovana O.
Author-X-Name-Last: Silva
Author-Name: Gauss M. Cordeiro
Author-X-Name-First: Gauss M.
Author-X-Name-Last: Cordeiro
Title: New flexible models generated by gamma random variables for lifetime modeling
Abstract:
In this paper we introduce a new three-parameter exponential-type
distribution. The new distribution is quite flexible and can be used
effectively in modeling survival data and reliability problems. It can
have constant, decreasing, increasing, upside-down bathtub and
bathtub-shaped hazard rate functions. It also generalizes some well-known
distributions. We discuss maximum likelihood estimation of the model
parameters for complete sample and for censored sample. Additionally, we
formulate a new cure rate survival model by assuming that the number of
competing causes of the event of interest has the Poisson distribution and
the time to this event follows the proposed distribution. Maximum
likelihood estimation of the model parameters of the new cure rate
survival model is discussed for complete sample and censored sample. Two
applications to real data are provided to illustrate the flexibility of
the new model in practice.
Journal: Journal of Applied Statistics
Pages: 2159-2179
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1021669
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1021669
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2159-2179
Template-Type: ReDIF-Article 1.0
Author-Name: Musie Ghebremichael
Author-X-Name-First: Musie
Author-X-Name-Last: Ghebremichael
Title: Joint modeling of correlated binary outcomes: HIV-1 and HSV-2 co-infection
Abstract:
Herpes Simplex Virus Type 2 (HSV-2) facilitates the sexual acquisition and
transmission of HIV-1 infection and is highly prevalent in most regions
experiencing severe HIV epidemics. In sub-Saharan Africa, where HIV
infection is a public health burden, the prevalence of HSV-2 is
substantially high. The high prevalence of HSV-2 and the association
between HSV-2 infection and HIV-1 acquisition could play a significant
role in the spread of HIV-1 in the region. The objective of our study was
to identify risk factors for HSV-2 and HIV-1 infections among men in
sub-Saharan Africa. We used a joint response model that accommodates the
interdependence between the two infections in assessing their risk
factors. Simulation studies show superiority of the joint response model
compared to the traditional models which ignore the dependence between the
two infections. We found higher odds of having HSV-2/HIV-1 among older
men, in men who had multiple sexual partners, abused alcohol, or reported
symptoms of sexually transmitted infections. These findings suggest that
interventions that identify and control the risk factors of the two
infections should be part of HIV-1 prevention programs in sub-Saharan
Africa where antiretroviral therapy is not readily available.
Journal: Journal of Applied Statistics
Pages: 2180-2191
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1022138
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1022138
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2180-2191
Template-Type: ReDIF-Article 1.0
Author-Name: Antonello D'Ambra
Author-X-Name-First: Antonello
Author-X-Name-Last: D'Ambra
Author-Name: Anna Crisci
Author-X-Name-First: Anna
Author-X-Name-Last: Crisci
Author-Name: Pasquale Sarnacchiaro
Author-X-Name-First: Pasquale
Author-X-Name-Last: Sarnacchiaro
Title: A generalized analysis of the dependence structure by means of ANOVA
Abstract:
The multiple non-symmetric correspondence analysis (MNSCA) is a useful
technique for analysing the prediction of a categorical variable through
two or more predictor variables placed in a contingency table. In MNSCA
framework, for summarizing the predictability between criterion and
predictor variables, the Multiple-TAU index has been proposed. But it
cannot be used to test association, and for overcoming this limitation, a
relationship with C-Statistic has been recommended. Multiple-TAU index is
an overall measure of association that contains both main effects and
interaction terms. The main effects represent the change in the response
variables due to the change in the level/categories of the predictor
variables, considering the effects of their addition. On the other hand,
the interaction effect represents the combined effect of predictor
variables on the response variable. In this paper, we propose a
decomposition of the Multiple-TAU index in main effects and interaction
terms. In order to show this decomposition, we consider an empirical case
in which the relationship between the demographic characteristics of the
American people, such as race, gender and location (column
variables), and their propensity to move (row
variable) to a new town to find a job is considered.
Journal: Journal of Applied Statistics
Pages: 2192-2202
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1023269
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1023269
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2192-2202
Template-Type: ReDIF-Article 1.0
Author-Name: H. He
Author-X-Name-First: H.
Author-X-Name-Last: He
Author-Name: W. Wang
Author-X-Name-First: W.
Author-X-Name-Last: Wang
Author-Name: J. Hu
Author-X-Name-First: J.
Author-X-Name-Last: Hu
Author-Name: R. Gallop
Author-X-Name-First: R.
Author-X-Name-Last: Gallop
Author-Name: P. Crits-Christoph
Author-X-Name-First: P.
Author-X-Name-Last: Crits-Christoph
Author-Name: Y. Xia
Author-X-Name-First: Y.
Author-X-Name-Last: Xia
Title: Distribution-free inference of zero-inflated binomial data for longitudinal studies
Abstract:
Count responses with structural zeros are very common in medical and
psychosocial research, especially in alcohol and HIV research, and the
zero-inflated Poisson (ZIP) and zero-inflated negative binomial models are
widely used for modeling such outcomes. However, as alcohol drinking
outcomes such as days of drinkings are counts within a given period, their
distributions are bounded above by an upper limit (total days in the
period) and thus inherently follow a binomial or zero-inflated binomial
(ZIB) distribution, rather than a Poisson or ZIP distribution, in the
presence of structural zeros. In this paper, we develop a new
semiparametric approach for modeling ZIB-like count responses for
cross-sectional as well as longitudinal data. We illustrate this approach
with both simulated and real study data.
Journal: Journal of Applied Statistics
Pages: 2203-2219
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1023270
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1023270
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2203-2219
Template-Type: ReDIF-Article 1.0
Author-Name: Wen-Liang Hung
Author-X-Name-First: Wen-Liang
Author-X-Name-Last: Hung
Author-Name: Shou-Jen Chang-Chien
Author-X-Name-First: Shou-Jen
Author-X-Name-Last: Chang-Chien
Author-Name: Miin-Shen Yang
Author-X-Name-First: Miin-Shen
Author-X-Name-Last: Yang
Title: An intuitive clustering algorithm for spherical data with application to extrasolar planets
Abstract:
This paper proposes an intuitive clustering algorithm capable of
automatically self-organizing data groups based on the original data
structure. Comparisons between the propopsed algorithm and EM [1] and
spherical k-means [7] algorithms are given. These
numerical results show the effectiveness of the proposed algorithm, using
the correct classification rate and the adjusted Rand index as evaluation
criteria [5,6]. In 1995, Mayor and Queloz announced the detection of the
first extrasolar planet (exoplanet) around a Sun-like star. Since then,
observational efforts of astronomers have led to the detection of more
than 1000 exoplanets. These discoveries may provide important information
for understanding the formation and evolution of planetary systems. The
proposed clustering algorithm is therefore used to study the data gathered
on exoplanets. Two main implications are also suggested: (1) there are
three major clusters, which correspond to the exoplanets in the regimes of
disc, ongoing tidal and tidal interactions, respectively, and (2) the
stellar metallicity does not play a key role in exoplanet migration.
Journal: Journal of Applied Statistics
Pages: 2220-2232
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1023271
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1023271
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2220-2232
Template-Type: ReDIF-Article 1.0
Author-Name: M. Teimourian
Author-X-Name-First: M.
Author-X-Name-Last: Teimourian
Author-Name: T. Baghfalaki
Author-X-Name-First: T.
Author-X-Name-Last: Baghfalaki
Author-Name: M. Ganjali
Author-X-Name-First: M.
Author-X-Name-Last: Ganjali
Author-Name: D. Berridge
Author-X-Name-First: D.
Author-X-Name-Last: Berridge
Title: Joint modeling of mixed skewed continuous and ordinal longitudinal responses: a Bayesian approach
Abstract:
In this paper, a joint model for analyzing multivariate mixed ordinal and
continuous responses, where continuous outcomes may be skew, is presented.
For modeling the discrete ordinal responses, a continuous latent variable
approach is considered and for describing continuous responses, a
skew-normal mixed effects model is used. A Bayesian approach using Markov
Chain Monte Carlo (MCMC) is adopted for parameter estimation. Some
simulation studies are performed for illustration of the proposed
approach. The results of the simulation studies show that the use of the
separate models or the normal distributional assumption for shared random
effects and within-subject errors of continuous and ordinal variables,
instead of the joint modeling under a skew-normal distribution, leads to
biased parameter estimates. The approach is used for analyzing a part of
the British Household Panel Survey (BHPS) data set. Annual income and life
satisfaction are considered as the continuous and the ordinal longitudinal
responses, respectively. The annual income variable is severely skewed,
therefore, the use of the normality assumption for the continuous response
does not yield acceptable results. The results of data analysis show that
gender, marital status, educational levels and the amount of money spent
on leisure have a significant effect on annual income, while marital
status has the highest impact on life satisfaction.
Journal: Journal of Applied Statistics
Pages: 2233-2256
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1023557
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1023557
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2233-2256
Template-Type: ReDIF-Article 1.0
Author-Name: A.F. Donneau
Author-X-Name-First: A.F.
Author-X-Name-Last: Donneau
Author-Name: M. Mauer
Author-X-Name-First: M.
Author-X-Name-Last: Mauer
Author-Name: P. Lambert
Author-X-Name-First: P.
Author-X-Name-Last: Lambert
Author-Name: E. Lesaffre
Author-X-Name-First: E.
Author-X-Name-Last: Lesaffre
Author-Name: A. Albert
Author-X-Name-First: A.
Author-X-Name-Last: Albert
Title: Testing the proportional odds assumption in multiply imputed ordinal longitudinal data
Abstract:
A popular choice when analyzing ordinal data is to consider the cumulative
proportional odds model to relate the marginal probabilities of the
ordinal outcome to a set of covariates. However, application of this model
relies on the condition of identical cumulative odds ratios across the
cut-offs of the ordinal outcome; the well-known proportional odds
assumption. This paper focuses on the assessment of this assumption while
accounting for repeated and missing data. In this respect, we develop a
statistical method built on multiple imputation (MI) based on
generalized estimating equations that allows to test the proportionality
assumption under the missing at random setting. The performance of the
proposed method is evaluated for two MI algorithms for incomplete
longitudinal ordinal data. The impact of both MI methods is compared with
respect to the type I error rate and the power for situations covering
various numbers of categories of the ordinal outcome, sample sizes, rates
of missingness, well-balanced and skewed data. The comparison of both MI
methods with the complete-case analysis is also provided. We illustrate
the use of the proposed methods on a quality of life data from a cancer
clinical trial.
Journal: Journal of Applied Statistics
Pages: 2257-2279
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1023704
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1023704
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2257-2279
Template-Type: ReDIF-Article 1.0
Author-Name: Gulder Kemalbay
Author-X-Name-First: Gulder
Author-X-Name-Last: Kemalbay
Author-Name: Ismihan Bayramoglu (Bairamov)
Author-X-Name-First: Ismihan
Author-X-Name-Last: Bayramoglu (Bairamov)
Title: Joint distribution of new sample rank of bivariate order statistics
Abstract:
Let , be independent
copies of bivariate random vector with joint
cumulative distribution function and probability
density function . For
, the vector of
order statistics of and
, respectively,
is denoted by . Let
,
, be a new
sample from , which is
independent from . Let
be the rank of
order statistics in a new sample
and
be the rank of
order statistics in a new sample
. We derive the
joint distribution of discrete random vector
and a general
scheme wherein the distributions of new and old samples are different is
considered. Numerical examples for given well-known distribution are also
provided.
Journal: Journal of Applied Statistics
Pages: 2280-2289
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1023705
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1023705
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2280-2289
Template-Type: ReDIF-Article 1.0
Author-Name: Pieter Bastiaan Ober
Author-X-Name-First: Pieter Bastiaan
Author-X-Name-Last: Ober
Title: Sequential analysis: hypothesis testing and changepoint detection
Journal: Journal of Applied Statistics
Pages: 2290-2290
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1015813
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1015813
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2290-2290
Template-Type: ReDIF-Article 1.0
Author-Name: Božidar V. Popović
Author-X-Name-First: Božidar V.
Author-X-Name-Last: Popović
Title: Handbook of univariate and multivariate data analysis with IBM SPSS, second edition
Journal: Journal of Applied Statistics
Pages: 2291-2291
Issue: 10
Volume: 42
Year: 2015
Month: 10
X-DOI: 10.1080/02664763.2015.1015811
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1015811
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:10:p:2291-2291
Template-Type: ReDIF-Article 1.0
Author-Name: Sibel Balci
Author-X-Name-First: Sibel
Author-X-Name-Last: Balci
Author-Name: Aysen Dener Akkaya
Author-X-Name-First: Aysen Dener
Author-X-Name-Last: Akkaya
Title: Robust pairwise multiple comparisons under short-tailed symmetric distributions
Abstract:
In one-way ANOVA, most of the pairwise multiple comparison procedures
depend on normality assumption of errors. In practice, errors have
non-normal distributions so frequently. Therefore, it is very important to
develop robust estimators of location and the associated variance under
non-normality. In this paper, we consider the estimation of one-way ANOVA
model parameters to make pairwise multiple comparisons under short-tailed
symmetric (STS) distribution. The classical least squares method is
neither efficient nor robust and maximum likelihood estimation technique
is problematic in this situation. Modified maximum likelihood (MML)
estimation technique gives the opportunity to estimate model parameters in
closed forms under non-normal distributions. Hence, the use of MML
estimators in the test statistic is proposed for pairwise multiple
comparisons under STS distribution. The efficiency and power comparisons
of the test statistic based on sample mean, trimmed mean, wave and MML
estimators are given and the robustness of the test obtained using these
estimators under plausible alternatives and inlier model are examined. It
is demonstrated that the test statistic based on MML estimators is
efficient and robust and the corresponding test is more powerful and
having smallest Type I error.
Journal: Journal of Applied Statistics
Pages: 2293-2306
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1023706
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1023706
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2293-2306
Template-Type: ReDIF-Article 1.0
Author-Name: Wei Chen
Author-X-Name-First: Wei
Author-X-Name-Last: Chen
Author-Name: Dehui Wang
Author-X-Name-First: Dehui
Author-X-Name-Last: Wang
Author-Name: Yanfeng Li
Author-X-Name-First: Yanfeng
Author-X-Name-Last: Li
Title: A class of tests of proportional hazards assumption for left-truncated and right-censored data
Abstract:
In this paper, we proposed a class of tests of proportional hazards
assumption for left-truncated and right-censored data based on a pair of
estimators of the hazard ratio constant. Using counting process and
martingale theory, the asymptotically normal distribution of the test
statistic is derived and a family of consistent estimators of variance are
also provided. Extensive simulation studies were conducted to evaluate the
performance of the proposed test statistics under finite sample
situations. Two real data sets are analyzed to illustrate our method.
Journal: Journal of Applied Statistics
Pages: 2307-2320
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1027884
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1027884
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2307-2320
Template-Type: ReDIF-Article 1.0
Author-Name: Gift Nyamundanda
Author-X-Name-First: Gift
Author-X-Name-Last: Nyamundanda
Author-Name: Avril Hegarty
Author-X-Name-First: Avril
Author-X-Name-Last: Hegarty
Author-Name: Kevin Hayes
Author-X-Name-First: Kevin
Author-X-Name-Last: Hayes
Title: Product partition latent variable model for multiple change-point detection in multivariate data
Abstract:
The product partition model (PPM) is a well-established efficient
statistical method for detecting multiple change points in time-evolving
univariate data. In this article, we refine the PPM for the purpose of
detecting multiple change points in correlated multivariate time-evolving
data. Our model detects distributional changes in both the mean and
covariance structures of multivariate Gaussian data by exploiting a
smaller dimensional representation of correlated multiple time series. The
utility of the proposed method is demonstrated through experiments on
simulated and real datasets.
Journal: Journal of Applied Statistics
Pages: 2321-2334
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1029444
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1029444
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2321-2334
Template-Type: ReDIF-Article 1.0
Author-Name: Iris Pigeot
Author-X-Name-First: Iris
Author-X-Name-Last: Pigeot
Author-Name: Fabian Sobotka
Author-X-Name-First: Fabian
Author-X-Name-Last: Sobotka
Author-Name: Svend Kreiner
Author-X-Name-First: Svend
Author-X-Name-Last: Kreiner
Author-Name: Ronja Foraita
Author-X-Name-First: Ronja
Author-X-Name-Last: Foraita
Title: The uncertainty of a selected graphical model
Abstract:
Graphical models are useful to detect multivariate association structures
in terms of conditional independencies and to represent these structures
in a graph. When fitting graphical models to multivariate data, the
uncertainty of a selected graphical model cannot be directly assessed. In
this paper, we therefore propose various descriptive measures to assess
the uncertainty of a graphical model based on the nonparametric bootstrap.
We also introduce a so-called mean graphical model. Simulations and one
real data example illustrate the application and interpretation of the
newly proposed measures and demonstrate that the mean graphical model
performs better than a single selected graphical model.
Journal: Journal of Applied Statistics
Pages: 2335-2352
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1030368
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1030368
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2335-2352
Template-Type: ReDIF-Article 1.0
Author-Name: Francisco Louzada
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada
Author-Name: M�rcia A.C. Macera
Author-X-Name-First: M�rcia A.C.
Author-X-Name-Last: Macera
Author-Name: Vicente G. Cancho
Author-X-Name-First: Vicente G.
Author-X-Name-Last: Cancho
Title: The Poisson-exponential model for recurrent event data: an application to bowel motility data
Abstract:
This paper presents a new parametric model for recurrent events, in which
the time of each recurrence is associated to one or multiple latent causes
and no information is provided about the responsible cause for the event.
This model is characterized by a rate function and it is based on the
Poisson-exponential distribution, namely the distribution of the maximum
among a random number (truncated Poisson distributed) of exponential
times. The time of each recurrence is then given by the maximum lifetime
value among all latent causes. Inference is based on a maximum likelihood
approach. A simulation study is performed in order to observe the
frequentist properties of the estimation procedure for small and moderate
sample sizes. We also investigated likelihood-based tests procedures. A
real example from a gastroenterology study concerning small bowel motility
during fasting state is used to illustrate the methodology. Finally, we
apply the proposed model to a real data set and compare it with the
classical Homogeneous Poisson model, which is a particular case.
Journal: Journal of Applied Statistics
Pages: 2353-2366
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1030369
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1030369
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2353-2366
Template-Type: ReDIF-Article 1.0
Author-Name: Julie E. Shortridge
Author-X-Name-First: Julie E.
Author-X-Name-Last: Shortridge
Author-Name: Stefanie M. Falconi
Author-X-Name-First: Stefanie M.
Author-X-Name-Last: Falconi
Author-Name: Benjamin F. Zaitchik
Author-X-Name-First: Benjamin F.
Author-X-Name-Last: Zaitchik
Author-Name: Seth D. Guikema
Author-X-Name-First: Seth D.
Author-X-Name-Last: Guikema
Title: Climate, agriculture, and hunger: statistical prediction of undernourishment using nonlinear regression and data-mining techniques
Abstract:
An estimated 1 billion people suffer from hunger worldwide, and climate
change, urbanization, and globalization have the potential to exacerbate
this situation. Improved models for predicting food security are needed to
understand these impacts and design interventions. However, food
insecurity is the result of complex interactions between physical and
socio-economic factors that can overwhelm linear regression models. More
sophisticated data-mining approaches could provide an effective way to
model these relationships and accurately predict food insecure situations.
In this paper, we compare multiple regression and data-mining methods in
their ability to predict the percent of a country's population that
suffers from undernourishment using widely available predictor variables
related to socio-economic settings, agricultural production and trade, and
climate conditions. Averaging predictions from multiple models results in
the lowest predictive error and provides an accurate method to predict
undernourishment levels. Partial dependence plots are used to evaluate
covariate influence and demonstrate the relationship between food
insecurity and climatic and socio-economic variables. By providing
insights into these relationships and a mechanism for predicting
undernourishment using readily available data, statistical models like
those developed here could be a useful tool for those tasked with
understanding and addressing food insecurity.
Journal: Journal of Applied Statistics
Pages: 2367-2390
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1032216
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1032216
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2367-2390
Template-Type: ReDIF-Article 1.0
Author-Name: Tahir Ekin
Author-X-Name-First: Tahir
Author-X-Name-Last: Ekin
Author-Name: R. Muzaffer Musal
Author-X-Name-First: R. Muzaffer
Author-X-Name-Last: Musal
Author-Name: Lawrence V. Fulton
Author-X-Name-First: Lawrence V.
Author-X-Name-Last: Fulton
Title: Overpayment models for medical audits: multiple scenarios
Abstract:
Comprehensive auditing in Medicare programs is infeasible due to the large
number of claims, therefore, the use of statistical sampling and
estimation methods is crucial. We introduce super-population models to
understand the overpayment phenomena within the claims population. The
zero- and one-inflated mixture-based models can capture various
overpayment patterns including the fully legitimate or fraudulent cases.
We compare them with the existing models for symmetric and mixed payment
populations that have different overpayment patterns. The distributional
fit between the actual and estimated overpayments is assessed. We also
provide comparisons of models with respect to their conformance with
Centers for Medicare and Medicaid Services (CMS) guidelines. In addition
to estimating the dollar amount of recovery, the proposed models can help
the investigators to detect overpayment patterns.
Journal: Journal of Applied Statistics
Pages: 2391-2405
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1034659
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1034659
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2391-2405
Template-Type: ReDIF-Article 1.0
Author-Name: Haiqiang Chen
Author-X-Name-First: Haiqiang
Author-X-Name-Last: Chen
Author-Name: Yanli Zhu
Author-X-Name-First: Yanli
Author-X-Name-Last: Zhu
Title: An empirical study on the threshold cointegration of Chinese A and H cross-listed shares
Abstract:
We investigate the dynamic relationship between the prices of Chinese A
and H market cross-listed shares using the Enders-Siklos threshold
cointegration approach. Our data are the daily closing prices of the Hang
Seng China AH (A) index and the Hang Seng China AH (H) index from 4
January 2006 to 1 November 2013. We find a threshold cointegration between
these two indices, instead of the linear cointegration well established in
the literature. The short-term adjustment to the equilibrium shows an
asymmetric effect according to the price deviation from the equilibrium.
Moreover, using a Granger causality test, we find a bi-directional
causality between these two markets, indicating a close relationship
between them. A pairs trading rule, based on the estimated threshold
cointegration model, demonstrates the usefulness of our results as it
generates a significantly higher return than a naive buy-and-hold trading
rule.
Journal: Journal of Applied Statistics
Pages: 2406-2419
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1034660
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1034660
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2406-2419
Template-Type: ReDIF-Article 1.0
Author-Name: Chen-ju Lin
Author-X-Name-First: Chen-ju
Author-X-Name-Last: Lin
Author-Name: Yi-chun Shu
Author-X-Name-First: Yi-chun
Author-X-Name-Last: Shu
Title: Detecting clusters with increased mean using scan windows with variable radius
Abstract:
Applying spatiotemporal scan statistics is an effective method to detect
the clustering of mean shifts in many application fields. Although several
exponentially weighted moving average (EWMA) based scan statistics have
been proposed, the existing methods generally require a fixed scan window
size or apply the weighting technique across the temporal axis only.
However, the size of shift coverage is often unavailable in practical
problems. Using a mismatching scan radius may mislead the size of cluster
coverage in space or delay the time to detection. This research proposed
an stEWMA method by applying the weighting technique across both temporal
and spatial axes with variable scan radius. The simulation analysis showed
that the stEWMA method can have a significantly shorter time to detection
than the likelihood ratio-based scan statistic using variable scan radius,
especially when cluster coverage size is small. The application to
detecting the increase of male thyroid cancer in the New Mexico state also
showed the effectiveness of the proposed method.
Journal: Journal of Applied Statistics
Pages: 2420-2431
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1041013
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1041013
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2420-2431
Template-Type: ReDIF-Article 1.0
Author-Name: Jin-Guan Lin
Author-X-Name-First: Jin-Guan
Author-X-Name-Last: Lin
Author-Name: Yan-Yong Zhao
Author-X-Name-First: Yan-Yong
Author-X-Name-Last: Zhao
Author-Name: Hong-Xia Wang
Author-X-Name-First: Hong-Xia
Author-X-Name-Last: Wang
Title: Heteroscedasticity diagnostics in varying-coefficient partially linear regression models and applications in analyzing Boston housing data
Abstract:
It is important to detect the variance heterogeneity in regression model
because efficient inference requires that heteroscedasticity is taken into
consideration if it really exists. For the varying-coefficient partially
linear regression models, however, the problem of detecting
heteroscedasticity has received very little attention. In this paper, we
present two classes of tests of heteroscedasticity for varying-coefficient
partially linear regression models. The first test statistic is
constructed based on the residuals, in which the error term is from a
normal distribution. The second one is motivated by the idea that testing
heteroscedasticity is equivalent to testing pseudo-residuals for a
constant mean. Asymptotic normality is established with different rates
corresponding to the null hypothesis of homoscedasticity and the
alternative. Some Monte Carlo simulations are conducted to investigate the
finite sample performance of the proposed tests. The test methodologies
are illustrated with a real data set example.
Journal: Journal of Applied Statistics
Pages: 2432-2448
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1043623
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043623
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2432-2448
Template-Type: ReDIF-Article 1.0
Author-Name: Opeoluwa F. Oyedele
Author-X-Name-First: Opeoluwa F.
Author-X-Name-Last: Oyedele
Author-Name: Sugnet Lubbe
Author-X-Name-First: Sugnet
Author-X-Name-Last: Lubbe
Title: The construction of a partial least-squares biplot
Abstract:
Biplots are useful tools to explore the relationship among variables. In
this paper, the specific regression relationship between a set of
predictors X and set of response variables Y by
means of partial least-squares (PLS) regression is represented. The PLS
biplot provides a single graphical representation of the samples together
with the predictor and response variables, as well as their
interrelationships in terms of the matrix of regression coefficients.
Journal: Journal of Applied Statistics
Pages: 2449-2460
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1043858
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043858
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2449-2460
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaobing Zhao
Author-X-Name-First: Xiaobing
Author-X-Name-Last: Zhao
Author-Name: Xian Zhou
Author-X-Name-First: Xian
Author-X-Name-Last: Zhou
Title: Semiparametric models of longitudinal and time-to-event data with applications to HIV viral dynamics and CD4 counts
Abstract:
We propose a semiparametric approach based on proportional hazards and
copula method to jointly model longitudinal outcomes and the
time-to-event. The dependence between the longitudinal outcomes on the
covariates is modeled by a copula-based times series, which allows
non-Gaussian random effects and overcomes the limitation of the parametric
assumptions in existing linear and nonlinear random effects models. A
modified partial likelihood method using estimated covariates at failure
times is employed to draw statistical inference. The proposed model and
method are applied to analyze a set of progression to AIDS data in a study
of the association between the human immunodeficiency virus viral dynamics
and the time trend in the CD4/CD8 ratio with measurement errors.
Simulations are also reported to evaluate the proposed model and method.
Journal: Journal of Applied Statistics
Pages: 2461-2477
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1043859
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043859
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2461-2477
Template-Type: ReDIF-Article 1.0
Author-Name: Cini Varghese
Author-X-Name-First: Cini
Author-X-Name-Last: Varghese
Author-Name: Eldho Varghese
Author-X-Name-First: Eldho
Author-X-Name-Last: Varghese
Author-Name: Seema Jaggi
Author-X-Name-First: Seema
Author-X-Name-Last: Jaggi
Author-Name: Arpan Bhowmik
Author-X-Name-First: Arpan
Author-X-Name-Last: Bhowmik
Title: Experimental designs for open pollination in polycross trials
Abstract:
A polycross is the pollination by natural hybridization of a group of
genotypes, generally selected, grown in isolation from other compatible
genotypes in such a way to promote random open pollination. A particular
practical application of the polycross method occurs in the production of
a synthetic variety resulting from cross-pollinated plants. Laying out
these experiments in appropriate designs, known as polycross designs,
would not only save experimental resources but also gather more
information from the experiment. Different situations may arise in
polycross nurseries where accordingly different polycross designs may be
used. For situations in which some genotypes interfere in the growth or
production of other genotypes, but have to be grown together,
neighbour-restricted design is a better option. Furthermore, when the
topography of the nursery is such that a known wind system in a certain
direction may prevail, then designs balanced for neighbour effects of
genotypes only in the direction of wind are appropriate which may help in
saving experimental resources to a great extent. Also, when genotypes are
planted in a small area without leaving much space between rows, designs
balanced for neighbour effects from all possible eight directions are
useful to have equal chance of pollinating and being pollinated by every
other genotype. Here, polycross designs have been obtained to match
above-mentioned three situations. SAS Macros have also been developed to
generate these proposed designs.
Journal: Journal of Applied Statistics
Pages: 2478-2484
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1043860
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043860
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2478-2484
Template-Type: ReDIF-Article 1.0
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Author-Name: Nasrullah Khan
Author-X-Name-First: Nasrullah
Author-X-Name-Last: Khan
Author-Name: Chi-Hyuck Jun
Author-X-Name-First: Chi-Hyuck
Author-X-Name-Last: Jun
Title: A new S-super-2 control chart using repetitive sampling
Abstract:
A new S-super-2 control chart is presented for monitoring
the process variance by utilizing a repetitive sampling scheme. The double
control limits called inner and outer control limits are proposed, whose
coefficients are determined by considering the average run length (ARL)
and the average sample number when the process is in control. The proposed
control chart is compared with the existing Shewhart
S-super-2 control chart in terms of the ARLs. The result
shows that the proposed control chart is more efficient than the existing
control chart in detecting the process shift.
Journal: Journal of Applied Statistics
Pages: 2485-2496
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1043861
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043861
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2485-2496
Template-Type: ReDIF-Article 1.0
Author-Name: Yunlu Jiang
Author-X-Name-First: Yunlu
Author-X-Name-Last: Jiang
Title: Robust estimation in partially linear regression models
Abstract:
A new class of robust estimators via the exponential squared loss function
with a tuning parameter are presented for the partially linear regression
models. Under some conditions, we show that our proposed estimators for
the regression parameter can achieve the highest asymptotic breakdown
point of . In addition,
we propose the data-driven procedure to choose the tuning parameter.
Simulation studies are conducted to compare the performances of the
proposed method with the existing methods in terms of the bias, standard
deviation (Sd) as well as the mean-squared errors (MSE). The results show
that our proposed method has smaller Sd and MSE than the existing methods
when there are outliers in the dataset. Finally, we apply the proposed
method to analyze the Ragweed Pollen Level data and the salinity data, and
the results reveal that our method performs better than the existing
methods.
Journal: Journal of Applied Statistics
Pages: 2497-2508
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1043862
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043862
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2497-2508
Template-Type: ReDIF-Article 1.0
Author-Name: A. Sani
Author-X-Name-First: A.
Author-X-Name-Last: Sani
Author-Name: B. Abapihi
Author-X-Name-First: B.
Author-X-Name-Last: Abapihi
Author-Name: Mukhsar Mukhsar
Author-X-Name-First: Mukhsar
Author-X-Name-Last: Mukhsar
Author-Name: Kadir Kadir
Author-X-Name-First: Kadir
Author-X-Name-Last: Kadir
Title: Relative risk analysis of dengue cases using convolution extended into spatio-temporal model
Abstract:
Dengue Hemmorage Fever (DHF) cases have become a serious problem every
year in tropical countries such as Indonesia. Understanding the dynamic
spread of the disease is essential in order to find an effective strategy
in controlling its spread. In this study, a convolution
(Poisson-lognormal) model that integrates both uncorrelated and correlated
random effects was developed. A spatial-temporal convolution model to
accomodate both spatial and temporal variations of the disease spread
dynamics was considered. The model was applied to the DHF cases in the
city of Kendari, Indonesia. DHF data for 10 districts during the period
2007-2010 were collected from the health services. The data of rainfall
and population density were obtained from the local offices in Kendari.
The numerical experiments indicated that both the rainfall and the
population density played an important role in the increasing DHF cases in
the city of Kendari. The result suggested that DHF cases mostly occured in
January, the wet session with high rainfall, and in Kadia, the densest
district in the city. As people in the city have high mobility while
dengue mosquitoes tend to stay localized in their area, the best
intervention is in January and in the district of Kadia.
Journal: Journal of Applied Statistics
Pages: 2509-2519
Issue: 11
Volume: 42
Year: 2015
Month: 11
X-DOI: 10.1080/02664763.2015.1043863
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043863
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:11:p:2509-2519
Template-Type: ReDIF-Article 1.0
Author-Name: Zahra Hadidoust
Author-X-Name-First: Zahra
Author-X-Name-Last: Hadidoust
Author-Name: Yaser Samimi
Author-X-Name-First: Yaser
Author-X-Name-Last: Samimi
Author-Name: Hamid Shahriari
Author-X-Name-First: Hamid
Author-X-Name-Last: Shahriari
Title: Monitoring and change-point estimation for spline-modeled non-linear profiles in phase II
Abstract:
In some applications of statistical quality control, quality of a process
or a product is best characterized by a functional relationship between a
response variable and one or more explanatory variables. This relationship
is referred to as a profile. In certain cases, the quality of a process or
a product is better described by a non-linear profile which does not
follow a specific parametric model. In these circumstances, nonparametric
approaches with greater flexibility in modeling the complicated profiles
are adopted. In this research, the spline smoothing method is used to
model a complicated non-linear profile and the Hotelling
T-super-2 control chart based on the spline coefficients
is used to monitor the process. After receiving an out-of-control signal,
a maximum likelihood estimator is employed for change point estimation.
The simulation studies, which include both global and local shifts,
provide appropriate evaluation of the performance of the proposed
estimation and monitoring procedure. The results indicate that the
proposed method detects large global shifts while it is very sensitive in
detecting local shifts.
Journal: Journal of Applied Statistics
Pages: 2520-2530
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1043864
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043864
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2520-2530
Template-Type: ReDIF-Article 1.0
Author-Name: Julia S. Benoit
Author-X-Name-First: Julia S.
Author-X-Name-Last: Benoit
Author-Name: Wenyaw Chan
Author-X-Name-First: Wenyaw
Author-X-Name-Last: Chan
Author-Name: Rachelle S. Doody
Author-X-Name-First: Rachelle S.
Author-X-Name-Last: Doody
Title: Joint coverage probability in a simulation study on continuous-time Markov chain parameter estimation
Abstract:
Parameter dependency within data sets in simulation studies is common,
especially in models such as continuous-time Markov chains (CTMCs).
Additionally, the literature lacks a comprehensive examination of
estimation performance for the likelihood-based general multi-state CTMC.
Among studies attempting to assess the estimation, none have accounted for
dependency among parameter estimates. The purpose of this research is
twofold: (1) to develop a multivariate approach for assessing accuracy and
precision for simulation studies (2) to add to the literature a
comprehensive examination of the estimation of a general 3-state CTMC
model. Simulation studies are conducted to analyze longitudinal data with
a trinomial outcome using a CTMC with and without covariates. Measures of
performance including bias, component-wise coverage probabilities, and
joint coverage probabilities are calculated. An application is presented
using Alzheimer's disease caregiver stress levels. Comparisons of joint
and component-wise parameter estimates yield conflicting inferential
results in simulations from models with and without covariates. In
conclusion, caution should be taken when conducting simulation studies
aiming to assess performance and choice of inference should properly
reflect the purpose of the simulation.
Journal: Journal of Applied Statistics
Pages: 2531-2538
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1043865
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043865
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2531-2538
Template-Type: ReDIF-Article 1.0
Author-Name: Pablo Mart�nez-Camblor
Author-X-Name-First: Pablo
Author-X-Name-Last: Mart�nez-Camblor
Author-Name: Jacobo de U�a-�lvarez
Author-X-Name-First: Jacobo
Author-X-Name-Last: de U�a-�lvarez
Author-Name: Carmen D�az Corte
Author-X-Name-First: Carmen D�az
Author-X-Name-Last: Corte
Title: Expanded renal transplantation: a competing risk model approach
Abstract:
Multi-state models (MSMs) are useful to analyze survival data when,
besides the event of main interest, one or more intermediate states of the
individual are identified. These models take the several existing states
and the possible transitions among them into account. At the same time,
covariate effects on each transition intensity may be investigated
separately and, therefore, MSMs are more flexible than the standard Cox
proportional hazards model. In this work, we use MSMs to investigate the
impact of the quality of a transplanted kidney for a group of patients at
the Hospital Universitario Central de Asturias. Specifically, we use an
illness-death model to study the evolution of patients
with kidney disease who received a renal transplant after a dialysis
period. The intermediate state is defined as the failure of the received
organ, while the terminating state is the death of the patient. In order
to increase the potential number of organs available for transplant, the
standards of quality for the transplanted kidneys were relaxed (the new
criteria are labeled expanded criteria), and these
'expanded kidneys' were transplanted in appropriate
candidates (older patients, with higher prevalence of diabetes mellitus).
Results suggest that the expanded kidneys have a minor effect on survival,
while both the kidney mortality and the risk of death increase with the
patient's age and the serum creatinine and serum hemoglobin levels.
Journal: Journal of Applied Statistics
Pages: 2539-2553
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1043866
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043866
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2539-2553
Template-Type: ReDIF-Article 1.0
Author-Name: E. Raffinetti
Author-X-Name-First: E.
Author-X-Name-Last: Raffinetti
Author-Name: I. Romeo
Author-X-Name-First: I.
Author-X-Name-Last: Romeo
Title: Dealing with the biased effects issue when handling huge datasets: the case of INVALSI data
Abstract:
The increasing prevalence of huge datasets addresses the research to
appropriate statistical methods for solving troubles caused by their
complexity. On the one hand, several techniques are mentioned in the
literature, especially for the time-consuming and variables reduction
issues. On the other, less debate is devoted to the statistical inference
issue. Indeed, a large number of involved statistical units may lead to
wrongly consider as significant variables without any actual impact on the
phenomenon under study. This paper suggests a suitable subsampling
procedure for the reduction of the number of statistical units and
provides a novel index for the assessment of the significance effects. The
proposal is validated by comparing results obtained from the analysis on
the original data to those obtained from the proposed subsampling
approach. The illustrative application focuses on the educational dataset
made available by the National Committee for the Evaluation of the Italian
Education Systems (INVALSI). This dataset collects information about the
student features and achievements in Maths within the lower secondary
schools of the Lombardy region (Italy). Due to the hierarchical structure
of the data, a multilevel model is implemented with the purpose of
investigating the effects of both individual and school factors on student
Maths score.
Journal: Journal of Applied Statistics
Pages: 2554-2570
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1043867
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043867
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2554-2570
Template-Type: ReDIF-Article 1.0
Author-Name: Daniel F. Linder
Author-X-Name-First: Daniel F.
Author-X-Name-Last: Linder
Author-Name: Hani Samawi
Author-X-Name-First: Hani
Author-X-Name-Last: Samawi
Author-Name: Lili Yu
Author-X-Name-First: Lili
Author-X-Name-Last: Yu
Author-Name: Arpita Chatterjee
Author-X-Name-First: Arpita
Author-X-Name-Last: Chatterjee
Author-Name: Yisong Huang
Author-X-Name-First: Yisong
Author-X-Name-Last: Huang
Author-Name: Robert Vogel
Author-X-Name-First: Robert
Author-X-Name-Last: Vogel
Title: On stratified bivariate ranked set sampling for regression estimators
Abstract:
We investigate the relative performance of stratified bivariate ranked set
sampling (SBVRSS), with respect to stratified simple random sampling
(SSRS) for estimating the population mean with regression methods. The
mean and variance of the proposed estimators are derived with the mean
being shown to be unbiased. We perform a simulation study to compare the
relative efficiency of SBVRSS to SSRS under various data-generating
scenarios. We also compare the two sampling schemes on a real data set
from trauma victims in a hospital setting. The results of our simulation
study and the real data illustration indicate that using SBVRSS for
regression estimation provides more efficiency than SSRS in most cases.
Journal: Journal of Applied Statistics
Pages: 2571-2583
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1043868
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043868
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2571-2583
Template-Type: ReDIF-Article 1.0
Author-Name: Ilaria L. Amerise
Author-X-Name-First: Ilaria L.
Author-X-Name-Last: Amerise
Author-Name: Agostino Tarsitano
Author-X-Name-First: Agostino
Author-X-Name-Last: Tarsitano
Title: Correction methods for ties in rank correlations
Abstract:
Equal values are common when rank methods are applied to rounded data or
data consisting solely of small integers. A popular technique for
resolving ties in rank correlation is the mid-rank method: the mean of the
rankings remains unaltered, but the variance is reduced and modified
according to the number and location of ties. Although other methods for
breaking ties were proposed in the literature as early as 1939, no such
procedure has gained such wide acceptance as mid-ranks. This research
analyses various techniques for assigning ranks to tied values, with two
objectives: (1) to enable the computation of rank correlation
coefficients, such as those of Spearman, Kendall and Gini, by using the
usual definition applied in the absence of ties, and (2) to determine
whether it really makes a difference which of the various techniques is
selected and, if so, which technique is most appropriate for a given
application.
Journal: Journal of Applied Statistics
Pages: 2584-2596
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1043870
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043870
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2584-2596
Template-Type: ReDIF-Article 1.0
Author-Name: Zahra Mansourvar
Author-X-Name-First: Zahra
Author-X-Name-Last: Mansourvar
Author-Name: Torben Martinussen
Author-X-Name-First: Torben
Author-X-Name-Last: Martinussen
Author-Name: Thomas H. Scheike
Author-X-Name-First: Thomas H.
Author-X-Name-Last: Scheike
Title: Semiparametric regression for restricted mean residual life under right censoring
Abstract:
A mean residual life function (MRLF) is the remaining life expectancy of a
subject who has survived to a certain time point. In the presence of
covariates, regression models are needed to study the association between
the MRLFs and covariates. If the survival time tends to be too long or the
tail is not observed, the restricted mean residual life must be
considered. In this paper, we propose the proportional restricted mean
residual life model for fitting survival data under right censoring. For
inference on the model parameters, martingale estimating equations are
developed, and the asymptotic properties of the proposed estimators are
established. In addition, a class of goodness-of-fit test is presented to
assess the adequacy of the model. The finite sample behavior of the
proposed estimators is evaluated through simulation studies, and the
approach is applied to a set of real life data collected from a randomized
clinical trial.
Journal: Journal of Applied Statistics
Pages: 2597-2613
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1043871
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043871
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2597-2613
Template-Type: ReDIF-Article 1.0
Author-Name: Rubing Liang
Author-X-Name-First: Rubing
Author-X-Name-Last: Liang
Author-Name: Cuizhen Niu
Author-X-Name-First: Cuizhen
Author-X-Name-Last: Niu
Author-Name: Qiang Xia
Author-X-Name-First: Qiang
Author-X-Name-Last: Xia
Author-Name: Zhiqiang Zhang
Author-X-Name-First: Zhiqiang
Author-X-Name-Last: Zhang
Title: Nonlinearity testing and modeling for threshold moving average models
Abstract:
In this paper, we suggest a simple test and an easily applicable modeling
procedure for threshold moving average (TMA) models. Firstly, based on the
fitted residuals by maximum likelihood estimate (MLE) for MA models, we
construct a simple statistic, which is obtained by linear arrange
regression and follows F-distribution approximately, to
test for threshold nonlinearity and specify the threshold variables. And
then, we use some scatterplots to identify the number and locations of the
potential thresholds. Finally, with the statistic and Akaike information
criterion, we propose the procedure to build TMA models. Both the power of
test statistic and the convenience of modeling procedure can work very
well demonstrated by simulation experiments and the application to a real
example.
Journal: Journal of Applied Statistics
Pages: 2614-2630
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1043872
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043872
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2614-2630
Template-Type: ReDIF-Article 1.0
Author-Name: Antonio Lucadamo
Author-X-Name-First: Antonio
Author-X-Name-Last: Lucadamo
Author-Name: Pietro Amenta
Author-X-Name-First: Pietro
Author-X-Name-Last: Amenta
Title: A proposal for handling ordinal categorical variables in co-inertia analysis
Abstract:
This paper is about the problem of the treatment of ordinal qualitative
variables in co-inertia analysis. In the literature, there are different
proposals based on the application of known statistical techniques to
quantify ordinal variables. Here we propose to use a new procedure for the
coding considering the empirical distributions of the variables involved
in the analysis. We present an application to a real dataset, comparing
the results obtained with the different kinds of quantification.
Journal: Journal of Applied Statistics
Pages: 2631-2638
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1044426
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1044426
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2631-2638
Template-Type: ReDIF-Article 1.0
Author-Name: Antonella Plaia
Author-X-Name-First: Antonella
Author-X-Name-Last: Plaia
Title: Long-term experiments and strip plot designs
Abstract:
In a long-term experiment usually the experimenter needs to know whether
the effect of a treatment varies over time. But time usually has both a
fixed and a random effects over the output and the difficulty in the
analysis depends on the particular design considered and the availability
of covariates. Actually, as shown in the paper, the presence of covariates
can be very useful to model the random effect of time. In this paper a
model to analyze data from a long-term strip plot design with covariates
is proposed. Its effectiveness will be tested using both simulated and
real data from a crop rotation experiment.
Journal: Journal of Applied Statistics
Pages: 2639-2653
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1046821
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1046821
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2639-2653
Template-Type: ReDIF-Article 1.0
Author-Name: Vahid Nekoukhou
Author-X-Name-First: Vahid
Author-X-Name-Last: Nekoukhou
Author-Name: Hamid Bidram
Author-X-Name-First: Hamid
Author-X-Name-Last: Bidram
Title: A new four-parameter discrete distribution with bathtub and unimodal failure rate
Abstract:
In this paper, a discrete counterpart of the general class of continuous
beta-G distributions is introduced. A discrete analog of
the beta generalized exponential distribution of Barreto-Souza et
al. [2], as an important special case of the proposed class, is
studied. This new distribution contains some previously known discrete
distributions as well as two new models. The hazard rate function of the
new model can be increasing, decreasing, bathtub-shaped and upside-down
bathtub. Some distributional and moment properties of the new distribution
as well as its order statistics are discussed. Estimation of the
parameters is illustrated using the maximum likelihood method and,
finally, the model with a real data set is examined.
Journal: Journal of Applied Statistics
Pages: 2654-2670
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1046822
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1046822
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2654-2670
Template-Type: ReDIF-Article 1.0
Author-Name: Sudesh Pundir
Author-X-Name-First: Sudesh
Author-X-Name-Last: Pundir
Author-Name: R. Amala
Author-X-Name-First: R.
Author-X-Name-Last: Amala
Title: Detecting diagnostic accuracy of two biomarkers through a bivariate log-normal ROC curve
Abstract:
In biomedical research, two or more biomarkers may be available for
diagnosis of a particular disease. Selecting one single biomarker which
ideally discriminate a diseased group from a healthy group is confront in
a diagnostic process. Frequently, most of the people use the accuracy
measure, area under the receiver operating characteristic (ROC) curve to
choose the best diagnostic marker among the available markers for
diagnosis. Some authors have tried to combine the multiple markers by an
optimal linear combination to increase the discriminatory power. In this
paper, we propose an alternative method that combines two continuous
biomarkers by direct bivariate modeling of the ROC curve under
log-normality assumption. The proposed method is applied to simulated data
set and prostate cancer diagnostic biomarker data set.
Journal: Journal of Applied Statistics
Pages: 2671-2685
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1046823
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1046823
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2671-2685
Template-Type: ReDIF-Article 1.0
Author-Name: Nikolay K. Vitanov
Author-X-Name-First: Nikolay K.
Author-X-Name-Last: Vitanov
Author-Name: Marcel Ausloos
Author-X-Name-First: Marcel
Author-X-Name-Last: Ausloos
Title: Test of two hypotheses explaining the size of populations in a system of cities
Abstract:
Two classical hypotheses are examined about the population growth in a
system of cities: Hypothesis 1 pertains to Gibrat's and Zipf's theory
which states that the city growth-decay process is size independent;
Hypothesis 2 pertains to the so-called Yule process which states that the
growth of populations in cities happens when (i) the distribution of the
city population initial size obeys a log-normal function, (ii) the growth
of the settlements follows a stochastic process. The basis for the test is
some official data on Bulgarian cities at various times. This system was
chosen because (i) Bulgaria is a country for which one does not expect
biased theoretical conditions; (ii) the city populations were determined
rather precisely. The present results show that: (i) the population size
growth of the Bulgarian cities is size dependent, whence Hypothesis 1 is
not confirmed for Bulgaria; (ii) the population size growth of Bulgarian
cities can be described by a double Pareto log-normal distribution, whence
Hypothesis 2 is valid for the Bulgarian city system. It is expected that
this fine study brings some information and light on other usually
considered to be more pertinent countries of city systems.
Journal: Journal of Applied Statistics
Pages: 2686-2693
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1047744
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1047744
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2686-2693
Template-Type: ReDIF-Article 1.0
Author-Name: Aldo M. Garay
Author-X-Name-First: Aldo M.
Author-X-Name-Last: Garay
Author-Name: Heleno Bolfarine
Author-X-Name-First: Heleno
Author-X-Name-Last: Bolfarine
Author-Name: Victor H. Lachos
Author-X-Name-First: Victor H.
Author-X-Name-Last: Lachos
Author-Name: Celso R.B. Cabral
Author-X-Name-First: Celso R.B.
Author-X-Name-Last: Cabral
Title: Bayesian analysis of censored linear regression models with scale mixtures of normal distributions
Abstract:
As is the case of many studies, the data collected are limited and an
exact value is recorded only if it falls within an interval range. Hence,
the responses can be either left, interval or right censored. Linear (and
nonlinear) regression models are routinely used to analyze these types of
data and are based on normality assumptions for the errors terms. However,
those analyzes might not provide robust inference when the normality
assumptions are questionable. In this article, we develop a Bayesian
framework for censored linear regression models by replacing the Gaussian
assumptions for the random errors with scale mixtures of normal (SMN)
distributions. The SMN is an attractive class of symmetric heavy-tailed
densities that includes the normal, Student-t, Pearson
type VII, slash and the contaminated normal distributions, as special
cases. Using a Bayesian paradigm, an efficient Markov chain Monte Carlo
algorithm is introduced to carry out posterior inference. A new
hierarchical prior distribution is suggested for the degrees of freedom
parameter in the Student-t distribution. The likelihood
function is utilized to compute not only some Bayesian model selection
measures but also to develop Bayesian case-deletion influence diagnostics
based on the q-divergence measure. The proposed Bayesian
methods are implemented in the R package BayesCR.
The newly developed procedures are illustrated with applications using
real and simulated data.
Journal: Journal of Applied Statistics
Pages: 2694-2714
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1048671
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1048671
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2694-2714
Template-Type: ReDIF-Article 1.0
Author-Name: H. Lu
Author-X-Name-First: H.
Author-X-Name-Last: Lu
Author-Name: P. Yin
Author-X-Name-First: P.
Author-X-Name-Last: Yin
Author-Name: R.X. Yue
Author-X-Name-First: R.X.
Author-X-Name-Last: Yue
Author-Name: J.Q. Shi
Author-X-Name-First: J.Q.
Author-X-Name-Last: Shi
Title: Robust confidence intervals for trend estimation in meta-analysis with publication bias
Abstract:
Confidence interval (CI) is very useful for trend estimation in
meta-analysis. It provides a type of interval estimate of the regression
slope as well as an indicator of the reliability of the estimate. Thus a
precise calculation of confidence interval at an expected level is
important. It is always difficult to explicitly quantify the CIs when
there is publication bias in meta-analysis. Various CIs have been
proposed, including the most widely used DerSimonian-Laird CI and the
recently proposed Henmi-Copas CI. The latter provides a robust solution
when there are non-ignorable missing data due to publication bias. In this
paper we extended the idea into meta-analysis for trend estimation. We
applied the method in different scenarios and showed that this type of CI
is more robust than the others.
Journal: Journal of Applied Statistics
Pages: 2715-2733
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1048672
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1048672
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2715-2733
Template-Type: ReDIF-Article 1.0
Author-Name: Alan D. Hutson
Author-X-Name-First: Alan D.
Author-X-Name-Last: Hutson
Author-Name: Gregory E. Wilding
Author-X-Name-First: Gregory E.
Author-X-Name-Last: Wilding
Author-Name: Terry L. Mashtare
Author-X-Name-First: Terry L.
Author-X-Name-Last: Mashtare
Author-Name: Albert Vexler
Author-X-Name-First: Albert
Author-X-Name-Last: Vexler
Title: Measures of biomarker dependence using a copula-based multivariate epsilon-skew-normal family of distributions
Abstract:
In this note we develop a new multivariate copula model based on
epsilon-skew-normal marginal densities for the purpose of examining
biomarker dependency structures. We illustrate the flexibility and utility
of this model via a variety of graphical tools and a data analysis example
pertaining to salivary biomarker. The multivariate normal model is a
sub-model of the multivariate epsilon-skew-normal distribution.
Journal: Journal of Applied Statistics
Pages: 2734-2753
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1049130
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1049130
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2734-2753
Template-Type: ReDIF-Article 1.0
Author-Name: Manisha Chakrabarty
Author-X-Name-First: Manisha
Author-X-Name-Last: Chakrabarty
Author-Name: Amita Majumder
Author-X-Name-First: Amita
Author-X-Name-Last: Majumder
Author-Name: Jeffrey Racine
Author-X-Name-First: Jeffrey
Author-X-Name-Last: Racine
Title: Household budget-share distributions and welfare implications: an application of multivariate distributional statistics
Abstract:
In this paper the consequences of considering the household 'food share'
distribution as a welfare measure, in isolation from the joint
distribution of itemized budget shares, is examined through the
unconditional and conditional distribution of 'food share' both
parametrically and nonparametrically. The parametric framework uses
Dirichlet and Beta distributions, while the nonparametric framework uses
kernel smoothing methods. The analysis, in a three commodity setup
('food', 'durables', 'others'), based on household level rural data for
West Bengal, India, for the year 2009-2010 shows significant
underrepresentation of households by the conventional unconditional 'food
share' distribution in the higher range of food budget shares that
correspond to the lower end of the income profile. This may have serious
consequences for welfare measurement.
Journal: Journal of Applied Statistics
Pages: 2754-2768
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1049132
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1049132
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2754-2768
Template-Type: ReDIF-Article 1.0
Author-Name: Markus Neuh�user
Author-X-Name-First: Markus
Author-X-Name-Last: Neuh�user
Title: Combining the t test and Wilcoxon's rank-sum test
Abstract:
In the two-sample location-shift problem, Student's t
test or Wilcoxon's rank-sum test are commonly applied. The latter test can
be more powerful for non-normal data. Here, we propose to combine the two
tests within a maximum test. We show that the constructed maximum test
controls the type I error rate and has good power characteristics for a
variety of distributions; its power is close to that of the more powerful
of the two tests. Thus, irrespective of the distribution, the maximum test
stabilizes the power. To carry out the maximum test is a more powerful
strategy than selecting one of the single tests. The proposed test is
applied to data of a clinical trial.
Journal: Journal of Applied Statistics
Pages: 2769-2775
Issue: 12
Volume: 42
Year: 2015
Month: 12
X-DOI: 10.1080/02664763.2015.1070809
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1070809
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:42:y:2015:i:12:p:2769-2775
Template-Type: ReDIF-Article 1.0
Author-Name: Massimo Attanasio
Author-X-Name-First: Massimo
Author-X-Name-Last: Attanasio
Author-Name: Vincenza Capursi
Author-X-Name-First: Vincenza
Author-X-Name-Last: Capursi
Title: Statistics in Education
Journal: Journal of Applied Statistics
Pages: 1-2
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1104890
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1104890
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:1-2
Template-Type: ReDIF-Article 1.0
Author-Name: Giada Adelfio
Author-X-Name-First: Giada
Author-X-Name-Last: Adelfio
Author-Name: Giovanni Boscaino
Author-X-Name-First: Giovanni
Author-X-Name-Last: Boscaino
Title: Degree course change and student performance: a mixed-effect model approach
Abstract:
This paper focuses on students credits earning speed over time and its
determinants, dealing with the huge percentage of students who do not take
the degree within the legal duration in the Italian University System. A
new indicator for the performance of the student career is proposed on
real data, concerning the cohort of students enrolled at a Faculty of the
University of Palermo (followed for 7 years). The new indicator highlights
a typical zero-inflated distribution and suggests to investigate the
effect of the degree course (DC) change on the student career. A
mixed-effect model for overdispersed data is considered, with the aim of
taking into account the individual variability as well, due to the
longitudinal nature of data. Results show the significant positive effect
of the DC change on the student performance.
Journal: Journal of Applied Statistics
Pages: 3-15
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1018673
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1018673
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:3-15
Template-Type: ReDIF-Article 1.0
Author-Name: F. Crippa
Author-X-Name-First: F.
Author-X-Name-Last: Crippa
Author-Name: M. Mazzoleni
Author-X-Name-First: M.
Author-X-Name-Last: Mazzoleni
Author-Name: M. Zenga
Author-X-Name-First: M.
Author-X-Name-Last: Zenga
Title: Departures from the formal of actual students' university careers: an application of non-homogeneous fuzzy Markov chains
Abstract:
As in most higher education (HE) systems, the Italian university
organisation draws paths of credit progression in formal curricula, which
aim at framing the acquisition of knowledge and competencies within each
specific major. The resulting yearly syllabi therefore develop in a
sequence of examinations that are to be successfully passed, and formal
administrative registration allows access to the following academic year.
In general, there is a divergence between formal and actual career
progression because each university student can proceed at her/his own
pace, sketching her/his own trajectories, free to depart from the formal
progression. Even if applied to various HE settings, Markov chain models
do not fit the aforementioned situation. A methodological extension has
been introduced, whereby progression levels are considered as fuzzy
states. Markov chains with fuzzy states identify the latter with specified
academic years and express each student's situation as a relational link
to present and past academic attainments. This link is operationalised by
means of a membership function, which is here discussed with reference to
the Italian HE system.
Journal: Journal of Applied Statistics
Pages: 16-30
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1091446
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1091446
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:16-30
Template-Type: ReDIF-Article 1.0
Author-Name: Valentina Raponi
Author-X-Name-First: Valentina
Author-X-Name-Last: Raponi
Author-Name: Francesca Martella
Author-X-Name-First: Francesca
Author-X-Name-Last: Martella
Author-Name: Antonello Maruotti
Author-X-Name-First: Antonello
Author-X-Name-Last: Maruotti
Title: A biclustering approach to university performances: an Italian case study
Abstract:
University evaluation is a topic of increasing concern in Italy as well as
in other countries. In empirical analysis, university activities and
performances are often measured by means of indicator variables. The
available information are then summarized to respond to different aims. We
argue that the evaluation process is a complex phenomenon that cannot be
addressed by a simple descriptive approach. In this paper, we used a
model-based approach to account for association between indicators and
similarities among the observed universities. We examine faculty-level
data collected from different sources, covering 55 Italian Economics
faculties in the academic year 2009/2010. Making use of a clustering
methodology, we introduce a biclustering model that accounts for both
homogeneity/heterogeneity among faculties and correlations between
indicators. Our results show that there are two substantial different
performances between universities which can be strictly related to the
nature of the institutions, namely the Private and
Public profiles. Each of the two groups has its own
peculiar features and its own group-specific list of priorities, strengths
and weaknesses. Thus, we suggest that caution should be used in
interpreting standard university rankings as they generally do not account
for the complex structure of the data.
Journal: Journal of Applied Statistics
Pages: 31-45
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1009005
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1009005
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:31-45
Template-Type: ReDIF-Article 1.0
Author-Name: Marco Enea
Author-X-Name-First: Marco
Author-X-Name-Last: Enea
Author-Name: Massimo Attanasio
Author-X-Name-First: Massimo
Author-X-Name-Last: Attanasio
Title: An association model for bivariate data with application to the analysis of university students' success
Abstract:
The academic success of students is a priority for all universities. We
analyze the students' success at university by considering their
performance in terms of both ‘qualitative performance’,
measured by their mean grade, and ‘quantitative
performance’, measured by university credits accumulated. These
data come from an Italian University and concern a cohort of students
enrolled at the Faculty of Economics. To jointly model both the marginal
relationships and the association structure with covariates, we fit a
bivariate ordered logistic model by penalized maximum likelihood
estimation. The penalty term we use allows us to smooth the association
structure and enlarge the range of possible parameterizations beyond that
provided by the usual Dale model. The advantages of our approach are also
in terms of parsimony and parameter interpretation, while preserving the
goodness of fit.
Journal: Journal of Applied Statistics
Pages: 46-57
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2014.998407
File-URL: http://hdl.handle.net/10.1080/02664763.2014.998407
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:46-57
Template-Type: ReDIF-Article 1.0
Author-Name: Hakim-Moulay Dehbi
Author-X-Name-First: Hakim-Moulay
Author-X-Name-Last: Dehbi
Author-Name: Mario Cortina-Borja
Author-X-Name-First: Mario
Author-X-Name-Last: Cortina-Borja
Author-Name: Marco Geraci
Author-X-Name-First: Marco
Author-X-Name-Last: Geraci
Title: Aranda-Ordaz quantile regression for student performance assessment
Abstract:
In education research, normal regression models may not be appropriate due
to the presence of bounded variables, which may exhibit a large variety of
distributional shapes and present floor and ceiling effects. In this
article a class of quantile regression models for bounded response
variables is developed. The one-parameter Aranda-Ordaz symmetric and
asymmetric families of transformations are applied to address modelling
issues that arise when estimating conditional quantiles of a bounded
response variable whose relationship with the covariates is possibly
nonlinear. This approach exploits the equivariance property of quantiles
and aims at achieving linearity of the predictor. This offers a flexible
model-based alternative to nonparametric estimation of the quantile
function. Since the transformation is quantile-specific, the modelling
takes into account the local features of the conditional distribution of
the response variable. Our study is motivated by the analysis of reading
performance in seven-year old children part of the Millennium Cohort
Study.
Journal: Journal of Applied Statistics
Pages: 58-71
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1025724
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1025724
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:58-71
Template-Type: ReDIF-Article 1.0
Author-Name: Rafael Pimentel Maia
Author-X-Name-First: Rafael Pimentel
Author-X-Name-Last: Maia
Author-Name: Hildete Prisco Pinheiro
Author-X-Name-First: Hildete Prisco
Author-X-Name-Last: Pinheiro
Author-Name: Aluísio Pinheiro
Author-X-Name-First: Aluísio
Author-X-Name-Last: Pinheiro
Title: Academic performance of students from entrance to graduation via quasi U-statistics: a study at a Brazilian research university
Abstract:
We present novel methodology to assess undergraduate students'
performance. Emphasis is given to potential dissimilar behaviors due to
high school background and gender. The proposed method is based on
measures of diversity and on the decomposability of quasi
U-statistics to define average distances between and
within groups. One advantage of the new method over the classical analysis
of variance is its robustness to distributional deviation from the
normality. Moreover, compared with other nonparametric methods, it also
includes tests for interaction effects which are not rank transform
procedures. The variance of the test statistic is estimated by jackknife
and p-values are computed using its asymptotic
distribution. A college education performance data is analyzed. The data
set is formed by students who entered in the University of Campinas,
Brazil, between 1997 and 2000. Their academic performance has been
recorded until graduation or drop-out. The classical ANOVA points to
significant effects of gender, type of high school and working status.
However, the residual analysis indicates a highly significant deviation
from normality. The quasi U-statistics nonparametric
tests proposed here present significant effect of interaction between type
of high school and gender but did not present a significant effect of
working status. The proposed nonparametric method also results in smaller
error variances, illustrating its robustness against model
misspecification.
Journal: Journal of Applied Statistics
Pages: 72-86
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1077939
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077939
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:72-86
Template-Type: ReDIF-Article 1.0
Author-Name: Dewi Juliah Ratnaningsih
Author-X-Name-First: Dewi Juliah
Author-X-Name-Last: Ratnaningsih
Author-Name: Imas Sukaesih Sitanggang
Author-X-Name-First: Imas Sukaesih
Author-X-Name-Last: Sitanggang
Title: Comparative analysis of classification methods in determining non-active student characteristics in Indonesia Open University
Abstract:
Classification is a data mining technique that aims to discover a model
from training data that distinguishes records into appropriate classes.
Classification methods can be applied in education, to classify non-active
students in higher education programs based on their characteristics. This
paper presents a comparison of three classification methods: Naïve
Bayes, Bagging, and C4.5. The criteria used to evaluate performance of
three classifiers are stratified cross-validation, confusion matrix, ROC
curve, recall, precision, and F-measure. The data used for this paper are
non-active students in Indonesia Open University (IOU) for the period of
2004--2012. The non-active students were divided into three groups:
non-active students in the first three years, non-active students in first
five years, and non-active students over five years. Results of the study
show that the Bagging method provided a higher accuracy than Naïve
Bayes and C4.5. The accuracy of bagging classification is 82.99%, while
the Naïve Bayes and C4.5 are 80.04% and 82.74%, respectively. The
classification tree resulted from the Bagging method has a large number of
nodes, so it is quite difficult to use in decision-making. For that, the
C4.5 tree is used to classify non-active students in IOU based in their
characteristics.
Journal: Journal of Applied Statistics
Pages: 87-97
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1077940
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077940
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:87-97
Template-Type: ReDIF-Article 1.0
Author-Name: Dalit Contini
Author-X-Name-First: Dalit
Author-X-Name-Last: Contini
Author-Name: Davide Azzolini
Author-X-Name-First: Davide
Author-X-Name-Last: Azzolini
Title: Performance and decisions: immigrant--native gaps in educational transitions in Italy
Abstract:
Following the seminal work of Boudon [5], sociological research has
conceptualized immigrant--native gaps in educational transitions as
deriving from children of immigrants' poorer academic performance (primary
effects) and from different decision models existing between native and
immigrant families (secondary effects). The limited evidence on
immigrant--native gaps in Europe indicates that secondary effects are
generally positive: children of immigrants tend to make more ambitious
educational choices than natives with the same prior performance. In this
paper we review the different decomposition methods employed so far in the
literature to tackle similar research questions, and extend the existing
methodology to allow including interaction effects and taking explanatory
variables under control. We apply this method to data coming from a unique
Italian administrative data set. We find that children of immigrants
exhibit higher likelihood to opt for vocational training over more
generalist and academic programs, even when controlling for socio-economic
background. A large share of the immigrant--native differentials in the
probability to attend the different school programs is explained by the
different prior performance distribution. However, decision models differ
between groups, and, contrary to the evidence on other countries, these
differences contribute to widening the existing gaps. If children of
immigrants had the same social background and prior performance of their
native peers, they still would be more likely to enroll in shorter and
less-demanding school programs. Interestingly, these results hold true
only for boys, while we find no evidence of decision effects for girls.
Journal: Journal of Applied Statistics
Pages: 98-114
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1036845
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1036845
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:98-114
Template-Type: ReDIF-Article 1.0
Author-Name: Maria Prosperina Vitale
Author-X-Name-First: Maria Prosperina
Author-X-Name-Last: Vitale
Author-Name: Giovanni C. Porzio
Author-X-Name-First: Giovanni C.
Author-X-Name-Last: Porzio
Author-Name: Patrick Doreian
Author-X-Name-First: Patrick
Author-X-Name-Last: Doreian
Title: Examining the effect of social influence on student performance through network autocorrelation models
Abstract:
The paper investigates the link between student relations and their
performances at university. A social influence mechanism is hypothesized
as individuals adjusting their own behaviors to those of others with whom
they are connected. This contribution explores the effect of peers on a
real network formed by a cohort of students enrolled at a graduate level
in an Italian University. Specifically, by adopting a network effects
model, the relation between interpersonal networks and university
performance is evaluated assuming that student performance is related to
the performance of the other students belonging to the same group. By
controlling for individual covariates, the network results show informal
contacts, based on mutual interests and goals, are related to performance,
while formal groups formed temporarily by the instructor have no such
effect.
Journal: Journal of Applied Statistics
Pages: 115-127
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1049517
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1049517
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:115-127
Template-Type: ReDIF-Article 1.0
Author-Name: Simon Monyai
Author-X-Name-First: Simon
Author-X-Name-Last: Monyai
Author-Name: 'Maseka Lesaoana
Author-X-Name-First: 'Maseka
Author-X-Name-Last: Lesaoana
Author-Name: Timotheus Darikwa
Author-X-Name-First: Timotheus
Author-X-Name-Last: Darikwa
Author-Name: Philimon Nyamugure
Author-X-Name-First: Philimon
Author-X-Name-Last: Nyamugure
Title: Application of multinomial logistic regression to educational factors of the 2009 General Household Survey in South Africa
Abstract:
This paper combines factor analysis and multinomial logistic regression
(MLR) in understanding the relationship between extracted factors of
quality of life pertaining to education and variables of five key areas of
the levels of development in the context of the South African 2009 General
Household Survey. MLR was used to analyse the identified educational
factors from factor analysis. It was also used to determine the extent to
which these factors impact on educational level outcomes across South
Africa. The overall classification accuracy rate displayed was 73.0% which
is greater than the proportion by chance accuracy criteria of 57.0%. This
means that the model improves on the proportion by chance accuracy rate of
25.0% or more so that the criterion for classification accuracy is
satisfied and the model is adequate. Evidence is that being historically
disadvantaged, absence of parental care, violence in schools and the
perception that fees were too high generally have a negative influence on
educational attainment. The results of this paper compare well with other
household surveys conducted by other researchers.
Journal: Journal of Applied Statistics
Pages: 128-139
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1077941
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077941
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:128-139
Template-Type: ReDIF-Article 1.0
Author-Name: Rosalia Castellano
Author-X-Name-First: Rosalia
Author-X-Name-Last: Castellano
Author-Name: Gennaro Punzo
Author-X-Name-First: Gennaro
Author-X-Name-Last: Punzo
Title: Patterns of earnings differentials across three conservative European welfare regimes with alternative education systems
Abstract:
The aim of this paper is to investigate, from a generational perspective,
the effect of human capital on individual earnings and earnings
differences in Germany, France and Italy, three developed countries in
Western Europe with similar conservative welfare regimes
but with important differences in their education systems. Income
inequalities between and within
education levels are explored using a two-stage probit model with quantile
regressions in the second stage. More precisely, drawing upon 2005 EU-SILC
data, returns on schooling and experience are estimated separately for
employees and self-employed full-time workers by means of Mincerian
earnings equations with sample selection; the sample selection correction
accounts for the potential individual self-selection into the two labour
force types. Although some determinants appear to be relatively similar
across countries, state-specific differentials are drawn in light of the
institutional features of each national context. The study reveals how
each dimension of human capital differently affects individuals’
earnings and earnings inequality and, most of all, how their impacts
differ along the conditional earnings distribution and across countries.
In the comparative perspective, the country's leading position in terms of
the highest rewards on education also depends on which earnings
distribution (employee vs. self-employed) is analysed.
Journal: Journal of Applied Statistics
Pages: 140-168
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1049518
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1049518
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:140-168
Template-Type: ReDIF-Article 1.0
Author-Name: Stefania Capecchi
Author-X-Name-First: Stefania
Author-X-Name-Last: Capecchi
Author-Name: Domenico Piccolo
Author-X-Name-First: Domenico
Author-X-Name-Last: Piccolo
Title: Investigating the determinants of job satisfaction of Italian graduates: a model-based approach
Abstract:
The paper explores the relationship between personal, economic and
time-dependent covariates as determinants of the job satisfaction
expressed by graduate workers. After discussing the main results of the
literature, the work emphasizes a statistical modelling approach able to
effectively estimate and visualize those determinants and their
interactions with subjects' covariates. Interpretation and visualization
of graduates' profiles are shown on the basis of a survey conducted in
Italy; more specifically, the determinants of both satisfaction and
uncertainty of the respondents are explicitly discussed.
Journal: Journal of Applied Statistics
Pages: 169-179
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1036844
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1036844
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:169-179
Template-Type: ReDIF-Article 1.0
Author-Name: S. Fasola
Author-X-Name-First: S.
Author-X-Name-Last: Fasola
Author-Name: O. Giambalvo
Author-X-Name-First: O.
Author-X-Name-Last: Giambalvo
Author-Name: C. Romano
Author-X-Name-First: C.
Author-X-Name-Last: Romano
Title: Flexible latent trait aggregation to analyze employability after the Ph.D. in Italy
Abstract:
The analysis of satisfaction, employability and economic perspectives
after the Ph.D. in Italy has not received adequate attention in the past,
especially in terms of comparison among universities. To analyze these
aspects, in this paper we consider data from the survey ‘Statistica
in TEma di Laureati e LAvoro’ on doctors who achieved the title on
2007, 2008 and 2009 [CILEA, Laureati STELLA, indagine
occupazionale post-dottorato, dottori di ricerca 2007--2008,
Tech. Rep., CILEA, Segrate, 2010; CILEA, Laureati STELLA, indagine
occupazionale post-dottorato, dottori di ricerca 2008--2009,
Tech. Rep., CILEA, Segrate, 2011]. To deal with the complex,
multidimensional nature of the concept, we propose a flexible two-step
procedure for the construction of a composite indicator, and make a first
attempt to rank some Italian universities. In the first step, indicators
for single dimensions are derived from cumulative link models with
proportional odds. In the second step, aggregation through standard, ad
hoc methods is proposed.
Journal: Journal of Applied Statistics
Pages: 180-194
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1077797
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077797
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:180-194
Template-Type: ReDIF-Article 1.0
Author-Name: D. Fouskakis
Author-X-Name-First: D.
Author-X-Name-Last: Fouskakis
Author-Name: G. Petrakos
Author-X-Name-First: G.
Author-X-Name-Last: Petrakos
Author-Name: I. Vavouras
Author-X-Name-First: I.
Author-X-Name-Last: Vavouras
Title: A Bayesian hierarchical model for comparative evaluation of teaching quality indicators in higher education
Abstract:
The problem motivating the paper is the quantification of students'
preferences regarding teaching/coursework quality, under certain numerical
restrictions, in order to build a model for identifying, assessing and
monitoring the major components of the overall teaching quality. We
propose a Bayesian hierarchical beta regression model, with a Dirichlet
prior on the model coefficients. The coefficients of the model can then be
interpreted as weights and thus they measure the relative importance that
students give to the different attributes. This approach not only allows
for the incorporation of informative prior when it is available but also
provides user-friendly interfaces and direct probability interpretations
for all quantities. Furthermore, it is a natural way to implement the
usual constraints for the model coefficients. This model is applied to
data collected in 2009 and 2013 from undergraduate students in the
Panteion University, Athens, Greece and besides the construction of an
instrument for the assessment and monitoring of teaching quality, it gave
some input for a preliminary discussion on the association of the
differences in students' preferences between the two time-periods with the
current Greek socioeconomic transformation. Results from the proposed
approach are compared with the ones obtained by two alternative
statistical techniques.
Journal: Journal of Applied Statistics
Pages: 195-211
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1054793
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1054793
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:195-211
Template-Type: ReDIF-Article 1.0
Author-Name: András Telcs
Author-X-Name-First: András
Author-X-Name-Last: Telcs
Author-Name: Zsolt Tibor Kosztyán
Author-X-Name-First: Zsolt Tibor
Author-X-Name-Last: Kosztyán
Author-Name: Ádám Török
Author-X-Name-First: Ádám
Author-X-Name-Last: Török
Title: Unbiased one-dimensional university ranking -- application-based preference ordering
Abstract:
Our main goal is to produce a ranking technique which overcomes
shortcomings of the numerous university rankings published. We propose a
ranking method that provides a one-dimensional preference list of
universities which is solely based on the partial rankings of applicants.
Our ranking is free of subjective weights and uncomparable dimensions.
Journal: Journal of Applied Statistics
Pages: 212-228
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2014.998180
File-URL: http://hdl.handle.net/10.1080/02664763.2014.998180
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:212-228
Template-Type: ReDIF-Article 1.0
Author-Name: J. Groß
Author-X-Name-First: J.
Author-X-Name-Last: Groß
Author-Name: A. Robitzsch
Author-X-Name-First: A.
Author-X-Name-Last: Robitzsch
Author-Name: A.C. George
Author-X-Name-First: A.C.
Author-X-Name-Last: George
Title: Cognitive diagnosis models for baseline testing of educational standards in math
Abstract:
Cognitive diagnosis models received growing attention in recent
psychometric literature in view of the potentiality for fine-grained
analysis of examinees’ latent skills. Although different types and
aspects of these models have been investigated in some detail, application
to real-life data had so far been sparse. This paper aims at addressing
different topics with respect to model building from a practitioner's
perspective. The objective is to draw conclusions about examinees’
performance on the Austrian baseline testing of educational standards in
math 2009. Although there is a variety of models at hand, the focus is set
on the easy to interpret deterministic input, noisy ‘and’
gate model. A possible course of action with respect to model fit is
outlined in detail and some conclusions with respect to test results are
discussed.
Journal: Journal of Applied Statistics
Pages: 229-243
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2014.1000841
File-URL: http://hdl.handle.net/10.1080/02664763.2014.1000841
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:229-243
Template-Type: ReDIF-Article 1.0
Author-Name: Oyelola A. Adegboye
Author-X-Name-First: Oyelola A.
Author-X-Name-Last: Adegboye
Author-Name: Asadullah Jawid
Author-X-Name-First: Asadullah
Author-X-Name-Last: Jawid
Title: Multivariate multilevel models for attitudes toward statistics: multi-disciplinary settings in Afghanistan
Abstract:
The present paper focuses on examining students' attitudes and perception
of statistics in Afghanistan universities and the factor structure of the
statistical anxiety rating scale (STARS). In total, 209 undergraduate
students from different disciplines in different universities in
Afghanistan participated in the study. In addition to testing the factor
structure of the STARS, a multivariate multilevel analysis that
incorporates the correlation in the data was carried out on the aggregated
subscales of the STARS scores. Results showed that the original 6-factor
structure did not fit the Afghanistan data well. Exploratory factor
analysis identified 5-factor constructs to best fit the data and was
confirmed by the fit indices as well as a likelihood ratio test. Male
students showed more positive attitudes toward statistics and a higher
level of statistics anxiety than their female counterparts. Female
students experienced higher levels of fear of asking for help and less
anxiety in computation. Students who had taken at least a previous
statistics course had lower statistics anxiety than those taking the
course for the first time.
Journal: Journal of Applied Statistics
Pages: 244-261
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1091445
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1091445
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:244-261
Template-Type: ReDIF-Article 1.0
Author-Name: Domenico De Stefano
Author-X-Name-First: Domenico
Author-X-Name-Last: De Stefano
Author-Name: Susanna Zaccarin
Author-X-Name-First: Susanna
Author-X-Name-Last: Zaccarin
Title: Co-authorship networks and scientific performance: an empirical analysis using the generalized extreme value distribution
Abstract:
This paper aims to explore the effects of collaborative behaviour on
scholar scientific performance. Individual network measures related to
scholar centrality as well as attitude to collaborate with others are
derived from co-authorship networks in a given scientific community (i.e.
Italian academic statisticians). Co-authorship information have been
collected from three data sources of national-based, discipline-based, and
international-based high-impact publications. Both network and individual
covariates are used to model individual h-index by
generalized extreme value distribution. Results show a positive
association between performance and actors' central position in the
network. Having a large number of co-authors and occupying central
positions are likely to positively affect scientific performance.
Journal: Journal of Applied Statistics
Pages: 262-279
Issue: 1
Volume: 43
Year: 2016
Month: 1
X-DOI: 10.1080/02664763.2015.1017719
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1017719
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:1:p:262-279
Template-Type: ReDIF-Article 1.0
Author-Name: Mohammadreza Meshkani
Author-X-Name-First: Mohammadreza
Author-X-Name-Last: Meshkani
Author-Name: Afshin Fallah
Author-X-Name-First: Afshin
Author-X-Name-Last: Fallah
Author-Name: Amir Kavousi
Author-X-Name-First: Amir
Author-X-Name-Last: Kavousi
Title: Bayesian analysis of covariance under inverse Gaussian model
Abstract:
This paper considers the problem of analysis of covariance (ANCOVA) under
the assumption of inverse Gaussian distribution for response variable from
the Bayesian point of view. We develop a fully Bayesian model for ANCOVA
based on the conjugate prior distributions for parameters contained in the
model. The Bayes estimator of parameters, ANCOVA model and adjusted
effects for both treatments and covariates along with predictive
distribution of future observations are developed. We also provide the
essentials for comparing adjusted treatments effects and adjusted factor
effects. A simulation study and a real world application are also
performed to illustrate and evaluate the proposed Bayesian model.
Journal: Journal of Applied Statistics
Pages: 280-298
Issue: 2
Volume: 43
Year: 2016
Month: 2
X-DOI: 10.1080/02664763.2015.1049131
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1049131
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:2:p:280-298
Template-Type: ReDIF-Article 1.0
Author-Name: Sonia Ferreira Lopes Toffoli
Author-X-Name-First: Sonia Ferreira Lopes
Author-X-Name-Last: Toffoli
Author-Name: Dalton Francisco de Andrade
Author-X-Name-First: Dalton Francisco
Author-X-Name-Last: de Andrade
Author-Name: Antonio Cezar Bornia
Author-X-Name-First: Antonio Cezar
Author-X-Name-Last: Bornia
Title: Evaluation of open items using the many-facet Rasch model
Abstract:
The goal of this study is to analyze the quality of ratings assigned to
two constructed response questions for evaluating the written ability of
essays in Portuguese language from the perspective of the many-facet Rasch
(MFR [15]) model. The analyzed data set comes from 350 written tests with
two open-item tasks that were developed based on a rating process
independently marked by two rater coordinators and a group of 42 raters.
The MFR model analysis shows the measurement quality related to the
examinees, raters, tasks and items, and classification scale that has been
used for the task rating process. The findings indicate significant
differences amongst the rater severities and show that the raters cannot
be interchanged. The results also suggest that the comparison between the
two task difficulties needs further investigation. An additional study has
been done on the scale structure of the classification used by each rater
for each item. The result suggests that there have been some similarities
amongst the tasks and a need of revision for some criteria of the rating
process. Overall, the scale of evaluation has shown to be efficient for a
classification of the examinees.
Journal: Journal of Applied Statistics
Pages: 299-316
Issue: 2
Volume: 43
Year: 2016
Month: 2
X-DOI: 10.1080/02664763.2015.1049938
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1049938
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:2:p:299-316
Template-Type: ReDIF-Article 1.0
Author-Name: H.D. Vinod
Author-X-Name-First: H.D.
Author-X-Name-Last: Vinod
Title: New bootstrap inference for spurious regression problems
Abstract:
Phillips [11] provides asymptotic theory for regressions that relate
nonstationary time series including those integrated of order 1,
. A practical
implication of the literature on spurious regression is that one cannot
trust the usual confidence intervals (CIs). In the absence of prior
knowledge that two series are cointegrated, it is therefore recommended
that we abandon the specification in levels and work with differenced or
detrended series. For situations when the specification in levels is
sacrosanct we propose new CIs based on the Maximum Entropy bootstrap
explained in Vinod and López-de-Lacalle (Maximum entropy
bootstrap for time series: The meboot R package, J. Statist.
Softw. 29 (2009), pp. 1--19). An extensive Monte Carlo simulation shows
that our proposal can provide more reliable conservative CIs than
traditional and block bootstrap intervals.
Journal: Journal of Applied Statistics
Pages: 317-335
Issue: 2
Volume: 43
Year: 2016
Month: 2
X-DOI: 10.1080/02664763.2015.1049939
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1049939
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:2:p:317-335
Template-Type: ReDIF-Article 1.0
Author-Name: Chandan Saha
Author-X-Name-First: Chandan
Author-X-Name-Last: Saha
Author-Name: Michael P. Jones
Author-X-Name-First: Michael P.
Author-X-Name-Last: Jones
Title: Type I and Type II error rates in the last observation carried forward method under informative dropout
Abstract:
Dropout is a persistent problem for a longitudinal study. We exhibit the
shortcomings of the last observation carried forward method. It produces
biased estimates of change in an outcome from baseline to study endpoint
under informative dropout. We developed a theoretical quantification of
the effect of such bias on type I and type II error rates. We present
results for a setup where a subject either completes the study or drops
out during one particular interval, and also under the setup in which
subjects could drop out at any time during the study. The type I error
rate steadily increases when time to dropout decreases or the common
sample size increases. The inflation in type I error rate can be
substantially high when reasons for dropout in the two groups differ; when
there is a large difference in dropout rates between the control and
treatment groups and when the common sample size is large; even when
dropout subjects have one or two fewer observations than the completers.
Similar results are also observed for type II error rates. A study can
have very low power when early recovered patients in the treatment group
and worsening patients in the control group drop out even near the end of
the study.
Journal: Journal of Applied Statistics
Pages: 336-350
Issue: 2
Volume: 43
Year: 2016
Month: 2
X-DOI: 10.1080/02664763.2015.1063112
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1063112
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:2:p:336-350
Template-Type: ReDIF-Article 1.0
Author-Name: Daniel Ambach
Author-X-Name-First: Daniel
Author-X-Name-Last: Ambach
Title: Short-term wind speed forecasting in Germany
Abstract:
The importance of renewable power production is a set goal in terms of the
energy turnaround. Developing short-term wind speed forecasting
improvements might increase the profitability of wind power. This article
compares two novel approaches to model and predict wind speed. Both
approaches incorporate periodic interactions, whereas the first model uses
Fourier series to model the periodicity. The second model takes
generalised
trigonometric functions into consideration. The aforementioned Fourier
series are special types of the p-generalised
trigonometrical function and therefore model 1 is nested in model 2. The
two models use an autoregressive fractionally integrated moving
average--asymmetric power generalised autoregressive conditional
heteroscedasticity process to cover the autocorrelation and the
heteroscedasticity. A data set which consist of 10 min data
collected at four stations at the German--Polish border from August 2007
to December 2012 is analysed. The most important finding is an enhancement
of the forecasting accuracy up to three hours that is directly related to
our new short-term forecasting model.
Journal: Journal of Applied Statistics
Pages: 351-369
Issue: 2
Volume: 43
Year: 2016
Month: 2
X-DOI: 10.1080/02664763.2015.1063113
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1063113
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:2:p:351-369
Template-Type: ReDIF-Article 1.0
Author-Name: Aslan Deniz Karaoglan
Author-X-Name-First: Aslan Deniz
Author-X-Name-Last: Karaoglan
Author-Name: Nihat Celik
Author-X-Name-First: Nihat
Author-X-Name-Last: Celik
Title: A new painting process for vessel radiators of transformer: wet-on-wet
Abstract:
The painting process of corrugated wall radiators of a distribution
transformer is performed by a flow-down painting technique in the
industrial field. This study has been prepared in accordance with ISO
12944-5. Correspondingly, this work is motivated by Epoxy 2-pack paints
(4.3.4.2) to obtain minimum requirements for C3 atmospheric corrosivity
categories (5.1.1). This standard requires from the vertical surface of
the vessel of the transformer to be painted with epoxy paints that contain
anti-corrosive pigments with a minimum of 100 µm dry film
thickness. In the present study, a new production methodology called
wet-on-wet (WOW) painting is developed which has never been used in
industry. In addition, a modified response surface methodology (RSM) is
proposed for designing, modeling, and optimizing the proposed process
under unsteady environmental effects. The results indicate that the WOW
painting can be applied to real industrial systems successfully by the aid
of the proposed new RSM algorithm and provide remarkable time and cost
savings.
Journal: Journal of Applied Statistics
Pages: 370-386
Issue: 2
Volume: 43
Year: 2016
Month: 2
X-DOI: 10.1080/02664763.2015.1063114
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1063114
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:2:p:370-386
Template-Type: ReDIF-Article 1.0
Author-Name: Rose D. Baker
Author-X-Name-First: Rose D.
Author-X-Name-Last: Baker
Author-Name: Ian G. McHale
Author-X-Name-First: Ian G.
Author-X-Name-Last: McHale
Title: An empirical Bayes' procedure for ranking players in Ryder Cup golf
Abstract:
We describe a model to obtain strengths and rankings of players appearing
in golf's Ryder Cup. Obtaining rankings is complicated because of two
reasons. First, competitors do not compete on an equal number of
occasions, with some competitors appearing too infrequently for their
ranking to be estimated with any degree of certainty, and second,
different competitors experience different levels of volatility in
results. Our approach is to assume the competitor strengths are drawn from
some common distribution. For small numbers of competitors, as is the case
here, we fit the model using Monte-Carlo integration. Results suggest
there is very little difference between the top performing players, though
Scotland's Colin Montgomerie is estimated as the strongest Ryder Cup
player.
Journal: Journal of Applied Statistics
Pages: 387-395
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1043869
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1043869
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:387-395
Template-Type: ReDIF-Article 1.0
Author-Name: S.T. Boris Choy
Author-X-Name-First: S.T. Boris
Author-X-Name-Last: Choy
Author-Name: Jennifer S.K. Chan
Author-X-Name-First: Jennifer S.K.
Author-X-Name-Last: Chan
Author-Name: Udi E. Makov
Author-X-Name-First: Udi E.
Author-X-Name-Last: Makov
Title: Robust Bayesian analysis of loss reserving data using scale mixtures distributions
Abstract:
It is vital for insurance companies to have appropriate levels of loss
reserving to pay outstanding claims and related settlement costs. With
many uncertainties and time lags inherently involved in the claims
settlement process, loss reserving therefore must be based on estimates.
Existing models and methods cannot cope with irregular and extreme claims
and hence do not offer an accurate prediction of loss reserving. This
paper extends the conventional normal error distribution in loss reserving
modeling to a range of heavy-tailed distributions which are expressed by
certain scale mixtures forms. This extension enables robust analysis and,
in addition, allows an efficient implementation of Bayesian analysis via
Markov chain Monte Carlo simulations. Various models for the mean of the
sampling distributions, including the log-Analysis of Variance (ANOVA),
log-Analysis of Covariance (ANCOVA) and state space models, are considered
and the straightforward implementation of scale mixtures distributions is
demonstrated using OpenBUGS.
Journal: Journal of Applied Statistics
Pages: 396-411
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1063115
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1063115
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:396-411
Template-Type: ReDIF-Article 1.0
Author-Name: Hadi Alizadeh Noughabi
Author-X-Name-First: Hadi
Author-X-Name-Last: Alizadeh Noughabi
Author-Name: Narayanaswamy Balakrishnan
Author-X-Name-First: Narayanaswamy
Author-X-Name-Last: Balakrishnan
Title: Tests of goodness of fit based on Phi-divergence
Abstract:
In this paper, we introduce a general goodness of fit test based on
Phi-divergence. Consistency of the proposed test is established. We then
study some special cases of tests for normal, exponential, uniform and
Laplace distributions. Through Monte Carlo simulations, the power values
of the proposed tests are compared with some known competing tests under
various alternatives. Finally, some numerical examples are presented to
illustrate the proposed procedure.
Journal: Journal of Applied Statistics
Pages: 412-429
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1063116
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1063116
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:412-429
Template-Type: ReDIF-Article 1.0
Author-Name: C. Armero
Author-X-Name-First: C.
Author-X-Name-Last: Armero
Author-Name: A. Forte
Author-X-Name-First: A.
Author-X-Name-Last: Forte
Author-Name: H. Perpiñán
Author-X-Name-First: H.
Author-X-Name-Last: Perpiñán
Title: Bayesian longitudinal models for paediatric kidney transplant recipients
Abstract:
Chronic kidney disease is a progressive loss of renal function which
results in the inability of the kidneys to properly filter waste from the
blood. Renal function is usually estimated by the glomerular filtration
rate (eGFR), which decreases with the worsening of the disease. Bayesian
longitudinal models with covariates, random effects, serial correlation
and measurement error are discussed to analyse the progression of eGFR in
first transplanted children taken from a study in València, Spain.
Journal: Journal of Applied Statistics
Pages: 430-440
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1063117
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1063117
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:430-440
Template-Type: ReDIF-Article 1.0
Author-Name: Qing Li
Author-X-Name-First: Qing
Author-X-Name-Last: Li
Title: Indirect membership function assignment based on ordinal regression
Abstract:
In many fuzzy sets applications, fuzzy membership functions are commonly
developed based on empirical or expert knowledge. The equation of a
membership function is usually determined somewhat arbitrarily. This paper
explores a novel membership function design method based on ordinal
regression analysis. The estimated thresholds between ordinal measurement
categories are applied to calculate the intersection points between fuzzy
sets. These intersection points are further applied to determine the
equations of the membership functions. Information distortion due to
empirical guess can thus be reduced and more latent information in the
fuzzy responses can therefore be captured. A case study investigating the
relationship between foster mothers’ satisfaction and the foster
time and information provided has been conducted in this research. The
applicability and effectiveness of the proposed membership function
assignment approach have been demonstrated through several case studies.
Journal: Journal of Applied Statistics
Pages: 441-460
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1070802
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1070802
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:441-460
Template-Type: ReDIF-Article 1.0
Author-Name: Chin-Shang Li
Author-X-Name-First: Chin-Shang
Author-X-Name-Last: Li
Title: A test for the linearity of the nonparametric part of a semiparametric logistic regression model
Abstract:
A semiparametric logistic regression model is proposed in which its
nonparametric component is approximated with fixed-knot cubic
B-splines. To assess the linearity of the nonparametric
component, we construct a penalized likelihood ratio test statistic. When
the number of knots is fixed, the null distribution of the test statistic
is shown to be asymptotically the distribution of a linear combination of
independent chi-squared random variables, each with one degree of freedom.
We set the asymptotic null expectation of this test statistic equal to a
value to determine the smoothing parameter value. Monte Carlo experiments
are conducted to investigate the performance of the proposed test. Its
practical use is illustrated with a real-life example.
Journal: Journal of Applied Statistics
Pages: 461-475
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1070803
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1070803
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:461-475
Template-Type: ReDIF-Article 1.0
Author-Name: Hadi Emami
Author-X-Name-First: Hadi
Author-X-Name-Last: Emami
Author-Name: Mostafa Emami
Author-X-Name-First: Mostafa
Author-X-Name-Last: Emami
Title: New influence diagnostics in ridge regression
Abstract:
We occasionally find that a small subset of the data exerts a
disproportionate influence on the fitted regression model. We would like
to locate these influential points and assess their impact on the model.
However, the existence of influential data is complicated by the presence
of collinearity (see, e.g. [15]). In this article we develop a new
influence statistic for one or a set of observations in linear regression
dealing with collinearity. We show that this statistic has asymptotically
normal distribution and is able to detect a subset of high ridge leverage
outliers. Using this influence statistic we also show that when ridge
regression is used to mitigate the effects of collinearity, the influence
of some observations can be drastically modified. As an illustrative
example, simulation studies and a real data set are analysed.
Journal: Journal of Applied Statistics
Pages: 476-489
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1070804
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1070804
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:476-489
Template-Type: ReDIF-Article 1.0
Author-Name: John Tyssedal
Author-X-Name-First: John
Author-X-Name-Last: Tyssedal
Author-Name: Shahrukh Hussain
Author-X-Name-First: Shahrukh
Author-X-Name-Last: Hussain
Title: Factor screening in nonregular two-level designs based on projection-based variable selection
Abstract:
In this paper, we focus on the problem of factor screening in nonregular
two-level designs through gradually reducing the number of possible sets
of active factors. We are particularly concerned with situations when
three or four factors are active. Our proposed method works through
examining fits of projection models, where variable selection techniques
are used to reduce the number of terms. To examine the reliability of the
methods in combination with such techniques, a panel of models consisting
of three or four active factors with data generated from the 12-run and
the 20-run Plackett--Burman (PB) design is used. The dependence of the
procedure on the amount of noise, the number of active factors and the
number of experimental factors is also investigated. For designs with few
runs such as the 12-run PB design, variable selection should be done with
care and default procedures in computer software may not be reliable to
which we suggest improvements. A real example is included to show how we
propose factor screening can be done in practice.
Journal: Journal of Applied Statistics
Pages: 490-508
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1070805
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1070805
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:490-508
Template-Type: ReDIF-Article 1.0
Author-Name: A.A.M. Nurunnabi
Author-X-Name-First: A.A.M.
Author-X-Name-Last: Nurunnabi
Author-Name: M. Nasser
Author-X-Name-First: M.
Author-X-Name-Last: Nasser
Author-Name: A.H.M.R. Imon
Author-X-Name-First: A.H.M.R.
Author-X-Name-Last: Imon
Title: Identification and classification of multiple outliers, high leverage points and influential observations in linear regression
Abstract:
Detection of multiple unusual observations such as outliers, high leverage
points and influential observations (IOs) in regression is still a
challenging task for statisticians due to the well-known masking and
swamping effects. In this paper we introduce a robust influence distance
that can identify multiple IOs, and propose a sixfold plotting technique
based on the well-known group deletion approach to classify regular
observations, outliers, high leverage points and IOs simultaneously in
linear regression. Experiments through several well-referred data sets and
simulation studies demonstrate that the proposed algorithm performs
successfully in the presence of multiple unusual observations and can
avoid masking and/or swamping effects.
Journal: Journal of Applied Statistics
Pages: 509-525
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1070806
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1070806
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:509-525
Template-Type: ReDIF-Article 1.0
Author-Name: Xuejun Ma
Author-X-Name-First: Xuejun
Author-X-Name-Last: Ma
Author-Name: Xiaoqun He
Author-X-Name-First: Xiaoqun
Author-X-Name-Last: He
Author-Name: Xiaokang Shi
Author-X-Name-First: Xiaokang
Author-X-Name-Last: Shi
Title: A variant of K nearest neighbor quantile regression
Abstract:
Compared with local polynomial quantile regression, K
nearest neighbor quantile regression (KNNQR) has many advantages, such as
not assuming smoothness of functions. The paper summarizes the research of
KNNQR and has carried out further research on the selection of
k, algorithm and Monte Carlo simulations. Additionally,
simulated functions are Blocks, Bumps, HeaviSine and Doppler, which stand
for jumping, volatility, mutagenicity slope and high frequency function.
When function to be estimated has some jump points or catastrophe points,
KNNQR is superior to local linear quantile regression in the sense of the
mean squared error and mean absolute error criteria. To be mentioned, even
high frequency, the superiority of KNNQR could be observed. A real data is
analyzed as an illustration.
Journal: Journal of Applied Statistics
Pages: 526-537
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1070807
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1070807
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:526-537
Template-Type: ReDIF-Article 1.0
Author-Name: S. Karagulle
Author-X-Name-First: S.
Author-X-Name-Last: Karagulle
Author-Name: Z. Kalaylioglu
Author-X-Name-First: Z.
Author-X-Name-Last: Kalaylioglu
Title: A test for detecting etiologic heterogeneity in epidemiological studies
Abstract:
Current statistical methods for analyzing epidemiological data with
disease subtype information allow us to acquire knowledge not only for
risk factor-disease subtype association but also, on a more profound
account, heterogeneity in these associations by multiple disease
characteristics (so-called etiologic heterogeneity of the disease).
Current interest, particularly in cancer epidemiology, lies in obtaining a
valid p-value for testing the hypothesis whether a
particular cancer is etiologically heterogeneous. We consider the
two-stage logistic regression model along with pseudo-conditional
likelihood estimation method and design a testing strategy based on Rao's
score test. An extensive Monte Carlo simulation study is carried out,
false discovery rate and statistical power of the suggested test are
investigated. Simulation results indicate that applying the proposed
testing strategy, even a small degree of true etiologic heterogeneity can
be recovered with a large statistical power from the sampled data. The
strategy is then applied on a breast cancer data set to illustrate its use
in practice where there are multiple risk factors and multiple disease
characteristics of simultaneous concern.
Journal: Journal of Applied Statistics
Pages: 538-549
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1070808
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1070808
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:538-549
Template-Type: ReDIF-Article 1.0
Author-Name: Shu-Man Shih
Author-X-Name-First: Shu-Man
Author-X-Name-Last: Shih
Author-Name: Wei-Hwa Wu
Author-X-Name-First: Wei-Hwa
Author-X-Name-Last: Wu
Author-Name: Hsin-Neng Hsieh
Author-X-Name-First: Hsin-Neng
Author-X-Name-Last: Hsieh
Title: A non-inferiority test for diagnostic accuracy in the absence of the golden standard test based on the paired partial areas under receiver operating characteristic curves
Abstract:
Non-inferiority tests are often measured for the diagnostic accuracy in
medical research. The area under the receiver operating characteristic
(ROC) curve is a familiar diagnostic measure for the overall
diagnostic accuracy. Nevertheless, since it may not differentiate the
diverse shapes of the ROC curves with different diagnostic significance,
the partial area under the ROC (PAUROC) curve, another summary measure
emerges for such diagnostic processes that require the false-positive rate
to be in the clinically interested range. Traditionally, to estimate the
PAUROC, the golden standard (GS) test on the true disease status is
required. Nevertheless, the GS test may sometimes be infeasible. Besides,
in a lot of research fields such as the epidemiology field, the true
disease status of the patients may not be known or available. Under the
normality assumption on diagnostic test results, based on the
expectation-maximization algorithm in combination with the bootstrap
method, we propose the heuristic method to construct a non-inferiority
test for the difference in the paired PAUROCs without the GS test. Through
the simulation study, although the proposed method might provide a liberal
test, as a whole, the empirical size of the proposed method sufficiently
controls the size at the significance level, and the empirical power of
the proposed method in the absence of the GS is as good as that of the
non-inferiority in the presence of the GS. The proposed method is
illustrated with the published data.
Journal: Journal of Applied Statistics
Pages: 550-562
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1070810
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1070810
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:550-562
Template-Type: ReDIF-Article 1.0
Author-Name: Sayed Mohammad Reza Alavi
Author-X-Name-First: Sayed Mohammad Reza
Author-X-Name-Last: Alavi
Author-Name: Mahboobeh Tajodini
Author-X-Name-First: Mahboobeh
Author-X-Name-Last: Tajodini
Title: Maximum likelihood estimation of sensitive proportion using repeated randomized response techniques
Abstract:
Randomized response techniques are designed to obtain usable data on
sensitive issues while protecting the privacy of individuals. In this
paper, based on repeating the randomized response technique, a new
technique called repeated randomized response is introduced to increase
the protection of privacy and efficiency of estimator for proportion of
sensitive attribute. By using this technique, the proportion of academic
cheating is estimated among students of Shahid Chamran University of
Ahvaz, Ahvaz, Iran.
Journal: Journal of Applied Statistics
Pages: 563-571
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1070811
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1070811
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:563-571
Template-Type: ReDIF-Article 1.0
Author-Name: Vicente G. Cancho
Author-X-Name-First: Vicente G.
Author-X-Name-Last: Cancho
Author-Name: Dipak K. Dey
Author-X-Name-First: Dipak K.
Author-X-Name-Last: Dey
Author-Name: Francisco Louzada
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada
Title: Unified multivariate survival model with a surviving fraction: an application to a Brazilian customer churn data
Abstract:
In this paper we propose a new lifetime model for multivariate survival
data in presence of surviving fractions and examine some of its
properties. Its genesis is based on situations in which there are
m types of unobservable competing causes, where each
cause is related to a time of occurrence of an event of interest. Our
model is a multivariate extension of the univariate survival cure rate
model proposed by Rodrigues et al. [37]. The inferential
approach exploits the maximum likelihood tools. We perform a simulation
study in order to verify the asymptotic properties of the maximum
likelihood estimators. The simulation study also focus on size and power
of the likelihood ratio test. The methodology is illustrated on a real
data set on customer churn data.
Journal: Journal of Applied Statistics
Pages: 572-584
Issue: 3
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1071341
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1071341
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:3:p:572-584
Template-Type: ReDIF-Article 1.0
Author-Name: Cheng Wenren
Author-X-Name-First: Cheng
Author-X-Name-Last: Wenren
Author-Name: Junfeng Shang
Author-X-Name-First: Junfeng
Author-X-Name-Last: Shang
Title: Conditional conceptual predictive statistic for mixed model selection
Abstract:
In linear mixed models, making use of the prediction of the random
effects, we propose the conditional Conceptual Predictive Statistic
for mixed model
selection based on a conditional Gauss discrepancy. We define the
conditional Gauss discrepancy for measuring the distance between the true
model and the candidate model under the conditional mean of response
variables. When the variance components are known, the conditional
serves as an
unbiased estimator for the expected transformed conditional Gauss
discrepancy; when the variance components are unknown, the conditional
serves as an
asymptotically unbiased estimator for the expected transformed conditional
Gauss discrepancy. The best linear unbiased predictor (BLUP) is employed
for the estimation of the random effects. The simulation results
demonstrate that when the true model includes significant fixed effects,
the conditional criteria perform effectively in selecting the most
appropriate model. The penalty term in the computed by the
estimated effective degrees of freedom yields a very good approximation to
the penalty term between the target discrepancy and the goodness-of-fit
term.
Journal: Journal of Applied Statistics
Pages: 585-603
Issue: 4
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1071342
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1071342
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:4:p:585-603
Template-Type: ReDIF-Article 1.0
Author-Name: Edoardo Otranto
Author-X-Name-First: Edoardo
Author-X-Name-Last: Otranto
Author-Name: Massimo Mucciardi
Author-X-Name-First: Massimo
Author-X-Name-Last: Mucciardi
Author-Name: Pietro Bertuccelli
Author-X-Name-First: Pietro
Author-X-Name-Last: Bertuccelli
Title: Spatial effects in dynamic conditional correlations
Abstract:
The recent literature on time series has developed a lot of models for the
analysis of the dynamic conditional correlation, involving the same
variable observed in different locations; very often, in this framework,
the consideration of the spatial interactions is omitted. We propose to
extend a time-varying conditional correlation model (following an
autoregressive moving average dynamics) to include the spatial effects,
with a specification depending on the local spatial interactions. The
spatial part is based on a fixed symmetric weight matrix, called Gaussian
kernel matrix, but its effect will vary along the time depending on the
degree of time correlation in a certain period. We show the theoretical
aspects, with the support of simulation experiments, and apply this
methodology to two space--time data sets, in a demographic and a financial
framework, respectively.
Journal: Journal of Applied Statistics
Pages: 604-626
Issue: 4
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1071343
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1071343
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:4:p:604-626
Template-Type: ReDIF-Article 1.0
Author-Name: Víctor Leiva
Author-X-Name-First: Víctor
Author-X-Name-Last: Leiva
Author-Name: Shuangzhe Liu
Author-X-Name-First: Shuangzhe
Author-X-Name-Last: Liu
Author-Name: Lei Shi
Author-X-Name-First: Lei
Author-X-Name-Last: Shi
Author-Name: Francisco José A. Cysneiros
Author-X-Name-First: Francisco José A.
Author-X-Name-Last: Cysneiros
Title: Diagnostics in elliptical regression models with stochastic restrictions applied to econometrics
Abstract:
We propose an influence diagnostic methodology for linear regression
models with stochastic restrictions and errors following elliptically
contoured distributions. We study how a perturbation may impact on the
mixed estimation procedure of parameters in the model. Normal curvatures
and slopes for assessing influence under usual schemes are derived,
including perturbations of case-weight, response variable, and explanatory
variable. Simulations are conducted to evaluate the performance of the
proposed methodology. An example with real-world economy data is presented
as an illustration.
Journal: Journal of Applied Statistics
Pages: 627-642
Issue: 4
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1072140
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1072140
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:4:p:627-642
Template-Type: ReDIF-Article 1.0
Author-Name: Junying Zhang
Author-X-Name-First: Junying
Author-X-Name-Last: Zhang
Author-Name: Riquan Zhang
Author-X-Name-First: Riquan
Author-X-Name-Last: Zhang
Author-Name: Zhiping Lu
Author-X-Name-First: Zhiping
Author-X-Name-Last: Lu
Title: Quantile-adaptive variable screening in ultra-high dimensional varying coefficient models
Abstract:
The varying-coefficient model is an important nonparametric statistical
model since it allows appreciable flexibility on the structure of fitted
model. For ultra-high dimensional heterogeneous data it is very necessary
to examine how the effects of covariates vary with exposure variables at
different quantile level of interest. In this paper, we extended the
marginal screening methods to examine and select variables by ranking a
measure of nonparametric marginal contributions of each covariate given
the exposure variable. Spline approximations are employed to model
marginal effects and select the set of active variables in
quantile-adaptive framework. This ensures the sure screening property in
quantile-adaptive varying-coefficient model. Numerical studies demonstrate
that the proposed procedure works well for heteroscedastic data.
Journal: Journal of Applied Statistics
Pages: 643-654
Issue: 4
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1072141
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1072141
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:4:p:643-654
Template-Type: ReDIF-Article 1.0
Author-Name: Boryana Bogdanova
Author-X-Name-First: Boryana
Author-X-Name-Last: Bogdanova
Author-Name: Ivan Ivanov
Author-X-Name-First: Ivan
Author-X-Name-Last: Ivanov
Title: A wavelet-based approach to the analysis and modelling of financial time series exhibiting strong long-range dependence: the case of Southeast Europe
Abstract:
This paper demonstrates the utilization of wavelet-based tools for the
analysis and prediction of financial time series exhibiting strong
long-range dependence (LRD). Commonly emerging markets' stock returns are
characterized by LRD. Therefore, we track the LRD evolvement for the
return series of six Southeast European stock indices through the
application of a wavelet-based semi-parametric method. We further engage
the á trous wavelet transform in order to extract deeper knowledge on
the returns term structure and utilize it for prediction purposes. In
particular, a multiscale autoregressive (MAR) model is fitted and its
out-of-sample forecast performance is benchmarked to that of ARMA.
Additionally, a data-driven MAR feature selection procedure is outlined.
We find that the wavelet-based method captures adequately LRD dynamics
both in calm as well as in turmoil periods detecting the presence of
transitional changes. At the same time, the MAR model handles with the
complicated autocorrelation structure implied by the LRD in a parsimonious
way achieving better performance.
Journal: Journal of Applied Statistics
Pages: 655-673
Issue: 4
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1077370
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077370
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:4:p:655-673
Template-Type: ReDIF-Article 1.0
Author-Name: Jan Schepers
Author-X-Name-First: Jan
Author-X-Name-Last: Schepers
Title: On regression modelling with dummy variables versus separate regressions per group: Comment on Holgersson et al.
Abstract:
In a recent issue of this journal, Holgersson et al.
[Dummy variables vs. category-wise models, J. Appl. Stat. 41(2) (2014),
pp. 233--241, doi:10.1080/02664763.2013.838665] compared the use of dummy
coding in regression analysis to the use of category-wise models (i.e.
estimating separate regression models for each group) with respect to
estimating and testing group differences in intercept and in slope. They
presented three objections against the use of dummy variables in a single
regression equation, which could be overcome by the category-wise
approach. In this note, I first comment on each of these three objections
and next draw attention to some other issues in comparing these two
approaches. This commentary further clarifies the differences and
similarities between dummy variable and category-wise approaches.
Journal: Journal of Applied Statistics
Pages: 674-681
Issue: 4
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1077371
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077371
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:4:p:674-681
Template-Type: ReDIF-Article 1.0
Author-Name: S. Zinn
Author-X-Name-First: S.
Author-X-Name-Last: Zinn
Author-Name: A. Würbach
Author-X-Name-First: A.
Author-X-Name-Last: Würbach
Title: A statistical approach to address the problem of heaping in self-reported income data
Abstract:
Self-reported income information particularly suffers from an intentional
coarsening of the data, which is called heaping or rounding. If it does
not occur completely at random -- which is usually the case -- heaping and
rounding have detrimental effects on the results of statistical analysis.
Conventional statistical methods do not consider this kind of reporting
bias, and thus might produce invalid inference. We describe a novel
statistical modeling approach that allows us to deal with self-reported
heaped income data in an adequate and flexible way. We suggest modeling
heaping mechanisms and the true underlying model in combination. To
describe the true net income distribution, we use the zero-inflated
log-normal distribution. Heaping points are identified from the data by
applying a heuristic procedure comparing a hypothetical income
distribution and the empirical one. To determine heaping behavior, we
employ two distinct models: either we assume piecewise constant heaping
probabilities, or heaping probabilities are considered to increase
steadily with proximity to a heaping point. We validate our approach by
some examples. To illustrate the capacity of the proposed method, we
conduct a case study using income data from the German National
Educational Panel Study.
Journal: Journal of Applied Statistics
Pages: 682-703
Issue: 4
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1077372
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077372
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:4:p:682-703
Template-Type: ReDIF-Article 1.0
Author-Name: Yue Zhang
Author-X-Name-First: Yue
Author-X-Name-Last: Zhang
Author-Name: Kiros Berhane
Author-X-Name-First: Kiros
Author-X-Name-Last: Berhane
Title: Dynamic latent trait models with mixed hidden Markov structure for mixed longitudinal outcomes
Abstract:
We propose a general Bayesian joint modeling approach to model mixed
longitudinal outcomes from the exponential family for taking into account
any differential misclassification that may exist among categorical
outcomes. Under this framework, outcomes observed without measurement
error are related to latent trait variables through generalized linear
mixed effect models. The misclassified outcomes are related to the latent
class variables, which represent unobserved real states, using mixed
hidden Markov models (MHMMs). In addition to enabling the estimation of
parameters in prevalence, transition and misclassification probabilities,
MHMMs capture cluster level heterogeneity. A transition modeling structure
allows the latent trait and latent class variables to depend on observed
predictors at the same time period and also on latent trait and latent
class variables at previous time periods for each individual. Simulation
studies are conducted to make comparisons with traditional models in order
to illustrate the gains from the proposed approach. The new approach is
applied to data from the Southern California Children Health Study to
jointly model questionnaire-based asthma state and multiple lung function
measurements in order to gain better insight about the underlying
biological mechanism that governs the inter-relationship between asthma
state and lung function development.
Journal: Journal of Applied Statistics
Pages: 704-720
Issue: 4
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1077373
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077373
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:4:p:704-720
Template-Type: ReDIF-Article 1.0
Author-Name: J. Andrés Christen
Author-X-Name-First: J. Andrés
Author-X-Name-Last: Christen
Author-Name: Bruno Sansó
Author-X-Name-First: Bruno
Author-X-Name-Last: Sansó
Author-Name: Mario Santana-Cibrian
Author-X-Name-First: Mario
Author-X-Name-Last: Santana-Cibrian
Author-Name: Jorge X. Velasco-Hernández
Author-X-Name-First: Jorge X.
Author-X-Name-Last: Velasco-Hernández
Title: Bayesian deconvolution of oil well test data using Gaussian processes
Abstract:
We use Bayesian methods to infer an unobserved function that is convolved
with a known kernel. Our method is based on the assumption that the
function of interest is a Gaussian process and, assuming a particular
correlation structure, the resulting convolution is also a Gaussian
process. This fact is used to obtain inferences regarding the unobserved
process, effectively providing a deconvolution method. We apply the
methodology to the problem of estimating the parameters of an oil
reservoir from well-test pressure data. Here, the unknown process
describes the structure of the well. Applications to data from Mexican oil
wells show very accurate results.
Journal: Journal of Applied Statistics
Pages: 721-737
Issue: 4
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1077374
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077374
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:4:p:721-737
Template-Type: ReDIF-Article 1.0
Author-Name: Ümit Kuvvetli
Author-X-Name-First: Ümit
Author-X-Name-Last: Kuvvetli
Author-Name: Ali Rıza Firuzan
Author-X-Name-First: Ali Rıza
Author-X-Name-Last: Firuzan
Author-Name: Süleyman Alpaykut
Author-X-Name-First: Süleyman
Author-X-Name-Last: Alpaykut
Author-Name: Atakan Gerger
Author-X-Name-First: Atakan
Author-X-Name-Last: Gerger
Title: Determining Six Sigma success factors in Turkey by using structural equation modeling
Abstract:
Since it includes strong statistical and executive techniques, Six Sigma
(SS) succeeded in many countries and different sectors. Especially
successful SS applications of many international companies have increased
the interest of other companies. As a result of this, the number of
implemented SS projects in various countries has increased. Although
successful SS projects are often in mind, the number of failed projects
because of various reasons is not as low as to be ignored. As well as
there are many factors that affect the success level of SS projects, and
these factors vary according to countries. In this study, a survey was
applied to 117 people who have 1 of SS belts in order to determine success
levels of the SS projects in Turkey. By using explanatory factor analysis
and structural equation modeling, critical success factors were
determined. According to the results, project selection and its scope,
quality culture and defining and measuring of metrics were determined as
the top factors that are affecting success levels of SS projects applied
in Turkey. The results of the study were also compared with the results of
similar projects implemented in other countries.
Journal: Journal of Applied Statistics
Pages: 738-753
Issue: 4
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1077375
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077375
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:4:p:738-753
Template-Type: ReDIF-Article 1.0
Author-Name: Andrea Beccarini
Author-X-Name-First: Andrea
Author-X-Name-Last: Beccarini
Title: Bias correction through filtering omitted variables and instruments
Abstract:
This paper proposes a combination of the particle-filter-based method and
the expectation-maximization algorithm (PFEM), in order to filter
unobservable variables and hence, to reduce the omitted variables bias.
Furthermore, I consider as an unobservable variable, an exogenous one that
can be used as an instrument in the instrumental variable (IV)
methodology. The aim is to show that the PFEM is able to eliminate or
reduce both the omitted variable bias and the simultaneous equation bias
by filtering the omitted variable and the unobserved instrument,
respectively. In other words, the procedure provides (at least
approximately) consistent estimates, without using additional information
embedded in the omitted variable or in the instruments, since they are
filtered by the observable variables. The validity of the procedure is
shown both through simulations and through a comparison to an IV analysis
which appeared in an important previous publication. As regards the latter
point, I demonstrate that the procedure developed in this article yields
similar results to those of the original IV analysis.
Journal: Journal of Applied Statistics
Pages: 754-766
Issue: 4
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1077376
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077376
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:4:p:754-766
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaofei Ma
Author-X-Name-First: Xiaofei
Author-X-Name-Last: Ma
Author-Name: Qiuyan Zhong
Author-X-Name-First: Qiuyan
Author-X-Name-Last: Zhong
Title: Missing value imputation method for disaster decision-making using K nearest neighbor
Abstract:
Due to destructiveness of natural disasters, restriction of disaster
scenarios and some human causes, missing data usually occur in disaster
decision-making problems. In order to estimate missing values of
alternatives, this paper focuses on imputing heterogeneous attribute
values of disaster based on an improved K nearest neighbor imputation
(KNNI) method. Firstly, some definitions of trapezoidal fuzzy numbers
(TFNs) are introduced and three types of attributes (i.e. linguistic term
sets, intervals and real numbers) are converted to TFNs. Then the
correlated degree model is utilized to extract related attributes to form
instances that will be used in K nearest neighbor algorithm, and a novel
KNNI method merging with correlated degree model is presented. Finally, an
illustrative example is given to verify the proposed method and to
demonstrate its feasibility and effectiveness.
Journal: Journal of Applied Statistics
Pages: 767-781
Issue: 4
Volume: 43
Year: 2016
Month: 3
X-DOI: 10.1080/02664763.2015.1077377
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077377
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:4:p:767-781
Template-Type: ReDIF-Article 1.0
Author-Name: Hsiu-Wen Chen
Author-X-Name-First: Hsiu-Wen
Author-X-Name-Last: Chen
Author-Name: Weng Kee Wong
Author-X-Name-First: Weng Kee
Author-X-Name-Last: Wong
Author-Name: Hongquan Xu
Author-X-Name-First: Hongquan
Author-X-Name-Last: Xu
Title: Data-driven desirability function to measure patients’ disease progression in a longitudinal study
Abstract:
Multiple outcomes are increasingly used to assess chronic disease
progression. We discuss and show how desirability functions can be used to
assess a patient overall response to a treatment using multiple outcome
measures and each of them may contribute unequally to the final
assessment. Because judgments on disease progression and the relative
contribution of each outcome can be subjective, we propose a data-driven
approach to minimize the biases by using desirability functions with
estimated shapes and weights based on a given gold standard. Our method
provides each patient with a meaningful overall progression score that
facilitates comparison and clinical interpretation. We also extend the
methodology in a novel way to monitor patients’ disease progression
when there are multiple time points and illustrate our method using a
longitudinal data set from a randomized two-arm clinical trial for
scleroderma patients.
Journal: Journal of Applied Statistics
Pages: 783-795
Issue: 5
Volume: 43
Year: 2016
Month: 4
X-DOI: 10.1080/02664763.2015.1077378
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1077378
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:5:p:783-795
Template-Type: ReDIF-Article 1.0
Author-Name: Xiuli Wang
Author-X-Name-First: Xiuli
Author-X-Name-Last: Wang
Author-Name: Mingqiu Wang
Author-X-Name-First: Mingqiu
Author-X-Name-Last: Wang
Title: Variable selection for high-dimensional generalized linear models with the weighted elastic-net procedure
Abstract:
High-dimensional data arise frequently in modern applications such as
biology, chemometrics, economics, neuroscience and other scientific
fields. The common features of high-dimensional data are that many of
predictors may not be significant, and there exists high correlation among
predictors. Generalized linear models, as the generalization of linear
models, also suffer from the collinearity problem. In this paper,
combining the nonconvex penalty and ridge regression, we propose the
weighted elastic-net to deal with the variable selection of generalized
linear models on high dimension and give the theoretical properties of the
proposed method with a diverging number of parameters. The finite sample
behavior of the proposed method is illustrated with simulation studies and
a real data example.
Journal: Journal of Applied Statistics
Pages: 796-809
Issue: 5
Volume: 43
Year: 2016
Month: 4
X-DOI: 10.1080/02664763.2015.1078300
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1078300
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:5:p:796-809
Template-Type: ReDIF-Article 1.0
Author-Name: M.L. Nores
Author-X-Name-First: M.L.
Author-X-Name-Last: Nores
Author-Name: M.P. Díaz
Author-X-Name-First: M.P.
Author-X-Name-Last: Díaz
Title: Bootstrap hypothesis testing in generalized additive models for comparing curves of treatments in longitudinal studies
Abstract:
The study of the effect of a treatment may involve the evaluation of a
variable at a number of moments. When assuming a smooth curve for the mean
response along time, estimation can be afforded by spline regression, in
the context of generalized additive models. The novelty of our work lies
in the construction of hypothesis tests to compare two curves of
treatments in any interval of time for several types of response
variables. The within-subject correlation is not modeled but is considered
to obtain valid inferences by the use of bootstrap. We propose both
semiparametric and nonparametric bootstrap approaches, based on resampling
vectors of residuals or responses, respectively. Simulation studies
revealed a good performance of the tests, considering, for the outcome,
different distribution functions in the exponential family and varying the
correlation between observations along time. We show that the sizes of
bootstrap tests are close to the nominal value, with tests based on a
standardized statistic having slightly better size properties. The power
increases as the distance between curves increases and decreases when
correlation gets higher. The usefulness of these statistical tools was
confirmed using real data, thus allowing to detect changes in fish
behavior when exposed to the toxin microcystin-RR.
Journal: Journal of Applied Statistics
Pages: 810-826
Issue: 5
Volume: 43
Year: 2016
Month: 4
X-DOI: 10.1080/02664763.2015.1078301
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1078301
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:5:p:810-826
Template-Type: ReDIF-Article 1.0
Author-Name: Meng-Ning Lyu
Author-X-Name-First: Meng-Ning
Author-X-Name-Last: Lyu
Author-Name: Qing-Shan Yang
Author-X-Name-First: Qing-Shan
Author-X-Name-Last: Yang
Author-Name: Na Yang
Author-X-Name-First: Na
Author-X-Name-Last: Yang
Author-Name: Siu-Seong Law
Author-X-Name-First: Siu-Seong
Author-X-Name-Last: Law
Title: Tourist number prediction of historic buildings by singular spectrum analysis
Abstract:
A wooden historic building located in Tibet, China, experienced structural
damage when subjected to tourists visit. This kind of ancient building
attends to too many visitors every day because heritage sites never fail
to attract tourists. There should be a balance between accepting the
visitors and the protection of historic buildings considering the
importance of the cultural relics. In this paper, the singular spectrum
analysis (SSA) is used for forecasting the number of tourist for the
building management to exercise maintenance measures to the structure. The
analyzed results can be used to control the tourist flow to avoid
excessive pedestrian loading on the structure. The relationship between
the measured acceleration from the structure and the tourist number is
firstly studied. The root-mean-square (RMS) value of the measured
acceleration in the passage route of the tourist is selected for
forecasting future tourist number. The forecasting results from different
methods are compared. The SSA is found slightly outperforms the
autoregressive integrated moving average model (ARIMA), the X-11-ARIMA
model and the cubic spline extrapolation in terms of the RMS error, mean
absolute error and mean absolute percentage error for long-term
prediction, whereas the opposite is observed for short-term forecasting.
Journal: Journal of Applied Statistics
Pages: 827-846
Issue: 5
Volume: 43
Year: 2016
Month: 4
X-DOI: 10.1080/02664763.2015.1078302
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1078302
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:5:p:827-846
Template-Type: ReDIF-Article 1.0
Author-Name: Leonardo Costa
Author-X-Name-First: Leonardo
Author-X-Name-Last: Costa
Author-Name: Adrian Pizzinga
Author-X-Name-First: Adrian
Author-X-Name-Last: Pizzinga
Author-Name: Rodrigo Atherino
Author-X-Name-First: Rodrigo
Author-X-Name-Last: Atherino
Title: Modeling and predicting IBNR reserve: extended chain ladder and heteroscedastic regression analysis
Abstract:
This work deals with two methodologies for predicting incurred but
not reported (IBNR) actuarial reserves. The first is the
traditional chain ladder, which is extended for dealing with the calendar
year IBNR reserve. The second is based on heteroscedastic regression
models suitable to deal with the tail effect of the runoff triangle -- and
to forecast calendar year IBNR reserves as well. Theoretical results
regarding closed expressions for IBNR predictors and mean squared errors
are established -- for the case of the second methodology, a Monte Carlo
study is designed and implemented for accessing finite sample performances
of feasible mean squared error formulae. Finally, the methods are
implemented with two real data sets. The main conclusions: (i) considering
tail effects does not imply theoretical and/or computational problems; and
(ii) both methodologies are interesting to design softwares for IBNR
reserve prediction.
Journal: Journal of Applied Statistics
Pages: 847-870
Issue: 5
Volume: 43
Year: 2016
Month: 4
X-DOI: 10.1080/02664763.2015.1079305
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1079305
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:5:p:847-870
Template-Type: ReDIF-Article 1.0
Author-Name: Stavros Degiannakis
Author-X-Name-First: Stavros
Author-X-Name-Last: Degiannakis
Author-Name: Alexandra Livada
Author-X-Name-First: Alexandra
Author-X-Name-Last: Livada
Title: Evaluation of realized volatility predictions from models with leptokurtically and asymmetrically distributed forecast errors
Abstract:
Accurate volatility forecasting is a key determinant for portfolio
management, risk management and economic policy. The paper provides
evidence that the sum of squared standardized forecast errors is a
reliable measure for model evaluation when the predicted variable is the
intra-day realized volatility. The forecasting evaluation is valid for
standardized forecast errors with leptokurtic distribution as well as with
leptokurtic and asymmetric distributions. Additionally, the widely applied
forecasting evaluation function, the predicted mean-squared error, fails
to select the adequate model in the case of models with residuals that are
leptokurtically and asymmetrically distributed. Hence, the realized
volatility forecasting evaluation should be based on the standardized
forecast errors instead of their unstandardized version.
Journal: Journal of Applied Statistics
Pages: 871-892
Issue: 5
Volume: 43
Year: 2016
Month: 4
X-DOI: 10.1080/02664763.2015.1079306
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1079306
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:5:p:871-892
Template-Type: ReDIF-Article 1.0
Author-Name: Rosa Aghdam
Author-X-Name-First: Rosa
Author-X-Name-Last: Aghdam
Author-Name: Mojtaba Ganjali
Author-X-Name-First: Mojtaba
Author-X-Name-Last: Ganjali
Author-Name: Parisa Niloofar
Author-X-Name-First: Parisa
Author-X-Name-Last: Niloofar
Author-Name: Changiz Eslahchi
Author-X-Name-First: Changiz
Author-X-Name-Last: Eslahchi
Title: Inferring gene regulatory networks by an order independent algorithm using incomplete data sets
Abstract:
Analyzing incomplete data for inferring the structure of gene regulatory
networks (GRNs) is a challenging task in bioinformatic. Bayesian network
can be successfully used in this field. k-nearest
neighbor, singular value decomposition (SVD)-based and multiple imputation
by chained equations are three fundamental imputation methods to deal with
missing values. Path consistency (PC) algorithm based on conditional
mutual information (PCA--CMI) is a famous algorithm for inferring GRNs.
This algorithm needs the data set to be complete. However, the problem is
that PCA--CMI is not a stable algorithm and when applied on permuted gene
orders, different networks are obtained. We propose an order independent
algorithm, PCA--CMI--OI, for inferring GRNs. After imputation of missing
data, the performances of PCA--CMI and PCA--CMI--OI are compared. Results
show that networks constructed from data imputed by the SVD-based method
and PCA--CMI--OI algorithm outperform other imputation methods and
PCA--CMI. An undirected or partially directed network is resulted by
PC-based algorithms. Mutual information test (MIT) score, which can deal
with discrete data, is one of the famous methods for directing the edges
of resulted networks. We also propose a new score, ConMIT, which is
appropriate for analyzing continuous data. Results shows that the
precision of directing the edges of skeleton is improved by applying the
ConMIT score.
Journal: Journal of Applied Statistics
Pages: 893-913
Issue: 5
Volume: 43
Year: 2016
Month: 4
X-DOI: 10.1080/02664763.2015.1079307
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1079307
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:5:p:893-913
Template-Type: ReDIF-Article 1.0
Author-Name: Ismail Onur Baycan
Author-X-Name-First: Ismail Onur
Author-X-Name-Last: Baycan
Title: The effects of exchange rate regimes on economic growth: evidence from propensity score matching estimates
Abstract:
This is the first study that employs the propensity score matching
framework to examine the average treatment effect of exchange rate regimes
on economic growth. Previous studies examining the effects of different
exchange regimes on growth often apply time series or panel data
techniques and provide mixed results. This study employs a variety of
non-parametric matching methods to address the self-selection problem,
which potentially causes a bias in the traditional linear regressions. We
evaluate the average treatment effect of the floating exchange rate regime
on economic growth in 164 countries. Time period of the quasi experiment
starts in 1970, capturing the collapse of the Bretton Woods fixed exchange
rate commitment system. Results show that the average treatment effect of
floating exchange rate regimes on economic growth is statistically
insignificant. Verifying the results with the Rosenbaum's bounds, our
findings are strong and robust. The research states that there is no
evidence that employing a floating exchange rate regime compared to a
fixed one leads to a higher economic growth for the countries that use
this particular policy.
Journal: Journal of Applied Statistics
Pages: 914-924
Issue: 5
Volume: 43
Year: 2016
Month: 4
X-DOI: 10.1080/02664763.2015.1080669
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1080669
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:5:p:914-924
Template-Type: ReDIF-Article 1.0
Author-Name: Özgür Asar
Author-X-Name-First: Özgür
Author-X-Name-Last: Asar
Author-Name: Ozlem Ilk
Author-X-Name-First: Ozlem
Author-X-Name-Last: Ilk
Title: First-order marginalised transition random effects models with probit link function
Abstract:
Marginalised models, also known as marginally specified models, have
recently become a popular tool for analysis of discrete longitudinal data.
Despite being a novel statistical methodology, these models introduce
complex constraint equations and model fitting algorithms. On the other
hand, there is a lack of publicly available software to fit these models.
In this paper, we propose a three-level marginalised model for analysis of
multivariate longitudinal binary outcome. The implicit function theorem is
introduced to approximately solve the marginal constraint equations
explicitly. probit link enables direct solutions to the
convolution equations. Parameters are estimated by maximum likelihood via
a Fisher--Scoring algorithm. A simulation study is conducted to examine
the finite-sample properties of the estimator. We illustrate the model
with an application to the data set from the Iowa Youth and Families
Project. The R package pnmtrem is prepared to fit
the model.
Journal: Journal of Applied Statistics
Pages: 925-942
Issue: 5
Volume: 43
Year: 2016
Month: 4
X-DOI: 10.1080/02664763.2015.1080670
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1080670
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:5:p:925-942
Template-Type: ReDIF-Article 1.0
Author-Name: Diwei Zhou
Author-X-Name-First: Diwei
Author-X-Name-Last: Zhou
Author-Name: Ian L. Dryden
Author-X-Name-First: Ian L.
Author-X-Name-Last: Dryden
Author-Name: Alexey A. Koloydenko
Author-X-Name-First: Alexey A.
Author-X-Name-Last: Koloydenko
Author-Name: Koenraad M.R. Audenaert
Author-X-Name-First: Koenraad M.R.
Author-X-Name-Last: Audenaert
Author-Name: Li Bai
Author-X-Name-First: Li
Author-X-Name-Last: Bai
Title: Regularisation, interpolation and visualisation of diffusion tensor images using non-Euclidean statistics
Abstract:
Practical statistical analysis of diffusion tensor images is considered,
and we focus primarily on methods that use metrics based on Euclidean
distances between powers of diffusion tensors. First, we describe a family
of anisotropy measures based on a scale invariant power-Euclidean metric,
which are useful for visualisation. Some properties of the measures are
derived and practical considerations are discussed, with some examples.
Second, we discuss weighted Procrustes methods for diffusion tensor
imaging interpolation and smoothing, and we compare methods based on
different metrics on a set of examples as well as analytically. We
establish a key relationship between the principal-square-root-Euclidean
metric and the size-and-shape Procrustes metric on the space of symmetric
positive semi-definite tensors. We explain, both analytically and by
experiments, why the size-and-shape Procrustes metric may be preferred in
practical tasks of interpolation, extrapolation and smoothing, especially
when observed tensors are degenerate or when a moderate degree of tensor
swelling is desirable. Third, we introduce regularisation methodology,
which is demonstrated to be useful for highlighting features of prior
interest and potentially for segmentation. Finally, we compare several
metrics in a data set of human brain diffusion-weighted magnetic resonance
imaging, and point out similarities between several of the non-Euclidean
metrics but important differences with the commonly used Euclidean metric.
Journal: Journal of Applied Statistics
Pages: 943-978
Issue: 5
Volume: 43
Year: 2016
Month: 4
X-DOI: 10.1080/02664763.2015.1080671
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1080671
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:5:p:943-978
Template-Type: ReDIF-Article 1.0
Author-Name: T. Chen
Author-X-Name-First: T.
Author-X-Name-Last: Chen
Author-Name: K. Knox
Author-X-Name-First: K.
Author-X-Name-Last: Knox
Author-Name: J. Arora
Author-X-Name-First: J.
Author-X-Name-Last: Arora
Author-Name: W. Tang
Author-X-Name-First: W.
Author-X-Name-Last: Tang
Author-Name: J. Kowalski
Author-X-Name-First: J.
Author-X-Name-Last: Kowalski
Author-Name: X.M. Tu
Author-X-Name-First: X.M.
Author-X-Name-Last: Tu
Title: Power analysis for clustered non-continuous responses in multicenter trials
Abstract:
Power analysis for multi-center randomized control trials is quite
difficult to perform for non-continuous responses when site differences
are modeled by random effects using the generalized linear mixed-effects
model (GLMM). First, it is not possible to construct power functions
analytically, because of the extreme complexity of the sampling
distribution of parameter estimates. Second, Monte Carlo (MC) simulation,
a popular option for estimating power for complex models, does not work
within the current context because of a lack of methods and software
packages that would provide reliable estimates for fitting such GLMMs. For
example, even statistical packages from software giants like SAS do not
provide reliable estimates at the time of writing. Another major
limitation of MC simulation is the lengthy running time, especially for
complex models such as GLMM, especially when estimating power for multiple
scenarios of interest. We present a new approach to address such
limitations. The proposed approach defines a marginal model to approximate
the GLMM and estimates power without relying on MC simulation. The
approach is illustrated with both real and simulated data, with the
simulation study demonstrating good performance of the method.
Journal: Journal of Applied Statistics
Pages: 979-995
Issue: 6
Volume: 43
Year: 2016
Month: 5
X-DOI: 10.1080/02664763.2015.1089218
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1089218
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:6:p:979-995
Template-Type: ReDIF-Article 1.0
Author-Name: Pao-sheng Shen
Author-X-Name-First: Pao-sheng
Author-X-Name-Last: Shen
Title: Estimation of association parameters in copula models for bivariate left-truncated and right-censored data
Abstract:
We investigate the problem of estimating the association between two
related survival variables when they follow a copula model and bivariate
left-truncated and right-censored data are available. By expressing
truncation probability as the functional of marginal survival functions,
we propose a two-stage estimation procedure for estimating the parameters
of Archimedean copulas. The asymptotic properties of the proposed
estimators are established. Simulation studies are conducted to
investigate the finite sample properties of the proposed estimators. The
proposed method is applied to a bivariate RNA data.
Journal: Journal of Applied Statistics
Pages: 996-1010
Issue: 6
Volume: 43
Year: 2016
Month: 5
X-DOI: 10.1080/02664763.2015.1089219
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1089219
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:6:p:996-1010
Template-Type: ReDIF-Article 1.0
Author-Name: Rijan Shrestha
Author-X-Name-First: Rijan
Author-X-Name-Last: Shrestha
Author-Name: Tomasz Kozlowski
Author-X-Name-First: Tomasz
Author-X-Name-Last: Kozlowski
Title: Inverse uncertainty quantification of input model parameters for thermal-hydraulics simulations using expectation--maximization under Bayesian framework
Abstract:
Quantification of uncertainties in code responses necessitates knowledge
of input model parameter uncertainties. However, nuclear
thermal-hydraulics code such as RELAP5 and TRACE do not provide any
information on input model parameter uncertainties. Moreover, the input
model parameters for physical models in these legacy codes were derived
under steady-state flow conditions and hence might not be accurate to use
in the analysis of transients without accounting for uncertainties. We
present a Bayesian framework to estimate the posterior mode of input model
parameters' mean and variance by implementing the iterative
expectation--maximization algorithm. For this, we introduce the idea of
model parameter multiplier. A log-normal transformation is used to
transform the model parameter multiplier to pseudo-parameter. Our analysis
is based on two main assumptions on pseudo-parameter. First, a first-order
linear relationship is assumed between code responses and
pseudo-parameters. Second, the pseudo-parameters are assumed to be
normally distributed. The problem is formulated to express the scalar
random variable, the difference between experimental result and base
(nominal) code-calculated value as a linear combination of
pseudo-parameters.
Journal: Journal of Applied Statistics
Pages: 1011-1026
Issue: 6
Volume: 43
Year: 2016
Month: 5
X-DOI: 10.1080/02664763.2015.1089220
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1089220
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:6:p:1011-1026
Template-Type: ReDIF-Article 1.0
Author-Name: Bao Yiqi
Author-X-Name-First: Bao
Author-X-Name-Last: Yiqi
Author-Name: Cibele Maria Russo
Author-X-Name-First: Cibele
Author-X-Name-Last: Maria Russo
Author-Name: Vicente G. Cancho
Author-X-Name-First: Vicente G.
Author-X-Name-Last: Cancho
Author-Name: Francisco Louzada
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada
Title: Influence diagnostics for the Weibull-Negative-Binomial regression model with cure rate under latent failure causes
Abstract:
In this paper, we propose a flexible cure rate survival model by assuming
that the number of competing causes of the event of interest follows the
Negative Binomial distribution and the time to event follows a Weibull
distribution. Indeed, we introduce the Weibull-Negative-Binomial (WNB)
distribution, which can be used in order to model survival data when the
hazard rate function is increasing, decreasing and some non-monotonous
shaped. Another advantage of the proposed model is that it has some
distributions commonly used in lifetime analysis as particular cases.
Moreover, the proposed model includes as special cases some of the
well-know cure rate models discussed in the literature. We consider a
frequentist analysis for parameter estimation of a WNB model with cure
rate. Then, we derive the appropriate matrices for assessing local
influence on the parameter estimates under different perturbation schemes
and present some ways to perform global influence analysis. Finally, the
methodology is illustrated on a medical data.
Journal: Journal of Applied Statistics
Pages: 1027-1060
Issue: 6
Volume: 43
Year: 2016
Month: 5
X-DOI: 10.1080/02664763.2015.1089221
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1089221
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:6:p:1027-1060
Template-Type: ReDIF-Article 1.0
Author-Name: Sang Eun Lee
Author-X-Name-First: Sang Eun
Author-X-Name-Last: Lee
Author-Name: Key-Il Shin
Author-X-Name-First: Key-Il
Author-X-Name-Last: Shin
Title: The cut-off point based on underlying distribution and cost function
Abstract:
Cut-off sampling has been widely used for business survey which has the
right-skewed population with a long tail. Several methods are suggested to
obtain the optimal cut-off point. The LH algorithm suggested by Lavallee
and Hidiroglou [6] is commonly used to get the optimum boundaries by
minimizing the total sample size with a given precision. In this paper, we
suggest a new cut-off point determination method which minimizes a cost
function. And that leads to reducing the size of take-all stratum. Also we
investigate an optimal cut-off point using a typical parametric estimation
method under the assumptions of underlying distributions. Small
Monte-Carlo simulation studies are performed in order to compare the new
cut-off point method to the LH algorithm. The Korea Transportation Origin
-- Destination data are used for real data analysis.
Journal: Journal of Applied Statistics
Pages: 1061-1073
Issue: 6
Volume: 43
Year: 2016
Month: 5
X-DOI: 10.1080/02664763.2015.1089222
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1089222
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:6:p:1061-1073
Template-Type: ReDIF-Article 1.0
Author-Name: Guohua Yan
Author-X-Name-First: Guohua
Author-X-Name-Last: Yan
Author-Name: M. Tariqul Hasan
Author-X-Name-First: M. Tariqul
Author-X-Name-Last: Hasan
Author-Name: Renjun Ma
Author-X-Name-First: Renjun
Author-X-Name-Last: Ma
Title: Modeling proportions and marginal counts simultaneously for clustered multinomial data with random cluster sizes
Abstract:
Clustered multinomial data with random cluster sizes commonly appear in
health, environmental and ecological studies. Traditional approaches for
analyzing clustered multinomial data contemplate two assumptions. One of
these assumptions is that cluster sizes are fixed, whereas the other
demands cluster sizes to be positive. Randomness of the cluster sizes may
be the determinant of the within-cluster correlation and between-cluster
variation. We propose a baseline-category mixed model for clustered
multinomial data with random cluster sizes based on Poisson mixed models.
Our orthodox best linear unbiased predictor approach to this model depends
only on the moment structure of unobserved distribution-free random
effects. Our approach also consolidates the marginal and conditional
modeling interpretations. Unlike the traditional methods, our approach can
accommodate both random and zero cluster sizes. Two real-life multinomial
data examples, crime data and food contamination data, are used to
manifest our proposed methodology.
Journal: Journal of Applied Statistics
Pages: 1074-1087
Issue: 6
Volume: 43
Year: 2016
Month: 5
X-DOI: 10.1080/02664763.2015.1089223
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1089223
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:6:p:1074-1087
Template-Type: ReDIF-Article 1.0
Author-Name: B. Martin-Barragan
Author-X-Name-First: B.
Author-X-Name-Last: Martin-Barragan
Author-Name: R.E. Lillo
Author-X-Name-First: R.E.
Author-X-Name-Last: Lillo
Author-Name: J. Romo
Author-X-Name-First: J.
Author-X-Name-Last: Romo
Title: Functional boxplots based on epigraphs and hypographs
Abstract:
Functional boxplot is an attractive technique to visualize data that come
from functions. We propose an alternative to the functional boxplot based
on depth measures. Our proposal generalizes the usual construction of the
box-plot in one dimension related to the down-upward orderings of the data
by considering two intuitive pre-orders in the functional context. These
orderings are based on the epigraphs and hypographs of the data that allow
a new definition of functional quartiles which is more robust to shape
outliers. Simulated and real examples show that this proposal provides a
convenient visualization technique with a great potential for analyzing
functional data and illustrate its usefulness to detect outliers that
other procedures do not detect.
Journal: Journal of Applied Statistics
Pages: 1088-1103
Issue: 6
Volume: 43
Year: 2016
Month: 5
X-DOI: 10.1080/02664763.2015.1092108
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1092108
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:6:p:1088-1103
Template-Type: ReDIF-Article 1.0
Author-Name: T. Chen
Author-X-Name-First: T.
Author-X-Name-Last: Chen
Author-Name: N. Lu
Author-X-Name-First: N.
Author-X-Name-Last: Lu
Author-Name: J. Arora
Author-X-Name-First: J.
Author-X-Name-Last: Arora
Author-Name: I. Katz
Author-X-Name-First: I.
Author-X-Name-Last: Katz
Author-Name: R. Bossarte
Author-X-Name-First: R.
Author-X-Name-Last: Bossarte
Author-Name: H. He
Author-X-Name-First: H.
Author-X-Name-Last: He
Author-Name: Y. Xia
Author-X-Name-First: Y.
Author-X-Name-Last: Xia
Author-Name: H. Zhang
Author-X-Name-First: H.
Author-X-Name-Last: Zhang
Author-Name: X.M. Tu
Author-X-Name-First: X.M.
Author-X-Name-Last: Tu
Title: Power analysis for cluster randomized trials with binary outcomes modeled by generalized linear mixed-effects models
Abstract:
Power analysis for cluster randomized control trials is difficult to
perform when a binary response is modeled using the generalized linear
mixed-effects model (GLMM). Although methods for clustered binary
responses exist such as the generalized estimating equations, they do not
apply to the context of GLMM. Also, because popular statistical packages
such as R and SAS do not provide correct estimates of parameters for the
GLMM for binary responses, Monte Carlo simulation, a popular ad-hoc method
for estimating power when the power function is too complex to evaluate
analytically or numerically, fails to provide correct power estimates
within the current context as well. In this paper, a new approach is
developed to estimate power for cluster randomized control trials when a
binary response is modeled by the GLMM. The approach is easy to implement
and seems to work quite well, as assessed by simulation studies. The
approach is illustrated with a real intervention study to reduce suicide
reattempt rates among US Veterans.
Journal: Journal of Applied Statistics
Pages: 1104-1118
Issue: 6
Volume: 43
Year: 2016
Month: 5
X-DOI: 10.1080/02664763.2015.1092109
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1092109
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:6:p:1104-1118
Template-Type: ReDIF-Article 1.0
Author-Name: Kristofer Månsson
Author-X-Name-First: Kristofer
Author-X-Name-Last: Månsson
Author-Name: B.M. Golam Kibria
Author-X-Name-First: B.M.
Author-X-Name-Last: Golam Kibria
Author-Name: Ghazi Shukur
Author-X-Name-First: Ghazi
Author-X-Name-Last: Shukur
Title: A restricted Liu estimator for binary regression models and its application to an applied demand system
Abstract:
In this article, we propose a restricted Liu regression estimator (RLRE)
for estimating the parameter vector, β, in the
presence of multicollinearity, when the dependent variable is binary and
it is suspected that β may belong to a linear
subspace defined by
Rβ = r. First, we investigate the mean squared error (MSE) properties of the
new estimator and compare them with those of the restricted maximum
likelihood estimator (RMLE). Then we suggest some estimators of the
shrinkage parameter, and a simulation study is conducted to compare the
performance of the different estimators. Finally, we show the benefit of
using RLRE instead of RMLE when estimating how changes in price affect
consumer demand for a specific product.
Journal: Journal of Applied Statistics
Pages: 1119-1127
Issue: 6
Volume: 43
Year: 2016
Month: 5
X-DOI: 10.1080/02664763.2015.1092110
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1092110
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:6:p:1119-1127
Template-Type: ReDIF-Article 1.0
Author-Name: Himadri Ghosh
Author-X-Name-First: Himadri
Author-X-Name-Last: Ghosh
Author-Name: S. Chowdhury
Author-X-Name-First: S.
Author-X-Name-Last: Chowdhury
Author-Name: Prajneshu
Author-X-Name-First:
Author-X-Name-Last: Prajneshu
Title: An improved fuzzy time-series method of forecasting based on L--R fuzzy sets and its application
Abstract:
Classical time-series theory assumes values of the response variable to be
‘crisp’ or ‘precise’, which is quite often
violated in reality. However, forecasting of such data can be carried out
through fuzzy time-series analysis. This article presents an improved
method of forecasting based on L--R
fuzzy sets as membership functions. As an illustration, the methodology is
employed for forecasting India's total foodgrain production. For the data
under consideration, superiority of proposed method over other competing
methods is demonstrated in respect of modelling and forecasting on the
basis of mean square error and average relative error criteria. Finally,
out-of-sample forecasts are also obtained.
Journal: Journal of Applied Statistics
Pages: 1128-1139
Issue: 6
Volume: 43
Year: 2016
Month: 5
X-DOI: 10.1080/02664763.2015.1092111
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1092111
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:6:p:1128-1139
Template-Type: ReDIF-Article 1.0
Author-Name: Ayça Çakmak Pehlivanlı
Author-X-Name-First: Ayça Çakmak
Author-X-Name-Last: Pehlivanlı
Title: A novel feature selection scheme for high-dimensional data sets: four-Staged Feature Selection
Abstract:
Classification of high-dimensional data set is a big challenge for
statistical learning and data mining algorithms. To effectively apply
classification methods to high-dimensional data sets, feature selection is
an indispensable pre-processing step of learning process. In this study,
we consider the problem of constructing an effective feature selection and
classification scheme for data set which has a small number of sample size
with a large number of features. A novel feature selection approach, named
four-Staged Feature Selection, has been proposed to overcome
high-dimensional data classification problem by selecting informative
features. The proposed method first selects candidate features with number
of filtering methods which are based on different metrics, and then it
applies semi-wrapper, union and voting stages, respectively, to obtain
final feature subsets. Several statistical learning and data mining
methods have been carried out to verify the efficiency of the selected
features. In order to test the adequacy of the proposed method, 10
different microarray data sets are employed due to their high number of
features and small sample size.
Journal: Journal of Applied Statistics
Pages: 1140-1154
Issue: 6
Volume: 43
Year: 2016
Month: 5
X-DOI: 10.1080/02664763.2015.1092112
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1092112
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:6:p:1140-1154
Template-Type: ReDIF-Article 1.0
Author-Name: E.F. Saraiva
Author-X-Name-First: E.F.
Author-X-Name-Last: Saraiva
Author-Name: A.K. Suzuki
Author-X-Name-First: A.K.
Author-X-Name-Last: Suzuki
Author-Name: F. Louzada
Author-X-Name-First: F.
Author-X-Name-Last: Louzada
Author-Name: L.A. Milan
Author-X-Name-First: L.A.
Author-X-Name-Last: Milan
Title: Partitioning gene expression data by data-driven Markov chain Monte Carlo
Abstract:
In this paper we introduce a Bayesian mixture model with an unknown number
of components for partitioning gene expression data. Inferences about all
the unknown parameters involved are made by using the proposed data-driven
Markov chain Monte Carlo. This algorithm is essentially a
Metropolis--Hastings within Gibbs sampling. The Metropolis--Hastings is
performed to change the number of partitions k in the
neighborhood and
using a pair of
split-merge moves. Our strategy for splitting is based on data in which
allocation probabilities are calculated based on marginal likelihood
function from the previously allocated observations. Conditional on
k, the partitions labels are updated via Gibbs sampling.
The two main advantages of the proposed algorithm is that it is easy to be
implemented and the acceptance probability for split-merge movements
depends only on the observed data. We examine the performance of the
proposed algorithm on simulated data and then analyze two publicly
available gene expression data sets.
Journal: Journal of Applied Statistics
Pages: 1155-1173
Issue: 6
Volume: 43
Year: 2016
Month: 5
X-DOI: 10.1080/02664763.2015.1092113
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1092113
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:6:p:1155-1173
Template-Type: ReDIF-Article 1.0
Author-Name: Getachew A. Dagne
Author-X-Name-First: Getachew A.
Author-X-Name-Last: Dagne
Title: A growth mixture Tobit model: application to AIDS studies
Abstract:
This paper presents an alternative analysis approach to modeling data
where a lower detection limit (LOD) and unobserved population
heterogeneity exist in a longitudinal data set. Longitudinal data on viral
loads in HIV/AIDS studies, for instance, show strong positive skewness and
left-censoring. Normalizing such data using a logarithmic transformation
seems to be unsuccessful. An alternative to such a transformation is to
use a finite mixture model which is suitable for analyzing data which have
skewed or multi-modal distributions. There is little work done to
simultaneously take into account these features of longitudinal data. This
paper develops a growth mixture Tobit model that deals with a LOD and
heterogeneity among growth trajectories. The proposed methods are
illustrated using simulated and real data from an AIDS clinical study.
Journal: Journal of Applied Statistics
Pages: 1174-1185
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1092114
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1092114
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1174-1185
Template-Type: ReDIF-Article 1.0
Author-Name: Tobias Voigt
Author-X-Name-First: Tobias
Author-X-Name-Last: Voigt
Author-Name: Roland Fried
Author-X-Name-First: Roland
Author-X-Name-Last: Fried
Author-Name: Wolfgang Rhode
Author-X-Name-First: Wolfgang
Author-X-Name-Last: Rhode
Author-Name: Fabian Temme
Author-X-Name-First: Fabian
Author-X-Name-Last: Temme
Title: Distance-based variable generation with applications to the FACT experiment
Abstract:
We introduce a new way to construct variables for classification in a
setting of astronomy. The newly constructed variables complement the
currently used Hillas parameters and are specifically designed to improve
the classification. They are based on fitting elliptic or skewed bivariate
distributions to images gathered by imaging atmospheric Cherenkov
telescopes and evaluating the distance between the observed and the fitted
distribution. As distance measures we use the Chi-square distance, the
Kullback--Leibler divergence and the Hellinger distance. The new variables
lead to an improved classification in terms of misclassification errors.
Journal: Journal of Applied Statistics
Pages: 1186-1197
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1092115
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1092115
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1186-1197
Template-Type: ReDIF-Article 1.0
Author-Name: Oscar O. Melo
Author-X-Name-First: Oscar O.
Author-X-Name-Last: Melo
Author-Name: Jorge Mateu
Author-X-Name-First: Jorge
Author-X-Name-Last: Mateu
Author-Name: Carlos E. Melo
Author-X-Name-First: Carlos E.
Author-X-Name-Last: Melo
Title: A generalised linear space--time autoregressive model with space--time autoregressive disturbances
Abstract:
We present a solution to problems where the response variable is a count,
a rate or binary using a generalised linear space--time autoregressive
model with space--time autoregressive disturbances (GLSTARAR). The
possibility to test the fixed effect specification against the random
effect specification of the panel data model is extended to include
space--time error autocorrelation or a space--time lagged dependent
variable. Space-time generalised estimating equations are used to estimate
the spatio-temporal parameters in the model. We also present a measure of
goodness of fit, and show the pseudo-best linear unbiased predictor for
prediction purposes. Additionally, we propose a joint space--time
modelling of mean and dispersion to give a solution when the variance is
not constant. In the application, we use social, economic, geographic and
state presence variables for 32 Colombian departments in order to analyse
the relationship between the number of armed actions (AAs) per
1000 km committed by
the guerrillas of the FARC-EP and ELN during the years 2003--2009, and a
set of covariates given by attention rate to victims of violence, forced
displacement-households expelled, forced displacement-households received,
total armed confrontations per year, number of AAs by military forces and
percentage of people living in urban area.
Journal: Journal of Applied Statistics
Pages: 1198-1225
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1092506
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1092506
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1198-1225
Template-Type: ReDIF-Article 1.0
Author-Name: Chiara Bocci
Author-X-Name-First: Chiara
Author-X-Name-Last: Bocci
Author-Name: Emilia Rocco
Author-X-Name-First: Emilia
Author-X-Name-Last: Rocco
Title: Modelling the location decisions of manufacturing firms with a spatial point process approach
Abstract:
The paper is devoted to explore how the increasing availability of spatial
micro-data, jointly with the diffusion of GIS software, allows to exploit
micro-econometric methods based on stochastic spatial point processes in
order to understand the factors that may influence the location decisions
of new firms. By using the knowledge of the geographical coordinates of
the newborn firms, their spatial distribution is treated as a realization
of an inhomogeneous marked point process in the continuous space and the
effect of spatial-varying factors on the location decisions is evaluated
by parametrically modelling the intensity of the process. The study is
motivated by the real issue of analysing the birth process of small and
medium manufacturing firms in Tuscany, an Italian region, and it shows
that the location choices of the new Tuscan firms is influenced on the one
hand by the availability of infrastructures and the level of
accessibility, and on the other by the presence and the characteristics of
the existing firms. Moreover, the effect of these factors varies with the
size and the level of technology of the new firms. Besides the specific
Tuscan result, the study shows the potentiality of the described
micro-econometric approach for the analysis of the spatial dynamics of
firms.
Journal: Journal of Applied Statistics
Pages: 1226-1239
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1093612
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1093612
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1226-1239
Template-Type: ReDIF-Article 1.0
Author-Name: Yuzhu Tian
Author-X-Name-First: Yuzhu
Author-X-Name-Last: Tian
Author-Name: Manlai Tang
Author-X-Name-First: Manlai
Author-X-Name-Last: Tang
Author-Name: Maozai Tian
Author-X-Name-First: Maozai
Author-X-Name-Last: Tian
Title: A class of finite mixture of quantile regressions with its applications
Abstract:
Mixture of linear regression models provide a popular treatment for
modeling nonlinear regression relationship. The traditional estimation of
mixture of regression models is based on Gaussian error assumption. It is
well known that such assumption is sensitive to outliers and extreme
values. To overcome this issue, a new class of finite mixture of quantile
regressions (FMQR) is proposed in this article. Compared with the existing
Gaussian mixture regression models, the proposed FMQR model can provide a
complete specification on the conditional distribution of response
variable for each component. From the likelihood point of view, the FMQR
model is equivalent to the finite mixture of regression models based on
errors following asymmetric Laplace distribution (ALD), which can be
regarded as an extension to the traditional mixture of regression models
with normal error terms. An EM algorithm is proposed to obtain the
parameter estimates of the FMQR model by combining a hierarchical
representation of the ALD. Finally, the iterated weighted least square
estimation for each mixture component of the FMQR model is derived.
Simulation studies are conducted to illustrate the finite sample
performance of the estimation procedure. Analysis of an aphid data set is
used to illustrate our methodologies.
Journal: Journal of Applied Statistics
Pages: 1240-1252
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1094035
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1094035
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1240-1252
Template-Type: ReDIF-Article 1.0
Author-Name: Steven B. Caudill
Author-X-Name-First: Steven B.
Author-X-Name-Last: Caudill
Author-Name: Franklin G. Mixon
Author-X-Name-First: Franklin G.
Author-X-Name-Last: Mixon
Title: Estimating class-specific parametric models using finite mixtures: an application to a hedonic model of wine prices
Abstract:
Hedonic price models are commonly used in the study of markets for various
goods, most notably those for wine, art, and jewelry. These models were
developed to estimate implicit prices of product attributes
within a given product class, where in the case of some
goods, such as wine, substantial product differentiation exists. To
address this issue, recent research on wine prices employs local
polynomial regression clustering (LPRC) for estimating regression models
under class uncertainty. This study demonstrates that a superior empirical
approach -- estimation of a mixture model -- is applicable to a hedonic
model of wine prices, provided only that the dependent variable in the
model is rescaled. The present study also catalogues several of the
advantages over LPRC modeling of estimating mixture models.
Journal: Journal of Applied Statistics
Pages: 1253-1261
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1094036
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1094036
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1253-1261
Template-Type: ReDIF-Article 1.0
Author-Name: Jimoh Olawale Ajadi
Author-X-Name-First: Jimoh Olawale
Author-X-Name-Last: Ajadi
Author-Name: Muhammad Riaz
Author-X-Name-First: Muhammad
Author-X-Name-Last: Riaz
Author-Name: Khalid Al-Ghamdi
Author-X-Name-First: Khalid
Author-X-Name-Last: Al-Ghamdi
Title: On increasing the sensitivity of mixed EWMA--CUSUM control charts for location parameter
Abstract:
Control chart is an important statistical technique that is used to
monitor the quality of a process. Shewhart control charts are used to
detect larger disturbances in the process parameters, whereas cumulative
sum (CUSUM) and exponential weighted moving average (EWMA) are meant for
smaller and moderate changes. In this study, we enhanced mixed EWMA--CUSUM
control charts with varying fast initial response (FIR) features and also
with a runs rule of two out of three successive points that fall above the
upper control limit. We investigate their run-length properties. The
proposed control charting schemes are compared with the existing
counterparts including classical CUSUM, classical EWMA, FIR CUSUM, FIR
EWMA, mixed EWMA--CUSUM, 2/3 modified EWMA, and 2/3 CUSUM control charting
schemes. A case study is presented for practical considerations using a
real data set.
Journal: Journal of Applied Statistics
Pages: 1262-1278
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1094453
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1094453
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1262-1278
Template-Type: ReDIF-Article 1.0
Author-Name: Guogen Shan
Author-X-Name-First: Guogen
Author-X-Name-Last: Shan
Title: Exact confidence intervals for randomized response strategies
Abstract:
For surveys with sensitive questions, randomized response sampling
strategies are often used to increase the response rate and encourage
participants to provide the truth of the question while participants'
privacy and confidentiality are protected. The proportion of responding
‘yes’ to the sensitive question is the parameter of
interest. Asymptotic confidence intervals for this proportion are
calculated from the limiting distribution of the test statistic, and are
traditionally used in practice for statistical inference. It is well known
that these intervals do not guarantee the coverage probability. For this
reason, we apply the exact approach, adjusting the critical value as in
[10], to construct the exact confidence interval of the proportion based
on the likelihood ratio test and three Wilson-type tests. Two randomized
response sampling strategies are studied: the Warner model and the
unrelated model. The exact interval based on the likelihood ratio test has
shorter average length than others when the probability of the sensitive
question is low. Exact Wilson intervals have good performance in other
cases. A real example from a survey study is utilized to illustrate the
application of these exact intervals.
Journal: Journal of Applied Statistics
Pages: 1279-1290
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1094454
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1094454
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1279-1290
Template-Type: ReDIF-Article 1.0
Author-Name: M. Roth
Author-X-Name-First: M.
Author-X-Name-Last: Roth
Author-Name: G. Jongbloed
Author-X-Name-First: G.
Author-X-Name-Last: Jongbloed
Author-Name: T.A. Buishand
Author-X-Name-First: T.A.
Author-X-Name-Last: Buishand
Title: Threshold selection for regional peaks-over-threshold data
Abstract:
A hurdle in the peaks-over-threshold approach for analyzing extreme values
is the selection of the threshold. A method is developed to reduce this
obstacle in the presence of multiple, similar data samples. This is for
instance the case in many environmental applications. The idea is to
combine threshold selection methods into a regional method. Regionalized
versions of the threshold stability and the mean excess plot are presented
as graphical tools for threshold selection. Moreover, quantitative
approaches based on the bootstrap distribution of the spatially averaged
Kolmogorov--Smirnov and Anderson--Darling test statistics are introduced.
It is demonstrated that the proposed regional method leads to an increased
sensitivity for too low thresholds, compared to methods that do not take
into account the regional information. The approach can be used for a wide
range of univariate threshold selection methods. We test the methods using
simulated data and present an application to rainfall data from the Dutch
water board Vallei en Veluwe.
Journal: Journal of Applied Statistics
Pages: 1291-1309
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1100589
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1100589
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1291-1309
Template-Type: ReDIF-Article 1.0
Author-Name: Marcel de Toledo Vieira
Author-X-Name-First: Marcel de Toledo
Author-X-Name-Last: Vieira
Author-Name: Maria de Fátima Salgueiro
Author-X-Name-First: Maria de Fátima
Author-X-Name-Last: Salgueiro
Author-Name: Peter W. F. Smith
Author-X-Name-First: Peter W. F.
Author-X-Name-Last: Smith
Title: Investigating impacts of complex sampling on latent growth curve modelling
Abstract:
We investigate the impacts of complex sampling on point and standard error
estimates in latent growth curve modelling of survey data. Methodological
issues are illustrated with empirical evidence from the analysis of
longitudinal data on life satisfaction trajectories using data from the
British Household Panel Survey, a national representative survey in Great
Britain. A multi-process second-order latent growth curve model with
conditional linear growth is used to study variation in the two perceived
life satisfaction latent factors considered. The benefits of accounting
for the complex survey design are considered, including obtaining unbiased
both point and standard error estimates, and therefore correctly specified
confidence intervals and statistical tests. We conclude that, even for the
rather elaborated longitudinal data models that were considered,
estimation procedures are affected by variance-inflating impacts of
complex sampling.
Journal: Journal of Applied Statistics
Pages: 1310-1321
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1100590
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1100590
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1310-1321
Template-Type: ReDIF-Article 1.0
Author-Name: Husam Awni Bayoud
Author-X-Name-First: Husam Awni
Author-X-Name-Last: Bayoud
Title: Testing the similarity of two normal populations with application to the bioequivalence problem
Abstract:
The problem of testing the similarity of two normal populations is
reconsidered, in this article, from a nonclassical point of view. We
introduce a test statistic based on the maximum likelihood estimate of
Weitzman's overlapping coefficient. Simulated critical points are provided
for the proposed test for various sample sizes and significance levels.
Statistical powers of the proposed test are computed via simulation
studies and compared to those of the existing tests. Furthermore, Type-I
error robustness of the proposed and the existing tests are studied via
simulation studies when the underlying distributions are non-normal. Two
data sets are analyzed for illustration purposes. Finally, the proposed
test has been implemented to assess the bioequivalence of two drug
formulations.
Journal: Journal of Applied Statistics
Pages: 1322-1334
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1100591
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1100591
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1322-1334
Template-Type: ReDIF-Article 1.0
Author-Name: Firoozeh Rivaz
Author-X-Name-First: Firoozeh
Author-X-Name-Last: Rivaz
Title: Optimal network design for Bayesian spatial prediction of multivariate non-Gaussian environmental data
Abstract:
This paper deals with the problem of increasing air pollution monitoring
stations in Tehran city for efficient spatial prediction. As the data are
multivariate and skewed, we introduce two multivariate skew models through
developing the univariate skew Gaussian random field proposed by Zareifard
and Jafari Khaledi [21]. These models provide extensions of the linear
model of coregionalization for non-Gaussian data. In the Bayesian
framework, the optimal network design is found based on the maximum
entropy criterion. A Markov chain Monte Carlo algorithm is developed to
implement posterior inference. Finally, the applicability of two proposed
models is demonstrated by analyzing an air pollution data set.
Journal: Journal of Applied Statistics
Pages: 1335-1348
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1100592
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1100592
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1335-1348
Template-Type: ReDIF-Article 1.0
Author-Name: Jan Lasek
Author-X-Name-First: Jan
Author-X-Name-Last: Lasek
Author-Name: Zoltán Szlávik
Author-X-Name-First: Zoltán
Author-X-Name-Last: Szlávik
Author-Name: Marek Gagolewski
Author-X-Name-First: Marek
Author-X-Name-Last: Gagolewski
Author-Name: Sandjai Bhulai
Author-X-Name-First: Sandjai
Author-X-Name-Last: Bhulai
Title: How to improve a team's position in the FIFA ranking? A simulation study
Abstract:
In this paper, we study the efficacy of the official ranking for
international football teams compiled by FIFA, the body governing football
competition around the globe. We present strategies for improving a team's
position in the ranking. By combining several statistical techniques, we
derive an objective function in a decision problem of optimal scheduling
of future matches. The presented results display how a team's position can
be improved. Along the way, we compare the official procedure to the
famous Elo rating system. Although it originates from chess, it has been
successfully tailored to ranking football teams as well.
Journal: Journal of Applied Statistics
Pages: 1349-1368
Issue: 7
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1100593
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1100593
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:7:p:1349-1368
Template-Type: ReDIF-Article 1.0
Author-Name: Baba B. Alhaji
Author-X-Name-First: Baba B.
Author-X-Name-Last: Alhaji
Author-Name: Hongsheng Dai
Author-X-Name-First: Hongsheng
Author-X-Name-Last: Dai
Author-Name: Yoshiko Hayashi
Author-X-Name-First: Yoshiko
Author-X-Name-Last: Hayashi
Author-Name: Veronica Vinciotti
Author-X-Name-First: Veronica
Author-X-Name-Last: Vinciotti
Author-Name: Andrew Harrison
Author-X-Name-First: Andrew
Author-X-Name-Last: Harrison
Author-Name: Berthold Lausen
Author-X-Name-First: Berthold
Author-X-Name-Last: Lausen
Title: Bayesian analysis for mixtures of discrete distributions with a non-parametric component
Abstract:
Bayesian finite mixture modelling is a flexible parametric modelling
approach for classification and density fitting. Many areas of application
require distinguishing a signal from a
noise component. In practice, it is often difficult to
justify a specific distribution for the signal component;
therefore, the signal distribution is usually further
modelled via a mixture of distributions. However, modelling the
signal as a mixture of distributions is computationally
non-trivial due to the difficulties in justifying the exact number of
components to be used and due to the label switching problem. This paper
proposes the use of a non-parametric distribution to model the
signal component. We consider the case of discrete data
and show how this new methodology leads to more accurate parameter
estimation and smaller false non-discovery rate. Moreover, it does not
incur the label switching problem. We show an application of the method to
data generated by ChIP-sequencing experiments.
Journal: Journal of Applied Statistics
Pages: 1369-1385
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1100594
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1100594
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1369-1385
Template-Type: ReDIF-Article 1.0
Author-Name: Fedya Telmoudi
Author-X-Name-First: Fedya
Author-X-Name-Last: Telmoudi
Author-Name: Mohamed EL Ghourabi
Author-X-Name-First: Mohamed
Author-X-Name-Last: EL Ghourabi
Author-Name: Mohamed Limam
Author-X-Name-First: Mohamed
Author-X-Name-Last: Limam
Title: On conditional risk estimation considering model risk
Abstract:
Usually, parametric procedures used for conditional variance modelling are
associated with model risk. Model risk may affect the volatility and
conditional value at risk estimation process either due to estimation or
misspecification risks. Hence, non-parametric artificial intelligence
models can be considered as alternative models given that they do not rely
on an explicit form of the volatility. In this paper, we consider the
least-squares support vector regression (LS-SVR), weighted LS-SVR and
Fixed size LS-SVR models in order to handle the problem of conditional
risk estimation taking into account issues of model risk. A simulation
study and a real application show the performance of proposed volatility
and VaR models.
Journal: Journal of Applied Statistics
Pages: 1386-1399
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1100595
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1100595
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1386-1399
Template-Type: ReDIF-Article 1.0
Author-Name: M.J. Ershadi
Author-X-Name-First: M.J.
Author-X-Name-Last: Ershadi
Author-Name: R. Noorossana
Author-X-Name-First: R.
Author-X-Name-Last: Noorossana
Author-Name: S.T.A Niaki
Author-X-Name-First: S.T.A
Author-X-Name-Last: Niaki
Title: Economic-statistical design of simple linear profiles with variable sampling interval
Abstract:
Control charts are statistical tools to monitor a process or a product.
However, some processes cannot be controlled by monitoring a
characteristic; instead, they need to be monitored using profiles.
Economic-statistical design of profile monitoring means determining the
parameters of a profile monitoring scheme such that total costs are
minimized while statistical measures maintain proper values. While varying
sampling interval usually increases the effectiveness of profile
monitoring, economic-statistical design of variable sampling interval
(VSI) profile monitoring is investigated in this paper. An extended
Lorenzen--Vance function is used for modeling total costs in VSI model
where the average time to signal is employed for depicting the statistical
measure of the obtained profile monitoring scheme. Two sampling intervals;
number of set points and the parameters of control charts that are used in
profile monitoring are the variables that are obtained thorough the
economic-statistical model. A genetic algorithm is employed to optimize
the model and an experimental design approach is used for tuning its
parameters. Sensitivity analysis and numerical results indicate
satisfactory performance for the proposed model.
Journal: Journal of Applied Statistics
Pages: 1400-1418
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1103705
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1103705
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1400-1418
Template-Type: ReDIF-Article 1.0
Author-Name: J. Machalová
Author-X-Name-First: J.
Author-X-Name-Last: Machalová
Author-Name: K. Hron
Author-X-Name-First: K.
Author-X-Name-Last: Hron
Author-Name: G.S. Monti
Author-X-Name-First: G.S.
Author-X-Name-Last: Monti
Title: Preprocessing of centred logratio transformed density functions using smoothing splines
Abstract:
With large-scale database systems, statistical analysis of data, occurring
in the form of probability distributions, becomes an important task in
explorative data analysis. Nevertheless, due to specific properties of
density functions, their proper statistical treatment of these data still
represents a challenging task in functional data analysis. Namely, the
usual metric does not
fully accounts for the relative character of information, carried by
density functions; instead, their geometrical features are captured by
Bayes spaces of measures. The easiest possibility of expressing density
functions in an space is to use
centred logratio transformation, even though this results in functional
data with a constant integral constraint that needs to be taken into
account in further analysis. While theoretical background for reasonable
analysis of density functions is already provided comprehensively by Bayes
spaces themselves, preprocessing issues still need to be developed. The
aim of this paper is to introduce optimal smoothing splines for centred
logratio transformed density functions that take all their specific
features into account and provide a concise methodology for reasonable
preprocessing of raw (discretized) distributional observations.
Theoretical developments are illustrated with a real-world data set from
official statistics and with a simulation study.
Journal: Journal of Applied Statistics
Pages: 1419-1435
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1103706
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1103706
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1419-1435
Template-Type: ReDIF-Article 1.0
Author-Name: Ndéné Ka
Author-X-Name-First: Ndéné
Author-X-Name-Last: Ka
Author-Name: Stéphane Mussard
Author-X-Name-First: Stéphane
Author-X-Name-Last: Mussard
Title: ℓ1 regressions: Gini estimators for fixed effects panel data
Abstract:
Panel data, frequently employed in empirical investigations, provide
estimators being strongly biased in the presence of atypical observations.
The aim of this work is to propose a Gini regression
for panel data. It is shown that the fixed effects within-group Gini
estimator is more robust than the ordinary least squares one when the data
are contaminated by outliers. This semi-parametric Gini estimator is
proven to be an U-statistics, consequently, it is
asymptotically normal.
Journal: Journal of Applied Statistics
Pages: 1436-1446
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1103707
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1103707
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1436-1446
Template-Type: ReDIF-Article 1.0
Author-Name: Miran A. Jaffa
Author-X-Name-First: Miran A.
Author-X-Name-Last: Jaffa
Author-Name: Mulugeta Gebregziabher
Author-X-Name-First: Mulugeta
Author-X-Name-Last: Gebregziabher
Author-Name: Deirdre K. Luttrell
Author-X-Name-First: Deirdre K.
Author-X-Name-Last: Luttrell
Author-Name: Louis M. Luttrell
Author-X-Name-First: Louis M.
Author-X-Name-Last: Luttrell
Author-Name: Ayad A. Jaffa
Author-X-Name-First: Ayad A.
Author-X-Name-Last: Jaffa
Title: Multivariate generalized linear mixed models with random intercepts to analyze cardiovascular risk markers in type-1 diabetic patients
Abstract:
Statistical approaches tailored to analyzing longitudinal data that have
multiple outcomes with different distributions are scarce. This paucity is
due to the non-availability of multivariate distributions that jointly
model outcomes with different distributions other than the multivariate
normal. A plethora of research has been done on the specific combination
of binary-Gaussian bivariate outcomes but a more general approach that
allows other mixtures of distributions for multiple longitudinal outcomes
has not been thoroughly demonstrated and examined. Here, we study a
multivariate generalized linear mixed models approach that jointly models
multiple longitudinal outcomes with different combinations of
distributions and incorporates the correlations between the various
outcomes through separate yet correlated random intercepts. Every outcome
is linked to the set of covariates through a proper link function that
allows the incorporation and joint modeling of different distributions. A
novel application was demonstrated on a cohort study of Type-1 diabetic
patients to jointly model a mix of longitudinal cardiovascular outcomes
and to explore for the first time the effect of glycemic control
treatment, plasma prekallikrein biomarker, gender and age on
cardiovascular risk factors collectively.
Journal: Journal of Applied Statistics
Pages: 1447-1464
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1103708
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1103708
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1447-1464
Template-Type: ReDIF-Article 1.0
Author-Name: Sylvain Robbiano
Author-X-Name-First: Sylvain
Author-X-Name-Last: Robbiano
Author-Name: Matthieu Saumard
Author-X-Name-First: Matthieu
Author-X-Name-Last: Saumard
Author-Name: Michel Curé
Author-X-Name-First: Michel
Author-X-Name-Last: Curé
Title: Improving prediction performance of stellar parameters using functional models
Abstract:
This paper investigates the problem of prediction of stellar parameters,
based on the star's electromagnetic spectrum. The knowledge of these
parameters permits to infer on the evolutionary state of the star. From a
statistical point of view, the spectra of different stars can be
represented as functional data. Therefore, a two-step procedure
decomposing the spectra in a functional basis combined with a regression
method of prediction is proposed. We also use a bootstrap methodology to
build prediction intervals for the stellar parameters. A practical
application is also provided to illustrate the numerical performance of
our approach.
Journal: Journal of Applied Statistics
Pages: 1465-1476
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1106448
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1106448
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1465-1476
Template-Type: ReDIF-Article 1.0
Author-Name: Soumya Roy
Author-X-Name-First: Soumya
Author-X-Name-Last: Roy
Author-Name: Chiranjit Mukhopadhyay
Author-X-Name-First: Chiranjit
Author-X-Name-Last: Mukhopadhyay
Title: Bayesian D-optimal Accelerated Life Test plans for series systems with competing exponential causes of failure
Abstract:
This paper provides methods of obtaining Bayesian
D-optimal Accelerated Life Test (ALT) plans for series
systems with independent exponential component lives under the Type-I
censoring scheme. Two different Bayesian D-optimality
design criteria are considered. For both the criteria, first optimal
designs for a given number of experimental points are found by solving a
finite-dimensional constrained optimization problem. Next, the global
optimality of such an ALT plan is ensured by applying the General
Equivalence Theorem. A detailed sensitivity analysis is also carried out
to investigate the effect of different planning inputs on the resulting
optimal ALT plans. Furthermore, these Bayesian optimal plans are also
compared with the corresponding (frequentist) locally
D-optimal ALT plans.
Journal: Journal of Applied Statistics
Pages: 1477-1493
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1106449
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1106449
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1477-1493
Template-Type: ReDIF-Article 1.0
Author-Name: Richard J. Cebula
Author-X-Name-First: Richard J.
Author-X-Name-Last: Cebula
Author-Name: Fiorentina Angjellari-Dajci
Author-X-Name-First: Fiorentina
Author-X-Name-Last: Angjellari-Dajci
Author-Name: Russell Kashian
Author-X-Name-First: Russell
Author-X-Name-Last: Kashian
Title: Are there interregional differences in the response of cigarette smoking to state cigarette excise taxes in the USA? Exploratory analysis
Abstract:
Within the context of the period fixed-effects model, this study uses a
2002--2009 state-level panel data set of the USA to investigate the
relative impact of state cigarette excise taxation across the nation in
reducing cigarette smoking. In particular, by focusing upon the state
cigarette excise taxation levels within each of the nine US Census
Divisions, this study investigates whether there are inter-regional
differences in the rate of responsiveness of cigarette consumption to
increased state cigarette taxes. The initial empirical estimates reveal
that although the per capita number of packs of cigarettes smoked annually
is a decreasing function of the state cigarette excise tax in all nine
Census Regions, the relative response of cigarette smoking to state
cigarette tax increases varies considerably from one region to the next.
Reinforcing this conclusion, in one specification of the model, the number
of packs of cigarettes smoked in response to a higher state cigarette tax
is statistically significant and negative in only eight of the nine Census
Divisions. Furthermore, when cigarette smoking is measured in terms of the
percentage of the population classified as smokers, interregional
differentials in the response of smokers to higher state cigarette taxes
are much greater. Thus, there is evidence that cigarette excise taxation
exercises rather different impacts on the propensity to smoke across
Census Regions.
Journal: Journal of Applied Statistics
Pages: 1494-1507
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1106451
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1106451
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1494-1507
Template-Type: ReDIF-Article 1.0
Author-Name: Luz Marina Rondon
Author-X-Name-First: Luz Marina
Author-X-Name-Last: Rondon
Author-Name: Heleno Bolfarine
Author-X-Name-First: Heleno
Author-X-Name-Last: Bolfarine
Title: Bayesian analysis of generalized elliptical semi-parametric models
Abstract:
In this paper, we study the statistical inference based on the Bayesian
approach for regression models with the assumption that independent
additive errors follow normal, Student-t, slash,
contaminated normal, Laplace or symmetric hyperbolic distribution, where
both location and dispersion parameters of the response variable
distribution include nonparametric additive components approximated by
B-splines. This class of models provides a rich set of
symmetric distributions for the model error. Some of these distributions
have heavier or lighter tails than the normal as well as different levels
of kurtosis. In order to draw samples of the posterior distribution of the
interest parameters, we propose an efficient Markov Chain Monte Carlo
(MCMC) algorithm, which combines Gibbs sampler and Metropolis--Hastings
algorithms. The performance of the proposed MCMC algorithm is assessed
through simulation experiments. We apply the proposed methodology to a
real data set. The proposed methodology is implemented in the R package
BayesGESM using the function gesm().
Journal: Journal of Applied Statistics
Pages: 1508-1524
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1109070
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1109070
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1508-1524
Template-Type: ReDIF-Article 1.0
Author-Name: Junying Chen
Author-X-Name-First: Junying
Author-X-Name-Last: Chen
Author-Name: Haoyu Zeng
Author-X-Name-First: Haoyu
Author-X-Name-Last: Zeng
Author-Name: Fei Yang
Author-X-Name-First: Fei
Author-X-Name-Last: Yang
Title: Parameter estimation for employee stock ownerships preference experimental design
Abstract:
The experimental design method is a pivotal factor for the reliability of
the parameters estimation in the discrete choice model. The traditional
orthogonal design is used widely, but insufficient empirical research has
been conducted on the effectiveness of these new design methods. Several
new experimental design methods, such as D-efficient, Bayesian
D-efficient, have been proposed recently. This study finds that the
D-adoption has statistically insignificant effect on the growth of
productivity. This study is motivated by the lack of documented evidence
on the effect of Chinese ESOS. This study contributes to the body of
knowledge by documenting evidence on the impact of ESOS on productivity
enhancement and earnings management practices. The existing literature on
productivity effect and earnings management effect of ESOS falls under two
isolated strands of research. No documented studies have been done to
investigate these two issues simultaneously using the same dataset. As a
result, the existing literature fails to identify which of these two
countervailing effects of ESOS is more dominant.
Journal: Journal of Applied Statistics
Pages: 1525-1540
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1117583
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117583
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1525-1540
Template-Type: ReDIF-Article 1.0
Author-Name: Philip Pallmann
Author-X-Name-First: Philip
Author-X-Name-Last: Pallmann
Author-Name: Ludwig A. Hothorn
Author-X-Name-First: Ludwig A.
Author-X-Name-Last: Hothorn
Title: Analysis of means: a generalized approach using R
Abstract:
Papers on the analysis of means (ANOM) have been circulating in the
quality control literature for decades, routinely describing it as a
statistical stand-alone concept. Therefore, we clarify that ANOM should
rather be regarded as a special case of a much more universal approach
known as multiple contrast tests (MCTs). Perceiving ANOM as a
grand-mean-type MCT paves the way for implementing it in the open-source
software R. We give a brief tutorial on how to exploit R's versatility and
introduce the R package ANOM for drawing the
familiar decision charts. Beyond that, we illustrate two practical aspects
of data analysis with ANOM: firstly, we compare merits and drawbacks of
ANOM-type MCTs and ANOVA F-test and assess their
respective statistical powers, and secondly, we show that the benefit of
using critical values from multivariate t-distributions
for ANOM instead of simple Bonferroni quantiles is oftentimes negligible.
Journal: Journal of Applied Statistics
Pages: 1541-1560
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1117584
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117584
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1541-1560
Template-Type: ReDIF-Article 1.0
Author-Name: Silvia Bozza
Author-X-Name-First: Silvia
Author-X-Name-Last: Bozza
Author-Name: Franco Taroni
Author-X-Name-First: Franco
Author-X-Name-Last: Taroni
Title: Posterior likelihood ratios for evaluation of forensic trace evidence given a two-level model on the data by Alberink et al. (2013)
Journal: Journal of Applied Statistics
Pages: 1561-1563
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1106450
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1106450
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1561-1563
Template-Type: ReDIF-Article 1.0
Author-Name: Thomas Holgersson
Author-X-Name-First: Thomas
Author-X-Name-Last: Holgersson
Author-Name: Louise Nordström
Author-X-Name-First: Louise
Author-X-Name-Last: Nordström
Author-Name: Özge Öner
Author-X-Name-First: Özge
Author-X-Name-Last: Öner
Title: On regression modelling with dummy variables versus separate regressions per group: comment on Holgersson et al.
Journal: Journal of Applied Statistics
Pages: 1564-1565
Issue: 8
Volume: 43
Year: 2016
Month: 6
X-DOI: 10.1080/02664763.2015.1092711
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1092711
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:8:p:1564-1565
Template-Type: ReDIF-Article 1.0
Author-Name: F.A. Alawadhi
Author-X-Name-First: F.A.
Author-X-Name-Last: Alawadhi
Author-Name: D. Alhulail
Author-X-Name-First: D.
Author-X-Name-Last: Alhulail
Title: Bayesian change points analysis for earthquakes body wave magnitude
Abstract:
Recently, the world has experienced an increased number of major
earthquakes. The Zagros belt is among the most seismically active mountain
ranges in the world. Due to Kuwait's location in the southwest of the
Zagros belt, it is affected by relative tectonic movements in the
neighboring region. It is vital to assess the Zagros seismic risks in
Kuwait using recent data and coordinate with the competent authorities to
reduce those risks. Using the body wave magnitude (Mb) data collected in
Kuwait, we want to assess the recent changes in the magnitude of
earthquakes and its variations in Kuwait's vicinity. We built a change
point model to detect the significant changes in its parameters. This
paper applies a hierarchical Bayesian technique and derives the marginal
posterior density function for the Mb. Our interest lies in identifying a
shift in the mean of a single or multiple change points as well as the
changes in the variation. Building upon the model and its parameters for
the 2002--2003 data, we detected three change points. The first, second
and third change points occurred in September 2002, April 2003 and August
2003, respectively.
Journal: Journal of Applied Statistics
Pages: 1567-1582
Issue: 9
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1117585
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117585
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:9:p:1567-1582
Template-Type: ReDIF-Article 1.0
Author-Name: Mahmood Ul Hassan
Author-X-Name-First: Mahmood
Author-X-Name-Last: Ul Hassan
Author-Name: Pär Stockhammar
Author-X-Name-First: Pär
Author-X-Name-Last: Stockhammar
Title: Fitting probability distributions to economic growth: a maximum likelihood approach
Abstract:
The growth rate of the gross domestic product (GDP) usually carries
heteroscedasticity, asymmetry and fat-tails. In this study three important
and significantly heteroscedastic GDP series are examined. A Normal,
normal-mixture, normal-asymmetric Laplace distribution and a Student's
t-Asymmetric Laplace (TAL) distribution mixture are considered for
distributional fit comparison of GDP growth series after removing
heteroscedasticity. The parameters of the distributions have been
estimated using maximum likelihood method. Based on the results of
different accuracy measures, goodness-of-fit tests and plots, we find out
that in the case of asymmetric, heteroscedastic and highly leptokurtic
data the TAL-distribution fits better than the alternatives. In the case
of asymmetric, heteroscedastic but less leptokurtic data the NM fit is
superior. Furthermore, a simulation study has been carried out to obtain
standard errors for the estimated parameters. The results of this study
might be used in e.g. density forecasting of GDP growth series or to
compare different economies.
Journal: Journal of Applied Statistics
Pages: 1583-1603
Issue: 9
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1117586
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117586
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:9:p:1583-1603
Template-Type: ReDIF-Article 1.0
Author-Name: Alexander Ludwig
Author-X-Name-First: Alexander
Author-X-Name-Last: Ludwig
Title: On the usability of the fluctuation test statistic to identify multiple cointegration break points
Abstract:
The fluctuation test suggested by Hansen and Johansen [Some tests
for parameter constancy in cointegrated VAR models, Econometrics
J. 2 (1999), pp. 306--333] intends to distinguish between the presence of
zero and one break in cointegration relations. In this article, we provide
evidence by Monte Carlo simulations that it also serves as a graphical
device to detect even multiple break locations. It suffices to consider a
simplified and easy-to-implement version of the original fluctuation test.
Its break detection performance depends on the sign of change in
cointegration parameters and the break height. The sign issue can be
approached successfully by a backward application of the test statistic.
If breaks are observable, the break locations are detected at the true
location on average. We apply the graphical procedure to assess the
cointegration of bond yields of Spain, Italy and Portugal with German
yields for the period 1995--2013 which is surprisingly supported by the
trace test. However, the recursive cointegration approach shows that a
stable relationship with German yields is only present for sub-periods
between the introduction of the Euro and the global financial crisis which
is in line with expectations. The statistical robustness of these results
is supported by a forward and backward application of the cointegration
breakdown test by Andrews and Kim [Tests for cointegration
breakdown over a short time period, J. Bus. Econom. Stat. 24
(2006), pp. 379--394].
Journal: Journal of Applied Statistics
Pages: 1604-1624
Issue: 9
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1117587
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117587
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:9:p:1604-1624
Template-Type: ReDIF-Article 1.0
Author-Name: Austin L. Hand
Author-X-Name-First: Austin L.
Author-X-Name-Last: Hand
Author-Name: John A. Scott
Author-X-Name-First: John A.
Author-X-Name-Last: Scott
Author-Name: Phil D. Young
Author-X-Name-First: Phil D.
Author-X-Name-Last: Young
Author-Name: James D. Stamey
Author-X-Name-First: James D.
Author-X-Name-Last: Stamey
Author-Name: Dean M. Young
Author-X-Name-First: Dean M.
Author-X-Name-Last: Young
Title: Bayesian adaptive two-stage design for determining person-time in Phase II clinical trials with Poisson data
Abstract:
Adaptive clinical trial designs can often improve drug-study efficiency by
utilizing data obtained during the course of the trial. We present a novel
Bayesian two-stage adaptive design for Phase II clinical trials with
Poisson-distributed outcomes that allows for person-observation-time
adjustments for early termination due to either futility or efficacy. Our
design is motivated by the adaptive trial from [9], which uses binomial
data. Although many frequentist and Bayesian two-stage adaptive designs
for count data have been proposed in the literature, many designs do not
allow for person-time adjustments after the first stage. This restriction
limits flexibility in the study design. However, our proposed design
allows for such flexibility by basing the second-stage person-time on the
first-stage observed-count data. We demonstrate the implementation of our
Bayesian predictive adaptive two-stage design using a hypothetical Phase
II trial of Immune Globulin (Intravenous).
Journal: Journal of Applied Statistics
Pages: 1625-1635
Issue: 9
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1117588
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117588
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:9:p:1625-1635
Template-Type: ReDIF-Article 1.0
Author-Name: J. A. Achcar
Author-X-Name-First: J. A.
Author-X-Name-Last: Achcar
Author-Name: N. Davarzani
Author-X-Name-First: N.
Author-X-Name-Last: Davarzani
Author-Name: R. M. Souza
Author-X-Name-First: R. M.
Author-X-Name-Last: Souza
Title: Basu--Dhar bivariate geometric distribution in the presence of covariates and censored data: a Bayesian approach
Abstract:
In this paper, we introduce classical and Bayesian approaches for the
Basu--Dhar bivariate geometric distribution in the presence of covariates
and censored data. This distribution is considered for the analysis of
bivariate lifetime as an alternative to some existing bivariate lifetime
distributions assuming continuous lifetimes as the Block and Basu or
Marshall and Olkin bivariate distributions. Maximum likelihood and
Bayesian estimators are presented. Two examples are considered to
illustrate the proposed methodology: an example with simulated data and an
example with medical bivariate lifetime data.
Journal: Journal of Applied Statistics
Pages: 1636-1648
Issue: 9
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1117589
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117589
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:9:p:1636-1648
Template-Type: ReDIF-Article 1.0
Author-Name: Zheng Xu
Author-X-Name-First: Zheng
Author-X-Name-Last: Xu
Title: An alternative circular smoothing method to nonparametric estimation of periodic functions
Abstract:
This article provides alternative circular smoothing methods in
nonparametric estimation of periodic functions. By treating the data as
‘circular’, we solve the “boundary issue” in
the nonparametric estimation treating the data as ‘linear’.
By redefining the distance metric and signed distance, we modify many
estimators used in the situations involving periodic patterns. In the
perspective of ‘nonparametric estimation of periodic
functions’, we present the examples in nonparametric estimation of
(1) a periodic function, (2) multiple periodic functions, (3) an evolving
function, (4) a periodically varying-coefficient model and (5) a
generalized linear model with periodically varying coefficient. In the
perspective of ‘circular statistics’, we provide alternative
approaches to calculate the weighted average and evaluate the
‘linear/circular--linear/circular’ association and
regression. Simulation studies and an empirical study of electricity price
index have been conducted to illustrate and compare our methods with other
methods in the literature.
Journal: Journal of Applied Statistics
Pages: 1649-1672
Issue: 9
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1117590
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117590
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:9:p:1649-1672
Template-Type: ReDIF-Article 1.0
Author-Name: Hongjian Zhu
Author-X-Name-First: Hongjian
Author-X-Name-Last: Zhu
Author-Name: Dejian Lai
Author-X-Name-First: Dejian
Author-X-Name-Last: Lai
Author-Name: Nils P. Johnson
Author-X-Name-First: Nils P.
Author-X-Name-Last: Johnson
Title: Agreement between two diagnostic tests when accounting for test--retest variation: application to FFR versus iFR
Abstract:
In medicine, there are often two diagnostic tests that serve the same
purpose. Typically, one of the tests will have a lower diagnostic
performance but be less invasive, easier to perform, or cheaper.
Clinicians must assess the agreement between the tests while accounting
for test--retest variation in both techniques. In this paper, we
investigate a specific example from interventional cardiology, studying
the agreement between the fractional flow reserve and the instantaneous
wave-free ratio. We analyze potential definitions of the agreement
(accuracy) between the two tests and compare five families of statistical
estimators. We contrast their statistical behavior both theoretically and
using numerical simulations. Surprisingly for clinicians, seemingly
natural and equivalent definitions of the concept of agreement can lead to
discordant and even nonsensical estimates.
Journal: Journal of Applied Statistics
Pages: 1673-1689
Issue: 9
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1117591
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117591
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:9:p:1673-1689
Template-Type: ReDIF-Article 1.0
Author-Name: Jaehee Kim
Author-X-Name-First: Jaehee
Author-X-Name-Last: Kim
Author-Name: Chulwoo Jeong
Author-X-Name-First: Chulwoo
Author-X-Name-Last: Jeong
Title: A Bayesian multiple structural change regression model with autocorrelated errors
Abstract:
This paper develops a new Bayesian approach to change-point modeling that
allows the number of change-points in the observed autocorrelated times
series to be unknown. The model we develop assumes that the number of
change-points have a truncated Poisson distribution. A genetic algorithm
is used to estimate a change-point model, which allows for structural
changes with autocorrelated errors. We focus considerable attention on the
construction of autocorrelated structure for each regime and for the
parameters that characterize each regime. Our techniques are found to work
well in the simulation with a few change-points. An empirical analysis is
provided involving the annual flow of the Nile River and the monthly total
energy production in South Korea to lead good estimates for structural
change-points.
Journal: Journal of Applied Statistics
Pages: 1690-1705
Issue: 9
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1117592
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117592
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:9:p:1690-1705
Template-Type: ReDIF-Article 1.0
Author-Name: Joshua N. Sampson
Author-X-Name-First: Joshua N.
Author-X-Name-Last: Sampson
Author-Name: Charles E. Matthews
Author-X-Name-First: Charles E.
Author-X-Name-Last: Matthews
Author-Name: Laurence S. Freedman
Author-X-Name-First: Laurence S.
Author-X-Name-Last: Freedman
Author-Name: Raymond J. Carroll
Author-X-Name-First: Raymond J.
Author-X-Name-Last: Carroll
Author-Name: Victor Kipnis
Author-X-Name-First: Victor
Author-X-Name-Last: Kipnis
Title: Methods to assess measurement error in questionnaires of sedentary behavior
Abstract:
Sedentary behavior has already been associated with mortality,
cardiovascular disease, and cancer. Questionnaires are an affordable tool
for measuring sedentary behavior in large epidemiological studies. Here,
we introduce and evaluate two statistical methods for quantifying
measurement error in questionnaires. Accurate estimates are needed for
assessing questionnaire quality. The two methods would be applied to
validation studies that measure a sedentary behavior by both questionnaire
and accelerometer on multiple days. The first method fits a reduced model
by assuming the accelerometer is without error, while the second method
fits a more complete model that allows both measures to have error.
Because accelerometers tend to be highly accurate, we show that ignoring
the accelerometer's measurement error, can result in more accurate
estimates of measurement error in some scenarios. In this article, we
derive asymptotic approximations for the mean-squared error of the
estimated parameters from both methods, evaluate their dependence on study
design and behavior characteristics, and offer an R package so
investigators can make an informed choice between the two methods. We
demonstrate the difference between the two methods in a recent validation
study comparing previous day recalls to an accelerometer-based ActivPal.
Journal: Journal of Applied Statistics
Pages: 1706-1721
Issue: 9
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1117593
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117593
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:9:p:1706-1721
Template-Type: ReDIF-Article 1.0
Author-Name: Trias Wahyuni Rakhmawati
Author-X-Name-First: Trias Wahyuni
Author-X-Name-Last: Rakhmawati
Author-Name: Geert Molenberghs
Author-X-Name-First: Geert
Author-X-Name-Last: Molenberghs
Author-Name: Geert Verbeke
Author-X-Name-First: Geert
Author-X-Name-Last: Verbeke
Author-Name: Christel Faes
Author-X-Name-First: Christel
Author-X-Name-Last: Faes
Title: Local influence diagnostics for incomplete overdispersed longitudinal counts
Abstract:
We develop local influence diagnostics to detect influential subjects when
generalized linear mixed models are fitted to incomplete longitudinal
overdispersed count data. The focus is on the influence stemming from the
dropout model specification. In particular, the effect of small
perturbations around an MAR specification are examined. The method is
applied to data from a longitudinal clinical trial in epileptic patients.
The effect on models allowing for overdispersion is contrasted with that
on models that do not.
Journal: Journal of Applied Statistics
Pages: 1722-1737
Issue: 9
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1117594
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117594
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:9:p:1722-1737
Template-Type: ReDIF-Article 1.0
Author-Name: M. Álvarez Hernández
Author-X-Name-First: M.
Author-X-Name-Last: Álvarez Hernández
Author-Name: A. Martín Andrés
Author-X-Name-First: A. Martín
Author-X-Name-Last: Andrés
Author-Name: I. Herranz Tejedor
Author-X-Name-First: I.
Author-X-Name-Last: Herranz Tejedor
Title: One-sided asymptotic inferences for a proportion
Abstract:
Two-sided asymptotic confidence intervals for an unknown proportion
p have been the subject of a great deal of literature.
Surprisingly, there are very few papers devoted, like this article, to the
case of one tail, despite its great importance in practice and the fact
that its behavior is usually different from that of the case with two
tails. This paper evaluates 47 methods and concludes that (1) the optimal
method is the classic Wilson method with a correction for continuity and
(2) a simpler option, almost as good as the first, is the new adjusted
Wald method (Wald's classic method applied to the data increased in the
values proposed by Borkowf: adding a single imaginary failure or success).
Journal: Journal of Applied Statistics
Pages: 1738-1752
Issue: 9
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1117595
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1117595
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:9:p:1738-1752
Template-Type: ReDIF-Article 1.0
Author-Name: Saima Afzal
Author-X-Name-First: Saima
Author-X-Name-Last: Afzal
Author-Name: Muhammad Mutahir Iqbal
Author-X-Name-First: Muhammad Mutahir
Author-X-Name-Last: Iqbal
Title: A new way to order independent components
Abstract:
A relatively newer computational technique adopted by statisticians is
known as independent component analysis (ICA) which is used to analyze
complex multidimensional data with the objective to separate it into
components that are independent to each other. Quite often the main
interest for conducting ICA is to identify a small number of significant
independent components (ICs) to replace the original complex dimensions
with. For this, determining the order of identified ICs is a
pre-requisite. The area is not unaddressed but it does deserve a careful
revisiting. This is the subject matter of the paper which introduces a new
method to order ICs. The proposed method is based upon regression
approach. It compares the magnitude of the mixing coefficients and
regression coefficients of the regression of the original series on ICs.
Their compatibility determines the order.
Journal: Journal of Applied Statistics
Pages: 1753-1764
Issue: 9
Volume: 43
Year: 2016
Month: 7
X-DOI: 10.1080/02664763.2015.1120709
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1120709
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:9:p:1753-1764
Template-Type: ReDIF-Article 1.0
Author-Name: M. I. Sánchez-Rodríguez
Author-X-Name-First: M. I.
Author-X-Name-Last: Sánchez-Rodríguez
Author-Name: E. M. Sánchez-López
Author-X-Name-First: E. M.
Author-X-Name-Last: Sánchez-López
Author-Name: A. Marinas
Author-X-Name-First: A.
Author-X-Name-Last: Marinas
Author-Name: J. M. Caridad
Author-X-Name-First: J. M.
Author-X-Name-Last: Caridad
Author-Name: F. J. Urbano
Author-X-Name-First: F. J.
Author-X-Name-Last: Urbano
Author-Name: J. M. Marinas
Author-X-Name-First: J. M.
Author-X-Name-Last: Marinas
Title: Improving the estimations of fatty acids in several Andalusian PDO olive oils from NMR spectral data
Abstract:
The aim of this paper is to determine the fatty acid profile of diverse
Andalusian extra-virgin olive oils from different protected designations
of origin (PDO). The available data for the statistical multivariate
analysis have been obtained from gas chromatography (GC, used as classical
reference analytical technique) and nuclear magnetic resonance (NMR)
spectroscopy : -NMR and
-NMR (in the
carbonyl, C-16 y aliphatic carbon regions). The diverse percentages of
fatty acids approximated by the above-mentioned chemical procedures are
summarized by using a statistical treatment, which presents a some
weighted averages to obtain the closest fatty acid profile to the one
provided by the GC reference technique, with weights being inversely
proportional to some measures of the calibration errors. Besides, the work
shows that the PDO of an olive oil conditions the NMR region
(-NMR or
carbonyl, C-16 or aliphatic -NMR) which
provides the best estimation of each type of fatty acid. Finally,
procedures of cross-validation are implemented in order to generalize the
previous results.
Journal: Journal of Applied Statistics
Pages: 1765-1793
Issue: 10
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1119808
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1119808
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:10:p:1765-1793
Template-Type: ReDIF-Article 1.0
Author-Name: Francisco J. Rubio
Author-X-Name-First: Francisco J.
Author-X-Name-Last: Rubio
Author-Name: Yili Hong
Author-X-Name-First: Yili
Author-X-Name-Last: Hong
Title: Survival and lifetime data analysis with a flexible class of distributions
Abstract:
We introduce a general class of continuous univariate distributions with
positive support obtained by transforming the class of two-piece
distributions. We show that this class of distributions is very flexible,
easy to implement, and contains members that can capture different tail
behaviours and shapes, producing also a variety of hazard functions. The
proposed distributions represent a flexible alternative to the classical
choices such as the log-normal, Gamma, and Weibull distributions. We
investigate empirically the inferential properties of the proposed models
through an extensive simulation study. We present some applications using
real data in the contexts of time-to-event and accelerated failure time
models. In the second kind of applications, we explore the use of these
models in the estimation of the distribution of the individual remaining
life.
Journal: Journal of Applied Statistics
Pages: 1794-1813
Issue: 10
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1120710
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1120710
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:10:p:1794-1813
Template-Type: ReDIF-Article 1.0
Author-Name: Cristian L. Bayes
Author-X-Name-First: Cristian L.
Author-X-Name-Last: Bayes
Author-Name: Luis Valdivieso
Author-X-Name-First: Luis
Author-X-Name-Last: Valdivieso
Title: A beta inflated mean regression model for fractional response variables
Abstract:
This article proposes a new regression model for a dependent fractional
random variable on the interval that takes with
positive probability the extreme values 0 or 1. Our model relates the
expected value of this variable with a linear predictor through a special
parametrization that let the parameters free in the parameter space. A
simulation-based study and an application to capital structure choices
were conducted to analyze the performance of the likelihood estimators in
the model. The results show not only accurate estimations and a better fit
than other traditional models but also a more straightforward and clear
way to estimate the effects of a set of covariates over the mean of a
fractional response.
Journal: Journal of Applied Statistics
Pages: 1814-1830
Issue: 10
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1120711
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1120711
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:10:p:1814-1830
Template-Type: ReDIF-Article 1.0
Author-Name: Roman Salmerón Gómez
Author-X-Name-First: Roman
Author-X-Name-Last: Salmerón Gómez
Author-Name: José García Pérez
Author-X-Name-First: José
Author-X-Name-Last: García Pérez
Author-Name: María Del Mar López Martín
Author-X-Name-First: María Del Mar
Author-X-Name-Last: López Martín
Author-Name: Catalina García García
Author-X-Name-First: Catalina García
Author-X-Name-Last: García
Title: Collinearity diagnostic applied in ridge estimation through the variance inflation factor
Abstract:
The variance inflation factor (VIF) is used to detect the presence of
linear relationships between two or more independent variables (i.e.
collinearity) in the multiple linear regression model. However, the
traditionally used VIF definitions encounter some problems when extended
to the case of the ridge estimation (RE). This paper presents an extension
of the VIF in RE by providing two alternative VIF expressions that
overcome these problems in the general case. Some characteristics of these
expressions are also presented and compared with the traditional
expression. The results are illustrated with an economic example in the
case of three independent variables and with a Monte Carlo simulation for
the general case.
Journal: Journal of Applied Statistics
Pages: 1831-1849
Issue: 10
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1120712
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1120712
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:10:p:1831-1849
Template-Type: ReDIF-Article 1.0
Author-Name: S. Noorian
Author-X-Name-First: S.
Author-X-Name-Last: Noorian
Author-Name: M. Ganjali
Author-X-Name-First: M.
Author-X-Name-Last: Ganjali
Author-Name: E. Bahrami Samani
Author-X-Name-First: E.
Author-X-Name-Last: Bahrami Samani
Title: A Bayesian test of homogeneity of association parameter using transition modelling of longitudinal mixed responses
Abstract:
In this paper, a Bayesian framework using a joint transition model for
analysing longitudinal mixed ordinal and continuous responses is
considered. The joint model considers a multivariate mixed model for the
responses in which a transitive cumulative logistic regression model and
an autoregressive regression model are used to model ordinal and
continuous responses, respectively. Also, to take into account the
association between longitudinal ordinal and continuous responses, a
dynamic association parameter is used. A test is conducted to see whether
this parameter is time-invariant and another test is presented to see
whether this parameter is equal to zero or significantly far from zero.
Our approach is applied to longitudinal PIAT (Peabody Individual
Achievement Test) data where the Bayesian estimates of parameters are
obtained.
Journal: Journal of Applied Statistics
Pages: 1850-1863
Issue: 10
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1125858
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1125858
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:10:p:1850-1863
Template-Type: ReDIF-Article 1.0
Author-Name: Gafar Matanmi Oyeyemi
Author-X-Name-First: Gafar Matanmi
Author-X-Name-Last: Oyeyemi
Author-Name: George Chinanu Mbaeyi
Author-X-Name-First: George Chinanu
Author-X-Name-Last: Mbaeyi
Author-Name: Saheed Ishola Salawu
Author-X-Name-First: Saheed Ishola
Author-X-Name-Last: Salawu
Author-Name: Bernard Olagboyega Muse
Author-X-Name-First: Bernard Olagboyega
Author-X-Name-Last: Muse
Title: On discrimination procedure with mixtures of continuous and categorical variables
Abstract:
A discrimination procedure, based on the location model is described and
suggested for use in situation where the discriminating variables are
mixtures of continuous and binary variables. Some procedures that have
been previously employed, in a similar situation, like Fisher's linear
discriminant function and the logistic regression were compared with this
method using error rate (ER). Optimal ERs for these procedures are
reported using real and simulated data for the case of varying sample size
and number of continuous and binary variables and were used as a measure
for assessing the performance of the various procedures. The suggested
procedure performed considerably better in the cases considered and never
did produce a result that is poor when compared with other procedures.
Hence, the suggested procedure might be considered for such situations.
Journal: Journal of Applied Statistics
Pages: 1864-1873
Issue: 10
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1125859
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1125859
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:10:p:1864-1873
Template-Type: ReDIF-Article 1.0
Author-Name: Ji Hwan Cha
Author-X-Name-First: Ji Hwan
Author-X-Name-Last: Cha
Title: Analysis of reliability characteristics in the acceptance sampling tests
Abstract:
Until now, various acceptance reliability sampling plans have been
developed based on different life tests of items. However, the statistical
effect of the acceptance sampling tests on the reliability characteristic
of the lots accepted in the test has not been appropriately addressed. In
this paper, we deal with an acceptance reliability sampling plan under a
‘general framework’ and discuss the corresponding
statistical effect of the acceptance sampling tests. The lifetime of the
population before the acceptance test and that of population
‘conditional on the acceptance’ in the sampling test are
stochastically compared. The improvement of reliability characteristics of
the population conditional on the acceptance in the sampling test is
precisely analyzed.
Journal: Journal of Applied Statistics
Pages: 1874-1891
Issue: 10
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1125860
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1125860
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:10:p:1874-1891
Template-Type: ReDIF-Article 1.0
Author-Name: Semra Türkan
Author-X-Name-First: Semra
Author-X-Name-Last: Türkan
Author-Name: Gamze Özel
Author-X-Name-First: Gamze
Author-X-Name-Last: Özel
Title: A new modified Jackknifed estimator for the Poisson regression model
Abstract:
The Poisson regression is very popular in applied researches when
analyzing the count data. However, multicollinearity problem arises for
the Poisson regression model when the independent variables are highly
intercorrelated. Shrinkage estimator is a commonly applied solution to the
general problem caused by multicollinearity. Recently, the ridge
regression (RR) estimators and some methods for estimating the ridge
parameter k in the Poisson regression have been proposed.
It has been found that some estimators are better than the commonly used
maximum-likelihood (ML) estimator and some other RR estimators. In this
study, the modified Jackknifed Poisson ridge regression (MJPR) estimator
is proposed to remedy the multicollinearity. A simulation study and a real
data example are provided to evaluate the performance of estimators. Both
mean-squared error and the percentage relative error are considered as the
performance criteria. The simulation study and the real data example
results show that the proposed MJPR method outperforms the Poisson ridge
regression, Jackknifed Poisson ridge regression and the ML in all of the
different situations evaluated in this paper.
Journal: Journal of Applied Statistics
Pages: 1892-1905
Issue: 10
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1125861
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1125861
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:10:p:1892-1905
Template-Type: ReDIF-Article 1.0
Author-Name: Didit B. Nugroho
Author-X-Name-First: Didit B.
Author-X-Name-Last: Nugroho
Author-Name: Takayuki Morimoto
Author-X-Name-First: Takayuki
Author-X-Name-Last: Morimoto
Title: Box--Cox realized asymmetric stochastic volatility models with generalized Student's t-error distributions
Abstract:
This study proposes a class of non-linear realized stochastic volatility
(SV) model by applying the Box--Cox (BC) transformation, instead of the
logarithmic transformation, to the realized estimator. The non-Gaussian
distributions such as Student's t, non-central Student's
t, and generalized hyperbolic skew Student's
t-distributions are applied to accommodate
heavy-tailedness and skewness in returns. The proposed models are fitted
to daily returns and realized kernel of six stocks: SP500, FTSE100,
Nikkei225, Nasdaq100, DAX, and DJIA using an Markov chain Monte Carlo
Bayesian method, in which the Hamiltonian Monte Carlo (HMC) algorithm
updates BC parameter and the Riemann manifold HMC algorithm updates latent
variables and other parameters that are unable to be sampled directly.
Empirical studies provide evidence against both the logarithmic
transformation and raw versions of realized SV model.
Journal: Journal of Applied Statistics
Pages: 1906-1927
Issue: 10
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1125862
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1125862
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:10:p:1906-1927
Template-Type: ReDIF-Article 1.0
Author-Name: Melody Denhere
Author-X-Name-First: Melody
Author-X-Name-Last: Denhere
Author-Name: Huybrechts F. Bindele
Author-X-Name-First: Huybrechts F.
Author-X-Name-Last: Bindele
Title: Rank estimation for the functional linear model
Abstract:
This article discusses the estimation of the parameter function for a
functional linear regression model under heavy-tailed errors'
distributions and in the presence of outliers. Standard approaches of
reducing the high dimensionality, which is inherent in functional data,
are considered. After reducing the functional model to a standard multiple
linear regression model, a weighted rank-based procedure is carried out to
estimate the regression parameters. A Monte Carlo simulation and a
real-world example are used to show the performance of the proposed
estimator and a comparison made with the least-squares and least absolute
deviation estimators.
Journal: Journal of Applied Statistics
Pages: 1928-1944
Issue: 10
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1125863
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1125863
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:10:p:1928-1944
Template-Type: ReDIF-Article 1.0
Author-Name: Housila P. Singh
Author-X-Name-First: Housila P.
Author-X-Name-Last: Singh
Author-Name: Surya K. Pal
Author-X-Name-First: Surya K.
Author-X-Name-Last: Pal
Title: An efficient class of estimators of finite population variance using quartiles
Abstract:
In this paper, we have proposed a class of estimators of finite population
variance using known values of parameters related to an auxiliary variable
such as quartiles and its properties are studied in simple random
sampling. The suggested class of ratio-type estimators has been compared
with the usual unbiased, ratio estimators and the class of ratio-type
estimators due to Singh et al. [Improved
estimation of finite population variance using quartiles,
Istatistik -- J. Turkish Stat. Assoc. 6(3) (2013), pp. 166--121] and
Solanki et al. [Improved ratio-type estimators of
finite population variance using quartiles, Hacettepe J. Math.
Stat. 44(3) (2015), pp. 747--754]. An empirical study is also carried out
to judge the merits of the proposed estimator over other existing
estimators of population variance using natural data set. It is found that
the proposed class of ratio-type estimators
‘’ is
superior to the usual unbiased estimator and the
estimators recently proposed by Singh et al.
[Improved estimation of finite population variance using
quartiles, Istatistik -- J. Turkish Stat. Assoc. 6(3) (2013), pp.
166--121] and Solanki et al. [Improved ratio-type
estimators of finite population variance using quartiles,
Hacettepe J. Math. Stat. 44(3) (2015), pp. 747--754].
Journal: Journal of Applied Statistics
Pages: 1945-1958
Issue: 10
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1125865
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1125865
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:10:p:1945-1958
Template-Type: ReDIF-Article 1.0
Author-Name: Douglas M. Hawkins
Author-X-Name-First: Douglas M.
Author-X-Name-Last: Hawkins
Author-Name: F. Lombard
Author-X-Name-First: F.
Author-X-Name-Last: Lombard
Title: Cusum control for data following the von Mises distribution
Abstract:
The von Mises distribution is widely used for modeling angular data. When such data are seen in a quality control setting, there may be interest in checking whether the values are in statistical control or have gone out of control. A cumulative sum (cusum) control chart has desirable properties for checking whether the distribution has changed from an in-control to an out-of-control setting. This paper develops cusums for a change in the mean direction and concentration of angular data and illustrates some of their properties.
Journal: Journal of Applied Statistics
Pages: 1319-1332
Issue: 8
Volume: 44
Year: 2017
Month: 6
X-DOI: 10.1080/02664763.2016.1202217
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1202217
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:8:p:1319-1332
Template-Type: ReDIF-Article 1.0
Author-Name: Rebecca M. Baker
Author-X-Name-First: Rebecca M.
Author-X-Name-Last: Baker
Author-Name: Tahani Coolen-Maturi
Author-X-Name-First: Tahani
Author-X-Name-Last: Coolen-Maturi
Author-Name: Frank P. A. Coolen
Author-X-Name-First: Frank P. A.
Author-X-Name-Last: Coolen
Title: Nonparametric predictive inference for stock returns
Abstract:
In finance, inferences about future asset returns are typically quantified with the use of parametric distributions and single-valued probabilities. It is attractive to use less restrictive inferential methods, including nonparametric methods which do not require distributional assumptions about variables, and imprecise probability methods which generalize the classical concept of probability to set-valued quantities. Main attractions include the flexibility of the inferences to adapt to the available data and that the level of imprecision in inferences can reflect the amount of data on which these are based. This paper introduces nonparametric predictive inference (NPI) for stock returns. NPI is a statistical approach based on few assumptions, with inferences strongly based on data and with uncertainty quantified via lower and upper probabilities. NPI is presented for inference about future stock returns, as a measure for risk and uncertainty, and for pairwise comparison of two stocks based on their future aggregate returns. The proposed NPI methods are illustrated using historical stock market data.
Journal: Journal of Applied Statistics
Pages: 1333-1349
Issue: 8
Volume: 44
Year: 2017
Month: 6
X-DOI: 10.1080/02664763.2016.1204429
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1204429
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:8:p:1333-1349
Template-Type: ReDIF-Article 1.0
Author-Name: Zhongxian Men
Author-X-Name-First: Zhongxian
Author-X-Name-Last: Men
Author-Name: Don McLeish
Author-X-Name-First: Don
Author-X-Name-Last: McLeish
Author-Name: Adam W. Kolkiewicz
Author-X-Name-First: Adam W.
Author-X-Name-Last: Kolkiewicz
Author-Name: Tony S. Wirjanto
Author-X-Name-First: Tony S.
Author-X-Name-Last: Wirjanto
Title: Comparison of asymmetric stochastic volatility models under different correlation structures
Abstract:
This paper conducts simulation-based comparison of several stochastic volatility models with leverage effects. Two new variants of asymmetric stochastic volatility models, which are subject to a logarithmic transformation on the squared asset returns, are proposed. The leverage effect is introduced into the model through correlation either between the innovations of the observation equation and the latent process, or between the logarithm of squared asset returns and the latent process. Suitable Markov Chain Monte Carlo algorithms are developed for parameter estimation and model comparison. Simulation results show that our proposed formulation of the leverage effect and the accompanying inference methods give rise to reasonable parameter estimates. Applications to two data sets uncover a negative correlation (which can be interpreted as a leverage effect) between the observed returns and volatilities, and a negative correlation between the logarithm of squared returns and volatilities.
Journal: Journal of Applied Statistics
Pages: 1350-1368
Issue: 8
Volume: 44
Year: 2017
Month: 6
X-DOI: 10.1080/02664763.2016.1204596
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1204596
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:8:p:1350-1368
Template-Type: ReDIF-Article 1.0
Author-Name: N. Davarzani
Author-X-Name-First: N.
Author-X-Name-Last: Davarzani
Author-Name: L. Golparvar
Author-X-Name-First: L.
Author-X-Name-Last: Golparvar
Author-Name: A. Parsian
Author-X-Name-First: A.
Author-X-Name-Last: Parsian
Author-Name: R. Peeters
Author-X-Name-First: R.
Author-X-Name-Last: Peeters
Title: Estimation on dependent right censoring scheme in an ordinary bivariate geometric distribution
Abstract:
Discrete lifetime data are very common in engineering and medical researches. In many cases the lifetime is censored at a random or predetermined time and we do not know the complete survival time. There are many situations that the lifetime variable could be dependent on the time of censoring. In this paper we propose the dependent right censoring scheme in discrete setup when the lifetime and censoring variables have a bivariate geometric distribution. We obtain the maximum likelihood estimators of the unknown parameters with their risks in closed forms. The Bayes estimators as well as the constrained Bayes estimates of the unknown parameters under the squared error loss function are also obtained. We considered an extension to the case where covariates are present along with the data. Finally we provided a simulation study and an illustrative example with a real data.
Journal: Journal of Applied Statistics
Pages: 1369-1384
Issue: 8
Volume: 44
Year: 2017
Month: 6
X-DOI: 10.1080/02664763.2016.1206064
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1206064
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:8:p:1369-1384
Template-Type: ReDIF-Article 1.0
Author-Name: A. Mahabbati
Author-X-Name-First: A.
Author-X-Name-Last: Mahabbati
Author-Name: A. Izady
Author-X-Name-First: A.
Author-X-Name-Last: Izady
Author-Name: M. Mousavi Baygi
Author-X-Name-First: M.
Author-X-Name-Last: Mousavi Baygi
Author-Name: K. Davary
Author-X-Name-First: K.
Author-X-Name-Last: Davary
Author-Name: S. M. Hasheminia
Author-X-Name-First: S. M.
Author-X-Name-Last: Hasheminia
Title: Daily soil temperature modeling using ‘panel-data’ concept
Abstract:
The purpose of this research was to predict soil temperature profile using ‘panel-data’ models. Panel-data analysis endows regression analysis with both spatial and temporal dimensions. The spatial dimension pertains to a set of cross-sectional units of observation. The temporal dimension pertains to periodic observations of a set of variables characterizing these cross-sectional units over a particular time-span. This study was conducted in Khorasan-Razavi Province, Iran. Daily mean soil temperatures for 9 years (2001–2009), in 6 different depths (5, 10, 20, 30, 50 and 100 cm) under bare soil surface at 10 meteorological stations were used. The data were divided into two sub-sets for training (parameter training) over the period of 2001–2008, and validation over the period of the year 2009. The panel-data models were developed using the average air temperature and rainfall of the day before ( $ {T_{d - 1}} $ Td−1 and $ {R_{t - 1}} $ Rt−1, respectively) and the average air temperature of the past 7 days (Tw) as inputs in order to predict the average soil temperature of the next day. The results showed that the two-way fixed effects models were superior. The performance indicators (R2 = 0.94 to 0.99, RMSE = 0.46 to 1.29 and MBE = −0.83 and 0.74) revealed the effectiveness of this model. In addition, these results were compared with the results of classic linear regression models using t-test, which showed the superiority of the panel-data models.
Journal: Journal of Applied Statistics
Pages: 1385-1401
Issue: 8
Volume: 44
Year: 2017
Month: 6
X-DOI: 10.1080/02664763.2016.1214240
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1214240
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:8:p:1385-1401
Template-Type: ReDIF-Article 1.0
Author-Name: Philip L.H. Yu
Author-X-Name-First: Philip L.H.
Author-X-Name-Last: Yu
Author-Name: Thomas Mathew
Author-X-Name-First: Thomas
Author-X-Name-Last: Mathew
Author-Name: Yuanyuan Zhu
Author-X-Name-First: Yuanyuan
Author-X-Name-Last: Zhu
Title: A generalized pivotal quantity approach to portfolio selection
Abstract:
The major problem of mean–variance portfolio optimization is parameter uncertainty. Many methods have been proposed to tackle this problem, including shrinkage methods, resampling techniques, and imposing constraints on the portfolio weights, etc. This paper suggests a new estimation method for mean–variance portfolio weights based on the concept of generalized pivotal quantity (GPQ) in the case when asset returns are multivariate normally distributed and serially independent. Both point and interval estimations of the portfolio weights are considered. Comparing with Markowitz's mean–variance model, resampling and shrinkage methods, we find that the proposed GPQ method typically yields the smallest mean-squared error for the point estimate of the portfolio weights and obtains a satisfactory coverage rate for their simultaneous confidence intervals. Finally, we apply the proposed methodology to address a portfolio rebalancing problem.
Journal: Journal of Applied Statistics
Pages: 1402-1420
Issue: 8
Volume: 44
Year: 2017
Month: 6
X-DOI: 10.1080/02664763.2016.1214241
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1214241
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:8:p:1402-1420
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaojuan Zhu
Author-X-Name-First: Xiaojuan
Author-X-Name-Last: Zhu
Author-Name: William Seaver
Author-X-Name-First: William
Author-X-Name-Last: Seaver
Author-Name: Rapinder Sawhney
Author-X-Name-First: Rapinder
Author-X-Name-Last: Sawhney
Author-Name: Shuguang Ji
Author-X-Name-First: Shuguang
Author-X-Name-Last: Ji
Author-Name: Bruce Holt
Author-X-Name-First: Bruce
Author-X-Name-Last: Holt
Author-Name: Gurudatt Bhaskar Sanil
Author-X-Name-First: Gurudatt Bhaskar
Author-X-Name-Last: Sanil
Author-Name: Girish Upreti
Author-X-Name-First: Girish
Author-X-Name-Last: Upreti
Title: Employee turnover forecasting for human resource management based on time series analysis
Abstract:
In some organizations, the hiring lead time is often long due to responding to human resource requirements associated with technical and security constrains. Thus, the human resource departments in these organizations are pretty interested in forecasting employee turnover since a good prediction of employee turnover could help the organizations to minimize the costs and impacts from the turnover on the operational capabilities and the budget. This study aims to enhance the ability to forecast employee turnover with or without considering the impact of economic indicators. Various time series modelling techniques were used to identify optimal models for effective employee turnover prediction. More than 11-years of monthly turnover data were used to build and validate the proposed models. Compared with other models, a dynamic regression model with additive trend, seasonality, interventions, and a very important economic indicator effectively predicted the turnover with training R2 = 0.77 and holdout R2 = 0.59. The forecasting performance of optimal models confirms that time series modelling approach has the ability to predict employee turnover for the specific scenario observed in our analysis.
Journal: Journal of Applied Statistics
Pages: 1421-1440
Issue: 8
Volume: 44
Year: 2017
Month: 6
X-DOI: 10.1080/02664763.2016.1214242
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1214242
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:8:p:1421-1440
Template-Type: ReDIF-Article 1.0
Author-Name: Safwan A. Altarazi
Author-X-Name-First: Safwan A.
Author-X-Name-Last: Altarazi
Author-Name: Rula M. Allaf
Author-X-Name-First: Rula M.
Author-X-Name-Last: Allaf
Title: Designing and analyzing a mixture experiment to optimize the mixing proportions of polyvinyl chloride composites
Abstract:
Polyvinyl chloride (PVC) products are typically complex composites, whose quality characteristics vary widely depending on the types and proportions of their components, as well as other processing factors. It is often required to optimize PVC production for specific applications at the highest cost efficiency. This study describes the design and analysis of a statistical experiment to investigate the effects of different parameters over the mechanical properties of PVC intended for use in electrical wire insulation. Four commonly used mixture components, namely, virgin PVC, recycled PVC, calcium carbonate, and a plasticizer, and two process variables, type of plasticizer and filler particle size, were examined. Statistical tools were utilized to analyze and optimize the mixture while simultaneously finding the proper process parameters. The mix was optimized to achieve required strength and ductility, as per ASTM D6096 while minimizing cost. The paper demonstrates how statistical models can help tailor complex polymeric composites in the presence of variations created by process variables.
Journal: Journal of Applied Statistics
Pages: 1441-1465
Issue: 8
Volume: 44
Year: 2017
Month: 6
X-DOI: 10.1080/02664763.2016.1214243
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1214243
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:8:p:1441-1465
Template-Type: ReDIF-Article 1.0
Author-Name: Yang Lei
Author-X-Name-First: Yang
Author-X-Name-Last: Lei
Author-Name: Susan Carlson
Author-X-Name-First: Susan
Author-X-Name-Last: Carlson
Author-Name: Lisa N. Yelland
Author-X-Name-First: Lisa N.
Author-X-Name-Last: Yelland
Author-Name: Maria Makrides
Author-X-Name-First: Maria
Author-X-Name-Last: Makrides
Author-Name: Robert Gibson
Author-X-Name-First: Robert
Author-X-Name-Last: Gibson
Author-Name: Byron J. Gajewski
Author-X-Name-First: Byron J.
Author-X-Name-Last: Gajewski
Title: Comparison of dichotomized and distributional approaches in rare event clinical trial design: a fixed Bayesian design
Abstract:
This research was motivated by our goal to design an efficient clinical trial to compare two doses of docosahexaenoic acid supplementation for reducing the rate of earliest preterm births (ePTB) and/or preterm births (PTB). Dichotomizing continuous gestational age (GA) data using a classic binomial distribution will result in a loss of information and reduced power. A distributional approach is an improved strategy to retain statistical power from the continuous distribution. However, appropriate distributions that fit the data properly, particularly in the tails, must be chosen, especially when the data are skewed. A recent study proposed a skew-normal method. We propose a three-component normal mixture model and introduce separate treatment effects at different components of GA. We evaluate operating characteristics of mixture model, beta-binomial model, and skew-normal model through simulation. We also apply these three methods to data from two completed clinical trials from the USA and Australia. Finite mixture models are shown to have favorable properties in PTB analysis but minimal benefit for ePTB analysis. Normal models on log-transformed data have the largest bias. Therefore we recommend finite mixture model for PTB study. Either finite mixture model or beta-binomial model is acceptable for ePTB study.
Journal: Journal of Applied Statistics
Pages: 1466-1478
Issue: 8
Volume: 44
Year: 2017
Month: 6
X-DOI: 10.1080/02664763.2016.1214244
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1214244
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:8:p:1466-1478
Template-Type: ReDIF-Article 1.0
Author-Name: Heba S. Mohammed
Author-X-Name-First: Heba S.
Author-X-Name-Last: Mohammed
Author-Name: Saieed F. Ateya
Author-X-Name-First: Saieed F.
Author-X-Name-Last: Ateya
Author-Name: Essam K. AL-Hussaini
Author-X-Name-First: Essam K.
Author-X-Name-Last: AL-Hussaini
Title: Estimation based on progressive first-failure censoring from exponentiated exponential distribution
Abstract:
In this paper, point and interval estimations for the parameters of the exponentiated exponential (EE) distribution are studied based on progressive first-failure-censored data. The Bayes estimates are computed based on squared error and Linex loss functions and using Markov Chain Monte Carlo (MCMC) algorithm. Also, based on this censoring scheme, approximate confidence intervals for the parameters of EE distribution are developed. Monte Carlo simulation study is carried out to compare the performances of the different methods by computing the estimated risks (ERs), as well as Akaike's information criteria (AIC) and Bayesian information criteria (BIC) of the estimates. Finally, a real data set is introduced and analyzed using EE and Weibull distributions. A comparison is carried out between the mentioned models based on the corresponding Kolmogorov–Smirnov (K–S) test statistic to emphasize that the EE model fits the data with the same efficiency as the other model. Point and interval estimation of all parameters are studied based on this real data set as illustrative example.
Journal: Journal of Applied Statistics
Pages: 1479-1494
Issue: 8
Volume: 44
Year: 2017
Month: 6
X-DOI: 10.1080/02664763.2016.1214245
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1214245
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:8:p:1479-1494
Template-Type: ReDIF-Article 1.0
Author-Name: Farag Shuweihdi
Author-X-Name-First: Farag
Author-X-Name-Last: Shuweihdi
Author-Name: Charles C. Taylor
Author-X-Name-First: Charles C.
Author-X-Name-Last: Taylor
Author-Name: Arief Gusnanto
Author-X-Name-First: Arief
Author-X-Name-Last: Gusnanto
Title: Classification of form under heterogeneity and non-isotropic errors
Abstract:
A number of areas related to learning under supervision have not been fully investigated, particularly the possibility of incorporating the method of classification into shape analysis. In this regard, practical ideas conducive to the improvement of form classification are the focus of interest. Our proposal is to employ a hybrid classifier built on Euclidean Distance Matrix Analysis (EDMA) and Procrustes distance, rather than generalised Procrustes analysis (GPA). In empirical terms, it has been demonstrated that there is notable difference between the estimated form and the true form when EDMA is used as the basis for computation. However, this does not seem to be the case when GPA is employed. With the assumption that no association exists between landmarks, EDMA and GPA are used to calculate the mean form and diagonal weighting matrix to build superimposing classifiers. As our findings indicate, with the use of EDMA estimators, the superimposing classifiers we propose work extremely well, as opposed to the use of GPA, as far as both simulated and real datasets are concerned.
Journal: Journal of Applied Statistics
Pages: 1495-1508
Issue: 8
Volume: 44
Year: 2017
Month: 6
X-DOI: 10.1080/02664763.2016.1214246
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1214246
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:8:p:1495-1508
Template-Type: ReDIF-Article 1.0
Author-Name: R. Scott Hacker
Author-X-Name-First: R.
Author-X-Name-Last: Scott Hacker
Author-Name: Abdulnasser Hatemi-J
Author-X-Name-First: Abdulnasser
Author-X-Name-Last: Hatemi-J
Title: Optimal lag-length choice in stable and unstable VAR models under situations of homoscedasticity and ARCH
Abstract: The performance of different information criteria – namely Akaike, corrected Akaike (AICC), Schwarz–Bayesian (SBC), and Hannan–Quinn – is investigated so as to choose the optimal lag length in stable and unstable vector autoregressive (VAR) models both when autoregressive conditional heteroscedasticity (ARCH) is present and when it is not. The investigation covers both large and small sample sizes. The Monte Carlo simulation results show that SBC has relatively better performance in lag-choice accuracy in many situations. It is also generally the least sensitive to ARCH regardless of stability or instability of the VAR model, especially in large sample sizes. These appealing properties of SBC make it the optimal criterion for choosing lag length in many situations, especially in the case of financial data, which are usually characterized by occasional periods of high volatility. SBC also has the best forecasting abilities in the majority of situations in which we vary sample size, stability, variance structure (ARCH or not), and forecast horizon (one period or five). frequently, AICC also has good lag-choosing and forecasting properties. However, when ARCH is present, the five-period forecast performance of all criteria in all situations worsens.
Journal: Journal of Applied Statistics
Pages: 601-615
Issue: 6
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760801920473
File-URL: http://hdl.handle.net/10.1080/02664760801920473
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:6:p:601-615
Template-Type: ReDIF-Article 1.0
Author-Name: J. López-Fidalgo
Author-X-Name-First: J.
Author-X-Name-Last: López-Fidalgo
Author-Name: R. Martín-Martín
Author-X-Name-First: R.
Author-X-Name-Last: Martín-Martín
Author-Name: M. Stehlík
Author-X-Name-First: M.
Author-X-Name-Last: Stehlík
Title: Marginally restricted D-optimal designs for correlated observations
Abstract: Two practical degrees of complexity may arise when designing an experiment for a model of a real life case. First, some explanatory variables may not be under the control of the practitioner. Secondly, the responses may be correlated. In this paper three real life cases in this situation are considered. Different covariance structures are studied and some designs are computed adapting the theory of marginally restricted designs for correlated observations. An exchange algorithm given by Brimkulov's algorithm is also adapted to marginally restricted D–optimality and it is applied to a complex situation.
Journal: Journal of Applied Statistics
Pages: 617-632
Issue: 6
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760801920556
File-URL: http://hdl.handle.net/10.1080/02664760801920556
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:6:p:617-632
Template-Type: ReDIF-Article 1.0
Author-Name: Rahul Mazumder
Author-X-Name-First: Rahul
Author-X-Name-Last: Mazumder
Title: Fluid flow pattern analysis in a trough region: a nonparametric approach
Abstract: This paper aims at identifying statistically different circulation patterns characterising fluid flow in the trough region between two adjacent asymmetric waveforms, using the velocity data collected by 3D acoustic Doppler velocimeter. Statistical clustering has been performed using ideas originating from information theory and scale space theory in computer vision for splitting the trough region into different spatially connected segments (identifying the circulation bubble in the process) on the basis of circulation patterns. The paper attempts to visualise the fluid fluctuations in the trough region, with emphasis on the circulation region, by simulating the directional fluctuations of fluid particles from the kernel density estimates learned from the experimental data. The image representation of the estimate of the spatial turbulent kinetic energy (TKE) function reveals interesting features corresponding to the regions of high TKE, suggesting the possibilities for further research in this area along the lines of feature extraction and image analysis.
Journal: Journal of Applied Statistics
Pages: 633-645
Issue: 6
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760801920671
File-URL: http://hdl.handle.net/10.1080/02664760801920671
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:6:p:633-645
Template-Type: ReDIF-Article 1.0
Author-Name: H. Jiang
Author-X-Name-First: H.
Author-X-Name-Last: Jiang
Author-Name: M. Xie
Author-X-Name-First: M.
Author-X-Name-Last: Xie
Author-Name: L.C. Tang
Author-X-Name-First: L.C.
Author-X-Name-Last: Tang
Title: Markov chain Monte Carlo methods for parameter estimation of the modified Weibull distribution
Abstract: In this paper, the Markov chain Monte Carlo (MCMC) method is used to estimate the parameters of a modified Weibull distribution based on a complete sample. While maximum-likelihood estimation (MLE) is the most used method for parameter estimation, MCMC has recently emerged as a good alternative. When applied to parameter estimation, MCMC methods have been shown to be easy to implement computationally, the estimates always exist and are statistically consistent, and their probability intervals are convenient to construct. Details of applying MCMC to parameter estimation for the modified Weibull model are elaborated and a numerical example is presented to illustrate the methods of inference discussed in this paper. To compare MCMC with MLE, a simulation study is provided, and the differences between the estimates obtained by the two algorithms are examined.
Journal: Journal of Applied Statistics
Pages: 647-658
Issue: 6
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760801920846
File-URL: http://hdl.handle.net/10.1080/02664760801920846
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:6:p:647-658
Template-Type: ReDIF-Article 1.0
Author-Name: Valentin Rousson
Author-X-Name-First: Valentin
Author-X-Name-Last: Rousson
Title: Monotone fitting for developmental variables
Abstract: In order to study developmental variables, for example, neuromotor development of children and adolescents, monotone fitting is typically needed. Most methods, to estimate a monotone regression function non-parametrically, however, are not straightforward to implement, a difficult issue being the choice of smoothing parameters. In this paper, a convenient implementation of the monotone B-spline estimates of Ramsay [Monotone regression splines in action (with discussion), Stat. Sci. 3 (1988), pp. 425–461] and Kelly and Rice [Montone smoothing with application to dose-response curves and the assessment of synergism, Biometrics 46 (1990), pp. 1071–1085] is proposed and applied to neuromotor data. Knots are selected adaptively using ideas found in Friedman and Silverman [Flexible parsimonous smoothing and additive modelling (with discussion), Technometrics 31 (1989), pp. 3–39] yielding a flexible algorithm to automatically and accurately estimate a monotone regression function. Using splines also simultaneously allows to include other aspects in the estimation problem, such as modeling a constant difference between two groups or a known jump in the regression function. Finally, an estimate which is not only monotone but also has a ‘levelling-off’ (i.e. becomes constant after some point) is derived. This is useful when the developmental variable is known to attain a maximum/minimum within the interval of observation.
Journal: Journal of Applied Statistics
Pages: 659-670
Issue: 6
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760801920960
File-URL: http://hdl.handle.net/10.1080/02664760801920960
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:6:p:659-670
Template-Type: ReDIF-Article 1.0
Author-Name: C. Lai
Author-X-Name-First: C.
Author-X-Name-Last: Lai
Author-Name: K. Govindaraju
Author-X-Name-First: K.
Author-X-Name-Last: Govindaraju
Title: Reduction of control-chart signal variablity for high-quality processes
Abstract: The design of a control chart is often based on the statistical measure of average run length (ARL). A longer in-control ARL is ensured by the design, but the variance run length distribution may also be large for such a design. In practical terms, the variability in false alarms and true signals may be large. If the sample size for plotting a point is not constant, then the focus is on the average number inspected as against the ARL. This article considers two well-known attribute control chart procedures for monitoring high quality based on the number inspected, and shows how the variability in false alarms and correct signals can be reduced.
Journal: Journal of Applied Statistics
Pages: 671-679
Issue: 6
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760801921232
File-URL: http://hdl.handle.net/10.1080/02664760801921232
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:6:p:671-679
Template-Type: ReDIF-Article 1.0
Author-Name: Tapio Nummi
Author-X-Name-First: Tapio
Author-X-Name-Last: Nummi
Author-Name: Laura Koskela
Author-X-Name-First: Laura
Author-X-Name-Last: Koskela
Title: Analysis of growth curve data by using cubic smoothing splines
Abstract: Longitudinal data frequently arises in various fields of applied sciences where individuals are measured according to some ordered variable, e.g. time. A common approach used to model such data is based on the mixed models for repeated measures. This model provides an eminently flexible approach to modeling of a wide range of mean and covariance structures. However, such models are forced into a rigidly defined class of mathematical formulas which may not be well supported by the data within the whole sequence of observations. A possible non-parametric alternative is a cubic smoothing spline, which is highly flexible and has useful smoothing properties. It can be shown that under normality assumption, the solution of the penalized log-likelihood equation is the cubic smoothing spline, and this solution can be further expressed as a solution of the linear mixed model. It is shown here how cubic smoothing splines can be easily used in the analysis of complete and balanced data. Analysis can be greatly simplified by using the unweighted estimator studied in the paper. It is shown that if the covariance structure of random errors belong to certain class of matrices, the unweighted estimator is the solution to the penalized log-likelihood function. This result is new in smoothing spline context and it is not only confined to growth curve settings. The connection to mixed models is used in developing a rough testing of group profiles. Numerical examples are presented to illustrate the techniques proposed.
Journal: Journal of Applied Statistics
Pages: 681-691
Issue: 6
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760801923964
File-URL: http://hdl.handle.net/10.1080/02664760801923964
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:6:p:681-691
Template-Type: ReDIF-Article 1.0
Author-Name: Ashis Sengupta
Author-X-Name-First: Ashis
Author-X-Name-Last: Sengupta
Author-Name: Arnab Kumar Laha
Author-X-Name-First: Arnab
Author-X-Name-Last: Kumar Laha
Title: A Bayesian analysis of the change-point problem for directional data
Abstract: In this paper, we discuss a simple fully Bayesian analysis of the change-point problem for the directional data in the parametric framework with von Mises or circular normal distribution as the underlying distribution. We first discuss the problem of detecting change in the mean direction of the circular normal distribution using a latent variable approach when the concentration parameter is unknown. Then, a simpler approach, beginning with proper priors for all the unknown parameters – the sampling importance resampling technique – is used to obtain the posterior marginal distribution of the change-point. The method is illustrated using the wind data [E.P. Weijers, A. Van Delden, H.F. Vugts and A.G.C.A. Meesters, The composite horizontal wind field within convective structures of the atmospheric surface layer, J. Atmos. Sci. 52 (1995. 3866–3878]. The method can be adapted for a variety of situations involving both angular and linear data and can be used with profit in the context of statistical process control in Phase I of control charting and also in Phase II in conjunction with control charts.
Journal: Journal of Applied Statistics
Pages: 693-700
Issue: 6
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760801924004
File-URL: http://hdl.handle.net/10.1080/02664760801924004
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:6:p:693-700
Template-Type: ReDIF-Article 1.0
Author-Name: Byoung Cheol Jung
Author-X-Name-First: Byoung
Author-X-Name-Last: Cheol Jung
Author-Name: André Khuri
Author-X-Name-First: André
Author-X-Name-Last: Khuri
Author-Name: Juneyoung Lee
Author-X-Name-First: Juneyoung
Author-X-Name-Last: Lee
Title: Comparison of designs for the three-fold nested random model
Abstract: The quality of estimation of variance components depends on the design used as well as on the unknown values of the variance components. In this article, three designs are compared, namely, the balanced, staggered, and inverted nested designs for the three-fold nested random model. The comparison is based on the so-called quantile dispersion graphs using analysis of variance (ANOVA) and maximum likelihood (ML) estimates of the variance components. It is demonstrated that the staggered nested design gives more stable estimates of the variance component for the highest nesting factor than the balanced design. The reverse, however, is true in case of lower nested factors. A comparison between ANOVA and ML estimation of the variance components is also made using each of the aforementioned designs.
Journal: Journal of Applied Statistics
Pages: 701-715
Issue: 6
Volume: 35
Year: 2008
X-DOI: 10.1080/02664760801924079
File-URL: http://hdl.handle.net/10.1080/02664760801924079
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:35:y:2008:i:6:p:701-715
Template-Type: ReDIF-Article 1.0
Author-Name: Rubén Manso
Author-X-Name-First: Rubén
Author-X-Name-Last: Manso
Author-Name: Rafael Calama
Author-X-Name-First: Rafael
Author-X-Name-Last: Calama
Author-Name: Marta Pardos
Author-X-Name-First: Marta
Author-X-Name-Last: Pardos
Author-Name: Mathieu Fortin
Author-X-Name-First: Mathieu
Author-X-Name-Last: Fortin
Title: A maximum likelihood estimator for left-truncated lifetimes based on probabilistic prior information about time of occurrence
Abstract:
In forestry, many processes of interest are binary and they can be modeled using lifetime analysis. However, available data are often incomplete, being interval- and right-censored as well as left-truncated, which may lead to biased parameter estimates. While censoring can be easily considered in lifetime analysis, left truncation is more complicated when individual age at selection is unknown. In this study, we designed and tested a maximum likelihood estimator that deals with left truncation by taking advantage of prior knowledge about the time when the individuals enter the experiment. Whenever a model is available for predicting the time of selection, the distribution of the delayed entries can be obtained using Bayes' theorem. It is then possible to marginalize the likelihood function over the distribution of the delayed entries in the experiment to assess the joint distribution of time of selection and time to event. This estimator was tested with continuous and discrete Gompertz-distributed lifetimes. It was then compared with two other estimators: a standard one in which left truncation was not considered and a second estimator that implemented an analytical correction. Our new estimator yielded unbiased parameter estimates with empirical coverage of confidence intervals close to their nominal value. The standard estimator leaded to an overestimation of the long-term probability of survival.
Journal: Journal of Applied Statistics
Pages: 2107-2127
Issue: 12
Volume: 45
Year: 2018
Month: 9
X-DOI: 10.1080/02664763.2017.1410527
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1410527
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:12:p:2107-2127
Template-Type: ReDIF-Article 1.0
Author-Name: Xin Zhao
Author-X-Name-First: Xin
Author-X-Name-Last: Zhao
Author-Name: Zhensheng Huang
Author-X-Name-First: Zhensheng
Author-X-Name-Last: Huang
Title: Varying-coefficient single-index measurement error model
Abstract:
This paper proposes a varying-coefficient single-index measurement error model, which consists of measurement error in the index covariates. We combine the simulation-extrapolation technique, the local linear regression and the weighted least-squares method to estimate the unknowns of the current model, and develop the asymptotic properties of the resulting estimators under some conditions. A simulation study is conducted to evaluate the proposed methodology, and a real example is also studied to illustrate our given methodology.
Journal: Journal of Applied Statistics
Pages: 2128-2144
Issue: 12
Volume: 45
Year: 2018
Month: 9
X-DOI: 10.1080/02664763.2017.1410528
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1410528
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:12:p:2128-2144
Template-Type: ReDIF-Article 1.0
Author-Name: Sang Gyu Kwak
Author-X-Name-First: Sang Gyu
Author-X-Name-Last: Kwak
Author-Name: Balgobin Nandram
Author-X-Name-First: Balgobin
Author-X-Name-Last: Nandram
Author-Name: Dal Ho Kim
Author-X-Name-First: Dal Ho
Author-X-Name-Last: Kim
Title: Bayesian inference on contingency tables with uncertainty about independence for small areas
Abstract:
A scientist might have vague information about independence/dependence in a two-way table, and a statistician might proceed with estimation conditional on this piece of information. However, one needs to take into account the uncertainty in this information which can increase variability. We develop a Bayesian method to solve this problem when estimation is needed for the cells of a $ r \times c $ r×c contingency table and there is uncertainty about independence or dependence. In our problem, there are several small areas and a $ r \times c $ r×c table is constructed for each area. We use the hierarchical Dirichlet-multinomial model to analyze the counts from these small areas. The key idea in our method is that the cell probabilities of each area is expressed as a convex combination of the cell probabilities under independence and the cell probabilities under dependence, where each area has its own unknown weight. We show how to fit the model using the Gibbs sampler even though many of the conditional posterior densities are nonstandard. As a by product of our method, we have actually produced a test of independence which is competitive to the chi-square test for a single table. To illustrate our method, we have used an example on body mass index and bone mineral density data obtained from NHANES III. We have shown some important differences among the three scenarios (independence, dependence and the convex combination of these two) when Bayesian predictive inference is done for the finite population means corresponding to each cell of the $ r \times c $ r×c table.
Journal: Journal of Applied Statistics
Pages: 2145-2163
Issue: 12
Volume: 45
Year: 2018
Month: 9
X-DOI: 10.1080/02664763.2017.1413074
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1413074
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:12:p:2145-2163
Template-Type: ReDIF-Article 1.0
Author-Name: Jiajia Chen
Author-X-Name-First: Jiajia
Author-X-Name-Last: Chen
Author-Name: Xiaoqin Zhang
Author-X-Name-First: Xiaoqin
Author-X-Name-Last: Zhang
Author-Name: Shengjia Li
Author-X-Name-First: Shengjia
Author-X-Name-Last: Li
Title: Heteroskedastic linear regression model with compositional response and covariates
Abstract:
Compositional data are known as a sort of complex multidimensional data with the feature that reflect the relative information rather than absolute information. There are a variety of models for regression analysis with compositional variables. Similar to the traditional regression analysis, the heteroskedasticity still exists in these models. However, the existing heteroskedastic regression analysis methods cannot apply in these models with compositional error term. In this paper, we mainly study the heteroskedastic linear regression model with compositional response and covariates. The parameter estimator is obtained through weighted least squares method. For the hypothesis test of parameter, the test statistic is based on the original least squares estimator and corresponding heteroskedasticity-consistent covariance matrix estimator. When the proposed method is applied to both simulation and real example, we use the original least squares method as a comparison during the whole process. The results implicate the model's practicality and effectiveness in regression analysis with heteroskedasticity.
Journal: Journal of Applied Statistics
Pages: 2164-2181
Issue: 12
Volume: 45
Year: 2018
Month: 9
X-DOI: 10.1080/02664763.2017.1413075
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1413075
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:12:p:2164-2181
Template-Type: ReDIF-Article 1.0
Author-Name: S. Eftekhari Mahabadi
Author-X-Name-First: S.
Author-X-Name-Last: Eftekhari Mahabadi
Author-Name: E. Rahimi Jafari
Author-X-Name-First: E.
Author-X-Name-Last: Rahimi Jafari
Title: Skew-mixed effects model for multivariate longitudinal data with categorical outcomes and missingness
Abstract:
A longitudinal study commonly follows a set of variables, measured for each individual repeatedly over time, and usually suffers from incomplete data problem. A common approach for dealing with longitudinal categorical responses is to use the Generalized Linear Mixed Model (GLMM). This model induces the potential relation between response variables over time via a vector of random effects, assumed to be shared parameters in the non-ignorable missing mechanism. Most GLMMs assume that the random-effects parameters follow a normal or symmetric distribution and this leads to serious problems in real applications. In this paper, we propose GLMMs for the analysis of incomplete multivariate longitudinal categorical responses with a non-ignorable missing mechanism based on a shared parameter framework with the less restrictive assumption of skew-normality for the random effects. These models may contain incomplete data with monotone and non-monotone missing patterns. The performance of the model is evaluated using simulation studies and a well-known longitudinal data set extracted from a fluvoxamine trial is analyzed to determine the profile of fluvoxamine in ambulatory clinical psychiatric practice.
Journal: Journal of Applied Statistics
Pages: 2182-2201
Issue: 12
Volume: 45
Year: 2018
Month: 9
X-DOI: 10.1080/02664763.2017.1413076
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1413076
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:12:p:2182-2201
Template-Type: ReDIF-Article 1.0
Author-Name: Marcel Ausloos
Author-X-Name-First: Marcel
Author-X-Name-Last: Ausloos
Author-Name: Roy Cerqueti
Author-X-Name-First: Roy
Author-X-Name-Last: Cerqueti
Title: Intriguing yet simple skewness: kurtosis relation in economic and demographic data distributions, pointing to preferential attachment processes
Abstract:
In this paper, we propose that relations between high-order moments of data distributions, for example, between the skewness (S) and kurtosis (K), allow to point to theoretical models with understandable structural parameters. The illustrative data concern two cases: (i) the distribution of income taxes and (ii) that of inhabitants, after aggregation over each city in each province of Italy in 2011. Moreover, from the rank-size relationship, for either S or K, in both cases, it is shown that one obtains the parameters of the underlying (hypothetical) modeling distribution: in the present cases, the 2-parameter Beta function, itself related to the Yule–Simon distribution function, whence suggesting a growth model based on the preferential attachment process.
Journal: Journal of Applied Statistics
Pages: 2202-2218
Issue: 12
Volume: 45
Year: 2018
Month: 9
X-DOI: 10.1080/02664763.2017.1413077
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1413077
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:12:p:2202-2218
Template-Type: ReDIF-Article 1.0
Author-Name: Anthony Zullo
Author-X-Name-First: Anthony
Author-X-Name-Last: Zullo
Author-Name: Mathieu Fauvel
Author-X-Name-First: Mathieu
Author-X-Name-Last: Fauvel
Author-Name: Frédéric Ferraty
Author-X-Name-First: Frédéric
Author-X-Name-Last: Ferraty
Title: Experimental comparison of functional and multivariate spectral-based supervised classification methods in hyperspectral image
Abstract:
The aim of this article is to assess and compare several statistical methods for hyperspectral image supervised classification only using the spectral dimension. Since hyperspectral profiles may be viewed either as a random vector or a random curve, we propose to confront various multivariate discriminating procedures with functional alternatives. Eight methods representing three important statistical communities (mixture models, machine learning and functional data analysis) have been applied on three hyperspectral datasets following three protocols studying the influence of size and composition of the learning sample, with or without noised labels. Besides this comparative study, this work proposes a functional extension of multinomial logit model as well as a fast computing adaptation of the nonparametric functional discrimination. As a by-product, this work provides a useful comprehensive bibliography and also supplemental material especially oriented towards practitioners.
Journal: Journal of Applied Statistics
Pages: 2219-2237
Issue: 12
Volume: 45
Year: 2018
Month: 9
X-DOI: 10.1080/02664763.2017.1414162
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1414162
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:12:p:2219-2237
Template-Type: ReDIF-Article 1.0
Author-Name: Julie Ann Lorah
Author-X-Name-First: Julie Ann
Author-X-Name-Last: Lorah
Title: Estimating individual-level interaction effects in multilevel models: a Monte Carlo simulation study with application
Abstract:
Moderated multiple regression provides a useful framework for understanding moderator variables. These variables can also be examined within multilevel datasets, although the literature is not clear on the best way to assess data for significant moderating effects, particularly within a multilevel modeling framework. This study explores potential ways to test moderation at the individual level (level one) within a 2-level multilevel modeling framework, with varying effect sizes, cluster sizes, and numbers of clusters. The study examines five potential methods for testing interaction effects: the Wald test, F-test, likelihood ratio test, Bayesian information criterion (BIC), and Akaike information criterion (AIC). For each method, the simulation study examines Type I error rates and power. Following the simulation study, an applied study uses real data to assess interaction effects using the same five methods. Results indicate that the Wald test, F-test, and likelihood ratio test all perform similarly in terms of Type I error rates and power. Type I error rates for the AIC are more liberal, and for the BIC typically more conservative. A four-step procedure for applied researchers interested in examining interaction effects in multi-level models is provided.
Journal: Journal of Applied Statistics
Pages: 2238-2255
Issue: 12
Volume: 45
Year: 2018
Month: 9
X-DOI: 10.1080/02664763.2017.1414163
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1414163
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:12:p:2238-2255
Template-Type: ReDIF-Article 1.0
Author-Name: Ken J. Beath
Author-X-Name-First: Ken J.
Author-X-Name-Last: Beath
Title: A mixture-based approach to robust analysis of generalised linear models
Abstract:
A method for robustness in linear models is to assume that there is a mixture of standard and outlier observations with a different error variance for each class. For generalised linear models (GLMs) the mixture model approach is more difficult as the error variance for many distributions has a fixed relationship to the mean. This model is extended to GLMs by changing the classes to one where the standard class is a standard GLM and the outlier class which is an overdispersed GLM achieved by including a random effect term in the linear predictor. The advantages of this method are it can be extended to any model with a linear predictor, and outlier observations can be easily identified. Using simulation the model is compared to an M-estimator, and found to have improved bias and coverage. The method is demonstrated on three examples.
Journal: Journal of Applied Statistics
Pages: 2256-2268
Issue: 12
Volume: 45
Year: 2018
Month: 9
X-DOI: 10.1080/02664763.2017.1414164
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1414164
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:12:p:2256-2268
Template-Type: ReDIF-Article 1.0
Author-Name: Achilleas Vassilopoulos
Author-X-Name-First: Achilleas
Author-X-Name-Last: Vassilopoulos
Author-Name: Andreas C. Drichoutis
Author-X-Name-First: Andreas C.
Author-X-Name-Last: Drichoutis
Author-Name: Rodolfo M. Nayga
Author-X-Name-First: Rodolfo M.
Author-X-Name-Last: Nayga
Author-Name: Panagiotis Lazaridis
Author-X-Name-First: Panagiotis
Author-X-Name-Last: Lazaridis
Title: Does the supplemental nutrition assistance program really increase obesity? The importance of accounting for misclassification errors
Abstract:
The prevalence of obesity among US citizens has grown rapidly over the last few decades, especially among low-income individuals. This has led to questions about the effectiveness of nutritional assistance programs such as the Supplemental Nutrition Assistance Program (SNAP). Previous results on the effect of SNAP participation on obesity are mixed. These findings are however based on the assumption that participation status can be accurately observed, despite significant misclassification errors reported in the literature. Using propensity score matching, we conclude that there seems to be a positive effect of SNAP participation on obesity rates for female participants and no such effect for males, a result that is consistent with several previous studies. However, an extensive sensitivity analysis reveals that the positive effect for females is sensitive to misclassification errors and to the conditional independence assumption. Thus analogous findings should also be used with caution unless examined under the prism of classification errors and of other assumptions used for the identification of causal parameters.
Journal: Journal of Applied Statistics
Pages: 2269-2278
Issue: 12
Volume: 45
Year: 2018
Month: 9
X-DOI: 10.1080/02664763.2017.1414165
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1414165
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:12:p:2269-2278
Template-Type: ReDIF-Article 1.0
Author-Name: Guy Cafri
Author-X-Name-First: Guy
Author-X-Name-Last: Cafri
Author-Name: Luo Li
Author-X-Name-First: Luo
Author-X-Name-Last: Li
Author-Name: Elizabeth W. Paxton
Author-X-Name-First: Elizabeth W.
Author-X-Name-Last: Paxton
Author-Name: Juanjuan Fan
Author-X-Name-First: Juanjuan
Author-X-Name-Last: Fan
Title: Predicting risk for adverse health events using random forest
Abstract:
Estimation of person-specific risk for adverse health events in medicine has been approached almost exclusively using parametric statistical methods. Random forest is a machine learning method based on tree ensembles that is completely nonparametric and for this reason may be better suited for risk prediction. An introduction to a random forest is provided with a focus on its application to risk prediction. Using data from a total joint replacement registry, we illustrate risk prediction for the binary outcome of 90-day mortality following implantation, as well as time to device failure for aseptic reasons with the competing risk of mortality. Using the methods described in this paper, the random forest could be applied to risk prediction in a wide variety of medical fields. Issues related to implementation are discussed.
Journal: Journal of Applied Statistics
Pages: 2279-2294
Issue: 12
Volume: 45
Year: 2018
Month: 9
X-DOI: 10.1080/02664763.2017.1414166
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1414166
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:12:p:2279-2294
Template-Type: ReDIF-Article 1.0
Author-Name: Thomas Kirschenmann
Author-X-Name-First: Thomas
Author-X-Name-Last: Kirschenmann
Author-Name: Paul Damien
Author-X-Name-First: Paul
Author-X-Name-Last: Damien
Author-Name: Stephen Walker
Author-X-Name-First: Stephen
Author-X-Name-Last: Walker
Title: Bayesian estimation of the Cox model under different hazard rate shape assumptions via slice sampling
Abstract:
In this paper, we provide a full Bayesian analysis for Cox's proportional hazards model under different hazard rate shape assumptions. To this end, we select the modified Weibull distribution family to model failure rates. A novel Markov chain Monte Carlo method allows one to tackle both exact and right-censored failure time data. Both simulated and real data are used to illustrate the methods.
Journal: Journal of Applied Statistics
Pages: 2295-2306
Issue: 12
Volume: 45
Year: 2018
Month: 9
X-DOI: 10.1080/02664763.2017.1420147
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1420147
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:12:p:2295-2306
Template-Type: ReDIF-Article 1.0
Author-Name: Da Xu
Author-X-Name-First: Da
Author-X-Name-Last: Xu
Author-Name: Shishun Zhao
Author-X-Name-First: Shishun
Author-X-Name-Last: Zhao
Author-Name: Tao Hu
Author-X-Name-First: Tao
Author-X-Name-Last: Hu
Author-Name: Mengzhu Yu
Author-X-Name-First: Mengzhu
Author-X-Name-Last: Yu
Author-Name: Jianguo Sun
Author-X-Name-First: Jianguo
Author-X-Name-Last: Sun
Title: Regression analysis of informative current status data with the semiparametric linear transformation model
Abstract:
Many methods have been developed in the literature for regression analysis of current status data with noninformative censoring and also some approaches have been proposed for semiparametric regression analysis of current status data with informative censoring. However, the existing approaches for the latter situation are mainly on specific models such as the proportional hazards model and the additive hazard model. Corresponding to this, in this paper, we consider a general class of semiparametric linear transformation models and develop a sieve maximum likelihood estimation approach for the inference. In the method, the copula model is employed to describe the informative censoring or relationship between the failure time of interest and the censoring time, and Bernstein polynomials are used to approximate the nonparametric functions involved. The asymptotic consistency and normality of the proposed estimators are established, and an extensive simulation study is conducted and indicates that the proposed approach works well for practical situations. In addition, an illustrative example is provided.
Journal: Journal of Applied Statistics
Pages: 187-202
Issue: 2
Volume: 46
Year: 2019
Month: 1
X-DOI: 10.1080/02664763.2018.1466870
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1466870
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:2:p:187-202
Template-Type: ReDIF-Article 1.0
Author-Name: Henok Woldu
Author-X-Name-First: Henok
Author-X-Name-Last: Woldu
Author-Name: Timothy G. Heckman
Author-X-Name-First: Timothy G.
Author-X-Name-Last: Heckman
Author-Name: Andreas Handel
Author-X-Name-First: Andreas
Author-X-Name-Last: Handel
Author-Name: Ye Shen
Author-X-Name-First: Ye
Author-X-Name-Last: Shen
Title: Applying functional data analysis to assess tele-interpersonal psychotherapy's efficacy to reduce depression
Abstract:
The use of parametric linear mixed models and generalized linear mixed models to analyze longitudinal data collected during randomized control trials (RCT) is conventional. The application of these methods, however, is restricted due to various assumptions required by these models. When the number of observations per subject is sufficiently large, and individual trajectories are noisy, functional data analysis (FDA) methods serve as an alternative to parametric longitudinal data analysis techniques. However, the use of FDA in RCTs is rare. In this paper, the effectiveness of FDA and linear mixed models (LMMs) was compared by analyzing data from rural persons living with HIV and comorbid depression enrolled in a depression treatment randomized clinical trial. Interactive voice response systems were used for weekly administrations of the 10-item Self-Administered Depression Scale (SADS) over 41 weeks. Functional principal component analysis and functional regression analysis methods detected a statistically significant difference in SADS between telphone-administered interpersonal psychotherapy (tele-IPT) and controls but linear mixed effects model results did not. Additional simulation studies were conducted to compare FDA and LMMs under a different nonlinear trajectory assumption. In this clinical trial with sufficient per subject measured outcomes and individual trajectories that are noisy and nonlinear, we found FDA methods to be a better alternative to LMMs.
Journal: Journal of Applied Statistics
Pages: 203-216
Issue: 2
Volume: 46
Year: 2019
Month: 1
X-DOI: 10.1080/02664763.2018.1470231
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1470231
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:2:p:203-216
Template-Type: ReDIF-Article 1.0
Author-Name: John Mashford
Author-X-Name-First: John
Author-X-Name-Last: Mashford
Author-Name: Yong Song
Author-X-Name-First: Yong
Author-X-Name-Last: Song
Author-Name: Q. J. Wang
Author-X-Name-First: Q. J.
Author-X-Name-Last: Wang
Author-Name: David Robertson
Author-X-Name-First: David
Author-X-Name-Last: Robertson
Title: A Bayesian hierarchical spatio-temporal rainfall model
Abstract:
A Bayesian hierarchical spatio-temporal rainfall model is presented and analysed. The model has the ability to deal with extensive missing or null values, uses a sophisticated variance stabilising rainfall pre-transformation, incorporates a new elevation model and can provide sub-catchment rainfall estimation and interpolation using a sequential kriging scheme. The model uses a vector autoregressive stochastic process to represent the time dependence of the rainfall field and an exponential covariogram to model the spatial correlation of the rainfall field. The model can be readily generalised to other types of stochastic processes. In this paper, some results of applying the model to a particular rainfall catchment are presented.
Journal: Journal of Applied Statistics
Pages: 217-229
Issue: 2
Volume: 46
Year: 2019
Month: 1
X-DOI: 10.1080/02664763.2018.1473347
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1473347
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:2:p:217-229
Template-Type: ReDIF-Article 1.0
Author-Name: Xiao Li
Author-X-Name-First: Xiao
Author-X-Name-Last: Li
Author-Name: Michele Guindani
Author-X-Name-First: Michele
Author-X-Name-Last: Guindani
Author-Name: Chaan S. Ng
Author-X-Name-First: Chaan S.
Author-X-Name-Last: Ng
Author-Name: Brian P. Hobbs
Author-X-Name-First: Brian P.
Author-X-Name-Last: Hobbs
Title: Spatial Bayesian modeling of GLCM with application to malignant lesion characterization
Abstract:
The emerging field of cancer radiomics endeavors to characterize intrinsic patterns of tumor phenotypes and surrogate markers of response by transforming medical images into objects that yield quantifiable summary statistics to which regression and machine learning algorithms may be applied for statistical interrogation. Recent literature has identified clinicopathological association based on textural features deriving from gray-level co-occurrence matrices (GLCM) which facilitate evaluations of gray-level spatial dependence within a delineated region of interest. GLCM-derived features, however, tend to contribute highly redundant information. Moreover, when reporting selected feature sets, investigators often fail to adjust for multiplicities and commonly fail to convey the predictive power of their findings. This article presents a Bayesian probabilistic modeling framework for the GLCM as a multivariate object as well as describes its application within a cancer detection context based on computed tomography. The methodology, which circumvents processing steps and avoids evaluations of reductive and highly correlated feature sets, uses latent Gaussian Markov random field structure to characterize spatial dependencies among GLCM cells and facilitates classification via predictive probability. Correctly predicting the underlying pathology of 81% of the adrenal lesions in our case study, the proposed method outperformed current practices which achieved a maximum accuracy of only 59%. Simulations and theory are presented to further elucidate this comparison as well as ascertain the utility of applying multivariate Gaussian spatial processes to GLCM objects.
Journal: Journal of Applied Statistics
Pages: 230-246
Issue: 2
Volume: 46
Year: 2019
Month: 1
X-DOI: 10.1080/02664763.2018.1473348
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1473348
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:2:p:230-246
Template-Type: ReDIF-Article 1.0
Author-Name: Li-Chu Chien
Author-X-Name-First: Li-Chu
Author-X-Name-Last: Chien
Title: A method for combining -values in meta-analysis by gamma distributions
Abstract:
Combining p-values from statistical tests across different studies is the most commonly used approach in meta-analysis for evolutionary biology. The most commonly used p-value combination methods mainly incorporate the z-transform tests (e.g., the un-weighted z-test and the weighted z-test) and the gamma-transform tests (e.g., the CZ method [Z. Chen, W. Yang, Q. Liu, J.Y. Yang, J. Li, and M.Q. Yang, A new statistical approach to combining p-values using gamma distribution and its application to genomewide association study, Bioinformatics 15 (2014), p. S3]). However, among these existing p-value combination methods, no method is uniformly most powerful in all situations [Chen et al. 2014]. In this paper, we propose a meta-analysis method based on the gamma distribution, MAGD, by pooling the p-values from independent studies. The newly proposed test, MAGD, allows for flexible accommodating of the different levels of heterogeneity of effect sizes across individual studies. The MAGD simultaneously retains all the characters of the z-transform tests and the gamma-transform tests. We also propose an easy-to-implement resampling approach for estimating the empirical p-values of MAGD for the finite sample size. Simulation studies and two data applications show that the proposed method MAGD is essentially as powerful as the z-transform tests (the gamma-transform tests) under the circumstance with the homogeneous (heterogeneous) effect sizes across studies.
Journal: Journal of Applied Statistics
Pages: 247-261
Issue: 2
Volume: 46
Year: 2019
Month: 1
X-DOI: 10.1080/02664763.2018.1474857
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1474857
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:2:p:247-261
Template-Type: ReDIF-Article 1.0
Author-Name: J. Peng
Author-X-Name-First: J.
Author-X-Name-Last: Peng
Author-Name: W. Liu
Author-X-Name-First: W.
Author-X-Name-Last: Liu
Author-Name: F. Bretz
Author-X-Name-First: F.
Author-X-Name-Last: Bretz
Author-Name: A. J. Hayter
Author-X-Name-First: A. J.
Author-X-Name-Last: Hayter
Title: Counting by weighing: construction of two-sided confidence intervals
Abstract:
Counting by weighing is widely used in industry and often more efficient than counting manually which is time consuming and prone to human errors especially when the number of items is large. Lower confidence bounds on the numbers of items in infinitely many future bags based on the weights of the bags have been proposed recently in Liu et al. [Counting by weighing: Know your numbers with confidence, J. Roy. Statist. Soc. Ser. C 65(4) (2016), pp. 641–648]. These confidence bounds are constructed using the data from one calibration experiment and for different parameters (or numbers), but have the frequency interpretation similar to a usual confidence set for one parameter only. In this paper, the more challenging problem of constructing two-sided confidence intervals is studied. A simulation-based method for computing the critical constant is proposed. This method is proven to give the required critical constant when the number of simulations goes to infinity, and shown to be easily implemented on an ordinary computer to compute the critical constant accurately and quickly. The methodology is illustrated with a real data example.
Journal: Journal of Applied Statistics
Pages: 262-271
Issue: 2
Volume: 46
Year: 2019
Month: 1
X-DOI: 10.1080/02664763.2018.1475553
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1475553
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:2:p:262-271
Template-Type: ReDIF-Article 1.0
Author-Name: David Gold
Author-X-Name-First: David
Author-X-Name-Last: Gold
Author-Name: Lixin Lang
Author-X-Name-First: Lixin
Author-X-Name-Last: Lang
Author-Name: Kim Zerba
Author-X-Name-First: Kim
Author-X-Name-Last: Zerba
Title: Practical statistical considerations for investigating anti-tumor treatments in mice
Abstract:
Anti-tumor treatment outcomes in mouse experiments can be challenging to interpret and communicate accurately. In reporting these experiments, rigorous statistical considerations are commonly absent, although statistical applications have been proposed. We investigated the practicality and utility of different statistical strategies for the analysis of anti-tumor responses in a longitudinal mouse case study. Each analysis that we performed had different endpoints, investigated different questions, and was based on different assumptions. We found rudimentary visual and risk analysis insufficient without additional considerations, and upon further investigation we found improvements in key anti-tumor parameter estimates associated with a drug combination in our case study. We offer practical statistical considerations for investigating anti-cancer treatments in mice, applying a multi-tier statistical approach.
Journal: Journal of Applied Statistics
Pages: 272-285
Issue: 2
Volume: 46
Year: 2019
Month: 1
X-DOI: 10.1080/02664763.2018.1477925
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1477925
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:2:p:272-285
Template-Type: ReDIF-Article 1.0
Author-Name: Brook T. Russell
Author-X-Name-First: Brook T.
Author-X-Name-Last: Russell
Title: Investigating precipitation extremes in South Carolina with focus on the state's October 2015 precipitation event
Abstract:
The October 2015 precipitation event in the Southeastern United States brought large amounts of rainfall to South Carolina, with particularly heavy amounts in Charleston and Columbia. The subsequent flooding resulted in numerous casualties and hundreds of millions of dollars in property damage. Precipitation levels were so severe that media outlets and government agencies labeled this storm as a 1 in 1000-year event in parts of the state. Two points of discussion emerged as a result of this event. The first was related to understanding the degree to which this event was anomalous; the second was related to understanding whether precipitation extremes in South Carolina have changed over recent time. In this work, 50 years of daily precipitation data at 28 locations are used to fit a spatiotemporal hierarchical model, with the ultimate goal of addressing these two points of discussion. Bayesian inference is used to estimate return levels and to perform a severity-area-frequency analysis, and it is determined that precipitation levels related to this event were atypical throughout much of the state, but were particularly unusual in the Columbia area. This analysis also finds marginal evidence in favor of the claim that precipitation extremes in the Carolinas have become more intense over the last 50 years.
Journal: Journal of Applied Statistics
Pages: 286-303
Issue: 2
Volume: 46
Year: 2019
Month: 1
X-DOI: 10.1080/02664763.2018.1477926
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1477926
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:2:p:286-303
Template-Type: ReDIF-Article 1.0
Author-Name: Yuvraj Sunecher
Author-X-Name-First: Yuvraj
Author-X-Name-Last: Sunecher
Author-Name: Naushad Mamode Khan
Author-X-Name-First: Naushad
Author-X-Name-Last: Mamode Khan
Author-Name: Vandna Jowaheer
Author-X-Name-First: Vandna
Author-X-Name-Last: Jowaheer
Title: A case study of MCB and SBMH stock transaction using a novel BINMA(1) with non-stationary NB correlated innovations
Abstract:
This paper focuses on the modeling of the intra-day transactions at the Stock Exchange Mauritius (SEM) of the two major banking companies: Mauritius Commercial Bank Group Limited (MCB) and State Bank of Mauritius Holdings Ltd (SBMH) in Mauritius using a flexible non-stationary bivariate integer-valued moving average of order 1 (BINMA(1)) process with negative binomial (NB) innovations that may cater for different levels of over-dispersion. The generalized quasi-likelihood (GQL) approach is used to estimate the regression, dependence and over-dispersion effects. However, for the over-dispersion parameters, the auto-covariance structure in the GQL is constructed using some higher order moments. This new model is tested over some Monte-Carlo experiments and is applied to analyze the inter-related intra-day series of volume of stocks for the two banking institutions using data collected from 3 August to 16 October 2015 in the presence of some time-varying covariates such as the news effect, Friday effect and time of the day effect.
Journal: Journal of Applied Statistics
Pages: 304-323
Issue: 2
Volume: 46
Year: 2019
Month: 1
X-DOI: 10.1080/02664763.2018.1477927
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1477927
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:2:p:304-323
Template-Type: ReDIF-Article 1.0
Author-Name: K. Drosou
Author-X-Name-First: K.
Author-X-Name-Last: Drosou
Author-Name: C. Koukouvinos
Author-X-Name-First: C.
Author-X-Name-Last: Koukouvinos
Author-Name: A. Lappa
Author-X-Name-First: A.
Author-X-Name-Last: Lappa
Title: Sure independence screening for real medical Poisson data
Abstract:
The statistical modeling of big data bases constitutes one of the most challenging issues, especially nowadays. The issue is even more critical in case of a complicated correlation structure. Variable selection plays a vital role in statistical analysis of large data bases and many methods have been proposed so far to deal with the aforementioned problem. One of such methods is the Sure Independence Screening which has been introduced to reduce dimensionality to a relatively smaller scale. This method, though simple, produces remarkable results even under both ultra high dimensionality and big scale in terms of sample size problems. In this paper we dealt with the analysis of a big real medical data set assuming a Poisson regression model. We support the analysis by conducting simulated experiments taking into consideration the correlation structure of the design matrix.
Journal: Journal of Applied Statistics
Pages: 324-350
Issue: 2
Volume: 46
Year: 2019
Month: 1
X-DOI: 10.1080/02664763.2018.1480708
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1480708
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:2:p:324-350
Template-Type: ReDIF-Article 1.0
Author-Name: R. N. Montgomery
Author-X-Name-First: R. N.
Author-X-Name-Last: Montgomery
Author-Name: A. S. Watts
Author-X-Name-First: A. S.
Author-X-Name-Last: Watts
Author-Name: N. C. Burns
Author-X-Name-First: N. C.
Author-X-Name-Last: Burns
Author-Name: E. D. Vidoni
Author-X-Name-First: E. D.
Author-X-Name-Last: Vidoni
Author-Name: J. D. Mahnken
Author-X-Name-First: J. D.
Author-X-Name-Last: Mahnken
Title: Evaluating paired categorical data when the pairing is lost
Abstract:
We encountered a problem in which a study's experimental design called for the use of paired data, but the pairing between subjects had been lost during the data collection procedure. Thus we were presented with a data set consisting of pre and post responses but with no way of determining the dependencies between our observed pre and post values. The aim of the study was to assess whether an intervention called Self-Revelatory Performance had an impact on participant's perceptions of Alzheimer's disease. The participant's responses were measured on an Affect grid before the intervention and on a separate grid after. To address the underlying question in light of the lost pairing we utilized a modified bootstrap approach to create a null hypothesized distribution for our test statistic, which was the distance between the two Affect Grids' Centers of Mass. Using this approach we were able to reject our null hypothesis and conclude that there was evidence the intervention influenced perceptions about the disease.
Journal: Journal of Applied Statistics
Pages: 351-363
Issue: 2
Volume: 46
Year: 2019
Month: 1
X-DOI: 10.1080/02664763.2018.1485013
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1485013
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:2:p:351-363
Template-Type: ReDIF-Article 1.0
Author-Name: Marcus B. Perry
Author-X-Name-First: Marcus B.
Author-X-Name-Last: Perry
Title: On the detection of transitive clusters in undirected networks
Abstract:
A network cluster is defined as a set of nodes with ‘strong’ within group ties and ‘weak’ between group ties. Most clustering methods focus on finding groups of ‘densely connected’ nodes, where the dyad (or tie between two nodes) serves as the building block for forming clusters. However, since the unweighted dyad cannot distinguish strong relationships from weak ones, it then seems reasonable to consider an alternative building block, i.e. one involving more than two nodes. In the simplest case, one can consider the triad (or three nodes), where the fully connected triad represents the basic unit of transitivity in an undirected network. In this effort we propose a clustering framework for finding highly transitive subgraphs in an undirected/unweighted network, where the fully connected triad (or triangle configuration) is used as the building block for forming clusters. We apply our methodology to four real networks with encouraging results. Monte Carlo simulation results suggest that, on average, the proposed method yields good clustering performance on synthetic benchmark graphs, relative to other popular methods.
Journal: Journal of Applied Statistics
Pages: 364-384
Issue: 2
Volume: 46
Year: 2019
Month: 1
X-DOI: 10.1080/02664763.2018.1491535
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1491535
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:2:p:364-384
Template-Type: ReDIF-Article 1.0
Author-Name: Patrick Aboagye-Sarfo
Author-X-Name-First: Patrick
Author-X-Name-Last: Aboagye-Sarfo
Author-Name: James Cross
Author-X-Name-First: James
Author-X-Name-Last: Cross
Author-Name: Ute Mueller
Author-X-Name-First: Ute
Author-X-Name-Last: Mueller
Title: Intervention time series analysis of voluntary, counselling and testing on HIV infections in West African sub-region: the case of Ghana
Abstract:
In this paper, intervention time series models were developed to examine the effectiveness of the voluntary counselling and testing (VCT) programme in the northern and southern sectors of Ghana. Pre-intervention data of HIV reported cases in the northern and southern sectors were first modelled as Box–Jenkins univariate time series. Second, the adopted models from the pre-intervention data were extended to include the intervention variable. The intervention variable was coded as zero for the pre-intervention period (1 January 1996–31 December 2002) and one for the post-intervention period (1 January 2003–31 December 2007). The models developed were applied to the entire data for the two sectors to estimate the effect of the VCT programme. Our findings indicate that the VCT programme was found to be associated with detection of 20 and 40 new HIV infections per 100,000 persons per month in the northern and southern sectors (p < .10), respectively. The VCT programme in Ghana, like most West African nations, has insignificant impact. Intervention time series models can be used to reliably examine the impact of the VCT programme. The impact of the VCT programme is minimal and we therefore recommend that the National AIDS Control Programme and other stakeholders re-double their efforts to maximise the impact of the programme.
Journal: Journal of Applied Statistics
Pages: 571-582
Issue: 4
Volume: 44
Year: 2017
Month: 3
X-DOI: 10.1080/02664763.2016.1177501
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1177501
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:4:p:571-582
Template-Type: ReDIF-Article 1.0
Author-Name: Thao Nguyentrang
Author-X-Name-First: Thao
Author-X-Name-Last: Nguyentrang
Author-Name: Tai Vovan
Author-X-Name-First: Tai
Author-X-Name-Last: Vovan
Title: Fuzzy clustering of probability density functions
Abstract:
Basing on L1-distance and representing element of cluster, the article proposes new three algorithms in Fuzzy Clustering of probability density Functions (FCF). They are hierarchical approach, non-hierarchical approach and the algorithm to determine the optimal number of clusters and the initial partition matrix to improve the qualities of established clusters in non-hierarchical approach. With proposed algorithms, FCF has more advantageous than Non-fuzzy Clustering of probability density Functions. These algorithms are applied for recognizing images from Texture and Corel database and practical problem about studying and training marks of students at an university. Many Matlab programs are established for computation in proposed algorithms. These programs are not only used to compute effectively the numerical examples of this article but also to be applied for many different realistic problems.
Journal: Journal of Applied Statistics
Pages: 583-601
Issue: 4
Volume: 44
Year: 2017
Month: 3
X-DOI: 10.1080/02664763.2016.1177502
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1177502
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:4:p:583-601
Template-Type: ReDIF-Article 1.0
Author-Name: Michele Rienzner
Author-X-Name-First: Michele
Author-X-Name-Last: Rienzner
Author-Name: Francesca Ieva
Author-X-Name-First: Francesca
Author-X-Name-Last: Ieva
Title: Critical values improvement for the standard normal homogeneity test by combining Monte Carlo and regression approaches
Abstract:
The distribution of the test statistics of homogeneity tests is often unknown, requiring the estimation of the critical values through Monte Carlo (MC) simulations. The computation of the critical values at low α, especially when the distribution of the statistics changes with the series length (sample cardinality), requires a considerable number of simulations to achieve a reasonable precision of the estimates (i.e. 106 simulations or more for each series length). If, in addition, the test requires a noteworthy computational effort, the estimation of the critical values may need unacceptably long runtimes.To overcome the problem, the paper proposes a regression-based refinement of an initial MC estimate of the critical values, also allowing an approximation of the achieved improvement. Moreover, the paper presents an application of the method to two tests: SNHT (standard normal homogeneity test, widely used in climatology), and SNH2T (a version of SNHT showing a squared numerical complexity). For both, the paper reports the critical values for α ranging between 0.1 and 0.0001 (useful for the p-value estimation), and the series length ranging from 10 (widely adopted size in climatological change-point detection literature) to 70,000 elements (nearly the length of a daily data time series 200 years long), estimated with coefficients of variation within 0.22%. For SNHT, a comparison of our results with approximated, theoretically derived, critical values is also performed; we suggest adopting those values for the series exceeding 70,000 elements.
Journal: Journal of Applied Statistics
Pages: 602-619
Issue: 4
Volume: 44
Year: 2017
Month: 3
X-DOI: 10.1080/02664763.2016.1182127
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1182127
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:4:p:602-619
Template-Type: ReDIF-Article 1.0
Author-Name: Trias Wahyuni Rakhmawati
Author-X-Name-First: Trias Wahyuni
Author-X-Name-Last: Rakhmawati
Author-Name: Geert Molenberghs
Author-X-Name-First: Geert
Author-X-Name-Last: Molenberghs
Author-Name: Geert Verbeke
Author-X-Name-First: Geert
Author-X-Name-Last: Verbeke
Author-Name: Christel Faes
Author-X-Name-First: Christel
Author-X-Name-Last: Faes
Title: Local influence diagnostics for generalized linear mixed models with overdispersion
Abstract:
Since the seminal paper by Cook and Weisberg [9], local influence, next to case deletion, has gained popularity as a tool to detect influential subjects and measurements for a variety of statistical models. For the linear mixed model the approach leads to easily interpretable and computationally convenient expressions, not only highlighting influential subjects, but also which aspect of their profile leads to undue influence on the model's fit [17]. Ouwens et al. [24] applied the method to the Poisson-normal generalized linear mixed model (GLMM). Given the model's nonlinear structure, these authors did not derive interpretable components but rather focused on a graphical depiction of influence. In this paper, we consider GLMMs for binary, count, and time-to-event data, with the additional feature of accommodating overdispersion whenever necessary. For each situation, three approaches are considered, based on: (1) purely numerical derivations; (2) using a closed-form expression of the marginal likelihood function; and (3) using an integral representation of this likelihood. Unlike when case deletion is used, this leads to interpretable components, allowing not only to identify influential subjects, but also to study the cause thereof. The methodology is illustrated in case studies that range over the three data types mentioned.
Journal: Journal of Applied Statistics
Pages: 620-641
Issue: 4
Volume: 44
Year: 2017
Month: 3
X-DOI: 10.1080/02664763.2016.1182128
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1182128
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:4:p:620-641
Template-Type: ReDIF-Article 1.0
Author-Name: Feng Xu
Author-X-Name-First: Feng
Author-X-Name-Last: Xu
Title: Statistical measurement of the inventory shortage cost
Abstract:
In searching for the optimal inventory control policy, the objective is to minimize the expected total costs related, of which the shortage cost is an important element. Due to the difficulty in calculating the indirect cost of the loss of goodwill resulted from the shortage, practitioners and researchers often simply assume a fixed penalty cost on the inventory shortage or switch to the alternative method by assigning a specific customer service level. The development of an appropriate tool for measuring the shortage cost can help a business control the total costs and improve the productivity more effectively. This paper proposes probabilistic measurements of the shortage cost, based on mathematical relationship between the cost and the shortage amount. The derived closed-form estimates of the expected shortage cost value can then be applied to support the determination of the optimal inventory control policy.
Journal: Journal of Applied Statistics
Pages: 642-648
Issue: 4
Volume: 44
Year: 2017
Month: 3
X-DOI: 10.1080/02664763.2016.1182129
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1182129
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:4:p:642-648
Template-Type: ReDIF-Article 1.0
Author-Name: M. Afshari
Author-X-Name-First: M.
Author-X-Name-Last: Afshari
Author-Name: F. Lak
Author-X-Name-First: F.
Author-X-Name-Last: Lak
Author-Name: B. Gholizadeh
Author-X-Name-First: B.
Author-X-Name-Last: Gholizadeh
Title: A new Bayesian wavelet thresholding estimator of nonparametric regression
Abstract:
The methods of estimation of nonparametric regression function are quite common in statistical application. In this paper, the new Bayesian wavelet thresholding estimation is considered. The new mixture prior distributions for the estimation of nonparametric regression function by applying wavelet transformation are investigated. The reversible jump algorithm to obtain the appropriate prior distributions and value of thresholding is used. The performance of the proposed estimator is assessed with simulated data from well-known test functions by comparing the convergence rate of the proposed estimator with respect to another by evaluating the average mean square error and standard deviations. Finally by applying the developed method, density function of galaxy data is estimated.
Journal: Journal of Applied Statistics
Pages: 649-666
Issue: 4
Volume: 44
Year: 2017
Month: 3
X-DOI: 10.1080/02664763.2016.1182130
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1182130
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:4:p:649-666
Template-Type: ReDIF-Article 1.0
Author-Name: Eudoxia Kakarantza
Author-X-Name-First: Eudoxia
Author-X-Name-Last: Kakarantza
Author-Name: Spyridon D. Symeonides
Author-X-Name-First: Spyridon D.
Author-X-Name-Last: Symeonides
Title: Seemingly unrelated systems of econometric equations
Abstract:
A generalization of Zellner's SUR model is derived for sets of seemingly unrelated systems of econometric equations. The resulting structural form – worked out for a set of Cowles Commission-type simultaneous equations systems – is general enough to include any SUR-type or panel-type specification of systems of econometric equations with contemporaneously correlated errors. Maximum estimation efficiency is obtained by treating all the individual subsystems at once rather than in a subsystem-by-subsystem fashion.
Journal: Journal of Applied Statistics
Pages: 667-684
Issue: 4
Volume: 44
Year: 2017
Month: 3
X-DOI: 10.1080/02664763.2016.1182131
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1182131
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:4:p:667-684
Template-Type: ReDIF-Article 1.0
Author-Name: Seunghon Ham
Author-X-Name-First: Seunghon
Author-X-Name-Last: Ham
Author-Name: Sunju Kim
Author-X-Name-First: Sunju
Author-X-Name-Last: Kim
Author-Name: Naroo Lee
Author-X-Name-First: Naroo
Author-X-Name-Last: Lee
Author-Name: Pilje Kim
Author-X-Name-First: Pilje
Author-X-Name-Last: Kim
Author-Name: Igchun Eom
Author-X-Name-First: Igchun
Author-X-Name-Last: Eom
Author-Name: Byoungcheun Lee
Author-X-Name-First: Byoungcheun
Author-X-Name-Last: Lee
Author-Name: Perng-Jy Tsai
Author-X-Name-First: Perng-Jy
Author-X-Name-Last: Tsai
Author-Name: Kiyoung Lee
Author-X-Name-First: Kiyoung
Author-X-Name-Last: Lee
Author-Name: Chungsik Yoon
Author-X-Name-First: Chungsik
Author-X-Name-Last: Yoon
Title: Comparison of data analysis procedures for real-time nanoparticle sampling data using classical regression and ARIMA models
Abstract:
Real-time monitoring is necessary for nanoparticle exposure assessment to characterize the exposure profile, but the data produced are autocorrelated. This study was conducted to compare three statistical methods used to analyze data, which constitute autocorrelated time series, and to investigate the effect of averaging time on the reduction of the autocorrelation using field data. First-order autoregressive (AR(1)) and autoregressive-integrated moving average (ARIMA) models are alternative methods that remove autocorrelation. The classical regression method was compared with AR(1) and ARIMA. Three data sets were used. Scanning mobility particle sizer data were used. We compared the results of regression, AR(1), and ARIMA with averaging times of 1, 5, and 10 min. AR(1) and ARIMA models had similar capacities to adjust autocorrelation of real-time data. Because of the non-stationary of real-time monitoring data, the ARIMA was more appropriate. When using the AR(1), transformation into stationary data was necessary. There was no difference with a longer averaging time. This study suggests that the ARIMA model could be used to process real-time monitoring data especially for non-stationary data, and averaging time setting is flexible depending on the data interval required to capture the effects of processes for occupational and environmental nano measurements.
Journal: Journal of Applied Statistics
Pages: 685-699
Issue: 4
Volume: 44
Year: 2017
Month: 3
X-DOI: 10.1080/02664763.2016.1182132
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1182132
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:4:p:685-699
Template-Type: ReDIF-Article 1.0
Author-Name: Waleed Dhhan
Author-X-Name-First: Waleed
Author-X-Name-Last: Dhhan
Author-Name: Sohel Rana
Author-X-Name-First: Sohel
Author-X-Name-Last: Rana
Author-Name: Habshah Midi
Author-X-Name-First: Habshah
Author-X-Name-Last: Midi
Title: A high breakdown, high efficiency and bounded influence modified GM estimator based on support vector regression
Abstract:
Regression analysis aims to estimate the approximate relationship between the response variable and the explanatory variables. This can be done using classical methods such as ordinary least squares. Unfortunately, these methods are very sensitive to anomalous points, often called outliers, in the data set. The main contribution of this article is to propose a new version of the Generalized M-estimator that provides good resistance against vertical outliers and bad leverage points. The advantage of this method over the existing methods is that it does not minimize the weight of the good leverage points, and this increases the efficiency of this estimator. To achieve this goal, the fixed parameters support vector regression technique is used to identify and minimize the weight of outliers and bad leverage points. The effectiveness of the proposed estimator is investigated using real and simulated data sets.
Journal: Journal of Applied Statistics
Pages: 700-714
Issue: 4
Volume: 44
Year: 2017
Month: 3
X-DOI: 10.1080/02664763.2016.1182133
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1182133
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:4:p:700-714
Template-Type: ReDIF-Article 1.0
Author-Name: Maosheng Li
Author-X-Name-First: Maosheng
Author-X-Name-Last: Li
Author-Name: Zhengqiu Liu
Author-X-Name-First: Zhengqiu
Author-X-Name-Last: Liu
Author-Name: Yonghong Zhang
Author-X-Name-First: Yonghong
Author-X-Name-Last: Zhang
Author-Name: Weijun Liu
Author-X-Name-First: Weijun
Author-X-Name-Last: Liu
Author-Name: Feng Shi
Author-X-Name-First: Feng
Author-X-Name-Last: Shi
Title: Distribution analysis of train interval journey time employing the censored model with shifting character
Abstract:
The theoretical framework of limited dependent variable models is extended to accommodate a shifting character and thus fit the distribution of train journey time on sections of urban rail network. Data of actual train arrival and departure time at each station are used to calculate the journey time of each railway interval of multi-class trains. The log-normal distribution and normal distribution among a group of theoretical distributions are the most and second most suitable latent distributions of the train interval journey time in the censored models with shifting character. This modified distribution is described by four parameters, namely, the expectation and variance of the latent distribution and the upper and lower bound of the migration interval. The square root of the least square measurement (SRLSM) is taken as a measure, and a traversal search is adopted to determine the above four parameters according to the SRLSM. The average of the SRLSM of the theoretical train interval journey time distribution obtained by using the proposed method on all railway sections is 0.0905. The theoretical framework is the basis of storing hidden rules in data instead of past data of train travel time and optimizing the existing management of rail transit operation.
Journal: Journal of Applied Statistics
Pages: 715-733
Issue: 4
Volume: 44
Year: 2017
Month: 3
X-DOI: 10.1080/02664763.2016.1182134
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1182134
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:4:p:715-733
Template-Type: ReDIF-Article 1.0
Author-Name: M. Templ
Author-X-Name-First: M.
Author-X-Name-Last: Templ
Author-Name: K. Hron
Author-X-Name-First: K.
Author-X-Name-Last: Hron
Author-Name: P. Filzmoser
Author-X-Name-First: P.
Author-X-Name-Last: Filzmoser
Title: Exploratory tools for outlier detection in compositional data with structural zeros
Abstract:
The analysis of compositional data using the log-ratio approach is based on ratios between the compositional parts. Zeros in the parts thus cause serious difficulties for the analysis. This is a particular problem in case of structural zeros, which cannot be simply replaced by a non-zero value as it is done, e.g. for values below detection limit or missing values. Instead, zeros to be incorporated into further statistical processing. The focus is on exploratory tools for identifying outliers in compositional data sets with structural zeros. For this purpose, Mahalanobis distances are estimated, computed either directly for subcompositions determined by their zero patterns, or by using imputation to improve the efficiency of the estimates, and then proceed to the subcompositional and subgroup level. For this approach, new theory is formulated that allows to estimate covariances for imputed compositional data and to apply estimations on subgroups using parts of this covariance matrix. Moreover, the zero pattern structure is analyzed using principal component analysis for binary data to achieve a comprehensive view of the overall multivariate data structure. The proposed tools are applied to larger compositional data sets from official statistics, where the need for an appropriate treatment of zeros is obvious.
Journal: Journal of Applied Statistics
Pages: 734-752
Issue: 4
Volume: 44
Year: 2017
Month: 3
X-DOI: 10.1080/02664763.2016.1182135
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1182135
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:4:p:734-752
Template-Type: ReDIF-Article 1.0
Author-Name: Hani M. Samawi
Author-X-Name-First: Hani M.
Author-X-Name-Last: Samawi
Author-Name: Haresh Rochani
Author-X-Name-First: Haresh
Author-X-Name-Last: Rochani
Author-Name: Daniel Linder
Author-X-Name-First: Daniel
Author-X-Name-Last: Linder
Author-Name: Arpita Chatterjee
Author-X-Name-First: Arpita
Author-X-Name-Last: Chatterjee
Title: More efficient logistic analysis using moving extreme ranked set sampling
Abstract:
Logistic regression is the most popular technique available for modeling dichotomous-dependent variables. It has intensive application in the field of social, medical, behavioral and public health sciences. In this paper we propose a more efficient logistic regression analysis based on moving extreme ranked set sampling (MERSSmin) scheme with ranking based on an easy-to-available auxiliary variable known to be associated with the variable of interest (response variable). The paper demonstrates that this approach will provide more powerful testing procedure as well as more efficient odds ratio and parameter estimation than using simple random sample (SRS). Theoretical derivation and simulation studies will be provided. Real data from 2011 Youth Risk Behavior Surveillance System (YRBSS) data are used to illustrate the procedures developed in this paper.
Journal: Journal of Applied Statistics
Pages: 753-766
Issue: 4
Volume: 44
Year: 2017
Month: 3
X-DOI: 10.1080/02664763.2016.1182136
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1182136
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:4:p:753-766
Template-Type: ReDIF-Article 1.0
Author-Name: Brenda Betancourt
Author-X-Name-First: Brenda
Author-X-Name-Last: Betancourt
Author-Name: Abel Rodríguez
Author-X-Name-First: Abel
Author-X-Name-Last: Rodríguez
Author-Name: Naomi Boyd
Author-X-Name-First: Naomi
Author-X-Name-Last: Boyd
Title: Investigating competition in financial markets: a sparse autologistic model for dynamic network data
Abstract:
We develop a sparse autologistic model for investigating the impact of diversification and disintermediation strategies in the evolution of financial trading networks. In order to induce sparsity in the model estimates and address substantive questions about the underlying processes the model includes an $ L^1 $ L1 regularization penalty. This makes implementation feasible for complex dynamic networks in which the number of parameters is considerably greater than the number of observations over time. We use the model to characterize trader behavior in the NYMEX natural gas futures market, where we find that disintermediation and not diversification or momentum tend to drive market microstructure.
Journal: Journal of Applied Statistics
Pages: 1157-1172
Issue: 7
Volume: 45
Year: 2018
Month: 5
X-DOI: 10.1080/02664763.2017.1357684
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1357684
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:7:p:1157-1172
Template-Type: ReDIF-Article 1.0
Author-Name: Jairo A. Fúquene Patiño
Author-X-Name-First: Jairo A.
Author-X-Name-Last: Fúquene Patiño
Author-Name: Brenda Betancourt
Author-X-Name-First: Brenda
Author-X-Name-Last: Betancourt
Author-Name: João B. M. Pereira
Author-X-Name-First: João B. M.
Author-X-Name-Last: Pereira
Title: A weakly informative prior for Bayesian dynamic model selection with applications in fMRI
Abstract:
In recent years, Bayesian statistics methods in neuroscience have been showing important advances. In particular, detection of brain signals for studying the complexity of the brain is an active area of research. Functional magnetic resonance imagining (fMRI) is an important tool to determine which parts of the brain are activated by different types of physical behavior. According to recent results, there is evidence that the values of the connectivity brain signal parameters are close to zero and due to the nature of time series fMRI data with high-frequency behavior, Bayesian dynamic models for identifying sparsity are indeed far-reaching. We propose a multivariate Bayesian dynamic approach for model selection and shrinkage estimation of the connectivity parameters. We describe the coupling or lead-lag between any pair of regions by using mixture priors for the connectivity parameters and propose a new weakly informative default prior for the state variances. This framework produces one-step-ahead proper posterior predictive results and induces shrinkage and robustness suitable for fMRI data in the presence of sparsity. To explore the performance of the proposed methodology, we present simulation studies and an application to functional magnetic resonance imaging data.
Journal: Journal of Applied Statistics
Pages: 1173-1192
Issue: 7
Volume: 45
Year: 2018
Month: 5
X-DOI: 10.1080/02664763.2017.1363161
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1363161
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:7:p:1173-1192
Template-Type: ReDIF-Article 1.0
Author-Name: Pasquale Sarnacchiaro
Author-X-Name-First: Pasquale
Author-X-Name-Last: Sarnacchiaro
Author-Name: Flavio Boccia
Author-X-Name-First: Flavio
Author-X-Name-Last: Boccia
Title: Some remarks on measurement models in the structural equation model: an application for socially responsible food consumption
Abstract:
Considering the structural equation model (SEM), usually the main researches are based on the structural model rather than on the measurement one. So, this context implies some problems: construct misspecification, identification and validation. Starting from the most recent articles in terms of these issues, we achieve – and formalize through two tables – a general framework that could help researchers select and assess both formative and reflective measurement models with special attention on statistical implications. To show this general framework, we present a survey on customer behaviours for socially responsible food consumption. The survey was carried out by delivering a questionnaire administered to a representative sample of 332 families. In order to detect the main aspects impacting consumers’ preferences, a factor analysis has been performed. Then the general framework has been used to select and assess the measurement models in SEM. The estimation of the SEM has been worked out by partial least squares. The significance of the indicators has been tested using bootstrap. As far as we know, it is the first time that a model for the analysis of the consumers’ behaviour for social responsibility is formalized through a SEM.
Journal: Journal of Applied Statistics
Pages: 1193-1208
Issue: 7
Volume: 45
Year: 2018
Month: 5
X-DOI: 10.1080/02664763.2017.1363162
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1363162
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:7:p:1193-1208
Template-Type: ReDIF-Article 1.0
Author-Name: Hamid Jamalinia
Author-X-Name-First: Hamid
Author-X-Name-Last: Jamalinia
Author-Name: Saber Khalouei
Author-X-Name-First: Saber
Author-X-Name-Last: Khalouei
Author-Name: Vahideh Rezaie
Author-X-Name-First: Vahideh
Author-X-Name-Last: Rezaie
Author-Name: Samad Nejatian
Author-X-Name-First: Samad
Author-X-Name-Last: Nejatian
Author-Name: Karamolah Bagheri-Fard
Author-X-Name-First: Karamolah
Author-X-Name-Last: Bagheri-Fard
Author-Name: Hamid Parvin
Author-X-Name-First: Hamid
Author-X-Name-Last: Parvin
Title: Diverse classifier ensemble creation based on heuristic dataset modification
Abstract:
Bagging and Boosting are two main ensemble approaches consolidating the decisions of several hypotheses. The diversity of the ensemble members is considered to be a significant element to obtain generalization error. Here, an inventive method called EBAGTS (ensemble-based artificially generated training samples) is proposed to generate ensembles. It manipulates training examples in three ways in order to build various hypotheses straightforwardly: drawing a sub-sample from training set, reducing/raising error-prone training instances, and reducing/raising local instances around error-prone regions. The proposed method is a straightforward, generic framework utilizing any base classifier as its ensemble members to assemble a powerfully built combinational classifier. Decision-tree classifier and multilayer perceptron classifier as some basic classifiers have been employed in the experiments to indicate the proposed method accomplish higher predictive accuracy compared to meta-learning algorithms like Boosting and Bagging. Furthermore, EBAGTS outperforms Boosting more impressively as the training data set gets broader. It is illustrated that EBAGTS can fulfill better performance comparing to the state of the art.
Journal: Journal of Applied Statistics
Pages: 1209-1226
Issue: 7
Volume: 45
Year: 2018
Month: 5
X-DOI: 10.1080/02664763.2017.1363163
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1363163
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:7:p:1209-1226
Template-Type: ReDIF-Article 1.0
Author-Name: Chalani Prematilake
Author-X-Name-First: Chalani
Author-X-Name-Last: Prematilake
Author-Name: Leif Ellingson
Author-X-Name-First: Leif
Author-X-Name-Last: Ellingson
Title: Evaluation and prediction of polygon approximations of planar contours for shape analysis
Abstract:
Contours may be viewed as the 2D outline of the image of an object. This type of data arises in medical imaging as well as in computer vision and can be modeled as data on a manifold and can be studied using statistical shape analysis. Practically speaking, each observed contour, while theoretically infinite dimensional, must be discretized for computations. As such, the coordinates for each contour as obtained at k sampling times, resulting in the contour being represented as a k-dimensional complex vector. While choosing large values of k will result in closer approximations to the original contour, this will also result in higher computational costs in the subsequent analysis. The goal of this study is to determine reasonable values for k so as to keep the computational cost low while maintaining accuracy. To do this, we consider two methods for selecting sample points and determine lower bounds for k for obtaining a desired level of approximation error using two different criteria. Because this process is computationally inefficient to perform on a large scale, we then develop models for predicting the lower bounds for k based on simple characteristics of the contours.
Journal: Journal of Applied Statistics
Pages: 1227-1246
Issue: 7
Volume: 45
Year: 2018
Month: 5
X-DOI: 10.1080/02664763.2017.1364716
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1364716
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:7:p:1227-1246
Template-Type: ReDIF-Article 1.0
Author-Name: Felix Famoye
Author-X-Name-First: Felix
Author-X-Name-Last: Famoye
Author-Name: John S. Preisser
Author-X-Name-First: John S.
Author-X-Name-Last: Preisser
Title: Marginalized zero-inflated generalized Poisson regression
Abstract:
The generalized Poisson (GP) regression model has been used to model count data that exhibit over-dispersion or under-dispersion. The zero-inflated GP (ZIGP) regression model can additionally handle count data characterized by many zeros. However, the parameters of ZIGP model cannot easily be used for inference on overall exposure effects. In order to address this problem, a marginalized ZIGP is proposed to directly model the population marginal mean count. The parameters of the marginalized zero-inflated GP model are estimated by the method of maximum likelihood. The regression model is illustrated by three real-life data sets.
Journal: Journal of Applied Statistics
Pages: 1247-1259
Issue: 7
Volume: 45
Year: 2018
Month: 5
X-DOI: 10.1080/02664763.2017.1364717
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1364717
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:7:p:1247-1259
Template-Type: ReDIF-Article 1.0
Author-Name: Li-An Lin
Author-X-Name-First: Li-An
Author-X-Name-Last: Lin
Author-Name: Sheng Luo
Author-X-Name-First: Sheng
Author-X-Name-Last: Luo
Author-Name: Barry R. Davis
Author-X-Name-First: Barry R.
Author-X-Name-Last: Davis
Title: Bayesian regression model for recurrent event data with event-varying covariate effects and event effect
Abstract:
In the course of hypertension, cardiovascular disease events (e.g. stroke, heart failure) occur frequently and recurrently. The scientific interest in such study may lie in the estimation of treatment effect while accounting for the correlation among event times. The correlation among recurrent event times comes from two sources: subject-specific heterogeneity (e.g. varied lifestyles, genetic variations, and other unmeasurable effects) and event dependence (i.e. event incidences may change the risk of future recurrent events). Moreover, event incidences may change the disease progression so that there may exist event-varying covariate effects (the covariate effects may change after each event) and event effect (the effect of prior events on the future events). In this article, we propose a Bayesian regression model that not only accommodates correlation among recurrent events from both sources, but also explicitly characterizes the event-varying covariate effects and event effect. This model is especially useful in quantifying how the incidences of events change the effects of covariates and risk of future events. We compare the proposed model with several commonly used recurrent event models and apply our model to the motivating lipid-lowering trial (LLT) component of the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT) (ALLHAT-LLT).
Journal: Journal of Applied Statistics
Pages: 1260-1276
Issue: 7
Volume: 45
Year: 2018
Month: 5
X-DOI: 10.1080/02664763.2017.1367368
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1367368
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:7:p:1260-1276
Template-Type: ReDIF-Article 1.0
Author-Name: David M. Zimmer
Author-X-Name-First: David M.
Author-X-Name-Last: Zimmer
Title: The heterogeneous impact of insurance on health care demand among young adults: a panel data analysis
Abstract:
Success of the recently implemented Affordable Care Act hinges on previously uninsured young adults enrolling in coverage. How will increased coverage, in turn, affect health care utilization? This paper applies variable coefficient panel models to estimate the impact of insurance on health care utilization among young adults. The econometric setup, which accommodates nonlinear usage measures, attempts to address the potential endogeneity of insurance status. The main finding is that, for approximately one-fifth of young adults, insurance does not substantially alter health care consumption. On the other hand, another one-fifth of young adults have large moral hazard effects. Among that group, insurance increases the probability of having a routine checkup by 71–120%, relative to mean probabilities, and insurance increases the number of curative-based doctor office visits by 67–181%, relative to the mean number of visits.
Journal: Journal of Applied Statistics
Pages: 1277-1291
Issue: 7
Volume: 45
Year: 2018
Month: 5
X-DOI: 10.1080/02664763.2017.1369497
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1369497
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:7:p:1277-1291
Template-Type: ReDIF-Article 1.0
Author-Name: Gary Madden
Author-X-Name-First: Gary
Author-X-Name-Last: Madden
Author-Name: Nicholas Apergis
Author-X-Name-First: Nicholas
Author-X-Name-Last: Apergis
Author-Name: Paul Rappoport
Author-X-Name-First: Paul
Author-X-Name-Last: Rappoport
Author-Name: Aniruddha Banerjee
Author-X-Name-First: Aniruddha
Author-X-Name-Last: Banerjee
Title: An application of nonparametric regression to missing data in large market surveys
Abstract:
Non-response (or missing data) is often encountered in large-scale surveys. To enable the behavioural analysis of these data sets, statistical treatments are commonly applied to complete or remove these data. However, the correctness of such procedures critically depends on the nature of the underlying missingness generation process. Clearly, the efficacy of applying either case deletion or imputation procedures rests on the unknown missingness generation mechanism. The contribution of this paper is twofold. The study is the first to propose a simple sequential method to attempt to identify the form of missingness. Second, the effectiveness of the tests is assessed by generating (experimentally) nine missing data sets by imposed MCAR, MAR and NMAR processes, with data removed.
Journal: Journal of Applied Statistics
Pages: 1292-1302
Issue: 7
Volume: 45
Year: 2018
Month: 5
X-DOI: 10.1080/02664763.2017.1369498
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1369498
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:7:p:1292-1302
Template-Type: ReDIF-Article 1.0
Author-Name: Thiago G. Ramires
Author-X-Name-First: Thiago G.
Author-X-Name-Last: Ramires
Author-Name: Edwin M. M. Ortega
Author-X-Name-First: Edwin M. M.
Author-X-Name-Last: Ortega
Author-Name: Niel Hens
Author-X-Name-First: Niel
Author-X-Name-Last: Hens
Author-Name: Gauss M. Cordeiro
Author-X-Name-First: Gauss M.
Author-X-Name-Last: Cordeiro
Author-Name: Gilberto A. Paula
Author-X-Name-First: Gilberto A.
Author-X-Name-Last: Paula
Title: A flexible semiparametric regression model for bimodal, asymmetric and censored data
Abstract:
In this paper, we propose a new semiparametric heteroscedastic regression model allowing for positive and negative skewness and bimodal shapes using the B-spline basis for nonlinear effects. The proposed distribution is based on the generalized additive models for location, scale and shape framework in order to model any or all parameters of the distribution using parametric linear and/or nonparametric smooth functions of explanatory variables. We motivate the new model by means of Monte Carlo simulations, thus ignoring the skewness and bimodality of the random errors in semiparametric regression models, which may introduce biases on the parameter estimates and/or on the estimation of the associated variability measures. An iterative estimation process and some diagnostic methods are investigated. Applications to two real data sets are presented and the method is compared to the usual regression methods.
Journal: Journal of Applied Statistics
Pages: 1303-1324
Issue: 7
Volume: 45
Year: 2018
Month: 5
X-DOI: 10.1080/02664763.2017.1369499
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1369499
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:7:p:1303-1324
Template-Type: ReDIF-Article 1.0
Author-Name: Ao Yuan
Author-X-Name-First: Ao
Author-X-Name-Last: Yuan
Author-Name: Yuan Guo
Author-X-Name-First: Yuan
Author-X-Name-Last: Guo
Author-Name: Nawar M. Shara
Author-X-Name-First: Nawar M.
Author-X-Name-Last: Shara
Author-Name: Barbara V. Howard
Author-X-Name-First: Barbara V.
Author-X-Name-Last: Howard
Author-Name: Ming T. Tan
Author-X-Name-First: Ming T.
Author-X-Name-Last: Tan
Title: An additive Cox model for coronary heart disease study
Abstract:
Existing models for coronary heart disease study use a set of common risk factors to predict the survival time of the disease, via the standard Cox regression model. For complex relationships between the survival time and risk factors, the linear regression specification in the existing Cox model is not flexible enough to accounts for such relationships. Also, the risk factors are actually risky only when they fall in some risk ranges. For more flexibility in modelling and characterize the risk factors more accurately, we study a semi-parametric additive Cox model, using basis splines and LASSO technique. The proposed model is evaluated by simulation studies and is used for the analysis of a real data in the Strong Heart Study.
Journal: Journal of Applied Statistics
Pages: 1325-1346
Issue: 7
Volume: 45
Year: 2018
Month: 5
X-DOI: 10.1080/02664763.2017.1369500
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1369500
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:7:p:1325-1346
Template-Type: ReDIF-Article 1.0
Author-Name: Yang Li
Author-X-Name-First: Yang
Author-X-Name-Last: Li
Author-Name: Yichen Qin
Author-X-Name-First: Yichen
Author-X-Name-Last: Qin
Author-Name: Yanming Xie
Author-X-Name-First: Yanming
Author-X-Name-Last: Xie
Author-Name: Feng Tian
Author-X-Name-First: Feng
Author-X-Name-Last: Tian
Title: Grouped penalization estimation of the osteoporosis data in the traditional Chinese medicine
Abstract: Both continuous and categorical covariates are common in traditional Chinese medicine (TCM) research, especially in the clinical syndrome identification and in the risk prediction research. For groups of dummy variables which are generated by the same categorical covariate, it is important to penalize them group-wise rather than individually. In this paper, we discuss the group lasso method for a risk prediction analysis in TCM osteoporosis research. It is the first time to apply such a group-wise variable selection method in this field. It may lead to new insights of using the grouped penalization method to select appropriate covariates in the TCM research. The introduced methodology can select categorical and continuous variables, and estimate their parameters simultaneously. In our application of the osteoporosis data, four covariates (including both categorical and continuous covariates) are selected out of 52 covariates. The accuracy of the prediction model is excellent. Compared with the prediction model with different covariates, the group lasso risk prediction model can significantly decrease the error rate and help TCM doctors to identify patients with a high risk of osteoporosis in clinical practice. Simulation results show that the application of the group lasso method is reasonable for the categorical covariates selection model in this TCM osteoporosis research.
Journal: Journal of Applied Statistics
Pages: 699-711
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.724660
File-URL: http://hdl.handle.net/10.1080/02664763.2012.724660
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:699-711
Template-Type: ReDIF-Article 1.0
Author-Name: Kassim Mwitondi
Author-X-Name-First: Kassim
Author-X-Name-Last: Mwitondi
Title: Statistical computing in C++ and R
Journal: Journal of Applied Statistics
Pages: 916-916
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.749033
File-URL: http://hdl.handle.net/10.1080/02664763.2012.749033
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:916-916
Template-Type: ReDIF-Article 1.0
Author-Name: Pieter Bastiaan Ober
Author-X-Name-First: Pieter
Author-X-Name-Last: Bastiaan Ober
Title: A practitioner's guide to resampling for data analysis, data mining, and modeling
Journal: Journal of Applied Statistics
Pages: 917-917
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.749035
File-URL: http://hdl.handle.net/10.1080/02664763.2012.749035
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:917-917
Template-Type: ReDIF-Article 1.0
Author-Name: Pieter Bastiaan Ober
Author-X-Name-First: Pieter
Author-X-Name-Last: Bastiaan Ober
Title: Data driven business decisions
Journal: Journal of Applied Statistics
Pages: 917-918
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.749041
File-URL: http://hdl.handle.net/10.1080/02664763.2012.749041
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:917-918
Template-Type: ReDIF-Article 1.0
Author-Name: Anouar Ben Mabrouk
Author-X-Name-First: Anouar
Author-X-Name-Last: Ben Mabrouk
Title: Time series modelling of neuroscience data
Journal: Journal of Applied Statistics
Pages: 918-919
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.749044
File-URL: http://hdl.handle.net/10.1080/02664763.2012.749044
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:918-919
Template-Type: ReDIF-Article 1.0
Author-Name: Yves Laberge
Author-X-Name-First: Yves
Author-X-Name-Last: Laberge
Title: Simulating nature: a philosophical study of computer-simulation uncertainties and their role in climate science and policy advice
Journal: Journal of Applied Statistics
Pages: 919-920
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.749047
File-URL: http://hdl.handle.net/10.1080/02664763.2012.749047
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:919-920
Template-Type: ReDIF-Article 1.0
Author-Name: Mariano Ruiz Espejo
Author-X-Name-First: Mariano
Author-X-Name-Last: Ruiz Espejo
Title: Sampling
Journal: Journal of Applied Statistics
Pages: 920-921
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.749048
File-URL: http://hdl.handle.net/10.1080/02664763.2012.749048
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:920-921
Template-Type: ReDIF-Article 1.0
Author-Name: Yiannis Kamarianakis
Author-X-Name-First: Yiannis
Author-X-Name-Last: Kamarianakis
Title: Ergodic control of diffusion processes
Journal: Journal of Applied Statistics
Pages: 921-922
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.750440
File-URL: http://hdl.handle.net/10.1080/02664763.2012.750440
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:921-922
Template-Type: ReDIF-Article 1.0
Author-Name: Chris Beeley
Author-X-Name-First: Chris
Author-X-Name-Last: Beeley
Title: Behavioural research data analysis with R
Journal: Journal of Applied Statistics
Pages: 922-922
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.750441
File-URL: http://hdl.handle.net/10.1080/02664763.2012.750441
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:922-922
Template-Type: ReDIF-Article 1.0
Author-Name: A. Mosammam
Author-X-Name-First: A.
Author-X-Name-Last: Mosammam
Title: Geostatistics: modeling spatial uncertainty, second edition
Journal: Journal of Applied Statistics
Pages: 923-923
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.750474
File-URL: http://hdl.handle.net/10.1080/02664763.2012.750474
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:923-923
Template-Type: ReDIF-Article 1.0
Author-Name: Nilgun Fescioglu-Unver
Author-X-Name-First: Nilgun
Author-X-Name-Last: Fescioglu-Unver
Author-Name: Başak Tanyeri
Author-X-Name-First: Başak
Author-X-Name-Last: Tanyeri
Title: A comparison of artificial neural network and multinomial logit models in predicting mergers
Abstract: A merger proposal discloses a bidder firm's desire to purchase the control rights in a target firm. Predicting who will propose (bidder candidacy) and who will receive (target candidacy) merger bids is important to investigate why firms merge and to measure the price impact of mergers. This study investigates the performance of artificial neural networks and multinomial logit models in predicting bidder and target candidacy. We use a comprehensive data set that covers the years 1979–2004 and includes all deals with publicly listed bidders and targets. We find that both models perform similarly while predicting target and non-merger firms. The multinomial logit model performs slightly better in predicting bidder firms.
Journal: Journal of Applied Statistics
Pages: 712-720
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.750717
File-URL: http://hdl.handle.net/10.1080/02664763.2012.750717
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:712-720
Template-Type: ReDIF-Article 1.0
Author-Name: B. Kibria
Author-X-Name-First: B.
Author-X-Name-Last: Kibria
Author-Name: Kristofer Månsson
Author-X-Name-First: Kristofer
Author-X-Name-Last: Månsson
Author-Name: Ghazi Shukur
Author-X-Name-First: Ghazi
Author-X-Name-Last: Shukur
Title: Some ridge regression estimators for the zero-inflated Poisson model
Abstract: The zero-inflated Poisson regression model is commonly used when analyzing economic data that come in the form of non-negative integers since it accounts for excess zeros and overdispersion of the dependent variable. However, a problem often encountered when analyzing economic data that has not been addressed for this model is multicollinearity. This paper proposes ridge regression (RR) estimators and some methods for estimating the ridge parameter k for a non-negative model. A simulation study has been conducted to compare the performance of the estimators. Both mean squared error and mean absolute error are considered as the performance criteria. The simulation study shows that some estimators are better than the commonly used maximum-likelihood estimator and some other RR estimators. Based on the simulation study and an empirical application, some useful estimators are recommended for practitioners.
Journal: Journal of Applied Statistics
Pages: 721-735
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.752448
File-URL: http://hdl.handle.net/10.1080/02664763.2012.752448
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:721-735
Template-Type: ReDIF-Article 1.0
Author-Name: Hui Lin
Author-X-Name-First: Hui
Author-X-Name-Last: Lin
Author-Name: Chong Wang
Author-X-Name-First: Chong
Author-X-Name-Last: Wang
Author-Name: Peng Liu
Author-X-Name-First: Peng
Author-X-Name-Last: Liu
Author-Name: Derald Holtkamp
Author-X-Name-First: Derald
Author-X-Name-Last: Holtkamp
Title: Construction of disease risk scoring systems using logistic group lasso: application to porcine reproductive and respiratory syndrome survey data
Abstract: We propose to utilize the group lasso algorithm for logistic regression to construct a risk scoring system for predicting disease in swine. This work is motivated by the need to develop a risk scoring system from survey data on risk factor for porcine reproductive and respiratory syndrome (PRRS), which is a major health, production and financial problem for swine producers in nearly every country. Group lasso provides an attractive solution to this research question because of its ability to achieve group variable selection and stabilize parameter estimates at the same time. We propose to choose the penalty parameter for group lasso through leave-one-out cross-validation, using the criterion of the area under the receiver operating characteristic curve. Survey data for 896 swine breeding herd sites in the USA and Canada completed between March 2005 and March 2009 are used to construct the risk scoring system for predicting PRRS outbreaks in swine. We show that our scoring system for PRRS significantly improves the current scoring system that is based on an expert opinion. We also show that our proposed scoring system is superior in terms of area under the curve to that developed using multiple logistic regression model selected based on variable significance.
Journal: Journal of Applied Statistics
Pages: 736-746
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.752449
File-URL: http://hdl.handle.net/10.1080/02664763.2012.752449
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:736-746
Template-Type: ReDIF-Article 1.0
Author-Name: Marco Marozzi
Author-X-Name-First: Marco
Author-X-Name-Last: Marozzi
Title: Adaptive choice of scale tests in flexible two-stage designs with applications in experimental ecology and clinical trials
Abstract: In this paper, the two-sample scale problem is addressed within the rank framework which does not require to specify the underlying continuous distribution. However, since the power of a rank test depends on the underlying distribution, it would be very useful for the researcher to have some information on it in order to use the possibly most suitable test. A two-stage adaptive design is used with adaptive tests where the data from the first stage are used to compute a selector statistic to select the test statistic for stage 2. More precisely, an adaptive scale test due to Hall and Padmanabhan and its components are considered in one-stage and several adaptive and non-adaptive two-stage procedures. A simulation study shows that the two-stage test with the adaptive choice in the second stage and with Liptak combination, when it is not more powerful than the corresponding one-stage test, shows, however, a quite similar power behavior. The test procedures are illustrated using two ecological applications and a clinical trial.
Journal: Journal of Applied Statistics
Pages: 747-762
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.752796
File-URL: http://hdl.handle.net/10.1080/02664763.2012.752796
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:747-762
Template-Type: ReDIF-Article 1.0
Author-Name: T. Górecki
Author-X-Name-First: T.
Author-X-Name-Last: Górecki
Title: Sequential correction of linear classifiers
Abstract: In this article, a sequential correction of two linear methods: linear discriminant analysis (LDA) and perceptron is proposed. This correction relies on sequential joining of additional features on which the classifier is trained. These new features are posterior probabilities determined by a basic classification method such as LDA and perceptron. In each step, we add the probabilities obtained on a slightly different data set, because the vector of added probabilities varies at each step. We therefore have many classifiers of the same type trained on slightly different data sets. Four different sequential correction methods are presented based on different combining schemas (e.g. mean rule and product rule). Experimental results on different data sets demonstrate that the improvements are efficient, and that this approach outperforms classical linear methods, providing a significant reduction in the mean classification error rate.
Journal: Journal of Applied Statistics
Pages: 763-776
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.753041
File-URL: http://hdl.handle.net/10.1080/02664763.2012.753041
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:763-776
Template-Type: ReDIF-Article 1.0
Author-Name: Gavin Shaddick
Author-X-Name-First: Gavin
Author-X-Name-Last: Shaddick
Author-Name: Haojie Yan
Author-X-Name-First: Haojie
Author-X-Name-Last: Yan
Author-Name: Ruth Salway
Author-X-Name-First: Ruth
Author-X-Name-Last: Salway
Author-Name: Danielle Vienneau
Author-X-Name-First: Danielle
Author-X-Name-Last: Vienneau
Author-Name: Daphne Kounali
Author-X-Name-First: Daphne
Author-X-Name-Last: Kounali
Author-Name: David Briggs
Author-X-Name-First: David
Author-X-Name-Last: Briggs
Title: Large-scale Bayesian spatial modelling of air pollution for policy support
Abstract: The potential effects of air pollution are a major concern both in terms of the environment and in relation to human health. In order to support environmental policy, there is a need for accurate measurements of the concentrations of pollutants at high geographical resolution over large regions. However, within such regions, there are likely to be areas where the monitoring information will be sparse and so methods are required to accurately predict concentrations. Set within a Bayesian framework, models are developed which exploit the relationships between pollution and geographical covariate information, such as land use, climate and transport variables together with spatial structure. Candidate models are compared based on their ability to predict a set of validation sites. The chosen model is used to perform large-scale prediction of nitrogen dioxide at a 1×1 km resolution for the entire EU. The models allow probabilistic statements to be made with regard to the levels of air pollution that might be experienced in each area. When combined with population data, such information can be invaluable in informing policy by indicating areas for which improvements may be given priority.
Journal: Journal of Applied Statistics
Pages: 777-794
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.754851
File-URL: http://hdl.handle.net/10.1080/02664763.2012.754851
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:777-794
Template-Type: ReDIF-Article 1.0
Author-Name: Mariantonietta Ruggieri
Author-X-Name-First: Mariantonietta
Author-X-Name-Last: Ruggieri
Author-Name: Antonella Plaia
Author-X-Name-First: Antonella
Author-X-Name-Last: Plaia
Author-Name: Francesca Di Salvo
Author-X-Name-First: Francesca
Author-X-Name-Last: Di Salvo
Author-Name: Gianna Agró
Author-X-Name-First: Gianna
Author-X-Name-Last: Agró
Title: Functional principal component analysis for the explorative analysis of multisite–multivariate air pollution time series with long gaps
Abstract: The knowledge of the urban air quality represents the first step to face air pollution issues. For the last decades many cities can rely on a network of monitoring stations recording concentration values for the main pollutants. This paper focuses on functional principal component analysis (FPCA) to investigate multiple pollutant datasets measured over time at multiple sites within a given urban area. Our purpose is to extend what has been proposed in the literature to data that are multisite and multivariate at the same time. The approach results to be effective to highlight some relevant statistical features of the time series, giving the opportunity to identify significant pollutants and to know the evolution of their variability along time. The paper also deals with missing value issue. As it is known, very long gap sequences can often occur in air quality datasets, due to long time failures not easily solvable or to data coming from a mobile monitoring station. In the considered dataset, large and continuous gaps are imputed by empirical orthogonal function procedure, after denoising raw data by functional data analysis and before performing FPCA, in order to further improve the reconstruction.
Journal: Journal of Applied Statistics
Pages: 795-807
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.754852
File-URL: http://hdl.handle.net/10.1080/02664763.2012.754852
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:795-807
Template-Type: ReDIF-Article 1.0
Author-Name: Sandra De Iaco
Author-X-Name-First: Sandra
Author-X-Name-Last: De Iaco
Title: On the use of different metrics for assessing complex pattern reproductions
Abstract: Nowadays, there is an increasing interest in multi-point models and their applications in Earth sciences. However, users not only ask for multi-point methods able to capture the uncertainties of complex structures and to reproduce the properties of a training image, but also they need quantitative tools for assessing whether a set of realizations have the properties required. Moreover, it is crucial to study the sensitivity of the realizations to the size of the data template and to analyze how fast realization-based statistics converge on average toward training-based statistics. In this paper, some similarity measures and convergence indexes, based on some physically measurable quantities and cumulants of high-order, are presented. In the case study, multi-point simulations of the spatial distribution of coarse-grained limestone and calcareous rock, generated by using three templates of different sizes, are compared and convergence toward training-based statistics is analyzed by taking into account increasing numbers of realizations.
Journal: Journal of Applied Statistics
Pages: 808-822
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.754853
File-URL: http://hdl.handle.net/10.1080/02664763.2012.754853
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:808-822
Template-Type: ReDIF-Article 1.0
Author-Name: Hukum Chandra
Author-X-Name-First: Hukum
Author-X-Name-Last: Chandra
Title: Exploring spatial dependence in area-level random effect model for disaggregate-level crop yield estimation
Abstract: This paper describes an application of small area estimation (SAE) techniques under area-level spatial random effect models when only area (or district or aggregated) level data are available. In particular, the SAE approach is applied to produce district-level model-based estimates of crop yield for paddy in the state of Uttar Pradesh in India using the data on crop-cutting experiments supervised under the Improvement of Crop Statistics scheme and the secondary data from the Population Census. The diagnostic measures are illustrated to examine the model assumptions as well as reliability and validity of the generated model-based small area estimates. The results show a considerable gain in precision in model-based estimates produced applying SAE. Furthermore, the model-based estimates obtained by exploiting spatial information are more efficient than the one obtained by ignoring this information. However, both of these model-based estimates are more efficient than the direct survey estimate. In many districts, there is no survey data and therefore it is not possible to produce direct survey estimates for these districts. The model-based estimates generated using SAE are still reliable for such districts. These estimates produced by using SAE will provide invaluable information to policy-analysts and decision-makers.
Journal: Journal of Applied Statistics
Pages: 823-842
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.756858
File-URL: http://hdl.handle.net/10.1080/02664763.2012.756858
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:823-842
Template-Type: ReDIF-Article 1.0
Author-Name: Ole Klungsøyr
Author-X-Name-First: Ole
Author-X-Name-Last: Klungsøyr
Author-Name: Joe Sexton
Author-X-Name-First: Joe
Author-X-Name-Last: Sexton
Author-Name: Inger Sandanger
Author-X-Name-First: Inger
Author-X-Name-Last: Sandanger
Author-Name: Jan Nygård
Author-X-Name-First: Jan
Author-X-Name-Last: Nygård
Title: A time-varying measurement error model for age of onset of a psychiatric diagnosis: applied to first depressive episode diagnosed by the Composite International Diagnostic Interview (CIDI)
Abstract: A substantial degree of uncertainty exists surrounding the reconstruction of events based on memory recall. This form of measurement error affects the performance of structured interviews such as the Composite International Diagnostic Interview (CIDI), an important tool to assess mental health in the community. Measurement error probably explains the discrepancy in estimates between longitudinal studies with repeated assessments (the gold-standard), yielding approximately constant rates of depression, versus cross-sectional studies which often find increasing rates closer in time to the interview. Repeated assessments of current status (or recent history) are more reliable than reconstruction of a person's psychiatric history based on a single interview. In this paper, we demonstrate a method of estimating a time-varying measurement error distribution in the age of onset of an initial depressive episode, as diagnosed by the CIDI, based on an assumption regarding age-specific incidence rates. High-dimensional non-parametric estimation is achieved by the EM-algorithm with smoothing. The method is applied to data from a Norwegian mental health survey in 2000. The measurement error distribution changes dramatically from 1980 to 2000, with increasing variance and greater bias further away in time from the interview. Some influence of the measurement error on already published results is found.
Journal: Journal of Applied Statistics
Pages: 843-861
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.756859
File-URL: http://hdl.handle.net/10.1080/02664763.2012.756859
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:843-861
Template-Type: ReDIF-Article 1.0
Author-Name: Muhammad Faisal
Author-X-Name-First: Muhammad
Author-X-Name-Last: Faisal
Author-Name: Andreas Futschik
Author-X-Name-First: Andreas
Author-X-Name-Last: Futschik
Author-Name: Ijaz Hussain
Author-X-Name-First: Ijaz
Author-X-Name-Last: Hussain
Title: A new approach to choose acceptance cutoff for approximate Bayesian computation
Abstract: The approximate Bayesian computation (ABC) algorithm is used to estimate parameters from complicated phenomena, where likelihood is intractable. Here, we report the development of an algorithm to choose the tolerance level for ABC. We have illustrated the performance of our proposed method by simulating the estimation of scaled mutation and recombination rates. The result shows that the proposed algorithm performs well.
Journal: Journal of Applied Statistics
Pages: 862-869
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.756860
File-URL: http://hdl.handle.net/10.1080/02664763.2012.756860
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:862-869
Template-Type: ReDIF-Article 1.0
Author-Name: Zaizai Yan
Author-X-Name-First: Zaizai
Author-X-Name-Last: Yan
Author-Name: Miaomiao Li
Author-X-Name-First: Miaomiao
Author-X-Name-Last: Li
Author-Name: Yalu Yan
Author-X-Name-First: Yalu
Author-X-Name-Last: Yan
Title: An efficient non-rejective implementation of the πps sampling designs
Abstract: Poisson sampling is a method for unequal probabilities sampling with random sample size. There exist several implementations of the Poisson sampling design, with fixed sample size, which almost all are rejective methods, that is, the sample is not always accepted. Thus, the existing methods can be time-consuming or even infeasible in some situations. In this paper, a fast and non-rejective method, which is efficient even for large populations, is proposed and studied. The method is a new design for selecting a sample of fixed size with unequal inclusion probabilities. For the population of large size, the proposed design is very close to the strict πps sampling which is similar to the conditional Poisson (CP) sampling design, but the implementation of the design is much more efficient than the CP sampling. And the inclusion probabilities can be calculated recursively.
Journal: Journal of Applied Statistics
Pages: 870-886
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.756861
File-URL: http://hdl.handle.net/10.1080/02664763.2012.756861
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:870-886
Template-Type: ReDIF-Article 1.0
Author-Name: A. Hayter
Author-X-Name-First: A.
Author-X-Name-Last: Hayter
Title: Inferences on the difference between future observations for comparing two treatments
Abstract: The comparison of two treatments with normally distributed data is considered. Inferences are considered based upon the difference between single potential future observations from each of the two treatments, which provides a useful and easily interpretable assessment of the difference between the two treatments. These methodologies combine information from a standard confidence interval analysis of the difference between the two treatment means, with information available from standard prediction intervals of future observations. Win-probabilities, which are the probabilities that a future observation from one treatment will be superior to a future observation from the other treatment, are a special case of these methodologies. The theoretical derivation of these methodologies is based upon inferences about the non-centrality parameter of a non-central t-distribution. Equal and unequal variance situations are addressed, and extensions to groups of future observations from the two treatments are also considered. Some examples and discussions of the methodologies are presented.
Journal: Journal of Applied Statistics
Pages: 887-900
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.758245
File-URL: http://hdl.handle.net/10.1080/02664763.2012.758245
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:887-900
Template-Type: ReDIF-Article 1.0
Author-Name: Edoardo Otranto
Author-X-Name-First: Edoardo
Author-X-Name-Last: Otranto
Title: Volatility clustering in the presence of time-varying model parameters
Abstract: The volatility pattern of financial time series is often characterized by several peaks and abrupt changes, consistent with the time-varying coefficients of the underlying data-generating process. As a consequence, the model-based classification of the volatility of a set of assets could vary over a period of time. We propose a procedure to classify the unconditional volatility obtained from an extended family of Multiplicative Error Models with time-varying coefficients to verify if it changes in correspondence with different regimes or particular dates. The proposed procedure is experimented on 15 stock indices.
Journal: Journal of Applied Statistics
Pages: 901-915
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.759191
File-URL: http://hdl.handle.net/10.1080/02664763.2012.759191
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:901-915
Template-Type: ReDIF-Article 1.0
Author-Name: Hassan Bakouch
Author-X-Name-First: Hassan
Author-X-Name-Last: Bakouch
Title: R for statistics
Journal: Journal of Applied Statistics
Pages: 924-924
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.761325
File-URL: http://hdl.handle.net/10.1080/02664763.2012.761325
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:924-924
Template-Type: ReDIF-Article 1.0
Author-Name: The Editors
Title: Erratum
Journal: Journal of Applied Statistics
Pages: v-v
Issue: 4
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.765664
File-URL: http://hdl.handle.net/10.1080/02664763.2013.765664
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:4:p:v-v
Template-Type: ReDIF-Article 1.0
Author-Name: I.S. Dhindsa
Author-X-Name-First: I.S.
Author-X-Name-Last: Dhindsa
Author-Name: R. Agarwal
Author-X-Name-First: R.
Author-X-Name-Last: Agarwal
Author-Name: H.S. Ryait
Author-X-Name-First: H.S.
Author-X-Name-Last: Ryait
Title: Principal component analysis-based muscle identification for myoelectric-controlled exoskeleton knee
Abstract:
This paper is an attempt to identify a set of muscles which are sufficient to control a myoelectic-controlled exoskeleton knee. A musculoskeletal model of the human body available in the anybody modelling system was scaled to match the subject-specific parameters. It was made to perform a task of sitting in a squat position from a standing position. Internal forces developed in 18 muscles of lower limb during the task were predicted by the inverse dynamic analysis. Principal component analysis was then conducted on the predicted force variable. The eigenvector coefficients of the principal components were evaluated. Significant variables were retained and redundant variables were rejected by the method of principal variable. Subjects were asked to perform the same task of sitting in a squat position from a standing position. Surface-electromyography (sEMG) signals were recorded from the selected muscles. The force developed in the subject's muscles were obtained from the sEMG signals. Force developed in the selected muscle was compared with the force obtained from the musculoskeletal model. A four channel system VastusLateralis, RectusFemoris, Semitendinosus and GluteusMedius and a five channel system VastusLateralis, BicepsFemoris, RectusFemoris, Semitendinosus and GluteusMedius are suitable muscles to control exoskeleton knee.
Journal: Journal of Applied Statistics
Pages: 1707-1720
Issue: 10
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1221907
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1221907
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:10:p:1707-1720
Template-Type: ReDIF-Article 1.0
Author-Name: JinXing Che
Author-X-Name-First: JinXing
Author-X-Name-Last: Che
Author-Name: YouLong Yang
Author-X-Name-First: YouLong
Author-X-Name-Last: Yang
Title: Stochastic correlation coefficient ensembles for variable selection
Abstract:
In this paper, we propose a novel Max-Relevance and Min-Common-Redundancy criterion for variable selection in linear models. Considering that the ensemble approach for variable selection has been proven to be quite effective in linear regression models, we construct a variable selection ensemble (VSE) by combining the presented stochastic correlation coefficient algorithm with a stochastic stepwise algorithm. We conduct extensive experimental comparison of our algorithm and other methods using two simulation studies and four real-life data sets. The results confirm that the proposed VSE leads to promising improvement on variable selection and regression accuracy.
Journal: Journal of Applied Statistics
Pages: 1721-1742
Issue: 10
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1221913
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1221913
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:10:p:1721-1742
Template-Type: ReDIF-Article 1.0
Author-Name: Nicholas T. Longford
Author-X-Name-First: Nicholas T.
Author-X-Name-Last: Longford
Title: Inflated assessments of disability
Abstract:
A medical examination provides a key input into decisions about disability pension and other forms of income support or compensation that are justified on medical grounds. The result of examining an individual is often communicated by means of a score, and inflation of such scores is a well-known problem. We estimate the extent of inflation of scores from a set of disability assessments using a model based on the discrete linear distribution. We explore some extensions within the framework of a sensitivity analysis.
Journal: Journal of Applied Statistics
Pages: 1743-1760
Issue: 10
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1221914
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1221914
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:10:p:1743-1760
Template-Type: ReDIF-Article 1.0
Author-Name: Geoffrey Colin L. Peterson
Author-X-Name-First: Geoffrey Colin L.
Author-X-Name-Last: Peterson
Author-Name: Dong Li
Author-X-Name-First: Dong
Author-X-Name-Last: Li
Author-Name: Brian J. Reich
Author-X-Name-First: Brian J.
Author-X-Name-Last: Reich
Author-Name: Donald Brenner
Author-X-Name-First: Donald
Author-X-Name-Last: Brenner
Title: Spatial prediction of crystalline defects observed in molecular dynamic simulations of plastic damage
Abstract:
Molecular dynamic computer simulation is an essential tool in materials science to study atomic properties of materials in extreme environments and guide development of new materials. We propose a statistical analysis to emulate simulation output with the ultimate goal of efficiently approximating the computationally intensive simulation. We compare several spatial regression approaches including conditional autoregression (CAR), discrete wavelets transform (DWT), and principle components analysis (PCA). The methods are applied to simulation of copper atoms with twin wall and dislocation loop defects, under varying tilt tension angles. We find that CAR and DWT yield accurate results but fail to capture extreme defects, yet PCA better captures defect structure.
Journal: Journal of Applied Statistics
Pages: 1761-1784
Issue: 10
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1221915
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1221915
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:10:p:1761-1784
Template-Type: ReDIF-Article 1.0
Author-Name: F. Cugnata
Author-X-Name-First: F.
Author-X-Name-Last: Cugnata
Author-Name: G. Perucca
Author-X-Name-First: G.
Author-X-Name-Last: Perucca
Author-Name: S. Salini
Author-X-Name-First: S.
Author-X-Name-Last: Salini
Title: Bayesian networks and the assessment of universities' value added
Abstract:
A broad literature focused on the effectiveness of tertiary education. In classical models, a performance indicator is regressed on a set of characteristics of the individuals and fixed effects at the institution level. The FE coefficients are interpreted as the pure value added of the universities. The innovative contribution of the present paper resides in the use of Bayesian network (BN) analysis to assess the effectiveness of tertiary education. The results of an empirical study focused on Italian universities are discussed, to present the use of BN as a decision support tool for policy-making purposes.
Journal: Journal of Applied Statistics
Pages: 1785-1806
Issue: 10
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1223839
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1223839
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:10:p:1785-1806
Template-Type: ReDIF-Article 1.0
Author-Name: Jacob Martin
Author-X-Name-First: Jacob
Author-X-Name-Last: Martin
Author-Name: Daniel B. Hall
Author-X-Name-First: Daniel B.
Author-X-Name-Last: Hall
Title: Marginal zero-inflated regression models for count data
Abstract:
Data sets with excess zeroes are frequently analyzed in many disciplines. A common framework used to analyze such data is the zero-inflated (ZI) regression model. It mixes a degenerate distribution with point mass at zero with a non-degenerate distribution. The estimates from ZI models quantify the effects of covariates on the means of latent random variables, which are often not the quantities of primary interest. Recently, marginal zero-inflated Poisson (MZIP; Long et al. [A marginalized zero-inflated Poisson regression model with overall exposure effects. Stat. Med. 33 (2014), pp. 5151–5165]) and negative binomial (MZINB; Preisser et al., 2016) models have been introduced that model the mean response directly. These models yield covariate effects that have simple interpretations that are, for many applications, more appealing than those available from ZI regression. This paper outlines a general framework for marginal zero-inflated models where the latent distribution is a member of the exponential dispersion family, focusing on common distributions for count data. In particular, our discussion includes the marginal zero-inflated binomial (MZIB) model, which has not been discussed previously. The details of maximum likelihood estimation via the EM algorithm are presented and the properties of the estimators as well as Wald and likelihood ratio-based inference are examined via simulation. Two examples presented illustrate the advantages of MZIP, MZINB, and MZIB models for practical data analysis.
Journal: Journal of Applied Statistics
Pages: 1807-1826
Issue: 10
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1225018
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1225018
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:10:p:1807-1826
Template-Type: ReDIF-Article 1.0
Author-Name: André G. F. C. Costa
Author-X-Name-First: André G. F. C.
Author-X-Name-Last: Costa
Author-Name: Enrico A. Colosimo
Author-X-Name-First: Enrico A.
Author-X-Name-Last: Colosimo
Author-Name: Aline B. M. Vaz
Author-X-Name-First: Aline B. M.
Author-X-Name-Last: Vaz
Author-Name: José Luiz P. Silva
Author-X-Name-First: José Luiz P.
Author-X-Name-Last: Silva
Author-Name: Leila D. Amorim
Author-X-Name-First: Leila D.
Author-X-Name-Last: Amorim
Title: Marginal models for the association structure of hierarchical binary responses
Abstract:
Clustered binary responses are often found in ecological studies. Data analysis may include modeling the marginal probability response. However, when the association is the main scientific focus, modeling the correlation structure between pairs of responses is the key part of the analysis. Second-order generalized estimating equations (GEE) are established in the literature. Some of them are more efficient in computational terms, especially facing large clusters. Alternating logistic regression (ALR) and orthogonalized residual (ORTH) GEE methods are presented and compared in this paper. Simulation results show a slightly superiority of ALR over ORTH. Marginal probabilities and odds ratios are also estimated and compared in a real ecological study involving a three-level hierarchical clustering. ALR and ORTH models are useful for modeling complex association structure with large cluster sizes.
Journal: Journal of Applied Statistics
Pages: 1827-1838
Issue: 10
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1238042
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238042
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:10:p:1827-1838
Template-Type: ReDIF-Article 1.0
Author-Name: Tao Wang
Author-X-Name-First: Tao
Author-X-Name-Last: Wang
Author-Name: Lin Zheng
Author-X-Name-First: Lin
Author-X-Name-Last: Zheng
Author-Name: Zhonghua Li
Author-X-Name-First: Zhonghua
Author-X-Name-Last: Li
Author-Name: Haiyang Liu
Author-X-Name-First: Haiyang
Author-X-Name-Last: Liu
Title: A robust variable screening method for high-dimensional data
Abstract:
In practice, the presence of influential observations may lead to misleading results in variable screening problems. We, therefore, propose a robust variable screening procedure for high-dimensional data analysis in this paper. Our method consists of two steps. The first step is to define a new high-dimensional influence measure and propose a novel influence diagnostic procedure to remove those unusual observations. The second step is to utilize the sure independence screening procedure based on distance correlation to select important variables in high-dimensional regression analysis. The new influence measure and diagnostic procedure that we developed are model free. To confirm the effectiveness of the proposed method, we conduct simulation studies and a real-life data analysis to illustrate the merits of the proposed approach over some competing methods. Both the simulation results and the real-life data analysis demonstrate that the proposed method can greatly control the adverse effect after detecting and removing those unusual observations, and performs better than the competing methods.
Journal: Journal of Applied Statistics
Pages: 1839-1855
Issue: 10
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1238044
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238044
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:10:p:1839-1855
Template-Type: ReDIF-Article 1.0
Author-Name: P.G. Sankaran
Author-X-Name-First: P.G.
Author-X-Name-Last: Sankaran
Author-Name: N.N. Midhu
Author-X-Name-First: N.N.
Author-X-Name-Last: Midhu
Title: Nonparametric estimation of mean residual quantile function under right censoring
Abstract:
In this paper, we develop non-parametric estimation of the mean residual quantile function based on right-censored data. Two non-parametric estimators, one based on the empirical quantile function and the other using the kernel smoothing method, are proposed. Asymptotic properties of the estimators are discussed. Monte Carlo simulation studies are conducted to compare the two estimators. The method is illustrated with the aid of two real data sets.
Journal: Journal of Applied Statistics
Pages: 1856-1874
Issue: 10
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1238046
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238046
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:10:p:1856-1874
Template-Type: ReDIF-Article 1.0
Author-Name: Fabio Baione
Author-X-Name-First: Fabio
Author-X-Name-Last: Baione
Author-Name: Paolo De Angelis
Author-X-Name-First: Paolo
Author-X-Name-Last: De Angelis
Author-Name: Massimiliano Menzietti
Author-X-Name-First: Massimiliano
Author-X-Name-Last: Menzietti
Author-Name: Agostino Tripodi
Author-X-Name-First: Agostino
Author-X-Name-Last: Tripodi
Title: A comparison of risk transfer strategies for a portfolio of life annuities based on RORAC
Abstract:
This paper aims to compare different reinsurance arrangements in order to reduce the longevity and financial risk originated by a life insurer while managing a portfolio of annuities policies. Linear and nonlinear reinsurance strategies as well as swap like agreements are evaluated via a discrete-time actuarial risk model. Specifically, longevity dynamics are represented by Lee–Carter type models, while interest rate is modeled by Cox–Ingersoll–Ross model. The reinsurance strategies effectiveness is evaluated according to the Return on Risk Adjusted Capital under a ruin probability constrain.
Journal: Journal of Applied Statistics
Pages: 1875-1892
Issue: 10
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1238047
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238047
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:10:p:1875-1892
Template-Type: ReDIF-Article 1.0
Author-Name: Ying-Hsiu Chen
Author-X-Name-First: Ying-Hsiu
Author-X-Name-Last: Chen
Author-Name: Po-Lin Lai
Author-X-Name-First: Po-Lin
Author-X-Name-Last: Lai
Title: Does diversification promote risk reduction and profitability raise? Estimation of dynamic impacts using the pooled mean group model
Abstract:
This paper utilizes the pooled mean group model to explore the dynamic effects of revenue diversification on the operational risks and profitability of banks. The sample consisted of unbalanced panel data of 25 listed Taiwanese banks for the period from 1998 to 2013. The results reveal a divergence in the long- and short-run effects of revenue diversification on credit risk by the banks, and the benefits of diversification on two other operational risks and profitability are deferred. This paper provides dynamic evidence of diversification, which has been typically evaluated in previous studies, to release the aggregate effect and to explain the ambiguity in the results in the current literature.
Journal: Journal of Applied Statistics
Pages: 1893-1901
Issue: 10
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1252729
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1252729
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:10:p:1893-1901
Template-Type: ReDIF-Article 1.0
Author-Name: Jingheng Cai
Author-X-Name-First: Jingheng
Author-X-Name-Last: Cai
Author-Name: Zhibin Liang
Author-X-Name-First: Zhibin
Author-X-Name-Last: Liang
Author-Name: Rongqian Sun
Author-X-Name-First: Rongqian
Author-X-Name-Last: Sun
Author-Name: Chenyi Liang
Author-X-Name-First: Chenyi
Author-X-Name-Last: Liang
Author-Name: Junhao Pan
Author-X-Name-First: Junhao
Author-X-Name-Last: Pan
Title: Bayesian analysis of latent Markov models with non-ignorable missing data
Abstract:
Latent Markov models (LMMs) are widely used in the analysis of heterogeneous longitudinal data. However, most existing LMMs are developed in fully observed data without missing entries. The main objective of this study is to develop a Bayesian approach for analyzing the LMMs with non-ignorable missing data. Bayesian methods for estimation and model comparison are discussed. The empirical performance of the proposed methodology is evaluated through simulation studies. An application to a data set derived from National Longitudinal Survey of Youth 1997 is presented.
Journal: Journal of Applied Statistics
Pages: 2299-2313
Issue: 13
Volume: 46
Year: 2019
Month: 10
X-DOI: 10.1080/02664763.2019.1584162
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1584162
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:13:p:2299-2313
Template-Type: ReDIF-Article 1.0
Author-Name: P. Mozgunov
Author-X-Name-First: P.
Author-X-Name-Last: Mozgunov
Author-Name: T. Jaki
Author-X-Name-First: T.
Author-X-Name-Last: Jaki
Author-Name: M. Gasparini
Author-X-Name-First: M.
Author-X-Name-Last: Gasparini
Title: Loss functions in restricted parameter spaces and their Bayesian applications
Abstract:
Squared error loss remains the most commonly used loss function for constructing a Bayes estimator of the parameter of interest. However, it can lead to suboptimal solutions when a parameter is defined on a restricted space. It can also be an inappropriate choice in the context when an extreme overestimation and/or underestimation results in severe consequences and a more conservative estimator is preferred. We advocate a class of loss functions for parameters defined on restricted spaces which infinitely penalize boundary decisions like the squared error loss does on the real line. We also recall several properties of loss functions such as symmetry, convexity and invariance. We propose generalizations of the squared error loss function for parameters defined on the positive real line and on an interval. We provide explicit solutions for corresponding Bayes estimators and discuss multivariate extensions. Four well-known Bayesian estimation problems are used to demonstrate inferential benefits the novel Bayes estimators can provide in the context of restricted estimation.
Journal: Journal of Applied Statistics
Pages: 2314-2337
Issue: 13
Volume: 46
Year: 2019
Month: 10
X-DOI: 10.1080/02664763.2019.1586848
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1586848
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:13:p:2314-2337
Template-Type: ReDIF-Article 1.0
Author-Name: Cord A. Müller
Author-X-Name-First: Cord A.
Author-X-Name-Last: Müller
Title: Optimal acceptance sampling for modules F and F1 of the European Measuring Instruments Directive
Abstract:
Acceptance sampling plans offered by ISO 2859-1 are far from optimal under the conditions for statistical verification in modules F and F1 as prescribed by Annex II of the Measuring Instruments Directive (MID) 2014/32/EU, resulting in sample sizes that are larger than necessary. An optimised single-sampling scheme is derived, both for large lots using the binomial distribution and for finite-sized lots using the exact hypergeometric distribution, resulting in smaller sample sizes that are economically more efficient while offering the full statistical protection required by the MID.
Journal: Journal of Applied Statistics
Pages: 2338-2356
Issue: 13
Volume: 46
Year: 2019
Month: 10
X-DOI: 10.1080/02664763.2019.1588235
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1588235
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:13:p:2338-2356
Template-Type: ReDIF-Article 1.0
Author-Name: Yi-Ting Hwang
Author-X-Name-First: Yi-Ting
Author-X-Name-Last: Hwang
Author-Name: Chia-Hui Huang
Author-X-Name-First: Chia-Hui
Author-X-Name-Last: Huang
Author-Name: Chun-Chao Wang
Author-X-Name-First: Chun-Chao
Author-X-Name-Last: Wang
Author-Name: Tzu-Yin Lin
Author-X-Name-First: Tzu-Yin
Author-X-Name-Last: Lin
Author-Name: Yi-Kuan Tseng
Author-X-Name-First: Yi-Kuan
Author-X-Name-Last: Tseng
Title: Joint modelling of longitudinal binary data and survival data
Abstract:
The medical costs in an ageing society substantially increase when the incidences of chronic diseases, disabilities and inability to live independently are high. Healthy lifestyles not only affect elderly individuals but also influence the entire community. When assessing treatment efficacy, survival and quality of life should be considered simultaneously. This paper proposes the joint likelihood approach for modelling survival and longitudinal binary covariates simultaneously. Because some unobservable information is present in the model, the Monte Carlo EM algorithm and Metropolis-Hastings algorithm are used to find the estimators. Monte Carlo simulations are performed to evaluate the performance of the proposed model based on the accuracy and precision of the estimates. Real data are used to demonstrate the feasibility of the proposed model.
Journal: Journal of Applied Statistics
Pages: 2357-2371
Issue: 13
Volume: 46
Year: 2019
Month: 10
X-DOI: 10.1080/02664763.2019.1590540
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1590540
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:13:p:2357-2371
Template-Type: ReDIF-Article 1.0
Author-Name: Marcelo A. da Silva
Author-X-Name-First: Marcelo A.
Author-X-Name-Last: da Silva
Author-Name: Anne C. Huggins-Manley
Author-X-Name-First: Anne C.
Author-X-Name-Last: Huggins-Manley
Author-Name: José A. Mazzon
Author-X-Name-First: José A.
Author-X-Name-Last: Mazzon
Author-Name: Jorge L. Bazán
Author-X-Name-First: Jorge L.
Author-X-Name-Last: Bazán
Title: Bayesian estimation of a flexible bifactor generalized partial credit model to survey data
Abstract:
Item response theory (IRT) models provide an important contribution in the analysis of polytomous items, such as Likert scale items in survey data. We propose a bifactor generalized partial credit model (bifac-GPC model) with flexible link functions - probit, logit and complementary log-log - for use in analysis of ordered polytomous item scale data. In order to estimate the parameters of the proposed model, we use a Bayesian approach through the NUTS algorithm and show the advantages of implementing IRT models through the Stan language. We present an application to marketing scale data. Specifically, we apply the model to a dataset of non-users of a mobile banking service in order to highlight the advantages of this model. The results show important managerial implications resulting from consumer perceptions. We provide a discussion of the methodology for this type of data and extensions. Codes are available for practitioners and researchers to replicate the application.
Journal: Journal of Applied Statistics
Pages: 2372-2387
Issue: 13
Volume: 46
Year: 2019
Month: 10
X-DOI: 10.1080/02664763.2019.1592125
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1592125
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:13:p:2372-2387
Template-Type: ReDIF-Article 1.0
Author-Name: Chih-Chun Tsai
Author-X-Name-First: Chih-Chun
Author-X-Name-Last: Tsai
Title: Optimal lamination test of ethylene vinyl acetate sheets for solar modules
Abstract:
Solar power is inexhaustible and has become one of the most appreciated alternative energy choices. In the development stage, solar modules are subjected to relevant reliability tests to ensure a long lifetime and optimal power generation efficiency. After the lamination process, the performance of solar modules is closely related to the degree of crosslinking of ethylene vinyl acetate (EVA) sheets. Traditionally, the degree of crosslinking on EVA sheets is obtained using the chemical extraction method to measure the gel content of these sheets. Motivated by lamination data, this study first constructed a statistical model to describe the relationship between the degree of crosslinking on EVA sheets and lamination time. Next, under the specification limits of the gel content of EVA sheets, the optimal lamination time of solar modules was derived, and the optimal allocation for measuring EVA sheets was addressed. The chemical extraction method is time consuming and leads to high pollution. The latest method is differential scanning calorimetric (DSC), which measures the curing degree of EVA sheets as the degree of crosslinking on these sheets. This study determined the specification limits of the curing degree under the DSC method. An example is presented to elucidate the proposed procedure.
Journal: Journal of Applied Statistics
Pages: 2388-2408
Issue: 13
Volume: 46
Year: 2019
Month: 10
X-DOI: 10.1080/02664763.2019.1596230
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1596230
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:13:p:2388-2408
Template-Type: ReDIF-Article 1.0
Author-Name: A. R. Baghestani
Author-X-Name-First: A. R.
Author-X-Name-Last: Baghestani
Author-Name: F. S. Hosseini-Baharanchi
Author-X-Name-First: F. S.
Author-X-Name-Last: Hosseini-Baharanchi
Title: An improper form of Weibull distribution for competing risks analysis with Bayesian approach
Abstract:
In survival analysis, individuals may fail due to multiple causes of failure called competing risks setting. Parametric models such as Weibull model are not improper that ignore the assumption of multiple failure times. In this study, a novel extension of Weibull distribution is proposed which is improper and then can incorporate to the competing risks framework. This model includes the original Weibull model before a pre-specified time point and an exponential form for the tail of the time axis. A Bayesian approach is used for parameter estimation. A simulation study is performed to evaluate the proposed model. The conducted simulation study showed identifiability and appropriate convergence of the proposed model. The proposed model and the 3-parameter Gompertz model, another improper parametric distribution, are fitted to the acute lymphoblastic leukemia dataset.
Journal: Journal of Applied Statistics
Pages: 2409-2417
Issue: 13
Volume: 46
Year: 2019
Month: 10
X-DOI: 10.1080/02664763.2019.1597027
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1597027
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:13:p:2409-2417
Template-Type: ReDIF-Article 1.0
Author-Name: Eliana Christou
Author-X-Name-First: Eliana
Author-X-Name-Last: Christou
Author-Name: Michael Grabchak
Author-X-Name-First: Michael
Author-X-Name-Last: Grabchak
Title: Estimation of value-at-risk using single index quantile regression
Abstract:
Value-at-Risk (VaR) is one of the best known and most heavily used measures of financial risk. In this paper, we introduce a non-iterative semiparametric model for VaR estimation called the single index quantile regression time series (SIQRTS) model. To test its performance, we give an application to four major US market indices: the S&P 500 Index, the Russell 2000 Index, the Dow Jones Industrial Average, and the NASDAQ Composite Index. Our results suggest that this method has a good finite sample performance and often outperforms a number of commonly used methods.
Journal: Journal of Applied Statistics
Pages: 2418-2433
Issue: 13
Volume: 46
Year: 2019
Month: 10
X-DOI: 10.1080/02664763.2019.1597028
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1597028
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:13:p:2418-2433
Template-Type: ReDIF-Article 1.0
Author-Name: Vinicius F. Calsavara
Author-X-Name-First: Vinicius F.
Author-X-Name-Last: Calsavara
Author-Name: Agatha S. Rodrigues
Author-X-Name-First: Agatha S.
Author-X-Name-Last: Rodrigues
Author-Name: Ricardo Rocha
Author-X-Name-First: Ricardo
Author-X-Name-Last: Rocha
Author-Name: Francisco Louzada
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada
Author-Name: Vera Tomazella
Author-X-Name-First: Vera
Author-X-Name-Last: Tomazella
Author-Name: Ana C. R. L. A. Souza
Author-X-Name-First: Ana C. R. L. A.
Author-X-Name-Last: Souza
Author-Name: Rafaela A. Costa
Author-X-Name-First: Rafaela A.
Author-X-Name-Last: Costa
Author-Name: Rossana P. V. Francisco
Author-X-Name-First: Rossana P. V.
Author-X-Name-Last: Francisco
Title: Zero-adjusted defective regression models for modeling lifetime data
Abstract:
In this paper, we introduce a defective regression model for survival data modeling with a proportion of early failures or zero-adjusted. Our approach enables us to accommodate three types of units, that is, patients with ‘zero’ survival times (early failures) and those who are susceptible or not susceptible to the event of interest. Defective distributions are obtained from standard distributions by changing the domain of the parameters of the latter in such a way that their survival functions are limited to $ p\in (0, 1) $ p∈(0,1). We consider the Gompertz and inverse Gaussian defective distributions, which allow modeling of data containing a cure fraction. Parameter estimation is performed by maximum likelihood estimation, and Monte Carlo simulation studies are conducted to evaluate the performance of the proposed models. We illustrate the practical relevance of the proposed models on two real data sets. The first is from a study of occlusion of endoscopic stenting in patients with pancreatic cancer performed at A.C.Camargo Cancer Center, and the other is from a study on insulin use in pregnant women diagnosed with gestational diabetes performed at São Paulo University Medical School. Both studies were performed in São Paulo, Brazil.
Journal: Journal of Applied Statistics
Pages: 2434-2459
Issue: 13
Volume: 46
Year: 2019
Month: 10
X-DOI: 10.1080/02664763.2019.1597029
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1597029
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:13:p:2434-2459
Template-Type: ReDIF-Article 1.0
Author-Name: Meena Badade
Author-X-Name-First: Meena
Author-X-Name-Last: Badade
Author-Name: T. V. Ramanathan
Author-X-Name-First: T. V.
Author-X-Name-Last: Ramanathan
Title: Probabilistic frontier regression models for binary type output data
Abstract:
This paper proposes a probabilistic frontier regression model for binary type output data in a production process setup. We consider one of the two categories of outputs as ‘selected’ category and the reduction in probability of falling in this category is attributed to the reduction in technical efficiency (TE) of the decision-making unit. An efficiency measure is proposed to determine the deviations of individual units from the probabilistic frontier. Simulation results show that the average estimated TE component is close to its true value. An application of the proposed method to the data related to the Indian public sector banking system is provided where the output variable is the indicator of level of non-performing assets. Individual TE is obtained for each of the banks under consideration. Among the public sector banks, Andhra bank is found to be the most efficient, whereas the United Bank of India is the least.
Journal: Journal of Applied Statistics
Pages: 2460-2480
Issue: 13
Volume: 46
Year: 2019
Month: 10
X-DOI: 10.1080/02664763.2019.1597838
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1597838
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:13:p:2460-2480
Template-Type: ReDIF-Article 1.0
Author-Name: Stavros Kourtzidis
Author-X-Name-First: Stavros
Author-X-Name-Last: Kourtzidis
Author-Name: Panayiotis Tzeremes
Author-X-Name-First: Panayiotis
Author-X-Name-Last: Tzeremes
Author-Name: Nickolaos G. Tzeremes
Author-X-Name-First: Nickolaos G.
Author-X-Name-Last: Tzeremes
Title: Conditional time-dependent nonparametric estimators with an application to healthcare production function
Abstract:
By using the probabilistic framework of production efficiency, the paper develops time-dependent conditional efficiency estimators performing a non-parametric frontier analysis. Specifically, by applying both full and quantile (robust) time-dependent conditional estimators, it models the dynamic effect of health expenditure on countries’ technological change and technological catch-up levels. The results from the application reveal that the effect of per capita health expenditure on countries’ technological change and technological catch-up is nonlinear and is subject to countries’ specific income levels.
Journal: Journal of Applied Statistics
Pages: 2481-2490
Issue: 13
Volume: 46
Year: 2019
Month: 10
X-DOI: 10.1080/02664763.2019.1588234
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1588234
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:13:p:2481-2490
Template-Type: ReDIF-Article 1.0
Author-Name: Eugenio Brentari
Author-X-Name-First: Eugenio
Author-X-Name-Last: Brentari
Author-Name: Livia Dancelli
Author-X-Name-First: Livia
Author-X-Name-Last: Dancelli
Author-Name: Marica Manisera
Author-X-Name-First: Marica
Author-X-Name-Last: Manisera
Title: Clustering ranking data in market segmentation: a case study on the Italian McDonald's customers’ preferences
Abstract:
Cluster analysis is often used for market segmentation. When the inputs in the clustering algorithm are ranking data, the intersubject (dis)similarities must be measured by matching-type measures, able to take account of the ordinal nature of the data. Among them, we used a Weighted Spearman's rho, suitably transformed into a (dis)similarity measure, in order to emphasize the concordance on the top ranks. This allows creating clusters grouping customers that place the same items (products, services, etc.) higher in their rankings. Also the statistical instruments used to interpret the clusters must be conceived to deal with ordinal data. The median and other location measures are appropriate but not always able to clearly differentiate groups. The so-called bipolar mean, with its related variability measure, may reveal some additional features. A case study on real data from a survey carried out in the Italian McDonald's restaurants is presented.
Journal: Journal of Applied Statistics
Pages: 1959-1976
Issue: 11
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1125864
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1125864
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:11:p:1959-1976
Template-Type: ReDIF-Article 1.0
Author-Name: Pelin Kasap
Author-X-Name-First: Pelin
Author-X-Name-Last: Kasap
Author-Name: Birdal Senoglu
Author-X-Name-First: Birdal
Author-X-Name-Last: Senoglu
Author-Name: Olcay Arslan
Author-X-Name-First: Olcay
Author-X-Name-Last: Arslan
Title: Stochastic analysis of covariance when the error distribution is long-tailed symmetric
Abstract:
In this study, we consider stochastic one-way analysis of covariance model when the distribution of the error terms is long-tailed symmetric. Estimators of the unknown model parameters are obtained by using the maximum likelihood (ML) methodology. Iteratively reweighting algorithm is used to compute the ML estimates of the parameters. We also propose new test statistic based on ML estimators for testing the linear contrasts of the treatment effects. In the simulation study, we compare the efficiencies of the traditional least-squares (LS) estimators of the model parameters with the corresponding ML estimators. We also compare the power of the test statistics based on LS and ML estimators, respectively. A real-life example is given at the end of the study.
Journal: Journal of Applied Statistics
Pages: 1977-1997
Issue: 11
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1125866
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1125866
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:11:p:1977-1997
Template-Type: ReDIF-Article 1.0
Author-Name: R. Lombardo
Author-X-Name-First: R.
Author-X-Name-Last: Lombardo
Author-Name: E.J. Beh
Author-X-Name-First: E.J.
Author-X-Name-Last: Beh
Title: The prediction index of aggregate data
Abstract:
The analysis of the association between the two dichotomous variables of a $ 2\times 2 $ 2×2 table arises as an important statistical issue in a number of diverse settings, such as in biomedical, medical, epidemiological, pharmaceutical or environmental research. When only the aggregate (or marginal) information is available, the analyst may determine the likely strength of the association between the variables. In this paper, we propose a new measure, called aggregate prediction index, that assesses the likely statistical significance of the association between the rows and columns of a $ 2\times 2 $ 2×2 table where one variable is treated as a predictor variable and the other is treated as a response variable. Further insight into the predictor's potential strength can be visually obtained by performing an asymmetric version of correspondence analysis and considering a biplot display of the two variables – this issue shall also be explored in light of the new index.
Journal: Journal of Applied Statistics
Pages: 1998-2018
Issue: 11
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1125867
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1125867
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:11:p:1998-2018
Template-Type: ReDIF-Article 1.0
Author-Name: Hsiao-Hsian Gao
Author-X-Name-First: Hsiao-Hsian
Author-X-Name-Last: Gao
Author-Name: Li-Shan Huang
Author-X-Name-First: Li-Shan
Author-X-Name-Last: Huang
Title: Sample size planning for testing significance of curves
Abstract:
Smoothing methods for curve estimation have received considerable attention in statistics with a wide range of applications. However, to our knowledge, sample size planning for testing significance of curves has not been discussed in the literature. This paper focuses on sample size calculations for nonparametric regression and partially linear models based on local linear estimators. We describe explicit procedures for sample size calculations based on non- and semi-parametric F-tests. Data examples are provided to demonstrate the use of the procedures.
Journal: Journal of Applied Statistics
Pages: 2019-2028
Issue: 11
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1126238
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1126238
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:11:p:2019-2028
Template-Type: ReDIF-Article 1.0
Author-Name: Chien-Chia L. Huang
Author-X-Name-First: Chien-Chia L.
Author-X-Name-Last: Huang
Author-Name: Yow-Jen Jou
Author-X-Name-First: Yow-Jen
Author-X-Name-Last: Jou
Author-Name: Hsun-Jung Cho
Author-X-Name-First: Hsun-Jung
Author-X-Name-Last: Cho
Title: A new multicollinearity diagnostic for generalized linear models
Abstract:
We propose a new collinearity diagnostic tool for generalized linear models. The new diagnostic tool is termed the weighted variance inflation factor (WVIF) behaving exactly the same as the traditional variance inflation factor in the context of regression diagnostic, given data matrix normalized. Compared to the use of condition number (CN), WVIF shows more reliable information on how severe the situation is, when data collinearity does exist. An alternative estimator, a by-product of the new diagnostic, outperforms the ridge estimator in the presence of data collinearity in both aspects of WVIF and CN. Evidences are given through analyzing various real-world numerical examples.
Journal: Journal of Applied Statistics
Pages: 2029-2043
Issue: 11
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1126239
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1126239
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:11:p:2029-2043
Template-Type: ReDIF-Article 1.0
Author-Name: Michael Lechner
Author-X-Name-First: Michael
Author-X-Name-Last: Lechner
Author-Name: Nuria Rodriguez-Planas
Author-X-Name-First: Nuria
Author-X-Name-Last: Rodriguez-Planas
Author-Name: Daniel Fernández Kranz
Author-X-Name-First: Daniel
Author-X-Name-Last: Fernández Kranz
Title: Difference-in-difference estimation by FE and OLS when there is panel non-response
Abstract:
We show that the ordinary least squares (OLS) and fixed-effects (FE) estimators of the popular difference-in-differences model may deviate when there is time-varying panel non-response. If such non-response does not affect the common-trend assumption, then OLS and FE are consistent, but OLS is more precise. However, if non-response is affecting the common-trend assumption, then FE estimation may still be consistent, while OLS will be inconsistent. We provide simulation as well as empirical evidence for this phenomenon to occur. We conclude that in case of unbalanced panels, deviating OLS and FE estimates should be considered as evidence that non-response is not ignorable for the differences-in-differences estimation.
Journal: Journal of Applied Statistics
Pages: 2044-2052
Issue: 11
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1126240
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1126240
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:11:p:2044-2052
Template-Type: ReDIF-Article 1.0
Author-Name: Kuo-Chin Lin
Author-X-Name-First: Kuo-Chin
Author-X-Name-Last: Lin
Author-Name: Yi-Ju Chen
Author-X-Name-First: Yi-Ju
Author-X-Name-Last: Chen
Title: Goodness-of-fit tests of generalized linear mixed models for repeated ordinal responses
Abstract:
Categorical longitudinal data are frequently applied in a variety of fields, and are commonly fitted by generalized linear mixed models (GLMMs) and generalized estimating equations models. The cumulative logit is one of the useful link functions to deal with the problem involving repeated ordinal responses. To check the adequacy of the GLMMs with cumulative logit link function, two goodness-of-fit tests constructed by the unweighted sum of squared model residuals using numerical integration and bootstrap resampling technique are proposed. The empirical type I error rates and powers of the proposed tests are examined by simulation studies. The ordinal longitudinal studies are utilized to illustrate the application of the two proposed tests.
Journal: Journal of Applied Statistics
Pages: 2053-2064
Issue: 11
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1126568
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1126568
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:11:p:2053-2064
Template-Type: ReDIF-Article 1.0
Author-Name: Christophe Corbier
Author-X-Name-First: Christophe
Author-X-Name-Last: Corbier
Title: Huberian function applied to neurodegenerative disorder gait rhythm
Abstract:
Huberian statistical approach is applied to differentiate three neurodegenerative disorder gait rhythm and presents a method reducing the number of parameters of an autoregressive moving average (ARMA) modeling of the walking signal. Gait rhythm dynamics differ between healthy control, Parkinson's disease, Huntington's disease and amyotrophic lateral sclerosis. Random variables such as the stride interval and its two sub-phases (i.e. swing and stance) present a great variability with natural outliers. Huberian function as a mixture of $ L_2 $ L2 and $ L_1 $ L1 norms with low threshold γ is used to present new statistical indicators by deducing the corresponding skewness and kurtosis. The choice of γ is discussed to ensure consistency and convergence of a low-order ARMA estimator of the gait rhythm signal. A mathematical point of view is developed and experimental results are presented.
Journal: Journal of Applied Statistics
Pages: 2065-2084
Issue: 11
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1126811
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1126811
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:11:p:2065-2084
Template-Type: ReDIF-Article 1.0
Author-Name: Fernanda B. Rizzato
Author-X-Name-First: Fernanda B.
Author-X-Name-Last: Rizzato
Author-Name: Roseli A. Leandro
Author-X-Name-First: Roseli A.
Author-X-Name-Last: Leandro
Author-Name: Clarice G.B. Demétrio
Author-X-Name-First: Clarice G.B.
Author-X-Name-Last: Demétrio
Author-Name: Geert Molenberghs
Author-X-Name-First: Geert
Author-X-Name-Last: Molenberghs
Title: A Bayesian approach to analyse overdispersed longitudinal count data
Abstract:
In this paper, we consider a model for repeated count data, with within-subject correlation and/or overdispersion. It extends both the generalized linear mixed model and the negative-binomial model. This model, proposed in a likelihood context [17,18] is placed in a Bayesian inferential framework. An important contribution takes the form of Bayesian model assessment based on pivotal quantities, rather than the often less adequate DIC. By means of a real biological data set, we also discuss some Bayesian model selection aspects, using a pivotal quantity proposed by Johnson [12].
Journal: Journal of Applied Statistics
Pages: 2085-2109
Issue: 11
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1126812
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1126812
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:11:p:2085-2109
Template-Type: ReDIF-Article 1.0
Author-Name: Xun Xiao
Author-X-Name-First: Xun
Author-X-Name-Last: Xiao
Author-Name: Amitava Mukherjee
Author-X-Name-First: Amitava
Author-X-Name-Last: Mukherjee
Author-Name: Min Xie
Author-X-Name-First: Min
Author-X-Name-Last: Xie
Title: Estimation procedures for grouped data – a comparative study
Abstract:
Interval-censored data are very common in the reliability and lifetime data analysis. This paper investigates the performance of different estimation procedures for a special type of interval-censored data, i.e. grouped data, from three widely used lifetime distributions. The approaches considered here include the maximum likelihood estimation, the minimum distance estimation based on chi-square criterion, the moment estimation based on imputation (IM) method and an ad hoc estimation procedure. Although IM-based techniques are extensively used recently, we show that this method is not always effective. It is found that the ad hoc estimation procedure is equivalent to the minimum distance estimation with another distance metric and more effective in the simulation. The procedures of different approaches are presented and their performances are investigated by Monte Carlo simulation for various combinations of sample sizes and parameter settings. The numerical results provide guidelines to analyse grouped data for practitioners when they need to choose a good estimation approach.
Journal: Journal of Applied Statistics
Pages: 2110-2130
Issue: 11
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1130801
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1130801
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:11:p:2110-2130
Template-Type: ReDIF-Article 1.0
Author-Name: Jiguang Shao
Author-X-Name-First: Jiguang
Author-X-Name-Last: Shao
Author-Name: Sheng Fu
Author-X-Name-First: Sheng
Author-X-Name-Last: Fu
Title: On the modes of the negative binomial distribution of order
Abstract:
In this paper, the modes of the negative binomial distribution of order k are studied. Firstly, the method of transition probability flow graphs is introduced to deal with the probability-generating function of the geometric distribution of order k, which is a special case of the negative binomial distribution of the same order. And then, the general negative binomial distribution of order k is investigated. By means of probability distribution function, the mode of the geometric distribution of order k is derived, i.e. $ m_{X_{(k)}}=k $ mX(k)=k. Based on the Fibonacci sequence and Poly-nacci sequence, the modes of the negative binomial distribution of order k in some cases are obtained: (1) $ m_{X_{(2,2)}}=6, 7, 8 $ mX(2,2)=6,7,8 and $ m_{X_{(3,2)}}=16 $ mX(3,2)=16, for p=0.5; (2) $ m_{X_{(2,3)}}=13 $ mX(2,3)=13 for p=0.5. Finally, an application of negative binomial distribution of order k in continuous sampling plans is given.
Journal: Journal of Applied Statistics
Pages: 2131-2149
Issue: 11
Volume: 43
Year: 2016
Month: 8
X-DOI: 10.1080/02664763.2015.1130802
File-URL: http://hdl.handle.net/10.1080/02664763.2015.1130802
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:11:p:2131-2149
Template-Type: ReDIF-Article 1.0
Author-Name: Ahmed Ghorbel
Author-X-Name-First: Ahmed
Author-X-Name-Last: Ghorbel
Author-Name: Wajdi Hamma
Author-X-Name-First: Wajdi
Author-X-Name-Last: Hamma
Author-Name: Anis Jarboui
Author-X-Name-First: Anis
Author-X-Name-Last: Jarboui
Title: Dependence between oil and commodities markets using time-varying Archimedean copulas and effectiveness of hedging strategies
Abstract:
The aim of this work is to study in a first step the dependence between oil and some commodity prices (cotton, rice, wheat, sucre, coffee, and silver) using copula theory, and then in a second step to determine the optimal hedging strategy for oil–commodity portfolio against the risk of negative variation in commodity markets prices. The model is implemented with an AR-GARCH model with innovations that follow t distribution for the marginal distribution and the extreme value copula for the joint distribution and parameters and dependence indices are re-estimated in each new day which allow taking into account nonlinear dependence, tails behavior, and their development over time. Various copula functions are used to model the dependence structure between oil and commodity markets. Empirical results show an increase in the dependence during the last 6 years. Volatility for commodity prices registered record levels in the same time with the increase in uncertainty. Optimal hedging ratio varies over time as a consequence of the change in the dependence structure.
Journal: Journal of Applied Statistics
Pages: 1509-1542
Issue: 9
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1155107
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1155107
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:9:p:1509-1542
Template-Type: ReDIF-Article 1.0
Author-Name: Fang Liu
Author-X-Name-First: Fang
Author-X-Name-Last: Liu
Author-Name: Peng Zhang
Author-X-Name-First: Peng
Author-X-Name-Last: Zhang
Author-Name: Ibrahim Erkan
Author-X-Name-First: Ibrahim
Author-X-Name-Last: Erkan
Author-Name: Dylan S. Small
Author-X-Name-First: Dylan S.
Author-X-Name-Last: Small
Title: Bayesian inference for random coefficient dynamic panel data models
Abstract:
We develop a hierarchical Bayesian approach for inference in random coefficient dynamic panel data models. Our approach allows for the initial values of each unit's process to be correlated with the unit-specific coefficients. We impose a stationarity assumption for each unit's process by assuming that the unit-specific autoregressive coefficient is drawn from a logitnormal distribution. Our method is shown to have favorable properties compared to the mean group estimator in a Monte Carlo study. We apply our approach to analyze energy and protein intakes among individuals from the Philippines.
Journal: Journal of Applied Statistics
Pages: 1543-1559
Issue: 9
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1214248
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1214248
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:9:p:1543-1559
Template-Type: ReDIF-Article 1.0
Author-Name: Giuseppe Arbia
Author-X-Name-First: Giuseppe
Author-X-Name-Last: Arbia
Author-Name: Giuseppe Espa
Author-X-Name-First: Giuseppe
Author-X-Name-Last: Espa
Author-Name: Diego Giuliani
Author-X-Name-First: Diego
Author-X-Name-Last: Giuliani
Author-Name: Rocco Micciolo
Author-X-Name-First: Rocco
Author-X-Name-Last: Micciolo
Title: A spatial analysis of health and pharmaceutical firm survival
Abstract:
The presence of knowledge spillovers and shared human capital is at the heart of the Marhall–Arrow–Romer externalities hypothesis. Most of the earlier empirical contributions on knowledge externalities; however, considered data aggregated at a regional level so that conclusions are based on the arbitrary definition of jurisdictional spatial units: this is the essence of the so-called modifiable areal unit problem. A second limitation of these studies is constituted by the fact that, somewhat surprisingly, while concentrating on the effects of agglomeration on firm creation and growth, the literature has, conversely, largely ignored its effects on firm survival. The present paper aims at contributing to the existing literature by answering to some of the open methodological questions reconciling the literature of Cox proportional hazards model with that on point pattern and thus capturing the true nature of spatial information. We also present some empirical results based on Italian firm demography data collected and managed by the Italian National Institute of Statistics (ISTAT).
Journal: Journal of Applied Statistics
Pages: 1560-1575
Issue: 9
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1214249
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1214249
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:9:p:1560-1575
Template-Type: ReDIF-Article 1.0
Author-Name: Essam A. Ahmed
Author-X-Name-First: Essam A.
Author-X-Name-Last: Ahmed
Title: Estimation and prediction for the generalized inverted exponential distribution based on progressively first-failure-censored data with application
Abstract:
In this paper, the estimation of parameters for a generalized inverted exponential distribution based on the progressively first-failure type-II right-censored sample is studied. An expectation–maximization (EM) algorithm is developed to obtain maximum likelihood estimates of unknown parameters as well as reliability and hazard functions. Using the missing value principle, the Fisher information matrix has been obtained for constructing asymptotic confidence intervals. An exact interval and an exact confidence region for the parameters are also constructed. Bayesian procedures based on Markov Chain Monte Carlo methods have been developed to approximate the posterior distribution of the parameters of interest and in addition to deduce the corresponding credible intervals. The performances of the maximum likelihood and Bayes estimators are compared in terms of their mean-squared errors through the simulation study. Furthermore, Bayes two-sample point and interval predictors are obtained when the future sample is ordinary order statistics. The squared error, linear-exponential and general entropy loss functions have been considered for obtaining the Bayes estimators and predictors. To illustrate the discussed procedures, a set of real data is analyzed.
Journal: Journal of Applied Statistics
Pages: 1576-1608
Issue: 9
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1214692
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1214692
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:9:p:1576-1608
Template-Type: ReDIF-Article 1.0
Author-Name: Eran A. Barnoy
Author-X-Name-First: Eran A.
Author-X-Name-Last: Barnoy
Author-Name: Hyun J. Kim
Author-X-Name-First: Hyun J.
Author-X-Name-Last: Kim
Author-Name: David W. Gjertson
Author-X-Name-First: David W.
Author-X-Name-Last: Gjertson
Title: Complexity in applying spatial analysis to describe heterogeneous air-trapping in thoracic imaging data
Abstract:
In this paper we consider a novel approach to analyzing medical images by applying a concept typically employed in geospatial studies. For certain diseases, such as asthma, there is a relevant distinction between the heterogeneity of constriction in airways for patients compared to healthy individuals. In order to describe such heterogeneities quantitatively, we utilize spatial correlation in the realm of lung computer tomography (CT). Specifically, we apply the approximate profile-likelihood estimator (APLE) to simulated lung air-trapping data selected based on potential interest to pulmonologists, and we explore reference values obtainable through this statistic. Results indicate that APLE values are independent of air-trapping values, and can provide useful insight into spatial patterns of these values within the lungs in situations where other common metrics, such as the coefficient of variation, reveal little. The APLE relies on a neighborhood weights matrix to define spatial relatedness of considered regions, and among a few weight structures explored, a working optimal choice seems to be one based on the inverse distance squared between regions of interest. The application yields a new method to help analyze the degree of heterogeneity in lung CT images, which can be generalized to other medical images as well.
Journal: Journal of Applied Statistics
Pages: 1609-1629
Issue: 9
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1221901
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1221901
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:9:p:1609-1629
Template-Type: ReDIF-Article 1.0
Author-Name: Rosineide Fernando da Paz
Author-X-Name-First: Rosineide Fernando
Author-X-Name-Last: da Paz
Author-Name: Jorge Luis Bazán
Author-X-Name-First: Jorge
Author-X-Name-Last: Luis Bazán
Author-Name: Luis Aparecido Milan
Author-X-Name-First: Luis
Author-X-Name-Last: Aparecido Milan
Title: Bayesian estimation for a mixture of simplex distributions with an unknown number of components: HDI analysis in Brazil
Abstract:
Variables taking value in $ (0, 1) $ (0,1), such as rates or proportions, are frequently analyzed by researchers, for instance, political and social data, as well as the Human Development Index (HDI). However, sometimes this type of data cannot be modeled adequately using a unique distribution. In this case, we can use a mixture of distributions, which is a powerful and flexible probabilistic tool. This manuscript deals with a mixture of simplex distributions to model proportional data. A fully Bayesian approach is proposed for inference which includes a reversible-jump Markov Chain Monte Carlo procedure. The usefulness of the proposed approach is confirmed by using of the simulated mixture data from several different scenarios and by using the methodology to analyze municipal HDI data of cities (or towns) in the Northeast region and São Paulo state in Brazil. The analysis shows that among the cities in the Northeast, some appear to have a similar HDI to other cities in São Paulo state.
Journal: Journal of Applied Statistics
Pages: 1630-1643
Issue: 9
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1221903
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1221903
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:9:p:1630-1643
Template-Type: ReDIF-Article 1.0
Author-Name: Ying Liu
Author-X-Name-First: Ying
Author-X-Name-Last: Liu
Author-Name: Fang Luo
Author-X-Name-First: Fang
Author-X-Name-Last: Luo
Author-Name: Danhui Zhang
Author-X-Name-First: Danhui
Author-X-Name-Last: Zhang
Author-Name: Hongyun Liu
Author-X-Name-First: Hongyun
Author-X-Name-Last: Liu
Title: Comparison and robustness of the REML, ML, MIVQUE estimators for multi-level random mediation model
Abstract:
This article concentrates on the multi-level random mediation effects model (1-1-1) and reviews the maximum likelihood (ML), restricted maximum likelihood (REML), and minimum variance quadratic unbiased estimation (MIVQUE) estimation methods provided by the SAS MIXED process. This paper uses Monte Carlo simulation to make a comparison of the performance of these estimators under a wide variety of different conditions. First, REML and ML produced equivalent results and both of them outperformed MIVQUE, no matter whether the normality assumption was satisfied. Second, the results indicated that the distribution of the $ {\boldsymbol{e}_{\boldsymbol{Yij}}} $ eYij does not influence the mediation effect. The deviation of the normal distribution of $ {\boldsymbol{b}_{\boldsymbol{j}}} $ bj or ‘ $ {\boldsymbol{a}_{\boldsymbol{j}}} $ aj and $ \boldsymbol{b}_{\boldsymbol{j}} $ bj’ affected the mediation effect, particularly in condition that not only the magnitude of the deviation but also the covariance between these two effects were large. This thesis ends with the implications, suggestions and recommendations for the application.
Journal: Journal of Applied Statistics
Pages: 1644-1661
Issue: 9
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1221904
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1221904
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:9:p:1644-1661
Template-Type: ReDIF-Article 1.0
Author-Name: Eliud Silva
Author-X-Name-First: Eliud
Author-X-Name-Last: Silva
Author-Name: Víctor M. Guerrero
Author-X-Name-First: Víctor M.
Author-X-Name-Last: Guerrero
Title: Penalized least squares smoothing of two-dimensional mortality tables with imposed smoothness
Abstract:
This paper presents a method to estimate mortality trends of two-dimensional mortality tables. Comparability of mortality trends for two or more of such tables is enhanced by applying penalized least squares and imposing a desired percentage of smoothness to be attained by the trends. The smoothing procedure is basically determined by the smoothing parameters that are related to the percentage of smoothness. To quantify smoothness, we employ an index defined first for the one-dimensional case and then generalized to the two-dimensional one. The proposed method is applied to data from member countries of the OECD. We establish as goal the smoothed mortality surface for one of those countries and compare it with some other mortality surfaces smoothed with the same percentage of two-dimensional smoothness. Our aim is to be able to see whether convergence exists in the mortality trends of the countries under study, in both year and age dimensions.
Journal: Journal of Applied Statistics
Pages: 1662-1679
Issue: 9
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1221905
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1221905
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:9:p:1662-1679
Template-Type: ReDIF-Article 1.0
Author-Name: Fragkiskos Bersimis
Author-X-Name-First: Fragkiskos
Author-X-Name-Last: Bersimis
Author-Name: Demosthenes Panagiotakos
Author-X-Name-First: Demosthenes
Author-X-Name-Last: Panagiotakos
Author-Name: Malvina Vamvakari
Author-X-Name-First: Malvina
Author-X-Name-Last: Vamvakari
Title: Investigating the sensitivity function's monotony of a health-related index
Abstract:
In this work it is investigated theoretically whether the support's length of a continuous variable, which represents a simple health-related index, affects the index's diagnostic ability of a binary health outcome. The aforementioned is attempted by studying the monotony of the index's sensitivity function, which is a measure of its diagnostic ability, in the cases that the index's distribution was either unknown or the uniform. The case of a composite health-related index which is formed by the sum of m component variables is also presented when the distribution of its component variables was either unknown or the uniform. It is proved that a health-related index's sensitivity is a non-decreasing function as to the finite length of its components' support, under certain condition. In addition, similar propositions are presented in the case that a health-related index is distributed normally according to its distribution parameters.
Journal: Journal of Applied Statistics
Pages: 1680-1706
Issue: 9
Volume: 44
Year: 2017
Month: 7
X-DOI: 10.1080/02664763.2016.1221906
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1221906
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:9:p:1680-1706
Template-Type: ReDIF-Article 1.0
Author-Name: A. Desgagné
Author-X-Name-First: A.
Author-X-Name-Last: Desgagné
Author-Name: P. Lafaye de Micheaux
Author-X-Name-First: P.
Author-X-Name-Last: Lafaye de Micheaux
Title: A powerful and interpretable alternative to the Jarque–Bera test of normality based on 2nd-power skewness and kurtosis, using the Rao's score test on the APD family
Abstract:
We introduce the 2nd-power skewness and kurtosis, which are interesting alternatives to the classical Pearson's skewness and kurtosis, called 3rd-power skewness and 4th-power kurtosis in our terminology. We use the sample 2nd-power skewness and kurtosis to build a powerful test of normality. This test can also be derived as Rao's score test on the asymmetric power distribution, which combines the large range of exponential tail behavior provided by the exponential power distribution family with various levels of asymmetry. We find that our test statistic is asymptotically chi-squared distributed. We also propose a modified test statistic, for which we show numerically that the distribution can be approximated for finite sample sizes with very high precision by a chi-square. Similarly, we propose a directional test based on sample 2nd-power kurtosis only, for the situations where the true distribution is known to be symmetric. Our tests are very similar in spirit to the famous Jarque–Bera test, and as such are also locally optimal. They offer the same nice interpretation, with in addition the gold standard power of the regression and correlation tests. An extensive empirical power analysis is performed, which shows that our tests are among the most powerful normality tests. Our test is implemented in an R package called PoweR.
Journal: Journal of Applied Statistics
Pages: 2307-2327
Issue: 13
Volume: 45
Year: 2018
Month: 10
X-DOI: 10.1080/02664763.2017.1415311
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1415311
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:13:p:2307-2327
Template-Type: ReDIF-Article 1.0
Author-Name: Raushan Bokusheva
Author-X-Name-First: Raushan
Author-X-Name-Last: Bokusheva
Title: Using copulas for rating weather index insurance contracts
Abstract:
This study develops a methodology for a copula-based weather index insurance design. Because the copula approach is better suited for modeling tail dependence than the standard linear correlation approach, its use may increase the effectiveness of weather insurance contracts designed to provide protection against extreme weather events. In our study, we employ three selected Archimedean copulas to capture the left-tail dependence in the joint distribution of the farm yield and a specific weather index. A hierarchical Bayesian model is applied to obtain consistent estimates of tail dependence using relatively short time series. Our empirical results for 47 large grain-producing farms from Kazakhstan indicate that, given the choice of an appropriate weather index to signal catastrophic events, such as a severe drought, copula-based weather insurance contracts may provide significantly higher risk reductions than regression-based indemnification schemes.
Journal: Journal of Applied Statistics
Pages: 2328-2356
Issue: 13
Volume: 45
Year: 2018
Month: 10
X-DOI: 10.1080/02664763.2017.1420146
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1420146
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:13:p:2328-2356
Template-Type: ReDIF-Article 1.0
Author-Name: Musie Ghebremichael
Author-X-Name-First: Musie
Author-X-Name-Last: Ghebremichael
Author-Name: Semere Habtemicael
Author-X-Name-First: Semere
Author-X-Name-Last: Habtemicael
Title: Effect of tuberculosis on immune restoration among HIV-infected patients receiving antiretroviral therapy
Abstract:
In this article, time to immune recovery during antiretroviral therapy was estimated and compared between HIV-infected children with and without tuberculosis (TB). CD4 T-cell restoration was used as a criterion for determining immune recovery. The median residual lifetime function, which is more intuitive and robust compared to the frequently used measures of lifetime data, was used to estimate time to CD4 T-cell restoration. The median residual lifetime is not influenced by extreme observations and heavy-tailed distributions which are commonly encountered in clinical studies. Permutation-based methods were used to compare the CD4 T-cell restoration times between the two groups of patients. Our results indicate that children with TB had uniformly higher median residual lifetimes to immune recovery compared to those without TB. Although TB was associated with slower CD4 T-cell restoration, the differences between the restoration times of the two groups were not statistically significant.
Journal: Journal of Applied Statistics
Pages: 2357-2364
Issue: 13
Volume: 45
Year: 2018
Month: 10
X-DOI: 10.1080/02664763.2017.1420758
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1420758
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:13:p:2357-2364
Template-Type: ReDIF-Article 1.0
Author-Name: Dilli Bhatta
Author-X-Name-First: Dilli
Author-X-Name-Last: Bhatta
Author-Name: Balgobin Nandram
Author-X-Name-First: Balgobin
Author-X-Name-Last: Nandram
Author-Name: Joseph Sedransk
Author-X-Name-First: Joseph
Author-X-Name-Last: Sedransk
Title: Bayesian testing for independence of two categorical variables under two-stage cluster sampling with covariates
Abstract:
We consider Bayesian testing for independence of two categorical variables with covariates for a two-stage cluster sample. This is a difficult problem because we have a complex sample (i.e. cluster sample), not a simple random sample. Our approach is to convert the cluster sample with covariates into an equivalent simple random sample without covariates, which provides a surrogate of the original sample. Then, this surrogate sample is used to compute the Bayes factor to make an inference about independence. We apply our methodology to the data from the Trend in International Mathematics and Science Study [30] for fourth grade US students to assess the association between the mathematics and science scores represented as categorical variables. We show that if there is strong association between two categorical variables, there is no significant difference between the tests with and without the covariates. We also performed a simulation study to further understand the effect of covariates in various situations. We found that for borderline cases (moderate association between the two categorical variables), there are noticeable differences in the test with and without covariates.
Journal: Journal of Applied Statistics
Pages: 2365-2393
Issue: 13
Volume: 45
Year: 2018
Month: 10
X-DOI: 10.1080/02664763.2017.1421914
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1421914
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:13:p:2365-2393
Template-Type: ReDIF-Article 1.0
Author-Name: Deepesh Bhati
Author-X-Name-First: Deepesh
Author-X-Name-Last: Bhati
Author-Name: Sreenivasan Ravi
Author-X-Name-First: Sreenivasan
Author-X-Name-Last: Ravi
Title: Diagnostic plots for identifying max domains of attraction under power normalization
Abstract:
Diagnostic plots for determining the max domains of attraction of power normalized partial maxima are proposed. A test to ascertain the veracity of the claim that data distribution belongs to a max domain of attraction under power normalization is given. The performance of this test is demonstrated using data simulated from many well-known distributions. Furthermore, two real-world datasets are analysed using the proposed procedure.
Journal: Journal of Applied Statistics
Pages: 2394-2410
Issue: 13
Volume: 45
Year: 2018
Month: 10
X-DOI: 10.1080/02664763.2017.1421915
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1421915
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:13:p:2394-2410
Template-Type: ReDIF-Article 1.0
Author-Name: Oliver Cencic
Author-X-Name-First: Oliver
Author-X-Name-Last: Cencic
Author-Name: Rudolf Frühwirth
Author-X-Name-First: Rudolf
Author-X-Name-Last: Frühwirth
Title: Data reconciliation of nonnormal observations with nonlinear constraints
Abstract:
This paper presents a new method for the reconciliation of data described by arbitrary continuous probability distributions, with the focus on nonlinear constraints. The main idea, already applied to linear constraints in a previous paper, is to restrict the joint prior probability distribution of the observed variables with model constraints to get a joint posterior probability distribution. Because in general the posterior probability density function cannot be calculated analytically, it is shown that it has decisive advantages to sample from the posterior distribution by a Markov chain Monte Carlo (MCMC) method. From the resulting sample of observed and unobserved variables various characteristics of the posterior distribution can be estimated, such as the mean, the full covariance matrix, marginal posterior densities, as well as marginal moments, quantiles, and HPD intervals. The procedure is illustrated by examples from material flow analysis and chemical engineering.
Journal: Journal of Applied Statistics
Pages: 2411-2428
Issue: 13
Volume: 45
Year: 2018
Month: 10
X-DOI: 10.1080/02664763.2017.1421916
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1421916
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:13:p:2411-2428
Template-Type: ReDIF-Article 1.0
Author-Name: Jun Zhang
Author-X-Name-First: Jun
Author-X-Name-Last: Zhang
Author-Name: Jing Zhang
Author-X-Name-First: Jing
Author-X-Name-Last: Zhang
Author-Name: Xuehu Zhu
Author-X-Name-First: Xuehu
Author-X-Name-Last: Zhu
Author-Name: Tao Lu
Author-X-Name-First: Tao
Author-X-Name-Last: Lu
Title: Testing symmetry based on empirical likelihood
Abstract:
In this paper, we propose a general kth correlation coefficient between the density function and distribution function of a continuous variable as a measure of symmetry and asymmetry. We first propose a root-n moment-based estimator of the kth correlation coefficient and present its asymptotic results. Next, we consider statistical inference of the kth correlation coefficient by using the empirical likelihood (EL) method. The EL statistic is shown to be asymptotically a standard chi-squared distribution. Last, we propose a residual-based estimator of the kth correlation coefficient for a parametric regression model to test whether the density function of the true model error is symmetric or not. We present the asymptotic results of the residual-based kth correlation coefficient estimator and also construct its EL-based confidence intervals. Simulation studies are conducted to examine the performance of the proposed estimators, and we also use our proposed estimators to analyze the air quality dataset.
Journal: Journal of Applied Statistics
Pages: 2429-2454
Issue: 13
Volume: 45
Year: 2018
Month: 10
X-DOI: 10.1080/02664763.2017.1421917
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1421917
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:13:p:2429-2454
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaojie Xu
Author-X-Name-First: Xiaojie
Author-X-Name-Last: Xu
Title: Causal structure among US corn futures and regional cash prices in the time and frequency domain
Abstract:
This study investigates causal structure among daily Chicago Board of Trade corn futures prices and seven regional cash series from Iowa, Illinois, Indiana, Ohio, Minnesota, Nebraska, and Kansas for January 2006–March 2011. Their wavelet transformed series are further analyzed for causal relationships at different time scales. Empirical results indicate no causality among states or between the futures and a cash series for time scales shorter than one month. As scales increase but do not exceed a year, bidirectional causal flows are determined among all prices. The information leadership role of the futures against a cash price is identified for the scale longer than one year and raw series, at which no interstate causality is found.
Journal: Journal of Applied Statistics
Pages: 2455-2480
Issue: 13
Volume: 45
Year: 2018
Month: 10
X-DOI: 10.1080/02664763.2017.1423044
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1423044
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:13:p:2455-2480
Template-Type: ReDIF-Article 1.0
Author-Name: Piotr Sulewski
Author-X-Name-First: Piotr
Author-X-Name-Last: Sulewski
Title: Power analysis of independence testing for three-way contingency tables of small sizes
Abstract:
The first aim of this paper is to introduce a modular test for the three-way contingency table (TT). The second aim is to describe the procedure of generating TT using the bar method. The third aim is on the one hand to suggest the measure of untruthfulness of H0 and on the other hand to compare the quality of independence tests by using their power. Critical values for analyzed statistics were determined by simulating the Monte Carlo method.
Journal: Journal of Applied Statistics
Pages: 2481-2498
Issue: 13
Volume: 45
Year: 2018
Month: 10
X-DOI: 10.1080/02664763.2018.1424122
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1424122
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:13:p:2481-2498
Template-Type: ReDIF-Article 1.0
Author-Name: Caleb Phillips
Author-X-Name-First: Caleb
Author-X-Name-Last: Phillips
Author-Name: Ryan Elmore
Author-X-Name-First: Ryan
Author-X-Name-Last: Elmore
Author-Name: Jenny Melius
Author-X-Name-First: Jenny
Author-X-Name-Last: Melius
Author-Name: Pieter Gagnon
Author-X-Name-First: Pieter
Author-X-Name-Last: Gagnon
Author-Name: Robert Margolis
Author-X-Name-First: Robert
Author-X-Name-Last: Margolis
Title: A data mining approach to estimating rooftop photovoltaic potential in the US
Abstract:
This paper aims to quantify the amount of suitable rooftop area for photovoltaic (PV) energy generation in the continental United States (US). The approach is data-driven, combining Geographic Information Systems analysis of an extensive dataset of Light Detection and Ranging (LiDAR) measurements collected by the Department of Homeland Security with a statistical model trained on these same data. The model developed herein can predict the quantity of suitable roof area where LiDAR data is not available. This analysis focuses on small buildings (1000 to 5000 square feet) which account for more than half of the total available rooftop space in these data (58%) and demonstrate a greater variability in suitability compared to larger buildings which are nearly all suitable for PV installations. This paper presents new results characterizing the size, shape and suitability of US rooftops with respect to PV installations. Overall 28% of small building roofs appear suitable in the continental United States for rooftop solar. Nationally, small building rooftops could accommodate an expected 731 GW of PV capacity and generate 926 TWh/year of PV energy on 4920 $ {\rm km}^2 $ km2 of suitable rooftop space which equates to 25% the current US electricity sales.
Journal: Journal of Applied Statistics
Pages: 385-394
Issue: 3
Volume: 46
Year: 2019
Month: 2
X-DOI: 10.1080/02664763.2018.1492525
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1492525
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:3:p:385-394
Template-Type: ReDIF-Article 1.0
Author-Name: Eliane R. Rodrigues
Author-X-Name-First: Eliane R.
Author-X-Name-Last: Rodrigues
Author-Name: Mario H. Tarumoto
Author-X-Name-First: Mario H.
Author-X-Name-Last: Tarumoto
Author-Name: Guadalupe Tzintzun
Author-X-Name-First: Guadalupe
Author-X-Name-Last: Tzintzun
Title: Application of a non-homogeneous Markov chain with seasonal transition probabilities to ozone data
Abstract:
In this work, we assume that the sequence recording whether or not an ozone exceedance of an environmental threshold has occurred in a given day is ruled by a non-homogeneous Markov chain of order one. In order to account for the possible presence of cycles in the empirical transition probabilities, a parametric form incorporating seasonal components is considered. Results show that even though some covariates (namely, relative humidity and temperature) are not included explicitly in the model, their influence is captured in the behavior of the transition probabilities. Parameters are estimated using the Bayesian point of view via Markov chain Monte Carlo algorithms. The model is applied to ozone data obtained from the monitoring network of Mexico City, Mexico. An analysis of how the methodology could be used as an aid in the decision-making is also given.
Journal: Journal of Applied Statistics
Pages: 395-415
Issue: 3
Volume: 46
Year: 2019
Month: 2
X-DOI: 10.1080/02664763.2018.1492527
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1492527
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:3:p:395-415
Template-Type: ReDIF-Article 1.0
Author-Name: Luiz R. Nakamura
Author-X-Name-First: Luiz R.
Author-X-Name-Last: Nakamura
Author-Name: Pedro H. R. Cerqueira
Author-X-Name-First: Pedro H. R.
Author-X-Name-Last: Cerqueira
Author-Name: Thiago G. Ramires
Author-X-Name-First: Thiago G.
Author-X-Name-Last: Ramires
Author-Name: Rodrigo R. Pescim
Author-X-Name-First: Rodrigo R.
Author-X-Name-Last: Pescim
Author-Name: R. A. Rigby
Author-X-Name-First: R. A.
Author-X-Name-Last: Rigby
Author-Name: Dimitrios M. Stasinopoulos
Author-X-Name-First: Dimitrios M.
Author-X-Name-Last: Stasinopoulos
Title: A new continuous distribution on the unit interval applied to modelling the points ratio of football teams
Abstract:
We introduce a new flexible distribution to deal with variables on the unit interval based on a transformation of the sinh–arcsinh distribution, which accommodates different degrees of skewness and kurtosis and becomes an interesting alternative to model this type of data. We also include this new distribution into the generalised additive models for location, scale and shape (GAMLSS) framework in order to develop and fit its regression model. For different parameter settings, some simulations are performed to investigate the behaviour of the estimators. The potentiality of the new regression model is illustrated by means of a real dataset related to the points rate of football teams at the end of a championship from the four most important leagues in the world: Barclays Premier League (England), Bundesliga (Germany), Serie A (Italy) and BBVA league (Spain) during three seasons (2011–2012, 2012–2013 and 2013–2014).
Journal: Journal of Applied Statistics
Pages: 416-431
Issue: 3
Volume: 46
Year: 2019
Month: 2
X-DOI: 10.1080/02664763.2018.1495699
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1495699
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:3:p:416-431
Template-Type: ReDIF-Article 1.0
Author-Name: Haiqing Chen
Author-X-Name-First: Haiqing
Author-X-Name-Last: Chen
Author-Name: Weihu Cheng
Author-X-Name-First: Weihu
Author-X-Name-Last: Cheng
Author-Name: Yaohua Rong
Author-X-Name-First: Yaohua
Author-X-Name-Last: Rong
Author-Name: Xu Zhao
Author-X-Name-First: Xu
Author-X-Name-Last: Zhao
Title: Fitting the generalized Pareto distribution to data based on transformations of order statistics
Abstract:
Generalized Pareto distribution (GPD) has been widely used to model exceedances over thresholds. In this article we propose a new method called weighted nonlinear least squares (WNLS) to estimate the parameters of the GPD. The WNLS estimators always exist and are simple to compute. Some asymptotic results of the proposed method are provided. The simulation results indicate that the proposed method performs well compared to existing methods in terms of mean squared error and bias. Its advantages are further illustrated through the analysis of two real data sets.
Journal: Journal of Applied Statistics
Pages: 432-448
Issue: 3
Volume: 46
Year: 2019
Month: 2
X-DOI: 10.1080/02664763.2018.1495700
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1495700
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:3:p:432-448
Template-Type: ReDIF-Article 1.0
Author-Name: Ricardo Puziol de Oliveira
Author-X-Name-First: Ricardo Puziol
Author-X-Name-Last: de Oliveira
Author-Name: Jorge Alberto Achcar
Author-X-Name-First: Jorge Alberto
Author-X-Name-Last: Achcar
Author-Name: Danielle Peralta
Author-X-Name-First: Danielle
Author-X-Name-Last: Peralta
Author-Name: Josmar Mazucheli
Author-X-Name-First: Josmar
Author-X-Name-Last: Mazucheli
Title: Discrete and continuous bivariate lifetime models in presence of cure rate: a comparative study under Bayesian approach
Abstract:
The modeling and analysis of lifetime data in which the main endpoints are the times when an event of interest occurs is of great interest in medical studies. In these studies, it is common that two or more lifetimes associated with the same unit such as the times to deterioration levels or the times to reaction to a treatment in pairs of organs like lungs, kidneys, eyes or ears. In medical applications, it is also possible that a cure rate is present and needed to be modeled with lifetime data with long-term survivors. This paper presented a comparative study under a Bayesian approach among some existing continuous and discrete bivariate distributions such as the bivariate exponential distributions and the bivariate geometric distributions in presence of cure rate, censored data and covariates. In presence of lifetimes related to cured patients, it is assumed standard mixture cure rate models in the data analysis. The posterior summaries of interest are obtained using Markov Chain Monte Carlo methods. To illustrate the proposed methodology two real medical data sets are considered.
Journal: Journal of Applied Statistics
Pages: 449-467
Issue: 3
Volume: 46
Year: 2019
Month: 2
X-DOI: 10.1080/02664763.2018.1495701
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1495701
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:3:p:449-467
Template-Type: ReDIF-Article 1.0
Author-Name: Yihong Zhan
Author-X-Name-First: Yihong
Author-X-Name-Last: Zhan
Author-Name: Yanan Zhang
Author-X-Name-First: Yanan
Author-X-Name-Last: Zhang
Author-Name: Jiajia Zhang
Author-X-Name-First: Jiajia
Author-X-Name-Last: Zhang
Author-Name: Bo Cai
Author-X-Name-First: Bo
Author-X-Name-Last: Cai
Author-Name: James W. Hardin
Author-X-Name-First: James W.
Author-X-Name-Last: Hardin
Title: Sample size calculation for a proportional hazards mixture cure model with nonbinary covariates
Abstract:
Sample size calculation is a critical issue in clinical trials because a small sample size leads to a biased inference and a large sample size increases the cost. With the development of advanced medical technology, some patients can be cured of certain chronic diseases, and the proportional hazards mixture cure model has been developed to handle survival data with potential cure information. Given the needs of survival trials with potential cure proportions, a corresponding sample size formula based on the log-rank test statistic for binary covariates has been proposed by Wang et al. [25]. However, a sample size formula based on continuous variables has not been developed. Herein, we presented sample size and power calculations for the mixture cure model with continuous variables based on the log-rank method and further modified it by Ewell's method. The proposed approaches were evaluated using simulation studies for synthetic data from exponential and Weibull distributions. A program for calculating necessary sample size for continuous covariates in a mixture cure model was implemented in R.
Journal: Journal of Applied Statistics
Pages: 468-483
Issue: 3
Volume: 46
Year: 2019
Month: 2
X-DOI: 10.1080/02664763.2018.1498463
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1498463
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:3:p:468-483
Template-Type: ReDIF-Article 1.0
Author-Name: Juliana Scudilio
Author-X-Name-First: Juliana
Author-X-Name-Last: Scudilio
Author-Name: Vinicius F. Calsavara
Author-X-Name-First: Vinicius F.
Author-X-Name-Last: Calsavara
Author-Name: Ricardo Rocha
Author-X-Name-First: Ricardo
Author-X-Name-Last: Rocha
Author-Name: Francisco Louzada
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada
Author-Name: Vera Tomazella
Author-X-Name-First: Vera
Author-X-Name-Last: Tomazella
Author-Name: Agatha S. Rodrigues
Author-X-Name-First: Agatha S.
Author-X-Name-Last: Rodrigues
Title: Defective models induced by gamma frailty term for survival data with cured fraction
Abstract:
In this paper, we propose a defective model induced by a frailty term for modeling the proportion of cured. Unlike most of the cure rate models, defective models have advantage of modeling the cure rate without adding any extra parameter in model. The introduction of an unobserved heterogeneity among individuals has bring advantages for the estimated model. The influence of unobserved covariates is incorporated using a proportional hazard model. The frailty term assumed to follow a gamma distribution is introduced on the hazard rate to control the unobservable heterogeneity of the patients. We assume that the baseline distribution follows a Gompertz and inverse Gaussian defective distributions. Thus we propose and discuss two defective distributions: the defective gamma-Gompertz and gamma-inverse Gaussian regression models. Simulation studies are performed to verify the asymptotic properties of the maximum likelihood estimator. Lastly, in order to illustrate the proposed model, we present three applications in real data sets, in which one of them we are using for the first time, related to a study about breast cancer in the A.C.Camargo Cancer Center, São Paulo, Brazil.
Journal: Journal of Applied Statistics
Pages: 484-507
Issue: 3
Volume: 46
Year: 2019
Month: 2
X-DOI: 10.1080/02664763.2018.1498464
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1498464
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:3:p:484-507
Template-Type: ReDIF-Article 1.0
Author-Name: Wei Wang
Author-X-Name-First: Wei
Author-X-Name-Last: Wang
Author-Name: Zhuo Fang
Author-X-Name-First: Zhuo
Author-X-Name-Last: Fang
Title: Linear scalar-on-surface random effects regression models
Abstract:
Many research fields increasingly involve analyzing data of a complex structure. Models investigating the dependence of a response on a predictor have moved beyond the ordinary scalar-on-vector regression. We propose a regression model for a scalar response and a surface (or a bivariate function) predictor. The predictor has a random component and the regression model falls in the framework of linear random effects models. We estimate the model parameters via maximizing the log-likelihood with the ECME (Expectation/Conditional Maximization Either) algorithm. We use the approach to analyze a data set where the response is the neuroticism score and the predictor is the resting-state brain function image. In the simulations we tried, the approach has better performance than two other approaches, a functional principal component regression approach and a smooth scalar-on-image regression approach.
Journal: Journal of Applied Statistics
Pages: 508-521
Issue: 3
Volume: 46
Year: 2019
Month: 2
X-DOI: 10.1080/02664763.2018.1502262
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1502262
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:3:p:508-521
Template-Type: ReDIF-Article 1.0
Author-Name: Weiwei Wang
Author-X-Name-First: Weiwei
Author-X-Name-Last: Wang
Author-Name: Xianyi Wu
Author-X-Name-First: Xianyi
Author-X-Name-Last: Wu
Author-Name: Xiaoqi Zhang
Author-X-Name-First: Xiaoqi
Author-X-Name-Last: Zhang
Author-Name: Xiaobing Zhao
Author-X-Name-First: Xiaobing
Author-X-Name-Last: Zhao
Author-Name: Xian Zhou
Author-X-Name-First: Xian
Author-X-Name-Last: Zhou
Title: Partial sufficient dimension reduction on joint model of recurrent and terminal events
Abstract:
Joint modeling of recurrent and terminal events has attracted considerable interest and extensive investigations by many authors. The assumption of low-dimensional covariates has been usually applied in the existing studies, which is however inapplicable in many practical situations. In this paper, we consider a partial sufficient dimension reduction approach for a joint model with high-dimensional covariates. Some simulations as well as three real data applications are presented to confirm and assess the performance of the proposed model and approach.
Journal: Journal of Applied Statistics
Pages: 522-541
Issue: 3
Volume: 46
Year: 2019
Month: 2
X-DOI: 10.1080/02664763.2018.1506019
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1506019
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:3:p:522-541
Template-Type: ReDIF-Article 1.0
Author-Name: Fatemeh Hosseini
Author-X-Name-First: Fatemeh
Author-X-Name-Last: Hosseini
Author-Name: Omid Karimi
Author-X-Name-First: Omid
Author-X-Name-Last: Karimi
Title: Approximate composite marginal likelihood inference in spatial generalized linear mixed models
Abstract:
Non-Gaussian spatial responses are usually modeled using spatial generalized linear mixed model with spatial random effects. The likelihood function of this model cannot usually be given in a closed form, thus the maximum likelihood approach is very challenging. There are numerical ways to maximize the likelihood function, such as Monte Carlo Expectation Maximization and Quadrature Pairwise Expectation Maximization algorithms. They can be applied but may in such cases be computationally very slow or even prohibitive. Gauss–Hermite quadrature approximation only suitable for low-dimensional latent variables and its accuracy depends on the number of quadrature points. Here, we propose a new approximate pairwise maximum likelihood method to the inference of the spatial generalized linear mixed model. This approximate method is fast and deterministic, using no sampling-based strategies. The performance of the proposed method is illustrated through two simulation examples and practical aspects are investigated through a case study on a rainfall data set.
Journal: Journal of Applied Statistics
Pages: 542-558
Issue: 3
Volume: 46
Year: 2019
Month: 2
X-DOI: 10.1080/02664763.2018.1506020
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1506020
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:3:p:542-558
Template-Type: ReDIF-Article 1.0
Author-Name: Ozgur Danisman
Author-X-Name-First: Ozgur
Author-X-Name-Last: Danisman
Author-Name: Umay Uzunoglu Kocer
Author-X-Name-First: Umay
Author-X-Name-Last: Uzunoglu Kocer
Title: Construction of a semi-Markov model for the performance of a football team in the presence of missing data
Abstract:
Using play-by-play data from the very beginning of the professional football league in Turkey, a semi-Markov model is presented for describing the performance of football teams. The official match results of the selected teams during 55 football seasons are used and winning, drawing and losing are considered as Markov states. The semi-Markov model is constructed with transition rates inferred from the official match results. The duration between the last match of a season and the very first match of the following season is much longer than any other duration during the season. Therefore these values are considered as missing values and estimated by using expectation–maximization algorithm. The effect of the sojourn time in a state to the performance of a team is discussed as well as mean sojourn times after losing/winning are estimated. The limiting probabilities of winning, drawing and losing are calculated. Some insights about the performance of the selected teams are presented.
Journal: Journal of Applied Statistics
Pages: 559-576
Issue: 3
Volume: 46
Year: 2019
Month: 2
X-DOI: 10.1080/02664763.2018.1508556
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1508556
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:3:p:559-576
Template-Type: ReDIF-Article 1.0
Author-Name: The Editors
Title: Corrigendum
Journal: Journal of Applied Statistics
Pages: 577-579
Issue: 3
Volume: 46
Year: 2019
Month: 2
X-DOI: 10.1080/02664763.2018.1505203
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1505203
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:3:p:577-579
Template-Type: ReDIF-Article 1.0
Author-Name: Kuangnan Fang
Author-X-Name-First: Kuangnan
Author-X-Name-Last: Fang
Author-Name: Shuangge Ma
Author-X-Name-First: Shuangge
Author-X-Name-Last: Ma
Title: Three-part model for fractional response variables with application to Chinese household health insurance coverage
Abstract: A survey on health insurance was conducted in July and August of 2011 in three major cities in China. In this study, we analyze the household coverage rate, which is an important index of the quality of health insurance. The coverage rate is restricted to the unit interval [0, 1], and it may differ from other rate data in that the “two corners” are nonzero. That is, there are nonzero probabilities of zero and full coverage. Such data may also be encountered in economics, finance, medicine, and many other areas. The existing approaches may not be able to properly accommodate such data. In this study, we develop a three-part model that properly describes fractional response variables with non-ignorable zeros and ones. We investigate estimation and inference under two proportional constraints on the regression parameters. Such constraints may lead to more lucid interpretations and fewer unknown parameters and hence more accurate estimation. A simulation study is conducted to compare the performance of constrained and unconstrained models and show that estimation under constraint can be more efficient. The analysis of household health insurance coverage data suggests that household size, income, expense, and presence of chronic disease are associated with insurance coverage.
Journal: Journal of Applied Statistics
Pages: 925-940
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.758246
File-URL: http://hdl.handle.net/10.1080/02664763.2012.758246
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:925-940
Template-Type: ReDIF-Article 1.0
Author-Name: Seung-Gu Kim
Author-X-Name-First: Seung-Gu
Author-X-Name-Last: Kim
Author-Name: Jeong-Soo Park
Author-X-Name-First: Jeong-Soo
Author-X-Name-Last: Park
Author-Name: Yung-Seop Lee
Author-X-Name-First: Yung-Seop
Author-X-Name-Last: Lee
Title: Identification of target clusters by using the restricted normal mixture model
Abstract: This paper addresses the problem of identifying groups that satisfy the specific conditions for the means of feature variables. In this study, we refer to the identified groups as “target clusters” (TCs). To identify TCs, we propose a method based on the normal mixture model (NMM) restricted by a linear combination of means. We provide an expectation–maximization (EM) algorithm to fit the restricted NMM by using the maximum-likelihood method. The convergence property of the EM algorithm and a reasonable set of initial estimates are presented. We demonstrate the method's usefulness and validity through a simulation study and two well-known data sets. The proposed method provides several types of useful clusters, which would be difficult to achieve with conventional clustering or exploratory data analysis methods based on the ordinary NMM. A simple comparison with another target clustering approach shows that the proposed method is promising in the identification.
Journal: Journal of Applied Statistics
Pages: 941-960
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.759192
File-URL: http://hdl.handle.net/10.1080/02664763.2012.759192
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:941-960
Template-Type: ReDIF-Article 1.0
Author-Name: B. Houlding
Author-X-Name-First: B.
Author-X-Name-Last: Houlding
Author-Name: J. Haslett
Author-X-Name-First: J.
Author-X-Name-Last: Haslett
Title: Scheduling parallel conference sessions: an application of a novel hybrid clustering algorithm for ensuring constrained cardinality
Abstract: The 2011 World Statistics Congress included approximately 1150 oral and 250 poster presentations over approximately 250 sessions, with as many as 20 sessions running in parallel at any one time. Scheduling a timetable for such a conference is hence a complicated task, as ideally, talks on similar topics should be scheduled in the same session, and similar session topics should not be presented at identical times, allowing participants to easily decide which of the number of sessions to attend. Here, we consider a novel hybrid clustering algorithm that allows a solution under a constraint of fixed cardinality, and which is designed to find clusters of highly dense regions whilst forming others from the remaining outlying regions, hence providing a simple, yet data-generated, solution to such scheduling problems.
Journal: Journal of Applied Statistics
Pages: 961-971
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.760239
File-URL: http://hdl.handle.net/10.1080/02664763.2012.760239
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:961-971
Template-Type: ReDIF-Article 1.0
Author-Name: I-Tang Yu
Author-X-Name-First: I-Tang
Author-X-Name-Last: Yu
Title: A modification of the Box–Meyer method for finding the active factors in screening experiments
Abstract: Screening experiments are conducted to identify a few active factors among a large number of factors. For the objective of identifying active factors, Box and Meyer provided an innovative approach, the Box–Meyer method (BMM). With the use of means models, we propose a modification of the BMM in this paper. Compared with the original BMM, the modified BMM (MBMM) can circumvent the problem that the original BMM runs into, namely that it may fail to identify some active factors due to the ignorance of higher order interactions. Furthermore, the number of explanatory variables in the MBMM is smaller. Therefore, the computational complexity is reduced. Finally, three examples with different types of designs are used to demonstrate the wide applicability of the MBMM.
Journal: Journal of Applied Statistics
Pages: 972-984
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2012.761181
File-URL: http://hdl.handle.net/10.1080/02664763.2012.761181
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:972-984
Template-Type: ReDIF-Article 1.0
Author-Name: Connie Stewart
Author-X-Name-First: Connie
Author-X-Name-Last: Stewart
Title: Zero-inflated beta distribution for modeling the proportions in quantitative fatty acid signature analysis
Abstract: Quantitative fatty acid signature analysis (QFASA) produces diet estimates containing the proportion of each species of prey in a predator's diet. Since the diet estimates are compositional, often contain an abundance of zeros (signifying the absence of a species in the diet), and samples sizes are generally small, inference problems require the use of nonstandard statistical methodology. Recently, a mixture distribution involving the multiplicative logistic normal distribution (and its skew-normal extension) was introduced in relation to QFASA to manage the problematic zeros. In this paper, we examine an alternative mixture distribution, namely, the recently proposed zero-inflated beta (ZIB) distribution. A potential advantage of using the ZIB distribution over the previously considered mixture models is that it does not require transformation of the data. To assess the usefulness of the ZIB distribution in QFASA inference problems, a simulation study is first carried out which compares the small sample properties of the maximum likelihood estimators of the means. The fit of the distributions is then examined using ‘pseudo-predators’ generated from a large real-life prey base. Finally, confidence intervals for the true diet based on the ZIB distribution are compared with earlier results through a simulation study and harbor seal data.
Journal: Journal of Applied Statistics
Pages: 985-992
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.769088
File-URL: http://hdl.handle.net/10.1080/02664763.2013.769088
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:985-992
Template-Type: ReDIF-Article 1.0
Author-Name: Yan Zhou
Author-X-Name-First: Yan
Author-X-Name-Last: Zhou
Author-Name: John Aston
Author-X-Name-First: John
Author-X-Name-Last: Aston
Author-Name: Adam Johansen
Author-X-Name-First: Adam
Author-X-Name-Last: Johansen
Title: Bayesian model comparison for compartmental models with applications in positron emission tomography
Abstract: We develop strategies for Bayesian modelling as well as model comparison, averaging and selection for compartmental models with particular emphasis on those that occur in the analysis of positron emission tomography (PET) data. Both modelling and computational issues are considered. Biophysically inspired informative priors are developed for the problem at hand, and by comparison with default vague priors it is shown that the proposed modelling is not overly sensitive to prior specification. It is also shown that an additive normal error structure does not describe measured PET data well, despite being very widely used, and that within a simple Bayesian framework simultaneous parameter estimation and model comparison can be performed with a more general noise model. The proposed approach is compared with standard techniques using both simulated and real data. In addition to good, robust estimation performance, the proposed technique provides, automatically, a characterisation of the uncertainty in the resulting estimates which can be considerable in applications such as PET.
Journal: Journal of Applied Statistics
Pages: 993-1016
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.772569
File-URL: http://hdl.handle.net/10.1080/02664763.2013.772569
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:993-1016
Template-Type: ReDIF-Article 1.0
Author-Name: Nagatomo Nakamura
Author-X-Name-First: Nagatomo
Author-X-Name-Last: Nakamura
Author-Name: Takahiro Tsuchiya
Author-X-Name-First: Takahiro
Author-X-Name-Last: Tsuchiya
Title: A model of regression lines through a common point: estimation of the focal point in wind-blown sand phenomena
Abstract: This paper discusses a model in which the regression lines will be passing through a common point. This point exists as a focal point in the wind-blown sand phenomena. The model of regression lines will be called ‘the focal point regression model’. The focal point will move according to the conditions of the experiments or the measurement site, so it must be estimated together with regression coefficients. The existence of the focal point is mathematically proved in the research field of coastal engineering, but its physical meaning and exact estimation method have not been established. Considering the experimental and/or measurement conditions, five models, that is, common or different error variance(s), passing through or not the centroid and Bayes-like approach are proposed. Moreover, the formulae of direct computation for a focal point under some conditions are given for engineering purpose. The models are applied to the wind-blown sand data, and behaviors of the models are verified by numerical experiments.
Journal: Journal of Applied Statistics
Pages: 1017-1031
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.772570
File-URL: http://hdl.handle.net/10.1080/02664763.2013.772570
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:1017-1031
Template-Type: ReDIF-Article 1.0
Author-Name: Antonio Gonçalves
Author-X-Name-First: Antonio
Author-X-Name-Last: Gonçalves
Author-Name: Renan Almeida
Author-X-Name-First: Renan
Author-X-Name-Last: Almeida
Author-Name: Marcos Lins
Author-X-Name-First: Marcos
Author-X-Name-Last: Lins
Author-Name: Carlos Samanez
Author-X-Name-First: Carlos
Author-X-Name-Last: Samanez
Title: Canonical correlation analysis in the definition of weight restrictions for data envelopment analysis
Abstract: This work investigates the use of canonical correlation analysis (CCA) in the definition of weight restrictions for data envelopment analysis (DEA). With this purpose, CCA limits are introduced into Wong and Beasley's DEA model. An application of the method is made over data from hospitals in 27 Brazilian cities, producing as outputs average payment (average admission values) and percentage of hospital admissions according to disease groups (International Classification of Diseases, 9th Edition), and having as inputs mortality rates and average stay (length of stay after admission (days)). In this application, performance scores were calculated for both the (CCA) restricted and unrestricted DEA models. It can be concluded that the use of CCA-based weight limits for DEA models increases the consistency of the estimated DEA scores (more homogenous weights) and that these limits do not present mathematical infeasibility problems while avoiding the need for subjectively restricting weight variation in DEA.
Journal: Journal of Applied Statistics
Pages: 1032-1043
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.772571
File-URL: http://hdl.handle.net/10.1080/02664763.2013.772571
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:1032-1043
Template-Type: ReDIF-Article 1.0
Author-Name: David Zimmer
Author-X-Name-First: David
Author-X-Name-Last: Zimmer
Title: Intertemporal persistence in healthcare spending and utilization: the role of insurance
Abstract: This paper develops a dynamic panel model of healthcare demand, with particular emphasis on the relationship between insurance and intertemporal persistence in demand. The model combines flexible marginal distributions for single-period demand with a tight intertemporal dynamic structure. The dynamic component follows a first-order nonlinear Markov process, which is constructed using copula functions. The model considers different insurance plans, including private plans with gatekeepers, private plans without gatekeepers, and public plans. Results indicate that individuals who lack insurance exhibit higher intertemporal persistence in medical spending compared to those with insurance coverage.
Journal: Journal of Applied Statistics
Pages: 1044-1063
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.780155
File-URL: http://hdl.handle.net/10.1080/02664763.2013.780155
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:1044-1063
Template-Type: ReDIF-Article 1.0
Author-Name: M. Islam
Author-X-Name-First: M.
Author-X-Name-Last: Islam
Author-Name: Abdulhamid Alzaid
Author-X-Name-First: Abdulhamid
Author-X-Name-Last: Alzaid
Author-Name: Rafiqul Chowdhury
Author-X-Name-First: Rafiqul
Author-X-Name-Last: Chowdhury
Author-Name: Khalaf Sultan
Author-X-Name-First: Khalaf
Author-X-Name-Last: Sultan
Title: A generalized bivariate Bernoulli model with covariate dependence
Abstract: Dependence in outcome variables may pose formidable difficulty in analyzing data in longitudinal studies. In the past, most of the studies made attempts to address this problem using the marginal models. However, using the marginal models alone, it is difficult to specify the measures of dependence in outcomes due to association between outcomes as well as between outcomes and explanatory variables. In this paper, a generalized approach is demonstrated using both the conditional and marginal models. This model uses link functions to test for dependence in outcome variables. The estimation and test procedures are illustrated with an application to the mobility index data from the Health and Retirement Survey and also simulations are performed for correlated binary data generated from the bivariate Bernoulli distributions. The results indicate the usefulness of the proposed method.
Journal: Journal of Applied Statistics
Pages: 1064-1075
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.780156
File-URL: http://hdl.handle.net/10.1080/02664763.2013.780156
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:1064-1075
Template-Type: ReDIF-Article 1.0
Author-Name: Farzana Noor
Author-X-Name-First: Farzana
Author-X-Name-Last: Noor
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Title: Bayesian inference of the inverse Weibull mixture distribution using type-I censoring
Abstract: A large number of models have been derived from the two-parameter Weibull distribution including the inverse Weibull (IW) model which is found suitable for modeling the complex failure data set. In this paper, we present the Bayesian inference for the mixture of two IW models. For this purpose, the Bayes estimates of the parameters of the mixture model along with their posterior risks using informative as well as the non-informative prior are obtained. These estimates have been attained considering two cases: (a) when the shape parameter is known and (b) when all parameters are unknown. For the former case, Bayes estimates are obtained under three loss functions while for the latter case only the squared error loss function is used. Simulation study is carried out in order to explore numerical aspects of the proposed Bayes estimators. A real-life data set is also presented for both cases, and parameters obtained under case when shape parameter is known are tested through testing of hypothesis procedure.
Journal: Journal of Applied Statistics
Pages: 1076-1089
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.780157
File-URL: http://hdl.handle.net/10.1080/02664763.2013.780157
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:1076-1089
Template-Type: ReDIF-Article 1.0
Author-Name: Vahid Nassiri
Author-X-Name-First: Vahid
Author-X-Name-Last: Nassiri
Author-Name: Ignace Loris
Author-X-Name-First: Ignace
Author-X-Name-Last: Loris
Title: A generalized quantile regression model
Abstract: A new class of probability distributions, the so-called connected double truncated gamma distribution, is introduced. We show that using this class as the error distribution of a linear model leads to a generalized quantile regression model that combines desirable properties of both least-squares and quantile regression methods: robustness to outliers and differentiable loss function.
Journal: Journal of Applied Statistics
Pages: 1090-1105
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.780158
File-URL: http://hdl.handle.net/10.1080/02664763.2013.780158
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:1090-1105
Template-Type: ReDIF-Article 1.0
Author-Name: Muhammad Danish
Author-X-Name-First: Muhammad
Author-X-Name-Last: Danish
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Title: Bayesian estimation for randomly censored generalized exponential distribution under asymmetric loss functions
Abstract: This paper deals with the Bayesian estimation of generalized exponential distribution in the proportional hazards model of random censorship under asymmetric loss functions. It is well known for the two-parameter lifetime distributions that the continuous conjugate priors for parameters do not exist; we assume independent gamma priors for the scale and the shape parameters. It is observed that the closed-form expressions for the Bayes estimators cannot be obtained; we propose Tierney–Kadane's approximation and Gibbs sampling to approximate the Bayes estimates. Monte Carlo simulation is carried out to observe the behavior of the proposed methods and one real data analysis is performed for illustration. Bayesian methods are compared with maximum likelihood and it is observed that the Bayes estimators perform better than the maximum-likelihood estimators in some cases.
Journal: Journal of Applied Statistics
Pages: 1106-1119
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.780159
File-URL: http://hdl.handle.net/10.1080/02664763.2013.780159
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:1106-1119
Template-Type: ReDIF-Article 1.0
Author-Name: Michael McAssey
Author-X-Name-First: Michael
Author-X-Name-Last: McAssey
Title: An empirical goodness-of-fit test for multivariate distributions
Abstract: An empirical test is presented as a tool for assessing whether a specified multivariate probability model is suitable to describe the underlying distribution of a set of observations. This test is based on the premise that, given any probability distribution, the Mahalanobis distances corresponding to data generated from that distribution will likewise follow a distinct distribution that can be estimated well by means of a large sample. We demonstrate the effectiveness of the test for detecting departures from several multivariate distributions. We then apply the test to a real multivariate data set to confirm that it is consistent with a multivariate beta model.
Journal: Journal of Applied Statistics
Pages: 1120-1131
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.780160
File-URL: http://hdl.handle.net/10.1080/02664763.2013.780160
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:1120-1131
Template-Type: ReDIF-Article 1.0
Author-Name: Jiong Luo
Author-X-Name-First: Jiong
Author-X-Name-Last: Luo
Author-Name: Zheng Su
Author-X-Name-First: Zheng
Author-X-Name-Last: Su
Title: A note on variance estimation in the Cox proportional hazards model
Abstract: The Cox proportional hazards model is widely used in clinical trials with time-to-event outcomes to compare an experimental treatment with the standard of care. At the design stage of a trial the number of events required to achieve a desired power needs to be determined, which is frequently based on estimating the variance of the maximum partial likelihood estimate of the regression parameter with a function of the number of events. Underestimating the variance at the design stage will lead to insufficiently powered studies, and overestimating the variance will lead to unnecessarily large trials. A simple approach to estimating the variance is introduced, which is compared with two widely adopted approaches in practice. Simulation results show that the proposed approach outperforms the standard ones and gives nearly unbiased estimates of the variance.
Journal: Journal of Applied Statistics
Pages: 1132-1139
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.780161
File-URL: http://hdl.handle.net/10.1080/02664763.2013.780161
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:1132-1139
Template-Type: ReDIF-Article 1.0
Author-Name: Steven Hua
Author-X-Name-First: Steven
Author-X-Name-Last: Hua
Author-Name: D. Hawkins
Author-X-Name-First: D.
Author-X-Name-Last: Hawkins
Author-Name: Jihao Zhou
Author-X-Name-First: Jihao
Author-X-Name-Last: Zhou
Title: Statistical considerations in bioequivalence of two area under the concentration–time curves obtained from serial sampling data
Abstract: In this paper, we study the bioequivalence (BE) inference problem motivated by pharmacokinetic data that were collected using the serial sampling technique. In serial sampling designs, subjects are independently assigned to one of the two drugs; each subject can be sampled only once, and data are collected at K distinct timepoints from multiple subjects. We consider design and hypothesis testing for the parameter of interest: the area under the concentration–time curve (AUC). Decision rules in demonstrating BE were established using an equivalence test for either the ratio or logarithmic difference of two AUCs. The proposed t-test can deal with cases where two AUCs have unequal variances. To control for the type I error rate, the involved degrees-of-freedom were adjusted using Satterthwaite's approximation. A power formula was derived to allow the determination of necessary sample sizes. Simulation results show that, when the two AUCs have unequal variances, the type I error rate is better controlled by the proposed method compared with a method that only handles equal variances. We also propose an unequal subject allocation method that improves the power relative to that of the equal and symmetric allocation. The methods are illustrated using practical examples.
Journal: Journal of Applied Statistics
Pages: 1140-1154
Issue: 5
Volume: 40
Year: 2013
X-DOI: 10.1080/02664763.2013.780234
File-URL: http://hdl.handle.net/10.1080/02664763.2013.780234
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:40:y:2013:i:5:p:1140-1154
Template-Type: ReDIF-Article 1.0
Author-Name: Himel Mallick
Author-X-Name-First: Himel
Author-X-Name-Last: Mallick
Author-Name: Nengjun Yi
Author-X-Name-First: Nengjun
Author-X-Name-Last: Yi
Title: Bayesian bridge regression
Abstract:
Classical bridge regression is known to possess many desirable statistical properties such as oracle, sparsity, and unbiasedness. One outstanding disadvantage of bridge regularization, however, is that it lacks a systematic approach to inference, reducing its flexibility in practical applications. In this study, we propose bridge regression from a Bayesian perspective. Unlike classical bridge regression that summarizes inference using a single point estimate, the proposed Bayesian method provides uncertainty estimates of the regression parameters, allowing coherent inference through the posterior distribution. Under a sparsity assumption on the high-dimensional parameter, we provide sufficient conditions for strong posterior consistency of the Bayesian bridge prior. On simulated datasets, we show that the proposed method performs well compared to several competing methods across a wide range of scenarios. Application to two real datasets further revealed that the proposed method performs as well as or better than published methods while offering the advantage of posterior inference.
Journal: Journal of Applied Statistics
Pages: 988-1008
Issue: 6
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1324565
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1324565
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:6:p:988-1008
Template-Type: ReDIF-Article 1.0
Author-Name: Emmanuel O. Ogundimu
Author-X-Name-First: Emmanuel O.
Author-X-Name-Last: Ogundimu
Author-Name: Gary S. Collins
Author-X-Name-First: Gary S.
Author-X-Name-Last: Collins
Title: Predictive performance of penalized beta regression model for continuous bounded outcomes
Abstract:
Prediction models for continuous bounded outcomes are often developed by fitting ordinary least-square regression. However, predicted values from such method may lie outside the range of the outcome as it is bounded within a fixed range, with nonlinear expectation due to the ceiling and floor effects of the bounds. Thus, regular regression models such as normal linear or nonlinear models, are inadequate for prediction purposes for bounded response variable and the use of distributions that can model different shapes are essential. Beta regression, apart from modeling different shapes and constraining predictions to an admissible range, has been shown to be superior to alternative methods for data fitting but not for prediction purposes. We take data structures into account and compared various penalized beta regression method on predictive accuracy for bounded outcome variables using optimism corrected measures. Contrary to results obtained under many regression contexts, the classical maximum likelihood method produced good predictive accuracy in terms of $ R^{2} $ R2 and RMSE. The ridge penalized beta regression performed better in terms of g-index, which is a measure of performance of the methods in external data sets. We restricted attention to prespecified models throughout and as such variable selection methods are not evaluated.
Journal: Journal of Applied Statistics
Pages: 1030-1040
Issue: 6
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1339024
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1339024
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:6:p:1030-1040
Template-Type: ReDIF-Article 1.0
Author-Name: Daniel Müllensiefen
Author-X-Name-First: Daniel
Author-X-Name-Last: Müllensiefen
Author-Name: Christian Hennig
Author-X-Name-First: Christian
Author-X-Name-Last: Hennig
Author-Name: Hedie Howells
Author-X-Name-First: Hedie
Author-X-Name-Last: Howells
Title: Using clustering of rankings to explain brand preferences with personality and socio-demographic variables
Abstract:
The primary aim of market segmentation is to identify relevant groups of consumers that can be addressed efficiently by marketing or advertising campaigns. This paper addresses the issue whether consumer groups can be identified from background variables that are not brand-related, and how much personality vs. socio-demographic variables contribute to the identification of consumer clusters. This is done by clustering aggregated preferences for 25 brands across 5 different product categories, and by relating socio-demographic and personality variables to the clusters using logistic regression and random forests over a range of different numbers of clusters. Results indicate that some personality variables contribute significantly to the identification of consumer groups in one sample. However, these results were not replicated on a second sample that was more heterogeneous in terms of socio-demographic characteristics and not representative of the brands target audience.
Journal: Journal of Applied Statistics
Pages: 1009-1029
Issue: 6
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1339025
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1339025
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:6:p:1009-1029
Template-Type: ReDIF-Article 1.0
Author-Name: Nurkhairany Amyra Mokhtar
Author-X-Name-First: Nurkhairany Amyra
Author-X-Name-Last: Mokhtar
Author-Name: Yong Zulina Zubairi
Author-X-Name-First: Yong Zulina
Author-X-Name-Last: Zubairi
Author-Name: Abdul Ghapor Hussin
Author-X-Name-First: Abdul Ghapor
Author-X-Name-Last: Hussin
Title: A clustering approach to detect multiple outliers in linear functional relationship model for circular data
Abstract:
Outlier detection has been used extensively in data analysis to detect anomalous observation in data. It has important applications such as in fraud detection and robust analysis, among others. In this paper, we propose a method in detecting multiple outliers in linear functional relationship model for circular variables. Using the residual values of the Caires and Wyatt model, we applied the hierarchical clustering approach. With the use of a tree diagram, we illustrate the detection of outliers graphically. A Monte Carlo simulation study is done to verify the accuracy of the proposed method. Low probability of masking and swamping effects indicate the validity of the proposed approach. Also, the illustrations to two sets of real data are given to show its practical applicability.
Journal: Journal of Applied Statistics
Pages: 1041-1051
Issue: 6
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1342779
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1342779
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:6:p:1041-1051
Template-Type: ReDIF-Article 1.0
Author-Name: Adam L. Smith
Author-X-Name-First: Adam L.
Author-X-Name-Last: Smith
Author-Name: Sofía S. Villar
Author-X-Name-First: Sofía S.
Author-X-Name-Last: Villar
Title: Bayesian adaptive bandit-based designs using the Gittins index for multi-armed trials with normally distributed endpoints
Abstract:
Adaptive designs for multi-armed clinical trials have become increasingly popular recently because of their potential to shorten development times and to increase patient response. However, developing response-adaptive designs that offer patient-benefit while ensuring the resulting trial provides a statistically rigorous and unbiased comparison of the different treatments included is highly challenging. In this paper, the theory of Multi-Armed Bandit Problems is used to define near optimal adaptive designs in the context of a clinical trial with a normally distributed endpoint with known variance. We report the operating characteristics (type I error, power, bias) and patient-benefit of these approaches and alternative designs using simulation studies based on an ongoing trial. These results are then compared to those recently published in the context of Bernoulli endpoints. Many limitations and advantages are similar in both cases but there are also important differences, specially with respect to type I error control. This paper proposes a simulation-based testing procedure to correct for the observed type I error inflation that bandit-based and adaptive rules can induce.
Journal: Journal of Applied Statistics
Pages: 1052-1076
Issue: 6
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1342780
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1342780
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:6:p:1052-1076
Template-Type: ReDIF-Article 1.0
Author-Name: Yingzhen Chen
Author-X-Name-First: Yingzhen
Author-X-Name-Last: Chen
Author-Name: Xuejun Ma
Author-X-Name-First: Xuejun
Author-X-Name-Last: Ma
Author-Name: Jingke Zhou
Author-X-Name-First: Jingke
Author-X-Name-Last: Zhou
Title: Variable selection for mode regression
Abstract:
From the prediction viewpoint, mode regression is more attractive since it pay attention to the most probable value of response variable given regressors. On the other hand, high-dimensional data are very prevalent as the advance of the technology of collecting and storing data. Variable selection is an important strategy to deal with high-dimensional regression problem. This paper aims to propose a variable selection procedure for high-dimensional mode regression via combining nonparametric kernel estimation method with sparsity penalty tactics. We also establish the asymptotic properties under certain technical conditions. The effectiveness and flexibility of the proposed methods are further illustrated by numerical studies and the real data application.
Journal: Journal of Applied Statistics
Pages: 1077-1084
Issue: 6
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1342781
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1342781
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:6:p:1077-1084
Template-Type: ReDIF-Article 1.0
Author-Name: H. Haselimashhadi
Author-X-Name-First: H.
Author-X-Name-Last: Haselimashhadi
Author-Name: V. Vinciotti
Author-X-Name-First: V.
Author-X-Name-Last: Vinciotti
Author-Name: K. Yu
Author-X-Name-First: K.
Author-X-Name-Last: Yu
Title: A novel Bayesian regression model for counts with an application to health data
Abstract:
Discrete data are collected in many application areas and are often characterised by highly-skewed distributions. An example of this, which is considered in this paper, is the number of visits to a specialist, often taken as a measure of demand in healthcare. A discrete Weibull regression model was recently proposed for regression problems with a discrete response and it was shown to possess desirable properties. In this paper, we propose the first Bayesian implementation of this model. We consider a general parametrization, where both parameters of the discrete Weibull distribution can be conditioned on the predictors, and show theoretically how, under a uniform non-informative prior, the posterior distribution is proper with finite moments. In addition, we consider closely the case of Laplace priors for parameter shrinkage and variable selection. Parameter estimates and their credible intervals can be readily calculated from their full posterior distribution. A simulation study and the analysis of four real datasets of medical records show promises for the wide applicability of this approach to the analysis of count data. The method is implemented in the R package
BDWreg.
Journal: Journal of Applied Statistics
Pages: 1085-1105
Issue: 6
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1342782
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1342782
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:6:p:1085-1105
Template-Type: ReDIF-Article 1.0
Author-Name: Olusola Samuel Makinde
Author-X-Name-First: Olusola Samuel
Author-X-Name-Last: Makinde
Author-Name: Olusoga Akin Fasoranbaku
Author-X-Name-First: Olusoga Akin
Author-X-Name-Last: Fasoranbaku
Title: On maximum depth classifiers: depth distribution approach
Abstract:
In this paper, we consider the notions of data depth for ordering multivariate data and propose a classification rule based on the distribution of some depth functions in $ \mathbb {R}^d $ Rd. The equivalence of the proposed classification rule to optimal Bayes rule is discussed under suitable conditions. The performance of the proposed classification method is investigated in low- and high-dimensional setting using real datasets. Also, the performance of the proposed classification method is illustrated in comparison to some other depth-based classifiers using simulated data sets.
Journal: Journal of Applied Statistics
Pages: 1106-1117
Issue: 6
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1342783
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1342783
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:6:p:1106-1117
Template-Type: ReDIF-Article 1.0
Author-Name: Jisu Yoon
Author-X-Name-First: Jisu
Author-X-Name-Last: Yoon
Author-Name: Tatyana Krivobokova
Author-X-Name-First: Tatyana
Author-X-Name-Last: Krivobokova
Title: Treatments of non-metric variables in partial least squares and principal component analysis
Abstract:
This paper reviews various treatments of non-metric variables in partial least squares (PLS) and principal component analysis (PCA) algorithms. The performance of different treatments is compared in an extensive simulation study under several typical data generating processes and associated recommendations are made. Moreover, we find that PLS-based methods are to prefer in practice, since, independent of the data generating process, PLS performs either as good as PCA or significantly outperforms it. As an application of PLS and PCA algorithms with non-metric variables we consider construction of a wealth index to predict household expenditures. Consistent with our simulation study, we find that a PLS-based wealth index with dummy coding outperforms PCA-based ones.
Journal: Journal of Applied Statistics
Pages: 971-987
Issue: 6
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1346065
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1346065
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:6:p:971-987
Template-Type: ReDIF-Article 1.0
Author-Name: Tarald O. Kvålseth
Author-X-Name-First: Tarald O.
Author-X-Name-Last: Kvålseth
Title: Measuring association between nominal categorical variables: an alternative to the Goodman–Kruskal lambda
Abstract:
As a measure of association between two nominal categorical variables, the lambda coefficient or Goodman–Kruskal's lambda has become a most popular measure. Its popularity is primarily due to its simple and meaningful definition and interpretation in terms of the proportional reduction in error when predicting a random observation's category for one variable given (versus not knowing) its category for the other variable. It is an asymmetric measure, although a symmetric version is available. The lambda coefficient does, however, have a widely recognized limitation: it can equal zero even when there is no independence between the variables and when all other measures take on positive values. In order to mitigate this problem, an alternative lambda coefficient is introduced in this paper as a slight modification of the Goodman–Kruskal lambda. The properties of the new measure are discussed and a symmetric form is introduced. A statistical inference procedure is developed and a numerical example is provided.
Journal: Journal of Applied Statistics
Pages: 1118-1132
Issue: 6
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1346066
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1346066
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:6:p:1118-1132
Template-Type: ReDIF-Article 1.0
Author-Name: Hugo Lewi Hammer
Author-X-Name-First: Hugo Lewi
Author-X-Name-Last: Hammer
Title: Statistical models for short- and long-term forecasts of snow depth
Abstract:
Forecasting of future snow depths is useful for many applications like road safety, winter sport activities, avalanche risk assessment and hydrology. Motivated by the lack of statistical forecasts models for snow depth, in this paper we present a set of models to fill this gap. First, we present a model to do short-term forecasts when we assume that reliable weather forecasts of air temperature and precipitation are available. The covariates are included nonlinearly into the model following basic physical principles of snowfall, snow aging and melting. Due to the large set of observations with snow depth equal to zero, we use a zero-inflated gamma regression model, which is commonly used to similar applications like precipitation. We also do long-term forecasts of snow depth and much further than traditional weather forecasts for temperature and precipitation. The long-term forecasts are based on fitting models to historic time series of precipitation, temperature and snow depth. We fit the models to data from six locations in Norway with different climatic and vegetation properties. Forecasting five days into the future, the results showed that, given reliable weather forecasts of temperature and precipitation, the forecast errors in absolute value was between 3 and 7 cm for different locations in Norway. Forecasting three weeks into the future, the forecast errors were between 7 and 16 cm.
Journal: Journal of Applied Statistics
Pages: 1133-1156
Issue: 6
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1357683
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1357683
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:6:p:1133-1156
Template-Type: ReDIF-Article 1.0
Author-Name: Mojtaba Alizadeh
Author-X-Name-First: Mojtaba
Author-X-Name-Last: Alizadeh
Author-Name: Seyyed Fazel Bagheri
Author-X-Name-First: Seyyed
Author-X-Name-Last: Fazel Bagheri
Author-Name: Mohammad Alizadeh
Author-X-Name-First: Mohammad
Author-X-Name-Last: Alizadeh
Author-Name: Saralees Nadarajah
Author-X-Name-First: Saralees
Author-X-Name-Last: Nadarajah
Title: A new four-parameter lifetime distribution
Abstract:
Generalizing lifetime distributions is always precious for applied statisticians. In this paper, we introduce a new four-parameter generalization of the exponentiated power Lindley (EPL) distribution, called the exponentiated power Lindley geometric (EPLG) distribution, obtained by compounding EPL and geometric distributions. The new distribution arises in a latent complementary risks scenario, in which the lifetime associated with a particular risk is not observable; rather, we observe only the maximum lifetime value among all risks. The distribution exhibits decreasing, increasing, unimodal and bathtub-shaped hazard rate functions, depending on its parameters. It contains several lifetime distributions as particular cases: EPL, new generalized Lindley, generalized Lindley, power Lindley and Lindley geometric distributions. We derive several properties of the new distribution such as closed-form expressions for the density, cumulative distribution function, survival function, hazard rate function, the rth raw moment, and also the moments of order statistics. Moreover, we discuss maximum likelihood estimation and provide formulas for the elements of the Fisher information matrix. Simulation studies are also provided. Finally, two real data applications are given for showing the flexibility and potentiality of the new distribution.
Journal: Journal of Applied Statistics
Pages: 767-797
Issue: 5
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1182137
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1182137
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:5:p:767-797
Template-Type: ReDIF-Article 1.0
Author-Name: Francisco J. Rubio
Author-X-Name-First: Francisco J.
Author-X-Name-Last: Rubio
Author-Name: Keming Yu
Author-X-Name-First: Keming
Author-X-Name-Last: Yu
Title: Flexible objective Bayesian linear regression with applications in survival analysis
Abstract:
We study objective Bayesian inference for linear regression models with residual errors distributed according to the class of two-piece scale mixtures of normal distributions. These models allow for capturing departures from the usual assumption of normality of the errors in terms of heavy tails, asymmetry, and certain types of heteroscedasticity. We propose a general non-informative, scale-invariant, prior structure and provide sufficient conditions for the propriety of the posterior distribution of the model parameters, which cover cases when the response variables are censored. These results allow us to apply the proposed models in the context of survival analysis. This paper represents an extension to the Bayesian framework of the models proposed in [16]. We present a simulation study that shows good frequentist properties of the posterior credible intervals as well as point estimators associated to the proposed priors. We illustrate the performance of these models with real data in the context of survival analysis of cancer patients.
Journal: Journal of Applied Statistics
Pages: 798-810
Issue: 5
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1182138
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1182138
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:5:p:798-810
Template-Type: ReDIF-Article 1.0
Author-Name: Kyeongjun Lee
Author-X-Name-First: Kyeongjun
Author-X-Name-Last: Lee
Author-Name: Youngseuk Cho
Author-X-Name-First: Youngseuk
Author-X-Name-Last: Cho
Title: Bayesian and maximum likelihood estimations of the inverted exponentiated half logistic distribution under progressive Type II censoring
Abstract:
In this paper, the estimation of parameters, reliability and hazard functions of a inverted exponentiated half logistic distribution (IEHLD) from progressive Type II censored data has been considered. The Bayes estimates for progressive Type II censored IEHLD under asymmetric and symmetric loss functions such as squared error, general entropy and linex loss function are provided. The Bayes estimates for progressive Type II censored IEHLD parameters, reliability and hazard functions are also obtained under the balanced loss functions. However, the Bayes estimates cannot be obtained explicitly, Lindley approximation method and importance sampling procedure are considered to obtain the Bayes estimates. Furthermore, the asymptotic normality of the maximum likelihood estimates is used to obtain the approximate confidence intervals. The highest posterior density credible intervals of the parameters based on importance sampling procedure are computed. Simulations are performed to see the performance of the proposed estimates. For illustrative purposes, two data sets have been analyzed.
Journal: Journal of Applied Statistics
Pages: 811-832
Issue: 5
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1183602
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1183602
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:5:p:811-832
Template-Type: ReDIF-Article 1.0
Author-Name: Nels Johnson
Author-X-Name-First: Nels
Author-X-Name-Last: Johnson
Author-Name: Inyoung Kim
Author-X-Name-First: Inyoung
Author-X-Name-Last: Kim
Title: Generalized linear models with covariate measurement error and unknown link function
Abstract:
Generalized linear models (GLMs) with error-in-covariates are useful in epidemiological research due to the ubiquity of non-normal response variables and inaccurate measurements. The link function in GLMs is chosen by the user depending on the type of response variable, frequently the canonical link function. When covariates are measured with error, incorrect inference can be made, compounded by incorrect choice of link function. In this article we propose three flexible approaches for handling error-in-covariates and estimating an unknown link simultaneously. The first approach uses a fully Bayesian (FB) hierarchical framework, treating the unobserved covariate as a latent variable to be integrated over. The second and third are approximate Bayesian approach which use a Laplace approximation to marginalize the variables measured with error out of the likelihood. Our simulation results show support that the FB approach is often a better choice than the approximate Bayesian approaches for adjusting for measurement error, particularly when the measurement error distribution is misspecified. These approaches are demonstrated on an application with binary response.
Journal: Journal of Applied Statistics
Pages: 833-852
Issue: 5
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1183603
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1183603
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:5:p:833-852
Template-Type: ReDIF-Article 1.0
Author-Name: Taha Alshaybawee
Author-X-Name-First: Taha
Author-X-Name-Last: Alshaybawee
Author-Name: Habshah Midi
Author-X-Name-First: Habshah
Author-X-Name-Last: Midi
Author-Name: Rahim Alhamzawi
Author-X-Name-First: Rahim
Author-X-Name-Last: Alhamzawi
Title: Bayesian elastic net single index quantile regression
Abstract:
Single index model conditional quantile regression is proposed in order to overcome the dimensionality problem in nonparametric quantile regression. In the proposed method, the Bayesian elastic net is suggested for single index quantile regression for estimation and variables selection. The Gaussian process prior is considered for unknown link function and a Gibbs sampler algorithm is adopted for posterior inference. The results of the simulation studies and numerical example indicate that our propose method, BENSIQReg, offers substantial improvements over two existing methods, SIQReg and BSIQReg. The BENSIQReg has consistently show a good convergent property, has the least value of median of mean absolute deviations and smallest standard deviations, compared to the other two methods.
Journal: Journal of Applied Statistics
Pages: 853-871
Issue: 5
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1189515
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1189515
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:5:p:853-871
Template-Type: ReDIF-Article 1.0
Author-Name: Mian Arif Shams Adnan
Author-X-Name-First: Mian Arif Shams
Author-X-Name-Last: Adnan
Author-Name: Shongkour Roy
Author-X-Name-First: Shongkour
Author-X-Name-Last: Roy
Title: A sequential discrimination procedure for two almost identically shaped wrapped distributions
Abstract:
The way of investigating a distribution knowing its interesting properties might be often inadequate when the shapes of two distributions are almost similar. In each of these circumstances, the accurate decision about the genesis of a random sample from any of the two parent distributions will be very much ambiguous even with the availability of the existing testing procedure of the circular data. A sequential discrimination procedure has been suggested which is also invariant to the sample size. The performance of the proposed discrimination procedure has been evaluated by checking its capability of detecting the genesis of the known samples from the two identically shaped wrapped distributions.
Journal: Journal of Applied Statistics
Pages: 872-881
Issue: 5
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1189516
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1189516
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:5:p:872-881
Template-Type: ReDIF-Article 1.0
Author-Name: Douglas Moura Miranda
Author-X-Name-First: Douglas Moura
Author-X-Name-Last: Miranda
Author-Name: Samuel Vieira Conceição
Author-X-Name-First: Samuel Vieira
Author-X-Name-Last: Conceição
Title: A practical method to calculate probabilities: illustrative example from the electronic industry business
Abstract:
The real-life environment is made of probabilistic data by nature and the ability to make decisions based on probabilities is crucial in the business world. It is common to have a set of data and the need of calculating the probability of taking a value greater or less than a specific value. It is also common in many companies the unavailability of a statistical software or a specialized professional in statistics. The purpose of this paper is to present a practical and simple method to calculate probabilities from normal or non-normal distributed data set and illustrate it with an application from the electronic industry. The method does not demand statistical knowledge from the user; there is no need of normality assumptions, goodness test or transformations. The proposed method is easy to implement, robust and the experiments have evidenced its quality. The technique is validated with a large variety of instances and compared with the well-known Johnson system of distributions.
Journal: Journal of Applied Statistics
Pages: 882-896
Issue: 5
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1189517
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1189517
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:5:p:882-896
Template-Type: ReDIF-Article 1.0
Author-Name: Rabindra Nath Das
Author-X-Name-First: Rabindra
Author-X-Name-Last: Nath Das
Author-Name: Anis Chandra Mukhopadhyay
Author-X-Name-First: Anis Chandra
Author-X-Name-Last: Mukhopadhyay
Title: Correlated random effects regression analysis for a log-normally distributed variable
Abstract:
In regression analysis, it is assumed that the response (or dependent variable) distribution is Normal, and errors are homoscedastic and uncorrelated. However, in practice, these assumptions are rarely satisfied by a real data set. To stabilize the heteroscedastic response variance, generally, log-transformation is suggested. Consequently, the response variable distribution approaches nearer to the Normal distribution. As a result, the model fit of the data is improved. Practically, a proper (seems to be suitable) transformation may not always stabilize the variance, and the response distribution may not reduce to Normal distribution. The present article assumes that the response distribution is log-normal with compound autocorrelated errors. Under these situations, estimation and testing of hypotheses regarding regression parameters have been derived. From a set of reduced data, we have derived the best linear unbiased estimators of all the regression coefficients, except the intercept which is often unimportant in practice. Unknown correlation parameters have been estimated. In this connection, we have derived a test rule for testing any set of linear hypotheses of the unknown regression coefficients. In addition, we have developed the confidence ellipsoids of a set of estimable functions of regression coefficients. For the fitted regression equation, an index of fit has been proposed. A simulated study illustrates the results derived in this report.
Journal: Journal of Applied Statistics
Pages: 897-915
Issue: 5
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1189518
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1189518
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:5:p:897-915
Template-Type: ReDIF-Article 1.0
Author-Name: Sukhdev Singh
Author-X-Name-First: Sukhdev
Author-X-Name-Last: Singh
Author-Name: Yogesh Mani Tripathi
Author-X-Name-First: Yogesh
Author-X-Name-Last: Mani Tripathi
Author-Name: Shuo-Jye Wu
Author-X-Name-First: Shuo-Jye
Author-X-Name-Last: Wu
Title: Bayesian estimation and prediction based on lognormal record values
Abstract:
In this paper we consider the problems of estimation and prediction when observed data from a lognormal distribution are based on lower record values and lower record values with inter-record times. We compute maximum likelihood estimates and asymptotic confidence intervals for model parameters. We also obtain Bayes estimates and the highest posterior density (HPD) intervals using noninformative and informative priors under square error and LINEX loss functions. Furthermore, for the problem of Bayesian prediction under one-sample and two-sample framework, we obtain predictive estimates and the associated predictive equal-tail and HPD intervals. Finally for illustration purpose a real data set is analyzed and simulation study is conducted to compare the methods of estimation and prediction.
Journal: Journal of Applied Statistics
Pages: 916-940
Issue: 5
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1189520
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1189520
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:5:p:916-940
Template-Type: ReDIF-Article 1.0
Author-Name: Xu Guo
Author-X-Name-First: Xu
Author-X-Name-Last: Guo
Author-Name: Hecheng Wu
Author-X-Name-First: Hecheng
Author-X-Name-Last: Wu
Author-Name: Gaorong Li
Author-X-Name-First: Gaorong
Author-X-Name-Last: Li
Author-Name: Qiuyue Li
Author-X-Name-First: Qiuyue
Author-X-Name-Last: Li
Title: Inference for the common mean of several Birnbaum–Saunders populations
Abstract:
The Birnbaum–Saunders distribution is a widely used distribution in reliability applications to model failure times. For several samples from possible different Birnbaum–Saunders distributions, if their means can be considered as the same, it is of importance to make inference for the common mean. This paper presents procedures for interval estimation and hypothesis testing for the common mean of several Birnbaum–Saunders populations. The proposed approaches are hybrids between the generalized inference method and the large sample theory. Some simulation results are conducted to present the performance of the proposed approaches. The simulation results indicate that our proposed approaches perform well. Finally, the proposed approaches are applied to analyze a real example on the fatigue life of 6061-T6 aluminum coupons for illustration.
Journal: Journal of Applied Statistics
Pages: 941-954
Issue: 5
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1189521
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1189521
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:5:p:941-954
Template-Type: ReDIF-Article 1.0
Author-Name: Zahra Sadat Meshkani Farahani
Author-X-Name-First: Zahra Sadat
Author-X-Name-Last: Meshkani Farahani
Author-Name: Esmaile Khorram
Author-X-Name-First: Esmaile
Author-X-Name-Last: Khorram
Author-Name: Mojtaba Ganjali
Author-X-Name-First: Mojtaba
Author-X-Name-Last: Ganjali
Author-Name: Taban Baghfalaki
Author-X-Name-First: Taban
Author-X-Name-Last: Baghfalaki
Title: Longitudinal data analysis in the presence of informative sampling: weighted distribution or joint modelling
Abstract:
Weighted distributions, as an example of informative sampling, work appropriately under the missing at random mechanism since they neglect missing values and only completely observed subjects are used in the study plan. However, length-biased distributions, as a special case of weighted distributions, remove the subjects with short length deliberately, which surely meet the missing not at random mechanism. Accordingly, applying length-biased distributions jeopardizes the results by producing biased estimates. Hence, an alternate method has to be used such that the results are improved by means of valid inferences. We propose methods that are based on weighted distributions and joint modelling procedure and compare them in analysing longitudinal data. After introducing three methods in use, a set of simulation studies and analysis of two real longitudinal datasets affirm our claim.
Journal: Journal of Applied Statistics
Pages: 2111-2127
Issue: 12
Volume: 46
Year: 2019
Month: 9
X-DOI: 10.1080/02664763.2019.1576599
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1576599
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:12:p:2111-2127
Template-Type: ReDIF-Article 1.0
Author-Name: Minjung Lee
Author-X-Name-First: Minjung
Author-X-Name-Last: Lee
Title: Parametric inference for quantile event times with adjustment for covariates on competing risks data
Abstract:
We propose parametric inferences for quantile event times with adjustment for covariates on competing risks data. We develop parametric quantile inferences using parametric regression modeling of the cumulative incidence function from the cause-specific hazard and direct approaches. Maximum likelihood inferences are developed for estimation of the cumulative incidence function and quantiles. We develop the construction of parametric confidence intervals for quantiles. Simulation studies show that the proposed methods perform well. We illustrate the methods using early stage breast cancer data.
Journal: Journal of Applied Statistics
Pages: 2128-2144
Issue: 12
Volume: 46
Year: 2019
Month: 9
X-DOI: 10.1080/02664763.2019.1577370
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1577370
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:12:p:2128-2144
Template-Type: ReDIF-Article 1.0
Author-Name: M. Shafiqur Rahman
Author-X-Name-First: M. Shafiqur
Author-X-Name-Last: Rahman
Author-Name: Afrin Sadia Rumana
Author-X-Name-First: Afrin Sadia
Author-X-Name-Last: Rumana
Title: A model-based concordance-type index for evaluating the added predictive ability of novel risk factors and markers in the logistic regression models
Abstract:
The Concordance statistic (C-statistic) is commonly used to assess the predictive performance (discriminatory ability) of logistic regression model. Although there are several approaches for the C-statistic, their performance in quantifying the subsequent improvement in predictive accuracy due to inclusion of novel risk factors or biomarkers in the model has been extremely criticized in literature. This paper proposed a model-based concordance-type index, CK, for use with logistic regression model. The CK and its asymptotic sampling distribution is derived following Gonen and Heller's approach for Cox PH model for survival data but taking necessary modifications for use with binary data. Unlike the existing C-statistics for logistic model, it quantifies the concordance probability by taking the difference in the predicted risks between two subjects in a pair rather than ranking them and hence is able to quantify the equivalent incremental value from the new risk factor or marker. The simulation study revealed that the CK performed well when the model parameters are correctly estimated for large sample and showed greater improvement in quantifying the additional predictive value from the new risk factor or marker than the existing C-statistics. Furthermore, the illustration using three datasets supports the findings from simulation study.
Journal: Journal of Applied Statistics
Pages: 2145-2163
Issue: 12
Volume: 46
Year: 2019
Month: 9
X-DOI: 10.1080/02664763.2019.1580253
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1580253
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:12:p:2145-2163
Template-Type: ReDIF-Article 1.0
Author-Name: Young H. Chun
Author-X-Name-First: Young H.
Author-X-Name-Last: Chun
Title: Generalized run tests for statistical process control
Abstract:
In a sequence of elements, a run is defined as a maximal subsequence of like elements. The number of runs or the length of the longest run has been widely used to test the randomness of an ordered sequence. Based on two different sampling methods and two types of test statistics used, run tests can be classified into one of four cases. Numerous researchers have derived the probability distributions in many different ways, treating each case separately. In the paper, we propose a unified approach which is based on recurrence arguments of two mutually exclusive sub-sequences. We also consider the sequence of nominal data that has more than two classes. Thus, the traditional run tests for a binary sequence are special cases of our generalized run tests. We finally show that the generalized run tests can be applied to many quality management areas, such as testing changes in process variation, developing non-parametric multivariate control charts, and comparing the shapes and locations of more than two process distributions.
Journal: Journal of Applied Statistics
Pages: 2164-2179
Issue: 12
Volume: 46
Year: 2019
Month: 9
X-DOI: 10.1080/02664763.2019.1581147
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1581147
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:12:p:2164-2179
Template-Type: ReDIF-Article 1.0
Author-Name: O. Theodosiadou
Author-X-Name-First: O.
Author-X-Name-Last: Theodosiadou
Author-Name: V. Polimenis
Author-X-Name-First: V.
Author-X-Name-Last: Polimenis
Author-Name: G. Tsaklidis
Author-X-Name-First: G.
Author-X-Name-Last: Tsaklidis
Title: A semi-parametric method for estimating the beta coefficients of the hidden two-sided asset return jumps
Abstract:
We introduce a new methodology for estimating the parameters of a two-sided jump model, which aims at decomposing the daily stock return evolution into (unobservable) positive and negative jumps as well as Brownian noise. The parameters of interest are the jump beta coefficients which measure the influence of the market jumps on the stock returns, and are latent components. For this purpose, at first we use the Variance Gamma (VG) distribution which is frequently used in modeling financial time series and leads to the revelation of the hidden market jumps' distributions. Then, our method is based on the central moments of the stock returns for estimating the parameters of the model. It is proved that the proposed method provides always a solution in terms of the jump beta coefficients. We thus achieve a semi-parametric fit to the empirical data. The methodology itself serves as a criterion to test the fit of any sets of parameters to the empirical returns. The analysis is applied to NASDAQ and Google returns during the 2006–2008 period.
Journal: Journal of Applied Statistics
Pages: 2180-2197
Issue: 12
Volume: 46
Year: 2019
Month: 9
X-DOI: 10.1080/02664763.2019.1581734
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1581734
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:12:p:2180-2197
Template-Type: ReDIF-Article 1.0
Author-Name: Lizbeth Naranjo
Author-X-Name-First: Lizbeth
Author-X-Name-Last: Naranjo
Author-Name: Carlos J. Pérez
Author-X-Name-First: Carlos J.
Author-X-Name-Last: Pérez
Author-Name: Jacinto Martín
Author-X-Name-First: Jacinto
Author-X-Name-Last: Martín
Author-Name: Timothy Mutsvari
Author-X-Name-First: Timothy
Author-X-Name-Last: Mutsvari
Author-Name: Emmanuel Lesaffre
Author-X-Name-First: Emmanuel
Author-X-Name-Last: Lesaffre
Title: A Bayesian approach for misclassified ordinal response data
Abstract:
Motivated by a longitudinal oral health study, the Signal-Tandmobiel® study, a Bayesian approach has been developed to model misclassified ordinal response data. Two regression models have been considered to incorporate misclassification in the categorical response. Specifically, probit and logit models have been developed. The computational difficulties have been avoided by using data augmentation. This idea is exploited to derive efficient Markov chain Monte Carlo methods. Although the method is proposed for ordered categories, it can also be implemented for unordered ones in a simple way. The model performance is shown through a simulation-based example and the analysis of the motivating study.
Journal: Journal of Applied Statistics
Pages: 2198-2215
Issue: 12
Volume: 46
Year: 2019
Month: 9
X-DOI: 10.1080/02664763.2019.1582613
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1582613
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:12:p:2198-2215
Template-Type: ReDIF-Article 1.0
Author-Name: Cheng Ju
Author-X-Name-First: Cheng
Author-X-Name-Last: Ju
Author-Name: Mary Combs
Author-X-Name-First: Mary
Author-X-Name-Last: Combs
Author-Name: Samuel D. Lendle
Author-X-Name-First: Samuel D.
Author-X-Name-Last: Lendle
Author-Name: Jessica M. Franklin
Author-X-Name-First: Jessica M.
Author-X-Name-Last: Franklin
Author-Name: Richard Wyss
Author-X-Name-First: Richard
Author-X-Name-Last: Wyss
Author-Name: Sebastian Schneeweiss
Author-X-Name-First: Sebastian
Author-X-Name-Last: Schneeweiss
Author-Name: Mark J. van der Laan
Author-X-Name-First: Mark J.
Author-X-Name-Last: van der Laan
Title: Propensity score prediction for electronic healthcare databases using super learner and high-dimensional propensity score methods
Abstract:
The optimal learner for prediction modeling varies depending on the underlying data-generating distribution. Super Learner (SL) is a generic ensemble learning algorithm that uses cross-validation to select among a ‘library’ of candidate prediction models. While SL has been widely studied in a number of settings, it has not been thoroughly evaluated in large electronic healthcare databases that are common in pharmacoepidemiology and comparative effectiveness research. In this study, we applied and evaluated the performance of SL in its ability to predict the propensity score (PS), the conditional probability of treatment assignment given baseline covariates, using three electronic healthcare databases. We considered a library of algorithms that consisted of both nonparametric and parametric models. We also proposed a novel strategy for prediction modeling that combines SL with the high-dimensional propensity score (hdPS) variable selection algorithm. Predictive performance was assessed using three metrics: the negative log-likelihood, area under the curve (AUC), and time complexity. Results showed that the best individual algorithm, in terms of predictive performance, varied across datasets. The SL was able to adapt to the given dataset and optimize predictive performance relative to any individual learner. Combining the SL with the hdPS was the most consistent prediction method and may be promising for PS estimation and prediction modeling in electronic healthcare databases.
Journal: Journal of Applied Statistics
Pages: 2216-2236
Issue: 12
Volume: 46
Year: 2019
Month: 9
X-DOI: 10.1080/02664763.2019.1582614
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1582614
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:12:p:2216-2236
Template-Type: ReDIF-Article 1.0
Author-Name: Asma Nani
Author-X-Name-First: Asma
Author-X-Name-Last: Nani
Author-Name: Imed Gamoudi
Author-X-Name-First: Imed
Author-X-Name-Last: Gamoudi
Author-Name: Mohamed El Ghourabi
Author-X-Name-First: Mohamed
Author-X-Name-Last: El Ghourabi
Title: Value-at-risk estimation by LS-SVR and FS-LS-SVR based on GAS model
Abstract:
Conditional risk measuring plays an important role in financial regulation and depends on volatility estimation. A new class of parameter models called Generalized Autoregressive Score (GAS) model has been successfully applied for different error's densities and for different problems of time series prediction in particular for volatility modeling and VaR estimation. To improve the estimating accuracy of the GAS model, this study proposed a semi-parametric method, LS-SVR and FS-LS-SVR applied to the GAS model to estimate the conditional VaR. In particular, we fit the GAS(1,1) model to the return series using three different distributions. Then, LS-SVR and FS-LS-SVR approximate the GAS(1,1) model. An empirical research was performed to illustrate the effectiveness of the proposed method. More precisely, the experimental results from four stock indexes returns suggest that using hybrid models, GAS-LS-SVR and GAS-FS-LS-SVR provides improved performances in the VaR estimation.
Journal: Journal of Applied Statistics
Pages: 2237-2253
Issue: 12
Volume: 46
Year: 2019
Month: 9
X-DOI: 10.1080/02664763.2019.1584161
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1584161
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:12:p:2237-2253
Template-Type: ReDIF-Article 1.0
Author-Name: Yonggang Lu
Author-X-Name-First: Yonggang
Author-X-Name-Last: Lu
Author-Name: Peter Westfall
Author-X-Name-First: Peter
Author-X-Name-Last: Westfall
Title: Simple and flexible Bayesian inferences for standardized regression coefficients
Abstract:
In statistical practice, inferences on standardized regression coefficients are often required, but complicated by the fact that they are nonlinear functions of the parameters, and thus standard textbook results are simply wrong. Within the frequentist domain, asymptotic delta methods can be used to construct confidence intervals of the standardized coefficients with proper coverage probabilities. Alternatively, Bayesian methods solve similar and other inferential problems by simulating data from the posterior distribution of the coefficients. In this paper, we present Bayesian procedures that provide comprehensive solutions for inferences on the standardized coefficients. Simple computing algorithms are developed to generate posterior samples with no autocorrelation and based on both noninformative improper and informative proper prior distributions. Simulation studies show that Bayesian credible intervals constructed by our approaches have comparable and even better statistical properties than their frequentist counterparts, particularly in the presence of collinearity. In addition, our approaches solve some meaningful inferential problems that are difficult if not impossible from the frequentist standpoint, including identifying joint rankings of multiple standardized coefficients and making optimal decisions concerning their sizes and comparisons. We illustrate applications of our approaches through examples and make sample R functions available for implementing our proposed methods.
Journal: Journal of Applied Statistics
Pages: 2254-2288
Issue: 12
Volume: 46
Year: 2019
Month: 9
X-DOI: 10.1080/02664763.2019.1584609
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1584609
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:12:p:2254-2288
Template-Type: ReDIF-Article 1.0
Author-Name: Nikolay Miller
Author-X-Name-First: Nikolay
Author-X-Name-Last: Miller
Author-Name: Yiming Yang
Author-X-Name-First: Yiming
Author-X-Name-Last: Yang
Author-Name: Bruce Sun
Author-X-Name-First: Bruce
Author-X-Name-Last: Sun
Author-Name: Guoyi Zhang
Author-X-Name-First: Guoyi
Author-X-Name-Last: Zhang
Title: Identification of technical analysis patterns with smoothing splines for bitcoin prices
Abstract:
This research studies automatic price pattern search procedure for bitcoin cryptocurrency based on 1-min price data. To achieve this, search algorithm is proposed based on nonparametric regression method of smoothing splines. We investigate some well-known technical analysis patterns and construct algorithmic trading strategy to evaluate the effectiveness of the patterns. We found that method of smoothing splines for identifying the technical analysis patterns and that strategies based on certain technical analysis patterns yield returns that significantly exceed results of unconditional trading strategies.
Journal: Journal of Applied Statistics
Pages: 2289-2297
Issue: 12
Volume: 46
Year: 2019
Month: 9
X-DOI: 10.1080/02664763.2019.1580251
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1580251
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:12:p:2289-2297
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaoqi Jiang
Author-X-Name-First: Xiaoqi
Author-X-Name-Last: Jiang
Author-Name: Steven Wink
Author-X-Name-First: Steven
Author-X-Name-Last: Wink
Author-Name: Bob van de Water
Author-X-Name-First: Bob
Author-X-Name-Last: van de Water
Author-Name: Annette Kopp-Schneider
Author-X-Name-First: Annette
Author-X-Name-Last: Kopp-Schneider
Title: Functional analysis of high-content high-throughput imaging data
Abstract:
High-content automated imaging platforms allow the multiplexing of several targets simultaneously to generate multi-parametric single-cell data sets over extended periods of time. Typically, standard simple measures such as mean value of all cells at every time point are calculated to summarize the temporal process, resulting in loss of time dynamics of the single cells. Multiple experiments are performed but observation time points are not necessarily identical, leading to difficulties when integrating summary measures from different experiments. We used functional data analysis to analyze continuous curve data, where the temporal process of a response variable for each single cell can be described using a smooth curve. This allows analyses to be performed on continuous functions, rather than on original discrete data points. Functional regression models were applied to determine common temporal characteristics of a set of single cell curves and random effects were employed in the models to explain variation between experiments. The aim of the multiplexing approach is to simultaneously analyze the effect of a large number of compounds in comparison to control to discriminate between their mode of action. Functional principal component analysis based on T-statistic curves for pairwise comparison to control was used to study time-dependent compound effects.
Journal: Journal of Applied Statistics
Pages: 1903-1919
Issue: 11
Volume: 44
Year: 2017
Month: 8
X-DOI: 10.1080/02664763.2016.1238048
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238048
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:11:p:1903-1919
Template-Type: ReDIF-Article 1.0
Author-Name: G. Inan
Author-X-Name-First: G.
Author-X-Name-Last: Inan
Author-Name: R. Yucel
Author-X-Name-First: R.
Author-X-Name-Last: Yucel
Title: Joint GEEs for multivariate correlated data with incomplete binary outcomes
Abstract:
This study considers a fully-parametric but uncongenial multiple imputation (MI) inference to jointly analyze incomplete binary response variables observed in a correlated data settings. Multiple imputation model is specified as a fully-parametric model based on a multivariate extension of mixed-effects models. Dichotomized imputed datasets are then analyzed using joint GEE models where covariates are associated with the marginal mean of responses with response-specific regression coefficients and a Kronecker product is accommodated for cluster-specific correlation structure for a given response variable and correlation structure between multiple response variables. The validity of the proposed MI-based JGEE (MI-JGEE) approach is assessed through a Monte Carlo simulation study under different scenarios. The simulation results, which are evaluated in terms of bias, mean-squared error, and coverage rate, show that MI-JGEE has promising inferential properties even when the underlying multiple imputation is misspecified. Finally, Adolescent Alcohol Prevention Trial data are used for illustration.
Journal: Journal of Applied Statistics
Pages: 1920-1937
Issue: 11
Volume: 44
Year: 2017
Month: 8
X-DOI: 10.1080/02664763.2016.1238049
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238049
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:11:p:1920-1937
Template-Type: ReDIF-Article 1.0
Author-Name: Wei Wang
Author-X-Name-First: Wei
Author-X-Name-Last: Wang
Title: Checking identifiability of covariance parameters in linear mixed effects models
Abstract:
To build a linear mixed effects model, one needs to specify the random effects and often the associated parametrized covariance matrix structure. Inappropriate specification of the structures can result in the covariance parameters of the model not identifiable. Non-identifiability can result in extraordinary wide confidence intervals, and unreliable parameter inference. Sometimes software produces implication of model non-identifiability, but not always. In the simulation of fitting non-identifiable models we tried, about half of the times the software output did not look abnormal. We derive necessary and sufficient conditions of covariance parameters identifiability which does not require any prior model fitting. The results are easy to implement and are applicable to commonly used covariance matrix structures.
Journal: Journal of Applied Statistics
Pages: 1938-1946
Issue: 11
Volume: 44
Year: 2017
Month: 8
X-DOI: 10.1080/02664763.2016.1238050
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238050
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:11:p:1938-1946
Template-Type: ReDIF-Article 1.0
Author-Name: Rahim Alhamzawi
Author-X-Name-First: Rahim
Author-X-Name-Last: Alhamzawi
Title: Inference with three-level prior distributions in quantile regression problems
Abstract:
In this paper, we propose a three level hierarchical Bayesian model for variable selection and estimation in quantile regression problems. Specifically, at the first level we consider a zero mean normal priors for the coefficients with unknown variance parameters. At the second level, we specify two different priors for the unknown variance parameters which introduce two different models producing different levels of sparsity. Then, at the third level we suggest joint improper priors for the unknown hyperparameters assuming they are independent. Simulations and Boston Housing data are utilized to compare the performance of our models with six existing models. The results indicate that our models perform good in the simulations and Boston Housing data.
Journal: Journal of Applied Statistics
Pages: 1947-1959
Issue: 11
Volume: 44
Year: 2017
Month: 8
X-DOI: 10.1080/02664763.2016.1238051
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238051
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:11:p:1947-1959
Template-Type: ReDIF-Article 1.0
Author-Name: Hongmei Lin
Author-X-Name-First: Hongmei
Author-X-Name-Last: Lin
Author-Name: Wenchao Xu
Author-X-Name-First: Wenchao
Author-X-Name-Last: Xu
Author-Name: Riquan Zhang
Author-X-Name-First: Riquan
Author-X-Name-Last: Zhang
Author-Name: Jianhong Shi
Author-X-Name-First: Jianhong
Author-X-Name-Last: Shi
Author-Name: Yuedong Wang
Author-X-Name-First: Yuedong
Author-X-Name-Last: Wang
Title: Multiple-index varying-coefficient models for longitudinal data
Abstract:
In haemodialysis patients, vascular access type is of paramount importance. Although recent studies have found that central venous catheter is often associated with poor outcomes and switching to arteriovenous fistula is beneficial, studies have not fully elucidated how the effect of switching of access on outcomes changes over time for patients on dialysis and whether the effect depends on switching time. In this paper, we characterise the switching access type effect on outcomes for haemodialysis patients. This is achieved by using a new class of multiple-index varying-coefficient (MIVC) models. We develop a new estimation procedure for MIVC models based on local linear, profile least-square method and Cholesky decomposition. Monte Carlo simulation studies show excellent finite sample performance. Finally, we analyse the dialysis data using our method.
Journal: Journal of Applied Statistics
Pages: 1960-1978
Issue: 11
Volume: 44
Year: 2017
Month: 8
X-DOI: 10.1080/02664763.2016.1238052
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238052
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:11:p:1960-1978
Template-Type: ReDIF-Article 1.0
Author-Name: Alan T. K. Wan
Author-X-Name-First: Alan T. K.
Author-X-Name-Last: Wan
Author-Name: Shangyu Xie
Author-X-Name-First: Shangyu
Author-X-Name-Last: Xie
Author-Name: Yong Zhou
Author-X-Name-First: Yong
Author-X-Name-Last: Zhou
Title: A varying coefficient approach to estimating hedonic housing price functions and their quantiles
Abstract:
The varying coefficient (VC) model introduced by Hastie and Tibshirani [26] is arguably one of the most remarkable recent developments in nonparametric regression theory. The VC model is an extension of the ordinary regression model where the coefficients are allowed to vary as smooth functions of an effect modifier possibly different from the regressors. The VC model reduces the modelling bias with its unique structure while also avoiding the ‘curse of dimensionality’ problem. While the VC model has been applied widely in a variety of disciplines, its application in economics has been minimal. The central goal of this paper is to apply VC modelling to the estimation of a hedonic house price function using data from Hong Kong, one of the world's most buoyant real estate markets. We demonstrate the advantages of the VC approach over traditional parametric and semi-parametric regressions in the face of a large number of regressors. We further combine VC modelling with quantile regression to examine the heterogeneity of the marginal effects of attributes across the distribution of housing prices.
Journal: Journal of Applied Statistics
Pages: 1979-1999
Issue: 11
Volume: 44
Year: 2017
Month: 8
X-DOI: 10.1080/02664763.2016.1238053
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238053
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:11:p:1979-1999
Template-Type: ReDIF-Article 1.0
Author-Name: Guohai Zhou
Author-X-Name-First: Guohai
Author-X-Name-Last: Zhou
Author-Name: Lang Wu
Author-X-Name-First: Lang
Author-X-Name-Last: Wu
Author-Name: Rollin Brant
Author-X-Name-First: Rollin
Author-X-Name-Last: Brant
Author-Name: J. Mark Ansermino
Author-X-Name-First: J. Mark
Author-X-Name-Last: Ansermino
Title: A likelihood-based approach for multivariate one-sided tests with missing data
Abstract:
Inequality-restricted hypotheses testing methods containing multivariate one-sided testing methods are useful in practice, especially in multiple comparison problems. In practice, multivariate and longitudinal data often contain missing values since it may be difficult to observe all values for each variable. However, although missing values are common for multivariate data, statistical methods for multivariate one-sided tests with missing values are quite limited. In this article, motivated by a dataset in a recent collaborative project, we develop two likelihood-based methods for multivariate one-sided tests with missing values, where the missing data patterns can be arbitrary and the missing data mechanisms may be non-ignorable. Although non-ignorable missing data are not testable based on observed data, statistical methods addressing this issue can be used for sensitivity analysis and might lead to more reliable results, since ignoring informative missingness may lead to biased analysis. We analyse the real dataset in details under various possible missing data mechanisms and report interesting findings which are previously unavailable. We also derive some asymptotic results and evaluate our new tests using simulations.
Journal: Journal of Applied Statistics
Pages: 2000-2016
Issue: 11
Volume: 44
Year: 2017
Month: 8
X-DOI: 10.1080/02664763.2016.1238054
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238054
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:11:p:2000-2016
Template-Type: ReDIF-Article 1.0
Author-Name: Milan Stojković
Author-X-Name-First: Milan
Author-X-Name-Last: Stojković
Author-Name: Stevan Prohaska
Author-X-Name-First: Stevan
Author-X-Name-Last: Prohaska
Author-Name: Nikola Zlatanović
Author-X-Name-First: Nikola
Author-X-Name-Last: Zlatanović
Title: Estimation of flood frequencies from data sets with outliers using mixed distribution functions
Abstract:
In this paper the estimation of high return period quantiles of the flood peak and volume in the Kolubara River basin are carried out. Estimation of flood frequencies is carried out on a data set containing high outliers which are identified by the Rosner’s test. Simultaneously, low outliers are determined by the multiple Grubbs–Beck. The next step involved the usage of the mixed distribution functions applied to a data set from three populations: floods with low outliers, normal floods and floods with high outliers. The contribution of the data set with low outliers is neglected, since it should underestimate the flood quantiles with large return periods. Consequently, the best fitted mixed distribution from the applied types (EV1, GEV, P3 and LP3) was determined by using the minimum standard error of fit.
Journal: Journal of Applied Statistics
Pages: 2017-2035
Issue: 11
Volume: 44
Year: 2017
Month: 8
X-DOI: 10.1080/02664763.2016.1238055
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238055
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:11:p:2017-2035
Template-Type: ReDIF-Article 1.0
Author-Name: Lucy Kerns
Author-X-Name-First: Lucy
Author-X-Name-Last: Kerns
Author-Name: John T. Chen
Author-X-Name-First: John T.
Author-X-Name-Last: Chen
Title: Simultaneous confidence bands for restricted logistic regression models
Abstract:
The hyperbolic $ 1-\alpha $ 1−α confidence bands for one logistic regression model with restricted predictors have been considered in the statistical literature. At times, one wishes to construct simultaneous confidence bands for comparing several logistic regression models. It seems that Liu's book [Simultaneous Inference in Regression, Chapman & Hall, 2010, Chapter 8] is the only published work that has addressed this problem. Liu suggested simulation-based methods for constructing simultaneous confidence bands for comparing several logistic models, but further research was warranted to assess the conservativeness of the bands. In this paper, we propose a dimension-wise partitioning method to construct a set of simultaneous confidence bands for the comparisons of several logistic regression functions with a pre-specified function in a stepwise fashion. In addition, simulation studies cast new light on the assumption of predetermined testing order for the stepwise procedures presented in this paper and by Hsu and Berger [Stepwise confidence intervals without multiplicity adjustment for dose–response and toxicity studies, J. Amer. Statist. Assoc. 94 (1999), pp. 468–482]. As an illustration, we include an example on the success rate of thrombolysis associated with patient characteristics regarding post-thrombotic syndrome.
Journal: Journal of Applied Statistics
Pages: 2036-2051
Issue: 11
Volume: 44
Year: 2017
Month: 8
X-DOI: 10.1080/02664763.2016.1238056
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238056
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:11:p:2036-2051
Template-Type: ReDIF-Article 1.0
Author-Name: Filidor Vilca
Author-X-Name-First: Filidor
Author-X-Name-Last: Vilca
Author-Name: Caio L. N. Azevedo
Author-X-Name-First: Caio L. N.
Author-X-Name-Last: Azevedo
Author-Name: N. Balakrishnan
Author-X-Name-First: N.
Author-X-Name-Last: Balakrishnan
Title: Bayesian inference for sinh-normal/independent nonlinear regression models
Abstract:
Sinh-normal/independent distributions are a class of symmetric heavy-tailed distributions that include the sinh-normal distribution as a special case, which has been used extensively in Birnbaum–Saunders regression models. Here, we explore the use of Markov Chain Monte Carlo methods to develop a Bayesian analysis in nonlinear regression models when Sinh-normal/independent distributions are assumed for the random errors term, and it provides a robust alternative to the sinh-normal nonlinear regression model. Bayesian mechanisms for parameter estimation, residual analysis and influence diagnostics are then developed, which extend the results of Farias and Lemonte [Bayesian inference for the Birnbaum-Saunders nonlinear regression model, Stat. Methods Appl. 20 (2011), pp. 423-438] who used the Sinh-normal/independent distributions with known scale parameter. Some special cases, based on the sinh-Student-t (sinh-St), sinh-slash (sinh-SL) and sinh-contaminated normal (sinh-CN) distributions are discussed in detail. Two real datasets are finally analyzed to illustrate the developed procedures.
Journal: Journal of Applied Statistics
Pages: 2052-2074
Issue: 11
Volume: 44
Year: 2017
Month: 8
X-DOI: 10.1080/02664763.2016.1238058
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1238058
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:11:p:2052-2074
Template-Type: ReDIF-Article 1.0
Author-Name: Wenjuan Liang
Author-X-Name-First: Wenjuan
Author-X-Name-Last: Liang
Author-Name: Xiaolong Pu
Author-X-Name-First: Xiaolong
Author-X-Name-Last: Pu
Author-Name: Dongdong Xiang
Author-X-Name-First: Dongdong
Author-X-Name-Last: Xiang
Title: A distribution-free multivariate CUSUM control chart using dynamic control limits
Abstract:
In modern quality control, it is becoming common to simultaneously monitor several quality characteristics of a process with rapid evolving data-acquisition technology. When the multivariate process distribution is unknown and only a set of in-control data is available, the bootstrap technique can be used to adjust the constant limit of the multivariate cumulative sum (MCUSUM) control chart. To further improve the performance of the control chart, we extend the constant control limit to a sequence of dynamic control limits which are determined by the conditional distribution of the charting statistics given the sprint length. Simulation results show that the novel control chart with dynamic control limits offers a better ARL performance, compared with the traditional MCUSUM control chart. Despite it, the proposed control chart is considerably computer-intensive. This leads to the development of a more flexible control chart which uses a continuous function of the sprint length as the control limit sequences. More importantly, the control chart is easy to implement and can reduce the computational time significantly. A white wine data illustrates that the novel control chart performs quite well in applications.
Journal: Journal of Applied Statistics
Pages: 2075-2093
Issue: 11
Volume: 44
Year: 2017
Month: 8
X-DOI: 10.1080/02664763.2016.1247784
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1247784
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:11:p:2075-2093
Template-Type: ReDIF-Article 1.0
Author-Name: Chun-Xia Zhang
Author-X-Name-First: Chun-Xia
Author-X-Name-Last: Zhang
Author-Name: Jiang-She Zhang
Author-X-Name-First: Jiang-She
Author-X-Name-Last: Zhang
Author-Name: Guan-Wei Wang
Author-X-Name-First: Guan-Wei
Author-X-Name-Last: Wang
Author-Name: Nan-Nan Ji
Author-X-Name-First: Nan-Nan
Author-X-Name-Last: Ji
Title: A novel bagging approach for variable ranking and selection via a mixed importance measure
Abstract:
At present, ensemble learning has exhibited its great power in stabilizing and enhancing the performance of some traditional variable selection methods such as lasso and genetic algorithm. In this paper, a novel bagging ensemble method called BSSW is developed to implement variable ranking and selection in linear regression models. Its main idea is to execute stepwise search algorithm on multiple bootstrap samples. In each trial, a mixed importance measure is assigned to each variable according to the order that it is selected into final model as well as the improvement of model fitting resulted from its inclusion. Based on the importance measure averaged across some bootstrapping trials, all candidate variables are ranked and then decided to be important or not. To extend the scope of application, BSSW is extended to the situation of generalized linear models. Experiments carried out with some simulated and real data indicate that BSSW achieves better performance in most studied cases when compared with several other existing methods.
Journal: Journal of Applied Statistics
Pages: 1734-1755
Issue: 10
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1391181
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1391181
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:10:p:1734-1755
Template-Type: ReDIF-Article 1.0
Author-Name: Brandi N. Falley
Author-X-Name-First: Brandi N.
Author-X-Name-Last: Falley
Author-Name: James D. Stamey
Author-X-Name-First: James D.
Author-X-Name-Last: Stamey
Author-Name: A. Alexander Beaujean
Author-X-Name-First: A. Alexander
Author-X-Name-Last: Beaujean
Title: Bayesian estimation of logistic regression with misclassified covariates and response
Abstract:
Measurement error is a commonly addressed problem in psychometrics and the behavioral sciences, particularly where gold standard data either does not exist or are too expensive. The Bayesian approach can be utilized to adjust for the bias that results from measurement error in tests. Bayesian methods offer other practical advantages for the analysis of epidemiological data including the possibility of incorporating relevant prior scientific information and the ability to make inferences that do not rely on large sample assumptions. In this paper we consider a logistic regression model where both the response and a binary covariate are subject to misclassification. We assume both a continuous measure and a binary diagnostic test are available for the response variable but no gold standard test is assumed available. We consider a fully Bayesian analysis that affords such adjustments, accounting for the sources of error and correcting estimates of the regression parameters. Based on the results from our example and simulations, the models that account for misclassification produce more statistically significant results, than the models that ignore misclassification. A real data example on math disorders is considered.
Journal: Journal of Applied Statistics
Pages: 1756-1769
Issue: 10
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1391182
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1391182
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:10:p:1756-1769
Template-Type: ReDIF-Article 1.0
Author-Name: Jin-Jian Hsieh
Author-X-Name-First: Jin-Jian
Author-X-Name-Last: Hsieh
Author-Name: Jian-Lin Wang
Author-X-Name-First: Jian-Lin
Author-X-Name-Last: Wang
Title: Quantile residual life regression based on semi-competing risks data
Abstract:
This paper investigates the quantile residual life regression based on semi-competing risk data. Because the terminal event time dependently censors the non-terminal event time, the inference on the non-terminal event time is not available without extra assumption. Therefore, we assume that the non-terminal event time and the terminal event time follow an Archimedean copula. Then, we apply the inverse probability weight technique to construct an estimating equation of quantile residual life regression coefficients. But, the estimating equation may not be continuous in coefficients. Thus, we apply the generalized solution approach to overcome this problem. Since the variance estimation of the proposed estimator is difficult to obtain, we use the bootstrap resampling method to estimate it. From simulations, it shows the performance of the proposed method is well. Finally, we analyze the Bone Marrow Transplant data for illustrations.
Journal: Journal of Applied Statistics
Pages: 1770-1780
Issue: 10
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1391183
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1391183
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:10:p:1770-1780
Template-Type: ReDIF-Article 1.0
Author-Name: Osvaldo Loquiha
Author-X-Name-First: Osvaldo
Author-X-Name-Last: Loquiha
Author-Name: Niel Hens
Author-X-Name-First: Niel
Author-X-Name-Last: Hens
Author-Name: Emilia Martins-Fonteyn
Author-X-Name-First: Emilia
Author-X-Name-Last: Martins-Fonteyn
Author-Name: Herman Meulemans
Author-X-Name-First: Herman
Author-X-Name-Last: Meulemans
Author-Name: Edwin Wouters
Author-X-Name-First: Edwin
Author-X-Name-Last: Wouters
Author-Name: Marleen Temmerman
Author-X-Name-First: Marleen
Author-X-Name-Last: Temmerman
Author-Name: Nafissa Osman
Author-X-Name-First: Nafissa
Author-X-Name-Last: Osman
Author-Name: Marc Aerts
Author-X-Name-First: Marc
Author-X-Name-Last: Aerts
Title: Joint models for mixed categorical outcomes: a study of HIV risk perception and disease status in Mozambique
Abstract:
Two types of bivariate models for categorical response variables are introduced to deal with special categories such as ‘unsure’ or ‘unknown’ in combination with other ordinal categories, while taking additional hierarchical data structures into account. The latter is achieved by the use of different covariance structures for a trivariate random effect. The models are applied to data from the INSIDA survey, where interest goes to the effect of covariates on the association between HIV risk perception (quadrinomial with an ‘unknown risk’ category) and HIV infection status (binary). The final model combines continuation-ratio with cumulative link logits for the risk perception, together with partly correlated and partly shared trivariate random effects for the household level. The results indicate that only age has a significant effect on the association between HIV risk perception and infection status. The proposed models may be useful in various fields of application such as social and biomedical sciences, epidemiology and public health.
Journal: Journal of Applied Statistics
Pages: 1781-1798
Issue: 10
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1391184
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1391184
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:10:p:1781-1798
Template-Type: ReDIF-Article 1.0
Author-Name: Shahedul A. Khan
Author-X-Name-First: Shahedul A.
Author-X-Name-Last: Khan
Author-Name: Setu C. Kar
Author-X-Name-First: Setu C.
Author-X-Name-Last: Kar
Title: Generalized bent-cable methodology for changepoint data: a Bayesian approach
Abstract:
The choice of the model framework in a regression setting depends on the nature of the data. The focus of this study is on changepoint data, exhibiting three phases: incoming and outgoing, both of which are linear, joined by a curved transition. Bent-cable regression is an appealing statistical tool to characterize such trajectories, quantifying the nature of the transition between the two linear phases by modeling the transition as a quadratic phase with unknown width. We demonstrate that a quadratic function may not be appropriate to adequately describe many changepoint data. We then propose a generalization of the bent-cable model by relaxing the assumption of the quadratic bend. The properties of the generalized model are discussed and a Bayesian approach for inference is proposed. The generalized model is demonstrated with applications to three data sets taken from environmental science and economics. We also consider a comparison among the quadratic bent-cable, generalized bent-cable and piecewise linear models in terms of goodness of fit in analyzing both real-world and simulated data. This study suggests that the proposed generalization of the bent-cable model can be valuable in adequately describing changepoint data that exhibit either an abrupt or gradual transition over time.
Journal: Journal of Applied Statistics
Pages: 1799-1812
Issue: 10
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1391754
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1391754
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:10:p:1799-1812
Template-Type: ReDIF-Article 1.0
Author-Name: Airlane P. Alencar
Author-X-Name-First: Airlane P.
Author-X-Name-Last: Alencar
Title: Seasonality of hospitalizations due to respiratory diseases: modelling serial correlation all we need is Poisson
Abstract:
The identification of seasonality and trend patterns of the weekly number of hospitalizations may be useful to plan the structure of health care and the vaccination calendar. A generalized additive model with the negative binomial distribution and a generalized additive model with autoregressive terms (GAMAR) and Poisson distribution are fitted including seasonal parameters and nonlinear trend using splines. The GAMAR includes autoregressive terms to take into account the serial correlation, yielding correct standard errors and reducing overdispersion. For the number of hospitalizations of people older than 60 years due to respiratory diseases in São Paulo city, both models present similar estimates but the Poisson-GAMAR presents uncorrelated residuals, no overdispersion and provides smaller confidence intervals for the weekly percentage changes. Forecasts for the next year based on both models are obtained by simulation and the Poisson-GAMAR presented better performance.
Journal: Journal of Applied Statistics
Pages: 1813-1822
Issue: 10
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1396295
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1396295
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:10:p:1813-1822
Template-Type: ReDIF-Article 1.0
Author-Name: M. S. Paez
Author-X-Name-First: M. S.
Author-X-Name-Last: Paez
Author-Name: S. G. Walker
Author-X-Name-First: S. G.
Author-X-Name-Last: Walker
Title: Modeling with a large class of unimodal multivariate distributions
Abstract:
In this paper we introduce a new class of multivariate unimodal distributions, motivated by Khintchine's representation for unimodal densities on the real line. We start by introducing a new class of unimodal distributions which can then be naturally extended to higher dimensions, using the multivariate Gaussian copula. Under both univariate and multivariate settings, we provide MCMC algorithms to perform inference about the model parameters and predictive densities. The methodology is illustrated with univariate and bivariate examples, and with variables taken from a real data set.
Journal: Journal of Applied Statistics
Pages: 1823-1845
Issue: 10
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1396296
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1396296
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:10:p:1823-1845
Template-Type: ReDIF-Article 1.0
Author-Name: D. S. Gonçalves
Author-X-Name-First: D. S.
Author-X-Name-Last: Gonçalves
Author-Name: C. L. N. Azevedo
Author-X-Name-First: C. L. N.
Author-X-Name-Last: Azevedo
Author-Name: C. Lavor
Author-X-Name-First: C.
Author-X-Name-Last: Lavor
Author-Name: M. A. Gomes-Ruggiero
Author-X-Name-First: M. A.
Author-X-Name-Last: Gomes-Ruggiero
Title: Bayesian inference for quantum state tomography
Abstract:
We present a Bayesian approach to the problem of estimating density matrices in quantum state tomography. A general framework is presented based on a suitable mathematical formulation, where a study of the convergence of the Monte Carlo Markov Chain algorithm is given, including a comparison with other estimation methods, such as maximum likelihood estimation and linear inversion. This analysis indicates that our approach not only recovers the underlying parameters quite properly, but also produces physically acceptable punctual and interval estimates. A prior sensitive study was conducted indicating that when useful prior information is available and incorporated, more accurate results are obtained. This general framework, which is based on a reparameterization of the model, allows an easier choice of the prior and proposal distributions for the Metropolis–Hastings algorithm.
Journal: Journal of Applied Statistics
Pages: 1846-1871
Issue: 10
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1401049
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1401049
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:10:p:1846-1871
Template-Type: ReDIF-Article 1.0
Author-Name: Mansi Ghodsi
Author-X-Name-First: Mansi
Author-X-Name-Last: Ghodsi
Author-Name: Hossein Hassani
Author-X-Name-First: Hossein
Author-X-Name-Last: Hassani
Author-Name: Donya Rahmani
Author-X-Name-First: Donya
Author-X-Name-Last: Rahmani
Author-Name: Emmanuel Sirimal Silva
Author-X-Name-First: Emmanuel Sirimal
Author-X-Name-Last: Silva
Title: Vector and recurrent singular spectrum analysis: which is better at forecasting?
Abstract:
Singular spectrum analysis (SSA) is an increasingly popular and widely adopted filtering and forecasting technique which is currently exploited in a variety of fields. Given its increasing application and superior performance in comparison to other methods, it is pertinent to study and distinguish between the two forecasting variations of SSA. These are referred to as Vector SSA (SSA-V) and Recurrent SSA (SSA-R). The general notion is that SSA-V is more robust and provides better forecasts than SSA-R. This is especially true when faced with time series which are non-stationary and asymmetric, or affected by unit root problems, outliers or structural breaks. However, currently there exists no empirical evidence for proving the above notions or suggesting that SSA-V is better than SSA-R. In this paper, we evaluate out-of-sample forecasting capabilities of the optimised SSA-V and SSA-R forecasting algorithms via a simulation study and an application to 100 real data sets with varying structures, to provide a statistically reliable answer to the question of which SSA algorithm is best for forecasting at both short and long run horizons based on several important criteria.
Journal: Journal of Applied Statistics
Pages: 1872-1899
Issue: 10
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1401050
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1401050
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:10:p:1872-1899
Template-Type: ReDIF-Article 1.0
Author-Name: Jiang Du
Author-X-Name-First: Jiang
Author-X-Name-Last: Du
Author-Name: Xiuping Chen
Author-X-Name-First: Xiuping
Author-X-Name-Last: Chen
Author-Name: Eddy Kwessi
Author-X-Name-First: Eddy
Author-X-Name-Last: Kwessi
Author-Name: Zhimeng Sun
Author-X-Name-First: Zhimeng
Author-X-Name-Last: Sun
Title: Model averaging based on rank
Abstract:
In this paper, we investigate model selection and model averaging based on rank regression. Under mild conditions, we propose a focused information criterion and a frequentist model averaging estimator for the focused parameters in rank regression model. Compared to the least squares method, the new method is not only highly efficient but also robust. The large sample properties of the proposed procedure are established. The finite sample properties are investigated via extensive Monte Claro simulation study. Finally, we use the Boston Housing Price Dataset to illustrate the use of the proposed rank methods.
Journal: Journal of Applied Statistics
Pages: 1900-1919
Issue: 10
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1401051
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1401051
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:10:p:1900-1919
Template-Type: ReDIF-Article 1.0
Author-Name: Qi Zhou
Author-X-Name-First: Qi
Author-X-Name-Last: Zhou
Author-Name: Yoo-Mi Chin
Author-X-Name-First: Yoo-Mi
Author-X-Name-Last: Chin
Author-Name: James D. Stamey
Author-X-Name-First: James D.
Author-X-Name-Last: Stamey
Author-Name: Joon Jin Song
Author-X-Name-First: Joon Jin
Author-X-Name-Last: Song
Title: Bayesian misclassification and propensity score methods for clustered observational studies
Abstract:
Bayesian propensity score regression analysis with misclassified binary responses is proposed to analyse clustered observational data. This approach utilizes multilevel models and corrects for misclassification in the responses. Using the deviance information criterion (DIC), the performance of the approach is compared with approaches without correcting for misclassification, multilevel structure specification, or both in the study of the impact of female employment on the likelihood of physical violence. The smallest DIC confirms that our proposed model best fits the data. We conclude that female employment has an insignificant impact on the likelihood of physical spousal violence towards women. In addition, a simulation study confirms that the proposed approach performed best in terms of bias and coverage rate. Ignoring misclassification in response or multilevel structure of data would yield biased estimation of the exposure effect.
Journal: Journal of Applied Statistics
Pages: 1547-1560
Issue: 9
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1380786
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1380786
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:9:p:1547-1560
Template-Type: ReDIF-Article 1.0
Author-Name: Gunther Schauberger
Author-X-Name-First: Gunther
Author-X-Name-Last: Schauberger
Author-Name: Andreas Groll
Author-X-Name-First: Andreas
Author-X-Name-Last: Groll
Author-Name: Gerhard Tutz
Author-X-Name-First: Gerhard
Author-X-Name-Last: Tutz
Title: Analysis of the importance of on-field covariates in the German Bundesliga
Abstract:
In modern football, various variables as, for example, the distance a team runs or its percentage of ball possession, are collected throughout a match. However, there is a lack of methods to make use of these on-field variables simultaneously and to connect them with the final result of the match. This paper considers data from the German Bundesliga season 2015/2016. The objective is to identify the on-field variables that are connected to the sportive success or failure of the single teams. An extended Bradley–Terry model for football matches is proposed that is able to take into account on-field covariates. Penalty terms are used to reduce the complexity of the model and to find clusters of teams with equal covariate effects. The model identifies the running distance to be the on-field covariate that is most strongly connected to the match outcome.
Journal: Journal of Applied Statistics
Pages: 1561-1578
Issue: 9
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1383370
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1383370
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:9:p:1561-1578
Template-Type: ReDIF-Article 1.0
Author-Name: Samer A. Kharroubi
Author-X-Name-First: Samer A.
Author-X-Name-Last: Kharroubi
Title: Valuations of EQ-5D health states: could United Kingdom results be used as informative priors for the United States
Abstract:
Valuations of health state descriptors such as the generic EuroQol five-dimensional (EQ-5D) or the six-dimensional short form (SF-6D) have been conducted in different countries. There is a scope to make use of the results in one country as informative priors to help with the analysis of a study in another, for this to enable better estimation to be obtained in the new country than analysing its data separately. This article analyses data from 2 EQ-5D valuation studies where, using similar time trade-off protocols, values for 42 common health states were elicited from representative samples of the US and UK general adult populations. We apply a nonparametric Bayesian method to improve the accuracy of predictions of the US population utility function where the UK results were used as informative priors. The results suggest that drawing extra information from the UK data produces a better estimation of the US population utility than analysing its data separately. The implications of these results will be extremely crucial in countries where valuation studies are limited.
Journal: Journal of Applied Statistics
Pages: 1579-1594
Issue: 9
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1386770
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1386770
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:9:p:1579-1594
Template-Type: ReDIF-Article 1.0
Author-Name: Luigi Spezia
Author-X-Name-First: Luigi
Author-X-Name-Last: Spezia
Author-Name: Nial Friel
Author-X-Name-First: Nial
Author-X-Name-Last: Friel
Author-Name: Alessandro Gimona
Author-X-Name-First: Alessandro
Author-X-Name-Last: Gimona
Title: Spatial hidden Markov models and species distributions
Abstract:
A spatial hidden Markov model (SHMM) is introduced to analyse the distribution of a species on an atlas, taking into account that false observations and false non-detections of the species can occur during the survey, blurring the true map of presence and absence of the species. The reconstruction of the true map is tackled as the restoration of a degraded pixel image, where the true map is an autologistic model, hidden behind the observed map, whose normalizing constant is efficiently computed by simulating an auxiliary map. The distribution of the species is explained under the Bayesian paradigm and Markov chain Monte Carlo (MCMC) algorithms are developed. We are interested in the spatial distribution of the bird species Greywing Francolin in the south of Africa. Many climatic and land-use explanatory variables are also available: they are included in the SHMM and a subset of them is selected by the mutation operators within the MCMC algorithm.
Journal: Journal of Applied Statistics
Pages: 1595-1615
Issue: 9
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1386771
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1386771
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:9:p:1595-1615
Template-Type: ReDIF-Article 1.0
Author-Name: Akash Malhotra
Author-X-Name-First: Akash
Author-X-Name-Last: Malhotra
Author-Name: Shailesh Krishna
Author-X-Name-First: Shailesh
Author-X-Name-Last: Krishna
Title: Release velocities and bowler performance in cricket
Abstract:
There is a widespread notion in the cricketing world that with increasing pace the performance of a bowler improves. Additionally, many cricket experts believe faster bowlers to be more effective against lower order batters than bowlers who bowl at slower speeds. The present study puts these two ubiquitous notions under test by statistically analysing the differences in performance of bowlers from three subpopulations based on average release velocities. Results from one-way ANOVA (and its modified versions), for international test matches, reveal faster bowlers to be performing better, in terms of Average and Strike-rate, but no significant differences in the Economy rate and Dynamic Bowling rate. Faster bowlers were found to be more effective in taking wickets of lower and middle order batters as compared to bowlers with less pace. However, there was no statistically significant difference in performance of Fast and Fast-Medium bowlers against a top-order batter.
Journal: Journal of Applied Statistics
Pages: 1616-1627
Issue: 9
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1386772
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1386772
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:9:p:1616-1627
Template-Type: ReDIF-Article 1.0
Author-Name: Kelley M. Kidwell
Author-X-Name-First: Kelley M.
Author-X-Name-Last: Kidwell
Author-Name: Nicholas J. Seewald
Author-X-Name-First: Nicholas J.
Author-X-Name-Last: Seewald
Author-Name: Qui Tran
Author-X-Name-First: Qui
Author-X-Name-Last: Tran
Author-Name: Connie Kasari
Author-X-Name-First: Connie
Author-X-Name-Last: Kasari
Author-Name: Daniel Almirall
Author-X-Name-First: Daniel
Author-X-Name-Last: Almirall
Title: Design and analysis considerations for comparing dynamic treatment regimens with binary outcomes from sequential multiple assignment randomized trials
Abstract:
In behavioral, educational and medical practice, interventions are often personalized over time using strategies that are based on individual behaviors and characteristics and changes in symptoms, severity, or adherence that are a result of one's treatment. Such strategies that more closely mimic real practice, are known as dynamic treatment regimens (DTRs). A sequential multiple assignment randomized trial (SMART) is a multi-stage trial design that can be used to construct effective DTRs. This article reviews a simple to use ‘weighted and replicated’ estimation technique for comparing DTRs embedded in a SMART design using logistic regression for a binary, end-of-study outcome variable. Based on a Wald test that compares two embedded DTRs of interest from the ‘weighted and replicated’ regression model, a sample size calculation is presented with a corresponding user-friendly applet to aid in the process of designing a SMART. The analytic models and sample size calculations are presented for three of the more commonly used two-stage SMART designs. Simulations for the sample size calculation show the empirical power reaches expected levels. A data analysis example with corresponding code is presented in the appendix using data from a SMART developing an effective DTR in autism.
Journal: Journal of Applied Statistics
Pages: 1628-1651
Issue: 9
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1386773
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1386773
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:9:p:1628-1651
Template-Type: ReDIF-Article 1.0
Author-Name: Yajie Zou
Author-X-Name-First: Yajie
Author-X-Name-Last: Zou
Author-Name: John E. Ash
Author-X-Name-First: John E.
Author-X-Name-Last: Ash
Author-Name: Byung-Jung Park
Author-X-Name-First: Byung-Jung
Author-X-Name-Last: Park
Author-Name: Dominique Lord
Author-X-Name-First: Dominique
Author-X-Name-Last: Lord
Author-Name: Lingtao Wu
Author-X-Name-First: Lingtao
Author-X-Name-Last: Wu
Title: Empirical Bayes estimates of finite mixture of negative binomial regression models and its application to highway safety
Abstract:
The empirical Bayes (EB) method is commonly used by transportation safety analysts for conducting different types of safety analyses, such as before–after studies and hotspot analyses. To date, most implementations of the EB method have been applied using a negative binomial (NB) model, as it can easily accommodate the overdispersion commonly observed in crash data. Recent studies have shown that a generalized finite mixture of NB models with K mixture components (GFMNB-K) can also be used to model crash data subjected to overdispersion and generally offers better statistical performance than the traditional NB model. So far, nobody has developed how the EB method could be used with finite mixtures of NB models. The main objective of this study is therefore to use a GFMNB-K model in the calculation of EB estimates. Specifically, GFMNB-K models with varying weight parameters are developed to analyze crash data from Indiana and Texas. The main finding shows that the rankings produced by the NB and GFMNB-2 models for hotspot identification are often quite different, and this was especially noticeable with the Texas dataset. Finally, a simulation study designed to examine which model formulation can better identify the hotspot is recommended as our future research.
Journal: Journal of Applied Statistics
Pages: 1652-1669
Issue: 9
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1389863
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1389863
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:9:p:1652-1669
Template-Type: ReDIF-Article 1.0
Author-Name: Joanna Morais
Author-X-Name-First: Joanna
Author-X-Name-Last: Morais
Author-Name: Christine Thomas-Agnan
Author-X-Name-First: Christine
Author-X-Name-Last: Thomas-Agnan
Author-Name: Michel Simioni
Author-X-Name-First: Michel
Author-X-Name-Last: Simioni
Title: Using compositional and Dirichlet models for market share regression
Abstract:
When the aim is to model market shares, the marketing literature proposes some regression models which can be qualified as attraction models. They are generally derived from an aggregated version of the multinomial logit model. But aggregated multinomial logit models (MNL) and the so-called generalized multiplicative competitive interaction models (GMCI) present some limitations: in their simpler version they do not specify brand-specific and cross effect parameters. In this paper, we consider alternative models: the Dirichlet model (DIR) and the compositional model (CODA). DIR allows to introduce brand-specific parameters and CODA allows additionally to consider cross effect parameters. We show that these two models can be written in a similar fashion, called attraction form, as the MNL and the GMCI models. As market share models are usually interpreted in terms of elasticities, we also use this notion to interpret the DIR and CODA models. We compare the properties of the models in order to explain why CODA and DIR models can outperform traditional market share models. An application to the automobile market is presented where we model brands market shares as a function of media investments, controlling for the brands price and scrapping incentive. We compare the quality of the models using measures adapted to shares.
Journal: Journal of Applied Statistics
Pages: 1670-1689
Issue: 9
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1389864
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1389864
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:9:p:1670-1689
Template-Type: ReDIF-Article 1.0
Author-Name: Bogumił Kamiński
Author-X-Name-First: Bogumił
Author-X-Name-Last: Kamiński
Author-Name: Przemysław Szufel
Author-X-Name-First: Przemysław
Author-X-Name-Last: Szufel
Title: On parallel policies for ranking and selection problems
Abstract:
In this paper we develop and test experimental methodologies for selection of the best alternative among a discrete number of available treatments. We consider a scenario where a researcher sequentially decides which treatments are assigned to experimental units. This problem is particularly challenging if a single measurement of the response to a treatment is time-consuming and there is a limited time for experimentation. This time can be decreased if it is possible to perform measurements in parallel. In this work we propose and discuss asynchronous extensions of two well-known Ranking & Selection policies, namely, Optimal Computing Budget Allocation (OCBA) and Knowledge Gradient (KG) policy. Our extensions (Asynchronous Optimal Computing Budget Allocation (AOCBA) and Asynchronous Knowledge Gradient (AKG), respectively) allow for parallel asynchronous allocation of measurements. Additionally, since the standard KG method is sequential (it can only allocate one experiment at a time) we propose a parallel synchronous extension of KG policy – Synchronous Knowledge Gradient (SKG). Computer simulations of our algorithms indicate that our parallel KG-based policies (AKG, SKG) outperform the standard OCBA method as well as AOCBA, if the number of evaluated alternatives is small or the computing/experimental budget is limited. For experimentations with large budgets and big sets of alternatives, both the OCBA and AOCBA policies are more efficient.
Journal: Journal of Applied Statistics
Pages: 1690-1713
Issue: 9
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1390555
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1390555
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:9:p:1690-1713
Template-Type: ReDIF-Article 1.0
Author-Name: W. Tang
Author-X-Name-First: W.
Author-X-Name-Last: Tang
Author-Name: H. He
Author-X-Name-First: H.
Author-X-Name-Last: He
Author-Name: W.J. Wang
Author-X-Name-First: W.J.
Author-X-Name-Last: Wang
Author-Name: D.G. Chen
Author-X-Name-First: D.G.
Author-X-Name-Last: Chen
Title: Untangle the structural and random zeros in statistical modelings
Abstract:
Count data with structural zeros are common in public health applications. There are considerable researches focusing on zero-inflated models such as zero-inflated Poisson (ZIP) and zero-inflated Negative Binomial (ZINB) models for such zero-inflated count data when used as response variable. However, when such variables are used as predictors, the difference between structural and random zeros is often ignored and may result in biased estimates. One remedy is to include an indicator of the structural zero in the model as a predictor if observed. However, structural zeros are often not observed in practice, in which case no statistical method is available to address the bias issue. This paper is aimed to fill this methodological gap by developing parametric methods to model zero-inflated count data when used as predictors based on the maximum likelihood approach. The response variable can be any type of data including continuous, binary, count or even zero-inflated count responses. Simulation studies are performed to assess the numerical performance of this new approach when sample size is small to moderate. A real data example is also used to demonstrate the application of this method.
Journal: Journal of Applied Statistics
Pages: 1714-1733
Issue: 9
Volume: 45
Year: 2018
Month: 7
X-DOI: 10.1080/02664763.2017.1391180
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1391180
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:9:p:1714-1733
Template-Type: ReDIF-Article 1.0
Author-Name: D. Kurz
Author-X-Name-First: D.
Author-X-Name-Last: Kurz
Author-Name: H. Lewitschnig
Author-X-Name-First: H.
Author-X-Name-Last: Lewitschnig
Author-Name: J. Pilz
Author-X-Name-First: J.
Author-X-Name-Last: Pilz
Title: Failure probability estimation under additional subsystem information with application to semiconductor burn-in
Abstract:
In the classical approach to qualitative reliability demonstration, system failure probabilities are estimated based on a binomial sample drawn from the running production. In this paper, we show how to take account of additional available sampling information for some or even all subsystems of a current system under test with serial reliability structure. In that connection, we present two approaches, a frequentist and a Bayesian one, for assessing an upper bound for the failure probability of serial systems under binomial subsystem data. In the frequentist approach, we introduce (i) a new way of deriving the probability distribution for the number of system failures, which might be randomly assembled from the failed subsystems and (ii) a more accurate estimator for the Clopper–Pearson upper bound using a beta mixture distribution. In the Bayesian approach, however, we infer the posterior distribution for the system failure probability on the basis of the system/subsystem testing results and a prior distribution for the subsystem failure probabilities. We propose three different prior distributions and compare their performances in the context of high reliability testing. Finally, we apply the proposed methods to reduce the efforts of semiconductor burn-in studies by considering synergies such as comparable chip layers, among different chip technologies.
Journal: Journal of Applied Statistics
Pages: 955-967
Issue: 6
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1189522
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1189522
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:6:p:955-967
Template-Type: ReDIF-Article 1.0
Author-Name: Yunlu Jiang
Author-X-Name-First: Yunlu
Author-X-Name-Last: Jiang
Title: S-estimator in partially linear regression models
Abstract:
In this paper, a robust estimator is proposed for partially linear regression models. We first estimate the nonparametric component using the penalized regression spline, then we construct an estimator of parametric component by using robust S-estimator. We propose an iterative algorithm to solve the proposed optimization problem, and introduce a robust generalized cross-validation to select the penalized parameter. Simulation studies and a real data analysis illustrate that the our proposed method is robust against outliers in the dataset or errors with heavy tails.
Journal: Journal of Applied Statistics
Pages: 968-977
Issue: 6
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1189523
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1189523
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:6:p:968-977
Template-Type: ReDIF-Article 1.0
Author-Name: Wen-Liang Hung
Author-X-Name-First: Wen-Liang
Author-X-Name-Last: Hung
Author-Name: Shou-Jen Chang-Chien
Author-X-Name-First: Shou-Jen
Author-X-Name-Last: Chang-Chien
Title: Learning-based EM algorithm for normal-inverse Gaussian mixture model with application to extrasolar planets
Abstract:
Karlis and Santourian [14] proposed a model-based clustering algorithm, the expectation–maximization (EM) algorithm, to fit the mixture of multivariate normal-inverse Gaussian (NIG) distribution. However, the EM algorithm for the mixture of multivariate NIG requires a set of initial values to begin the iterative process, and the number of components has to be given a priori. In this paper, we present a learning-based EM algorithm: its aim is to overcome the aforementioned weaknesses of Karlis and Santourian's EM algorithm [14]. The proposed learning-based EM algorithm was first inspired by Yang et al. [24]: the process of how they perform self-clustering was then simulated. Numerical experiments showed promising results compared to Karlis and Santourian's EM algorithm. Moreover, the methodology is applicable to the analysis of extrasolar planets. Our analysis provides an understanding of the clustering results in the ln P−ln M and ln P−e spaces, where M is the planetary mass, P is the orbital period and e is orbital eccentricity. Our identified groups interpret two phenomena: (1) the characteristics of two clusters in ln P−ln M space might be related to the tidal and disc interactions (see [9]); and (2) there are two clusters in ln P−e space.
Journal: Journal of Applied Statistics
Pages: 978-999
Issue: 6
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1190322
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1190322
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:6:p:978-999
Template-Type: ReDIF-Article 1.0
Author-Name: M. P. Gadre
Author-X-Name-First: M. P.
Author-X-Name-Last: Gadre
Author-Name: S. B. Adnaik
Author-X-Name-First: S. B.
Author-X-Name-Last: Adnaik
Author-Name: R.N. Rattihalli
Author-X-Name-First: R.N.
Author-X-Name-Last: Rattihalli
Title: Continuous single attribute control chart for Markov-dependent processes
Abstract:
Most of the times, the observations related to the quality characteristic of a process do not need to be independent. In such cases, control charts based on the assumption of independence of the observations are not appropriate. When the characteristic under study is qualitative, Markov model serves as a simple model to account for the dependency of the observations. For this purpose, we develop an attribute control chart under 100% inspection for a Markov dependent process by controlling the error probabilities. This chart consists of two sub-charts. For a given sample, depending upon the state of the last observation of previous sample (if any), one of these two will be used. Optimal values of the design parameters of the control chart are obtained. Chart’s performance is studied by using its capability (probability) of detecting a shift in process parameters.
Journal: Journal of Applied Statistics
Pages: 1000-1012
Issue: 6
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1191621
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1191621
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:6:p:1000-1012
Template-Type: ReDIF-Article 1.0
Author-Name: S. Wang
Author-X-Name-First: S.
Author-X-Name-Last: Wang
Author-Name: N. G. Cadigan
Author-X-Name-First: N. G.
Author-X-Name-Last: Cadigan
Author-Name: H. P. Benoît
Author-X-Name-First: H. P.
Author-X-Name-Last: Benoît
Title: Inference about regression parameters using highly stratified survey count data with over-dispersion and repeated measurements
Abstract:
We study methods to estimate regression and variance parameters for over-dispersed and correlated count data from highly stratified surveys. Our application involves counts of fish catches from stratified research surveys and we propose a novel model in fisheries science to address changes in survey protocols. A challenge with this model is the large number of nuisance parameters which leads to computational issues and biased statistical inferences. We use a computationally efficient profile generalized estimating equation method and compare it to marginal maximum likelihood (MLE) and restricted MLE (REML) methods. We use REML to address bias and inaccurate confidence intervals because of many nuisance parameters. The marginal MLE and REML approaches involve intractable integrals and we used a new R package that is designed for estimating complex nonlinear models that may include random effects. We conclude from simulation analyses that the REML method provides more reliable statistical inferences among the three methods we investigated.
Journal: Journal of Applied Statistics
Pages: 1013-1030
Issue: 6
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1191622
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1191622
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:6:p:1013-1030
Template-Type: ReDIF-Article 1.0
Author-Name: I. A. R. de Lara
Author-X-Name-First: I. A. R.
Author-X-Name-Last: de Lara
Author-Name: J. P. Hinde
Author-X-Name-First: J. P.
Author-X-Name-Last: Hinde
Author-Name: A. C. de Castro
Author-X-Name-First: A. C.
Author-X-Name-Last: de Castro
Author-Name: I. J. O. da Silva
Author-X-Name-First: I. J. O.
Author-X-Name-Last: da Silva
Title: A proportional odds transition model for ordinal responses with an application to pig behaviour
Abstract:
Categorical data are quite common in many fields of science including in behaviour studies in animal science. In this article, the data concern the degree of lesions in pigs, related to the behaviour of these animals. The experimental design corresponded to two levels of environmental enrichment and four levels of genetic lineages in a completely randomized $ 2 \times 4 $ 2×4 factorial with data collected longitudinally over four time occasions. The transition models used for the data analysis are based on stochastic processes and Generalized Linear Models. In general, these are not used for analysis of longitudinal data but they are useful in many situations as in this study. We present some aspects of this class of models for the stationary case. The proportional odds transition model is used to construct the matrix of transition probabilities and a function was developed in the R system to fit this model. The likelihood ratio test was used to verify the assumption of odds ratio proportionality and to select the structure of the linear predictor. The methodology used allowed for the choice of a model that can be used to explain the relationship between the severity of lesions in pigs and the use of the environmental enrichment.
Journal: Journal of Applied Statistics
Pages: 1031-1046
Issue: 6
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1191623
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1191623
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:6:p:1031-1046
Template-Type: ReDIF-Article 1.0
Author-Name: I. R. C. Oliveira
Author-X-Name-First: I. R. C.
Author-X-Name-Last: Oliveira
Author-Name: G. Molenberghs
Author-X-Name-First: G.
Author-X-Name-Last: Molenberghs
Author-Name: G. Verbeke
Author-X-Name-First: G.
Author-X-Name-Last: Verbeke
Author-Name: C. G. B. Demétrio
Author-X-Name-First: C. G. B.
Author-X-Name-Last: Demétrio
Author-Name: C. T. S. Dias
Author-X-Name-First: C. T. S.
Author-X-Name-Last: Dias
Title: Negative variance components for non-negative hierarchical data with correlation, over-, and/or underdispersion
Abstract:
The concept of negative variance components in linear mixed-effects models, while confusing at first sight, has received considerable attention in the literature, for well over half a century, following the early work of Chernoff [7] and Nelder [21]. Broadly, negative variance components in linear mixed models are allowable if inferences are restricted to the implied marginal model. When a hierarchical view-point is adopted, in the sense that outcomes are specified conditionally upon random effects, the variance–covariance matrix of the random effects must be positive-definite (positive-semi-definite is also possible, but raises issues of degenerate distributions). Many contemporary software packages allow for this distinction. Less work has been done for generalized linear mixed models. Here, we study such models, with extension to allow for overdispersion, for non-negative outcomes (counts). Using a study of trichomes counts on tomato plants, it is illustrated how such negative variance components play a natural role in modeling both the correlation between repeated measures on the same experimental unit and over- or underdispersion.
Journal: Journal of Applied Statistics
Pages: 1047-1063
Issue: 6
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1191624
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1191624
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:6:p:1047-1063
Template-Type: ReDIF-Article 1.0
Author-Name: C. E. Pertsinidou
Author-X-Name-First: C. E.
Author-X-Name-Last: Pertsinidou
Author-Name: G. Tsaklidis
Author-X-Name-First: G.
Author-X-Name-Last: Tsaklidis
Author-Name: E. Papadimitriou
Author-X-Name-First: E.
Author-X-Name-Last: Papadimitriou
Author-Name: N. Limnios
Author-X-Name-First: N.
Author-X-Name-Last: Limnios
Title: Application of hidden semi-Markov models for the seismic hazard assessment of the North and South Aegean Sea, Greece
Abstract:
The real stress field in an area associated with earthquake generation cannot be directly observed. For that purpose we apply hidden semi-Markov models (HSMMs) for strong $ (M\ge 5.5) $ (M≥5.5) earthquake occurrence in the areas of North and South Aegean Sea considering that the stress field constitutes the hidden process. The advantage of HSMMs compared to hidden Markov models (HMMs) is that they allow any arbitrary distribution for the sojourn times. Poisson, Logarithmic and Negative Binomial distributions as well as different model dimensions are tested. The parameter estimation is achieved via the EM algorithm. For the decoding procedure, a new Viterbi algorithm with a simple form is applied detecting precursory phases (hidden stress variations) and warning for anticipated earthquake occurrences. The optimal HSMM provides an alarm period for 70 out of 88 events. HMMs are also studied presenting poor results compared to these obtained via HSMMs. Bootstrap standard errors and confidence intervals for the parameters are evaluated and the forecasting ability of the Poisson models is examined.
Journal: Journal of Applied Statistics
Pages: 1064-1085
Issue: 6
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1193724
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1193724
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:6:p:1064-1085
Template-Type: ReDIF-Article 1.0
Author-Name: Byron C. Jaeger
Author-X-Name-First: Byron C.
Author-X-Name-Last: Jaeger
Author-Name: Lloyd J. Edwards
Author-X-Name-First: Lloyd J.
Author-X-Name-Last: Edwards
Author-Name: Kalyan Das
Author-X-Name-First: Kalyan
Author-X-Name-Last: Das
Author-Name: Pranab K. Sen
Author-X-Name-First: Pranab K.
Author-X-Name-Last: Sen
Title: An statistic for fixed effects in the generalized linear mixed model
Abstract:
Measuring the proportion of variance explained ( $ R^2 $ R2) by a statistical model and the relative importance of specific predictors (semi-partial $ R^2 $ R2) can be essential considerations when building a parsimonious statistical model. The $ R^2 $ R2 statistic is a familiar summary of goodness-of-fit for normal linear models and has been extended in various ways to more general models. In particular, the generalized linear mixed model (GLMM) extends the normal linear model and is used to analyze correlated (hierarchical), non-normal data structures. Although various $ R^2 $ R2 statistics have been proposed, there is no consensus in statistical literature for the most sensible definition of $ R^2 $ R2 in this context. This research aims to build upon existing knowledge and definitions of $ R^2 $ R2 and to concisely define the statistic for the GLMM. Here, we derive a model and semi-partial $ R^2 $ R2 statistic for fixed (population) effects in the GLMM by utilizing the penalized quasi-likelihood estimation method based on linearization. We show that our proposed $ R^2 $ R2 statistic generalizes the widely used marginal $ R^2 $ R2 statistic introduced by Nakagawa and Schielzeth, demonstrate our statistics capability in model selection, show the utility of semi-partial $ R^2 $ R2 statistics in longitudinal data analysis, and provide software that computes the proposed $ R^2 $ R2 statistic along with semi-partial $ R^2 $ R2 for individual fixed effects. The software provided is adapted for both SAS and R programming languages.
Journal: Journal of Applied Statistics
Pages: 1086-1105
Issue: 6
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1193725
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1193725
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:6:p:1086-1105
Template-Type: ReDIF-Article 1.0
Author-Name: M. Mahdizadeh
Author-X-Name-First: M.
Author-X-Name-Last: Mahdizadeh
Author-Name: Ehsan Zamanzade
Author-X-Name-First: Ehsan
Author-X-Name-Last: Zamanzade
Title: New goodness of fit tests for the Cauchy distribution
Abstract:
Some goodness-of-fit procedures for the Cauchy distribution are presented. The power comparisons indicate that the new tests possess good performances among the competitors, especially against symmetric alternatives. A financial data set is analyzed for illustration.
Journal: Journal of Applied Statistics
Pages: 1106-1121
Issue: 6
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1193726
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1193726
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:6:p:1106-1121
Template-Type: ReDIF-Article 1.0
Author-Name: Rui Fang
Author-X-Name-First: Rui
Author-X-Name-Last: Fang
Author-Name: Xiaohu Li
Author-X-Name-First: Xiaohu
Author-X-Name-Last: Li
Title: Nonparametric tests for strictly increasing virtual valuations
Abstract:
This paper shows that for absolutely continuous valuation distributions the increasing virtual valuations is equivalent to the increasing odds rate. Based on this new characterization we develop two nonparametric tests for the strictly increasing virtual valuations by using the generalized total time on test transform. The empirical type I error rate and power performance of the two tests are examined through Monte Carlo simulations. As illustrations the two tests are also applied to two real data sets collected from eBay.
Journal: Journal of Applied Statistics
Pages: 1122-1136
Issue: 6
Volume: 44
Year: 2017
Month: 4
X-DOI: 10.1080/02664763.2016.1193727
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1193727
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:6:p:1122-1136
Template-Type: ReDIF-Article 1.0
Author-Name: Anthony C. Atkinson
Author-X-Name-First: Anthony C.
Author-X-Name-Last: Atkinson
Author-Name: Marco Riani
Author-X-Name-First: Marco
Author-X-Name-Last: Riani
Author-Name: Andrea Cerioli
Author-X-Name-First: Andrea
Author-X-Name-Last: Cerioli
Title: Cluster detection and clustering with random start forward searches
Abstract:
The forward search is a method of robust data analysis in which outlier free subsets of the data of increasing size are used in model fitting; the data are then ordered by closeness to the model. Here the forward search, with many random starts, is used to cluster multivariate data. These random starts lead to the diagnostic identification of tentative clusters. Application of the forward search to the proposed individual clusters leads to the establishment of cluster membership through the identification of non-cluster members as outlying. The method requires no prior information on the number of clusters and does not seek to classify all observations. These properties are illustrated by the analysis of 200 six-dimensional observations on Swiss banknotes. The importance of linked plots and brushing in elucidating data structures is illustrated. We also provide an automatic method for determining cluster centres and compare the behaviour of our method with model-based clustering. In a simulated example with eight clusters our method provides more stable and accurate solutions than model-based clustering. We consider the computational requirements of both procedures.
Journal: Journal of Applied Statistics
Pages: 777-798
Issue: 5
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1310806
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1310806
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:5:p:777-798
Template-Type: ReDIF-Article 1.0
Author-Name: Hyoyoung Choo-Wosoba
Author-X-Name-First: Hyoyoung
Author-X-Name-Last: Choo-Wosoba
Author-Name: Somnath Datta
Author-X-Name-First: Somnath
Author-X-Name-Last: Datta
Title: Analyzing clustered count data with a cluster-specific random effect zero-inflated Conway–Maxwell–Poisson distribution
Abstract:
Count data analysis techniques have been developed in biological and medical research areas. In particular, zero-inflated versions of parametric count distributions have been used to model excessive zeros that are often present in these assays. The most common count distributions for analyzing such data are Poisson and negative binomial. However, a Poisson distribution can only handle equidispersed data and a negative binomial distribution can only cope with overdispersion. However, a Conway–Maxwell–Poisson (CMP) distribution [4] can handle a wide range of dispersion. We show, with an illustrative data set on next-generation sequencing of maize hybrids, that both underdispersion and overdispersion can be present in genomic data. Furthermore, the maize data set consists of clustered observations and, therefore, we develop inference procedures for a zero-inflated CMP regression that incorporates a cluster-specific random effect term. Unlike the Gaussian models, the underlying likelihood is computationally challenging. We use a numerical approximation via a Gaussian quadrature to circumvent this issue. A test for checking zero-inflation has also been developed in our setting. Finite sample properties of our estimators and test have been investigated by extensive simulations. Finally, the statistical methodology has been applied to analyze the maize data mentioned before.
Journal: Journal of Applied Statistics
Pages: 799-814
Issue: 5
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1312299
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1312299
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:5:p:799-814
Template-Type: ReDIF-Article 1.0
Author-Name: Rahim Alhamzawi
Author-X-Name-First: Rahim
Author-X-Name-Last: Alhamzawi
Author-Name: Haithem Taha Mohammad Ali
Author-X-Name-First: Haithem Taha Mohammad
Author-X-Name-Last: Ali
Title: Bayesian quantile regression for ordinal longitudinal data
Abstract:
Since the pioneering work by Koenker and Bassett [27], quantile regression models and its applications have become increasingly popular and important for research in many areas. In this paper, a random effects ordinal quantile regression model is proposed for analysis of longitudinal data with ordinal outcome of interest. An efficient Gibbs sampling algorithm was derived for fitting the model to the data based on a location-scale mixture representation of the skewed double-exponential distribution. The proposed approach is illustrated using simulated data and a real data example. This is the first work to discuss quantile regression for analysis of longitudinal data with ordinal outcome.
Journal: Journal of Applied Statistics
Pages: 815-828
Issue: 5
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1315059
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1315059
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:5:p:815-828
Template-Type: ReDIF-Article 1.0
Author-Name: R. J. Waken
Author-X-Name-First: R. J.
Author-X-Name-Last: Waken
Author-Name: Joon Jin Song
Author-X-Name-First: Joon Jin
Author-X-Name-Last: Song
Author-Name: Soohyun Kwon
Author-X-Name-First: Soohyun
Author-X-Name-Last: Kwon
Author-Name: Ki-Hong Min
Author-X-Name-First: Ki-Hong
Author-X-Name-Last: Min
Author-Name: GyuWon Lee
Author-X-Name-First: GyuWon
Author-X-Name-Last: Lee
Title: A flexible and efficient spatial interpolator for radar rainfall estimation
Abstract:
A key challenge in rainfall estimation is spatio-temporal variablility. Weather radars are used to estimate precipitation with high spatial and temporal resolution. Due to the inherent errors in radar estimates, spatial interpolation has been often employed to calibrate the estimates. Kriging is a simple and popular spatial interpolation method, but the method has several shortcomings. In particular, the prediction is quite unstable and often fails to be performed when sample size is small. In this paper, we proposed a flexible and efficient spatial interpolator for radar rainfall estimation, with several advantages over kriging. The method is illustrated using a real-world data set.
Journal: Journal of Applied Statistics
Pages: 829-844
Issue: 5
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1317723
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1317723
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:5:p:829-844
Template-Type: ReDIF-Article 1.0
Author-Name: Georg Man
Author-X-Name-First: Georg
Author-X-Name-Last: Man
Title: Critical appraisal of jointness concepts in Bayesian model averaging: evidence from life sciences, sociology, and other scientific fields
Abstract:
Jointness is a Bayesian approach to capturing dependence among regressors in multivariate data. It addresses the general issue of whether explanatory factors for a given empirical phenomenon are complements or substitutes. I ask a number of questions about existing jointness concepts: Are the patterns revealed stable across datasets? Are results robust to prior choice and do data characteristics affect results? And importantly: What do the answers imply from a practical vista? The present study takes an applied, interdisciplinary and comparative perspective, validating jointness concepts on datasets across scientific fields with focus on life sciences (Parkinson's disease) and sociology. Simulations complement the study of real-world data. My findings suggest that results depend on which jointness concept is used: Some concepts deliver jointness patterns remarkably uniform across datasets, while all concepts are fairly robust to the choice of prior structure. This can be interpreted as critique of jointness from a practical perspective, given that the patterns revealed are at times very different and no concept emerges as overall advantageous. The composite indicators approach to combining information across jointness concepts is also explored, suggesting an avenue to facilitate the application of the concepts in future research.
Journal: Journal of Applied Statistics
Pages: 845-867
Issue: 5
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1318839
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1318839
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:5:p:845-867
Template-Type: ReDIF-Article 1.0
Author-Name: Ramón Flores
Author-X-Name-First: Ramón
Author-X-Name-Last: Flores
Author-Name: Rosa Lillo
Author-X-Name-First: Rosa
Author-X-Name-Last: Lillo
Author-Name: Juan Romo
Author-X-Name-First: Juan
Author-X-Name-Last: Romo
Title: Homogeneity test for functional data
Abstract:
In the context of functional data analysis, we propose new sample tests for homogeneity. Based on some well-known depth measures, we construct four different statistics in order to measure distance between the two samples. A simulation study is performed to check the efficiency of the tests when confronted with shape and magnitude perturbation. Finally, we apply these tools to measure the homogeneity in some samples of real data, and we obtain good results using this new method.
Journal: Journal of Applied Statistics
Pages: 868-883
Issue: 5
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1319470
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1319470
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:5:p:868-883
Template-Type: ReDIF-Article 1.0
Author-Name: Christian Pierdzioch
Author-X-Name-First: Christian
Author-X-Name-Last: Pierdzioch
Author-Name: Monique B. Reid
Author-X-Name-First: Monique B.
Author-X-Name-Last: Reid
Author-Name: Rangan Gupta
Author-X-Name-First: Rangan
Author-X-Name-Last: Gupta
Title: On the directional accuracy of inflation forecasts: evidence from South African survey data
Abstract:
We study the information content of South African inflation survey data by determining the directional accuracy of both short-term and long-term forecasts. We use relative operating characteristic (ROC) curves, which have been applied in a variety of fields including weather forecasting and radiology, to ascertain the directional accuracy of the forecasts. A ROC curve summarizes the directional accuracy of forecasts by comparing the rate of true signals (sensitivity) with the rate of false signals (one minus specifity). A ROC curve goes beyond market-timing tests widely studied in earlier research as this comparison is carried out for many alternative values of a decision criterion that discriminates between signals (of a rising inflation rate) and nonsignals (of an unchanged or a falling inflation rate). We find consistent evidence that forecasts contain information with respect to the subsequent direction of change of the inflation rate.
Journal: Journal of Applied Statistics
Pages: 884-900
Issue: 5
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1322556
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1322556
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:5:p:884-900
Template-Type: ReDIF-Article 1.0
Author-Name: Jun Ye
Author-X-Name-First: Jun
Author-X-Name-Last: Ye
Author-Name: Juan Xi
Author-X-Name-First: Juan
Author-X-Name-Last: Xi
Author-Name: Richard L. Einsporn
Author-X-Name-First: Richard L.
Author-X-Name-Last: Einsporn
Title: Functional principal component analysis in age–period–cohort analysis of body mass index data by gender and ethnicity
Abstract:
In this paper, we propose a two-stage functional principal component analysis method in age–period–cohort (APC) analysis. The first stage of the method considers the age–period effect with the fitted values treated as an offset; and the second stage of the method considers the residual age–cohort effect conditional on the already estimated age-period effect. An APC version of the model in functional data analysis provides an improved fit to the data, especially when the data are sparse and irregularly spaced. We demonstrate the effectiveness of the proposed method using body mass index data stratified by gender and ethnicity.
Journal: Journal of Applied Statistics
Pages: 901-917
Issue: 5
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1322557
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1322557
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:5:p:901-917
Template-Type: ReDIF-Article 1.0
Author-Name: Harvey Goldstein
Author-X-Name-First: Harvey
Author-X-Name-Last: Goldstein
Author-Name: William J. Browne
Author-X-Name-First: William J.
Author-X-Name-Last: Browne
Author-Name: Christopher Charlton
Author-X-Name-First: Christopher
Author-X-Name-Last: Charlton
Title: A Bayesian model for measurement and misclassification errors alongside missing data, with an application to higher education participation in Australia
Abstract:
In this paper we consider the impact of both missing data and measurement errors on a longitudinal analysis of participation in higher education in Australia. We develop a general method for handling both discrete and continuous measurement errors that also allows for the incorporation of missing values and random effects in both binary and continuous response multilevel models. Measurement errors are allowed to be mutually dependent and their distribution may depend on further covariates. We show that our methodology works via two simple simulation studies. We then consider the impact of our measurement error assumptions on the analysis of the real data set.
Journal: Journal of Applied Statistics
Pages: 918-931
Issue: 5
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1322558
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1322558
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:5:p:918-931
Template-Type: ReDIF-Article 1.0
Author-Name: Mário F. Desousa
Author-X-Name-First: Mário F.
Author-X-Name-Last: Desousa
Author-Name: Helton Saulo
Author-X-Name-First: Helton
Author-X-Name-Last: Saulo
Author-Name: Víctor Leiva
Author-X-Name-First: Víctor
Author-X-Name-Last: Leiva
Author-Name: Paulo Scalco
Author-X-Name-First: Paulo
Author-X-Name-Last: Scalco
Title: On a tobit–Birnbaum–Saunders model with an application to medical data
Abstract:
The tobit model allows a censored response variable to be described by covariates. Its applications cover different areas such as economics, engineering, environment and medicine. A strong assumption of the standard tobit model is that its errors follow a normal distribution. However, not all applications are well modeled by this distribution. Some efforts have relaxed the normality assumption by considering more flexible distributions. Nevertheless, the presence of asymmetry could not be well described by these flexible distributions. A real-world data application of measles vaccine in Haiti is explored, which confirms this asymmetry. We propose a tobit model with errors following a Birnbaum–Saunders (BS) distribution, which is asymmetrical and has shown to be a good alternative for describing medical data. Inference based on the maximum likelihood method and a type of residual are derived for the tobit–BS model. We perform global and local influence diagnostics to assess the sensitivity of the maximum likelihood estimators to atypical cases. A Monte Carlo simulation study is carried out to empirically evaluate the performance of these estimators. We conduct a data analysis for the mentioned application of measles vaccine based on the proposed model with the help of the R software. The results show the good performance of the tobit–BS model.
Journal: Journal of Applied Statistics
Pages: 932-955
Issue: 5
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1322559
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1322559
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:5:p:932-955
Template-Type: ReDIF-Article 1.0
Author-Name: Wei-Ya Wu
Author-X-Name-First: Wei-Ya
Author-X-Name-Last: Wu
Author-Name: Wei-Hwa Wu
Author-X-Name-First: Wei-Hwa
Author-X-Name-Last: Wu
Author-Name: Hsin-Neng Hsieh
Author-X-Name-First: Hsin-Neng
Author-X-Name-Last: Hsieh
Author-Name: Meng-Chih Lee
Author-X-Name-First: Meng-Chih
Author-X-Name-Last: Lee
Title: The generalized inference on the sign testing problem about the normal variances
Abstract:
For the sign testing problem about the normal variances, we develop the heuristic testing procedure based on the concept of generalized test variable and generalized p-value. A detailed simulation study is conducted to empirically investigate the performance of the proposed method. Through the simulation study, especially in small sample sizes, the proposed test not only adequately controls empirical size at the nominal level, but also uniformly more powerful than likelihood ratio test, Gutmann's test, Li and Sinha's test and Liu and Chan's test, showing that the proposed method can be recommended in practice. The proposed method is illustrated with the published data.
Journal: Journal of Applied Statistics
Pages: 956-970
Issue: 5
Volume: 45
Year: 2018
Month: 4
X-DOI: 10.1080/02664763.2017.1325857
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1325857
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:5:p:956-970
Template-Type: ReDIF-Article 1.0
Author-Name: Luisa Rivas
Author-X-Name-First: Luisa
Author-X-Name-Last: Rivas
Author-Name: Manuel Galea
Author-X-Name-First: Manuel
Author-X-Name-Last: Galea
Title: Influence analysis for the generalized Waring regression model
Abstract:
In this paper, we consider a regression model under the generalized Waring distribution for modeling count data. We develop and implement local influence diagnostic techniques based on likelihood displacement. Also we develop case-deletion methods. The generalized Waring regression model is presented as a mixture of the Negative Binomial and the Beta II distributions, and it is compared to the Negative Binomial and Waring regression models. Estimation is performed by maximum likelihood function. The influence measures developed in this paper are applied to a Spanish football league data set. Empirical results show that the generalized Waring regression model performs better when compared to the Negative Binomial and Waring regression models. Technical details are presented in the Appendix.
Journal: Journal of Applied Statistics
Pages: 1-27
Issue: 1
Volume: 47
Year: 2020
Month: 1
X-DOI: 10.1080/02664763.2019.1670148
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1670148
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:1:p:1-27
Template-Type: ReDIF-Article 1.0
Author-Name: Mengmeng Guo
Author-X-Name-First: Mengmeng
Author-X-Name-Last: Guo
Author-Name: Jingyong Su
Author-X-Name-First: Jingyong
Author-X-Name-Last: Su
Author-Name: Li Sun
Author-X-Name-First: Li
Author-X-Name-Last: Sun
Author-Name: Guofeng Cao
Author-X-Name-First: Guofeng
Author-X-Name-Last: Cao
Title: Statistical regression analysis of functional and shape data
Abstract:
We develop a multivariate regression model when responses or predictors are on nonlinear manifolds, rather than on Euclidean spaces. The nonlinear constraint makes the problem challenging and needs to be studied carefully. By performing principal component analysis (PCA) on tangent space of manifold, we use principal directions instead in the model. Then, the ordinary regression tools can be utilized. We apply the framework to both shape data (ozone hole contours) and functional data (spectrums of absorbance of meat in Tecator dataset). Specifically, we adopt the square-root velocity function representation and parametrization-invariant metric. Experimental results have shown that we can not only perform powerful regression analysis on the non-Euclidean data but also achieve high prediction accuracy by the constructed model.
Journal: Journal of Applied Statistics
Pages: 28-44
Issue: 1
Volume: 47
Year: 2020
Month: 1
X-DOI: 10.1080/02664763.2019.1669541
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1669541
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:1:p:28-44
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaowen Dai
Author-X-Name-First: Xiaowen
Author-X-Name-Last: Dai
Author-Name: Zhen Yan
Author-X-Name-First: Zhen
Author-X-Name-Last: Yan
Author-Name: Maozai Tian
Author-X-Name-First: Maozai
Author-X-Name-Last: Tian
Author-Name: ManLai Tang
Author-X-Name-First: ManLai
Author-X-Name-Last: Tang
Title: Quantile regression for general spatial panel data models with fixed effects
Abstract:
This paper considers the quantile regression model with both individual fixed effect and time period effect for general spatial panel data. Fixed effects quantile regression estimators based on instrumental variable method will be proposed. Asymptotic properties of the proposed estimators will be developed. Simulations are conducted to study the performance of the proposed method. We will illustrate our methodologies using a cigarettes demand data set.
Journal: Journal of Applied Statistics
Pages: 45-60
Issue: 1
Volume: 47
Year: 2020
Month: 1
X-DOI: 10.1080/02664763.2019.1628190
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1628190
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:1:p:45-60
Template-Type: ReDIF-Article 1.0
Author-Name: Caiyun Fan
Author-X-Name-First: Caiyun
Author-X-Name-Last: Fan
Author-Name: Gang Ding
Author-X-Name-First: Gang
Author-X-Name-Last: Ding
Author-Name: Feipeng Zhang
Author-X-Name-First: Feipeng
Author-X-Name-Last: Zhang
Title: A kernel nonparametric quantile estimator for right-censored competing risks data
Abstract:
In medical and epidemiological studies, it is often interest to study time-to-event distributions under competing risks that involve two or more failure types. Nonparametric analysis of competing risks is typically focused on the cumulative incidence function or nonparametric quantile function. However, the existing estimators may be very unstable due to their unsmoothness. In this paper, we propose a kernel nonparametric quantile estimator for right-censored competing risks data, which is a smoothed version of Peng and Fine's nonparametric quantile estimator. We establish the Bahadur representation of the proposed estimator. The convergence rate of the remainder term for the proposed estimator is substantially faster than Peng and Fine's quantile estimator. The pointwise confidence intervals and simultaneous confidence bands of the quantile functions are also derived. Simulation studies illustrate the good performance of the proposed estimator. The methodology is demonstrated with two applications of the Supreme Court Judge data and AIDSSI data.
Journal: Journal of Applied Statistics
Pages: 61-75
Issue: 1
Volume: 47
Year: 2020
Month: 1
X-DOI: 10.1080/02664763.2019.1631267
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1631267
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:1:p:61-75
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaohui Liu
Author-X-Name-First: Xiaohui
Author-X-Name-Last: Liu
Author-Name: Yang He
Author-X-Name-First: Yang
Author-X-Name-Last: He
Title: RR-plot: a descriptive tool for regression observations
Abstract:
In this paper, we propose a regression depth versus regression depth plot, hereafter RR-plot, for regression observations based on the halfspace regression depth. Areas of application of this tool include: the visualization of hypothesis tests about regression coefficients, and of the comparison between regression observations from different models, etc. Some characterization theorems are also provided to address the rationale of this RR-plot.
Journal: Journal of Applied Statistics
Pages: 76-90
Issue: 1
Volume: 47
Year: 2020
Month: 1
X-DOI: 10.1080/02664763.2019.1631268
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1631268
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:1:p:76-90
Template-Type: ReDIF-Article 1.0
Author-Name: Sanying Feng
Author-X-Name-First: Sanying
Author-X-Name-Last: Feng
Author-Name: Gaorong Li
Author-X-Name-First: Gaorong
Author-X-Name-Last: Li
Author-Name: Tiejun Tong
Author-X-Name-First: Tiejun
Author-X-Name-Last: Tong
Author-Name: Shuanghua Luo
Author-X-Name-First: Shuanghua
Author-X-Name-Last: Luo
Title: Testing for heteroskedasticity in two-way fixed effects panel data models
Abstract:
In this paper, we propose a new method for testing heteroskedasticity in two-way fixed effects panel data models under two important scenarios where the cross-sectional dimension is large and the temporal dimension is either large or fixed. Specifically, we will develop test statistics for both cases under the conditional moment framework, and derive their asymptotic distributions under both the null and alternative hypotheses. The proposed tests are distribution free and can easily be implemented using the simple auxiliary regressions. Simulation studies and two real data analyses demonstrate that our proposed tests perform well in practice, and may have the potential for wide application in econometric models with panel data.
Journal: Journal of Applied Statistics
Pages: 91-116
Issue: 1
Volume: 47
Year: 2020
Month: 1
X-DOI: 10.1080/02664763.2019.1634682
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1634682
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:1:p:91-116
Template-Type: ReDIF-Article 1.0
Author-Name: Yuzhu Tian
Author-X-Name-First: Yuzhu
Author-X-Name-Last: Tian
Author-Name: Liyong Wang
Author-X-Name-First: Liyong
Author-X-Name-Last: Wang
Author-Name: Manlai Tang
Author-X-Name-First: Manlai
Author-X-Name-Last: Tang
Author-Name: Yanchao Zang
Author-X-Name-First: Yanchao
Author-X-Name-Last: Zang
Author-Name: Maozai Tian
Author-X-Name-First: Maozai
Author-X-Name-Last: Tian
Title: Likelihood-based quantile autoregressive distributed lag models and its applications
Abstract:
Time lag effect exists widely in the course of economic operation. Some economic variables are affected not only by various factors in the current period but also by various factors in the past and even their own past values. As a class of dynamical models, autoregressive distributed lag (ARDL) models are frequently used to conduct dynamic regression analysis. In this paper, we are interested in the quantile regression (QR) modeling of the ARDL model in a dynamic framework. By combining the working likelihood of asymmetric Laplace distribution (ALD) with the expectation–maximization (EM) algorithm into the considered ARDL model, the iterative weighted least square estimators (IWLSE) are derived. Some Monte Carlo simulations are implemented to evaluate the performance of the proposed estimation method. A dataset of the consumption of electricity by residential customers is analyzed to illustrate the application.
Journal: Journal of Applied Statistics
Pages: 117-131
Issue: 1
Volume: 47
Year: 2020
Month: 1
X-DOI: 10.1080/02664763.2019.1633285
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1633285
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:1:p:117-131
Template-Type: ReDIF-Article 1.0
Author-Name: Abdelghani Hamaz
Author-X-Name-First: Abdelghani
Author-X-Name-Last: Hamaz
Author-Name: Ouerdia Arezki
Author-X-Name-First: Ouerdia
Author-X-Name-Last: Arezki
Author-Name: Farida Achemine
Author-X-Name-First: Farida
Author-X-Name-Last: Achemine
Title: Impact of missing data on the prediction of random fields
Abstract:
The purpose of this paper is to treat the prediction problems where a number of observations are missing to the quarter-plane past of a stationary random field. Our aim is to quantify the influence of missing values on the prediction by giving the simple bounds for the prediction error variance. These bounds allow to characterize the random fields for which the missing observations do not affect the prediction. Simulation experiments and an application to real data are presented.
Journal: Journal of Applied Statistics
Pages: 132-149
Issue: 1
Volume: 47
Year: 2020
Month: 1
X-DOI: 10.1080/02664763.2019.1633286
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1633286
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:1:p:132-149
Template-Type: ReDIF-Article 1.0
Author-Name: Joonsung Kang
Author-X-Name-First: Joonsung
Author-X-Name-Last: Kang
Title: Robust estimation for longitudinal data based upon minimum Hellinger distance
Abstract:
Generalized linear mixed models have been widely used in the analysis of correlated data in a lot of research areas. The linear mixed model with normal errors has been a popular model for the analysis of repeated measures and longitudinal data. Outliers, however, can severely have an wrong influence on the linear mixed model. The aforementioned model has not fully taken those severe outliers into consideration. One of the popular robust estimation methods, M-estimator attains robustness at the expense of first-order or second-order efficiency whereas minimum Hellinger distance estimator is efficient and robust. In this paper, we propose more robust Bayesian version of parameter estimation via pseudo posterior distribution based on minimum Hellinger distance. It accommodates an appropriate nonparametric kernel density estimation for longitudinal data to require the proposed cross-validation estimator. We conduct simulation study and real data study with the orthodontic study data and the Alzheimers Disease (AD) study data. In simulation study, the proposed method shows smaller biases, mean squared errors, and standard errors than the (residual) maximum likelihood method (REML) in the presence of outliers or missing values. In real data analysis, standard errors and variance-covariance components for the proposed method in two data sets are shown to be lower than those for REML method.
Journal: Journal of Applied Statistics
Pages: 150-159
Issue: 1
Volume: 47
Year: 2020
Month: 1
X-DOI: 10.1080/02664763.2019.1635573
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1635573
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:1:p:150-159
Template-Type: ReDIF-Article 1.0
Author-Name: K. Krishnamoorthy
Author-X-Name-First: K.
Author-X-Name-Last: Krishnamoorthy
Author-Name: Dustin Waguespack
Author-X-Name-First: Dustin
Author-X-Name-Last: Waguespack
Author-Name: Ngan Hoang-Nguyen-Thuy
Author-X-Name-First: Ngan
Author-X-Name-Last: Hoang-Nguyen-Thuy
Title: Confidence interval, prediction interval and tolerance limits for a two-parameter Rayleigh distribution
Abstract:
The problems of interval estimating the parameters and the mean of a two-parameter Rayleigh distribution are considered. We propose pivotal-based methods for constructing confidence intervals for the mean, quantiles, survival probability and for constructing prediction intervals for the mean of a future sample. Pivotal quantities based on the maximum likelihood estimates (MLEs), moment estimates (MEs) and the L-moments estimates (L-MEs) are proposed. Interval estimates based on them are compared via Monte Carlo simulation. Comparison studies indicate that the results based on the MEs and the L-MEs are very similar. The results based on the MLEs are slightly better than those based on the MEs and the L-MEs for small to moderate sample sizes. The methods are illustrated using an example involving lifetime data.
Journal: Journal of Applied Statistics
Pages: 160-175
Issue: 1
Volume: 47
Year: 2020
Month: 1
X-DOI: 10.1080/02664763.2019.1634681
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1634681
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:1:p:160-175
Template-Type: ReDIF-Article 1.0
Author-Name: Suchismita Goswami
Author-X-Name-First: Suchismita
Author-X-Name-Last: Goswami
Author-Name: Edward J. Wegman
Author-X-Name-First: Edward J.
Author-X-Name-Last: Wegman
Title: Detection of excessive activities in time series of graphs
Abstract:
Considerable efforts have been made to apply scan statistics in detecting fraudulent or excessive activities in dynamic email networks. However, previous studies are mostly based on the fixed and disjoint windows, and on the assumption of short-term stationarity of the series, which might result in loss of information and error in detecting excessive activities. Here we devise scan statistics with variable and overlapping windows on stationary time series of organizational emails with a two-step process, and use likelihood function to rank the clusters. We initially estimate the log-likelihood ratio to obtain a primary cluster of communications using the Poisson model on email count series, and then extract neighborhood ego subnetworks around the observed primary cluster to obtain more refined cluster by invoking the graph invariant betweenness as the locality statistic using the binomial model. The results were then compared with the non-parametric maximum likelihood estimation method, and the residual analysis of ARMA model fitted to the time series of graph edit distance. We demonstrate that the scan statistics with two-step process is effective in detecting excessive activity in large dynamic social networks.
Journal: Journal of Applied Statistics
Pages: 176-200
Issue: 1
Volume: 47
Year: 2020
Month: 1
X-DOI: 10.1080/02664763.2019.1634680
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1634680
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:1:p:176-200
Template-Type: ReDIF-Article 1.0
Author-Name: M. Cannas
Author-X-Name-First: M.
Author-X-Name-Last: Cannas
Author-Name: C. Conversano
Author-X-Name-First: C.
Author-X-Name-Last: Conversano
Author-Name: F. Mola
Author-X-Name-First: F.
Author-X-Name-Last: Mola
Author-Name: E. Sironi
Author-X-Name-First: E.
Author-X-Name-Last: Sironi
Title: Variation in caesarean delivery rates across hospitals: a Bayesian semi-parametric approach
Abstract:
This article presents a Bayesian semi-parametric approach for modeling the occurrence of cesarean sections using a sample of women delivering in 20 hospitals of Sardinia (Italy). A multilevel logistic regression has been fitted on the data using a Dirichlet process prior for modeling the random-effects distribution of the unobserved factors at the hospital level. Using the estimated random effects at the hospital level, a partition of the hospitals in terms of similar medical practice has been obtained that identifies different profiles of hospitals in terms of caesarean section risks. The limited number of clusters may be useful for suggesting policy implications that help to reduce the heterogeneity of caesarean delivery risks.
Journal: Journal of Applied Statistics
Pages: 2095-2107
Issue: 12
Volume: 44
Year: 2017
Month: 9
X-DOI: 10.1080/02664763.2016.1247785
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1247785
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:12:p:2095-2107
Template-Type: ReDIF-Article 1.0
Author-Name: Z. Naji
Author-X-Name-First: Z.
Author-X-Name-Last: Naji
Author-Name: A. Rasekh
Author-X-Name-First: A.
Author-X-Name-Last: Rasekh
Author-Name: E. L. Boone
Author-X-Name-First: E. L.
Author-X-Name-Last: Boone
Title: Local influence in seemingly unrelated regression model with ridge estimate
Abstract:
Local influence is a well-known method for identifying the influential observations in a dataset and commonly needed in a statistical analysis. In this paper, we study the local influence on the parameters of interest in the seemingly unrelated regression model with ridge estimation, when there exists collinearity among the explanatory variables. We examine two types of perturbation schemes to identify influential observations: the perturbation of variance and the perturbation of individual explanatory variables. Finally, the efficacy of our proposed method is illustrated by analyzing [13] productivity dataset.
Journal: Journal of Applied Statistics
Pages: 2108-2124
Issue: 12
Volume: 44
Year: 2017
Month: 9
X-DOI: 10.1080/02664763.2016.1247787
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1247787
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:12:p:2108-2124
Template-Type: ReDIF-Article 1.0
Author-Name: Clécio S. Ferreira
Author-X-Name-First: Clécio S.
Author-X-Name-Last: Ferreira
Author-Name: Camila B. Zeller
Author-X-Name-First: Camila B.
Author-X-Name-Last: Zeller
Author-Name: Aparecida M. S. Mimura
Author-X-Name-First: Aparecida M. S.
Author-X-Name-Last: Mimura
Author-Name: Júlio C. J. Silva
Author-X-Name-First: Júlio C. J.
Author-X-Name-Last: Silva
Title: Partially linear models and their applications to change point detection of chemical process data
Abstract:
In many chemical data sets, the amount of radiation absorbed (absorbance) is related to the concentration of the element in the sample by Lambert–Beer's law. However, this relation changes abruptly when the variable concentration reaches an unknown threshold level, the so-called change point. In the context of analytical chemistry, there are many methods that describe the relationship between absorbance and concentration, but none of them provide inferential procedures to detect change points. In this paper, we propose partially linear models with a change point separating the parametric and nonparametric components. The Schwarz information criterion is used to locate a change point. A back-fitting algorithm is presented to obtain parameter estimates and the penalized Fisher information matrix is obtained to calculate the standard errors of the parameter estimates. To examine the proposed method, we present a simulation study. Finally, we apply the method to data sets from the chemistry area. The partially linear models with a change point developed in this paper are useful supplements to other methods of absorbance–concentration analysis in chemical studies, for example, and in many other practical applications.
Journal: Journal of Applied Statistics
Pages: 2125-2141
Issue: 12
Volume: 44
Year: 2017
Month: 9
X-DOI: 10.1080/02664763.2016.1247788
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1247788
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:12:p:2125-2141
Template-Type: ReDIF-Article 1.0
Author-Name: Giorgio Calzolari
Author-X-Name-First: Giorgio
Author-X-Name-Last: Calzolari
Author-Name: Antonino Di Pino
Author-X-Name-First: Antonino
Author-X-Name-Last: Di Pino
Title: Self-selection and direct estimation of across-regime correlation parameter
Abstract:
A direct maximum likelihood (ML) procedure to estimate the ‘generally unidentified’ across-regime correlation parameter in a two-regime endogenous switching model is here provided. The results of a Monte Carlo experiment confirm consistency of our direct ML procedure, and its relative efficiency over widely applied models and methods. As an empirical application, we estimate a two-regime simultaneous equation model of domestic work of Italian married women in which the two regimes are given by their working status (employed or unemployed).
Journal: Journal of Applied Statistics
Pages: 2142-2160
Issue: 12
Volume: 44
Year: 2017
Month: 9
X-DOI: 10.1080/02664763.2016.1247789
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1247789
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:12:p:2142-2160
Template-Type: ReDIF-Article 1.0
Author-Name: Chien-Chia L. Huang
Author-X-Name-First: Chien-Chia L.
Author-X-Name-Last: Huang
Author-Name: Yow-Jen Jou
Author-X-Name-First: Yow-Jen
Author-X-Name-Last: Jou
Author-Name: Hsun-Jung Cho
Author-X-Name-First: Hsun-Jung
Author-X-Name-Last: Cho
Title: Difference-based matrix perturbation method for semi-parametric regression with multicollinearity
Abstract:
This paper addresses the collinearity problems in semi-parametric linear models. Under the difference-based settings, we introduce a new diagnostic, the difference-based variance inflation factor (DVIF), for detecting the presence of multicollinearity in semi-parametric models. The DVIF is then used to device a difference-based matrix perturbation method for solving the problem. The electricities distribution data set is analyzed, and numerical evidences validate the effectiveness of the proposed method.
Journal: Journal of Applied Statistics
Pages: 2161-2171
Issue: 12
Volume: 44
Year: 2017
Month: 9
X-DOI: 10.1080/02664763.2016.1247790
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1247790
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:12:p:2161-2171
Template-Type: ReDIF-Article 1.0
Author-Name: T. Górecki
Author-X-Name-First: T.
Author-X-Name-Last: Górecki
Author-Name: Ł. Smaga
Author-X-Name-First: Ł.
Author-X-Name-Last: Smaga
Title: Multivariate analysis of variance for functional data
Abstract:
Functional data are being observed frequently in many scientific fields, and therefore most of the standard statistical methods are being adapted for functional data. The multivariate analysis of variance problem for functional data is considered. It seems to be of practical interest similarly as the one-way analysis of variance for such data. For the MANOVA problem for multivariate functional data, we propose permutation tests based on a basis function representation and tests based on random projections. Their performance is examined in comprehensive simulation studies, which provide an idea of the size control and power of the tests and identify differences between them. The simulation experiments are based on artificial data and real labeled multivariate time series data found in the literature. The results suggest that the studied testing procedures can detect small differences between vectors of curves even with small sample sizes. Illustrative real data examples of the use of the proposed testing procedures in practice are also presented.
Journal: Journal of Applied Statistics
Pages: 2172-2189
Issue: 12
Volume: 44
Year: 2017
Month: 9
X-DOI: 10.1080/02664763.2016.1247791
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1247791
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:12:p:2172-2189
Template-Type: ReDIF-Article 1.0
Author-Name: Nicholas T. Longford
Author-X-Name-First: Nicholas T.
Author-X-Name-Last: Longford
Author-Name: José Rafael Tovar Cuevas
Author-X-Name-First: José Rafael
Author-X-Name-Last: Tovar Cuevas
Author-Name: Carlos Alvear
Author-X-Name-First: Carlos
Author-X-Name-Last: Alvear
Title: Analysis of a marker for cancer of the thyroid with a limit of detection
Abstract:
Limit of detection (LoD) is a common problem in the analysis of data generated by instruments that cannot detect very small concentrations or other quantities, resulting in left-censored measurements. Methods intended for data that are not subject to this problem are often difficult to modify for censoring. We adapt the simulation-extrapolation method, devised originally for fitting models with measurement error, to dealing with LoD in conjunction with a mixture analysis. The application relates the levels of thyroglobulin in individuals with cancer of the thyroid before and after treatment with radioactive iodine I–131. We conclude that the fitted mixture components correspond to levels of effectiveness of the treatment.
Journal: Journal of Applied Statistics
Pages: 2190-2203
Issue: 12
Volume: 44
Year: 2017
Month: 9
X-DOI: 10.1080/02664763.2016.1247792
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1247792
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:12:p:2190-2203
Template-Type: ReDIF-Article 1.0
Author-Name: Liang Yan
Author-X-Name-First: Liang
Author-X-Name-Last: Yan
Author-Name: Rui Wang
Author-X-Name-First: Rui
Author-X-Name-Last: Wang
Author-Name: Xingzhong Xu
Author-X-Name-First: Xingzhong
Author-X-Name-Last: Xu
Title: A new confidence interval in errors-in-variables model with known error variance
Abstract:
This paper considers constructing a new confidence interval for the slope parameter in the structural errors-in-variables model with known error variance associated with the regressors. Existing confidence intervals are so severely affected by Gleser–Hwang effect that they are subject to have poor empirical coverage probabilities and unsatisfactory lengths. Moreover, these problems get worse with decreasing reliability ratio which also result in more frequent absence of some existing intervals. To ease these issues, this paper presents a fiducial generalized confidence interval which maintains the correct asymptotic coverage. Simulation results show that this fiducial interval is slightly conservative while often having average length comparable or shorter than the other methods. Finally, we illustrate these confidence intervals with two real data examples, and in the second example some existing intervals do not exist.
Journal: Journal of Applied Statistics
Pages: 2204-2221
Issue: 12
Volume: 44
Year: 2017
Month: 9
X-DOI: 10.1080/02664763.2016.1247793
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1247793
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:12:p:2204-2221
Template-Type: ReDIF-Article 1.0
Author-Name: Jung In Seo
Author-X-Name-First: Jung In
Author-X-Name-Last: Seo
Author-Name: Yongku Kim
Author-X-Name-First: Yongku
Author-X-Name-Last: Kim
Title: Objective Bayesian analysis based on upper record values from two-parameter Rayleigh distribution with partial information
Abstract:
In the life test, predicting higher failure times than the largest failure time of the observed is an important issue. Although the Rayleigh distribution is a suitable model for analyzing the lifetime of components that age rapidly over time because its failure rate function is an increasing linear function of time, the inference for a two-parameter Rayleigh distribution based on upper record values has not been addressed from the Bayesian perspective. This paper provides Bayesian analysis methods by proposing a noninformative prior distribution to analyze survival data, using a two-parameter Rayleigh distribution based on record values. In addition, we provide a pivotal quantity and an algorithm based on the pivotal quantity to predict the behavior of future survival records. We show that the proposed method is superior to the frequentist counterpart in terms of the mean-squared error and bias through Monte carlo simulations. For illustrative purposes, survival data on lung cancer patients are analyzed, and it is proved that the proposed model can be a good alternative when prior information is not given.
Journal: Journal of Applied Statistics
Pages: 2222-2237
Issue: 12
Volume: 44
Year: 2017
Month: 9
X-DOI: 10.1080/02664763.2016.1251886
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1251886
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:12:p:2222-2237
Template-Type: ReDIF-Article 1.0
Author-Name: Yeşim Güney
Author-X-Name-First: Yeşim
Author-X-Name-Last: Güney
Author-Name: Yetkin Tuaç
Author-X-Name-First: Yetkin
Author-X-Name-Last: Tuaç
Author-Name: Olcay Arslan
Author-X-Name-First: Olcay
Author-X-Name-Last: Arslan
Title: Marshall–Olkin distribution: parameter estimation and application to cancer data
Abstract:
In this study, as alternatives to the maximum likelihood (ML) and the frequency estimators, we propose robust estimators for the parameters of Zipf and Marshall–Olkin Zipf distributions. A small simulation study is given to illustrate the performance of the proposed estimators. We apply the proposed estimators to a real data set from cancer research to illustrate the performance of the proposed estimators over the ML, moments and frequency estimators. We observe that the robust estimators have superiority over the frequency estimators based on classical sample mean.
Journal: Journal of Applied Statistics
Pages: 2238-2250
Issue: 12
Volume: 44
Year: 2017
Month: 9
X-DOI: 10.1080/02664763.2016.1252730
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1252730
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:12:p:2238-2250
Template-Type: ReDIF-Article 1.0
Author-Name: M. Revan Özkale
Author-X-Name-First: M. Revan
Author-X-Name-Last: Özkale
Author-Name: Funda Can
Author-X-Name-First: Funda
Author-X-Name-Last: Can
Title: An evaluation of ridge estimator in linear mixed models: an example from kidney failure data
Abstract:
This paper is concerned with the ridge estimation of fixed and random effects in the context of Henderson's mixed model equations in the linear mixed model. For this purpose, a penalized likelihood method is proposed. A linear combination of ridge estimator for fixed and random effects is compared to a linear combination of best linear unbiased estimator for fixed and random effects under the mean-square error (MSE) matrix criterion. Additionally, for choosing the biasing parameter, a method of MSE under the ridge estimator is given. A real data analysis is provided to illustrate the theoretical results and a simulation study is conducted to characterize the performance of ridge and best linear unbiased estimators approach in the linear mixed model.
Journal: Journal of Applied Statistics
Pages: 2251-2269
Issue: 12
Volume: 44
Year: 2017
Month: 9
X-DOI: 10.1080/02664763.2016.1252732
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1252732
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:12:p:2251-2269
Template-Type: ReDIF-Article 1.0
Author-Name: Jiajia Chen
Author-X-Name-First: Jiajia
Author-X-Name-Last: Chen
Author-Name: Xiaoqin Zhang
Author-X-Name-First: Xiaoqin
Author-X-Name-Last: Zhang
Author-Name: Shengjia Li
Author-X-Name-First: Shengjia
Author-X-Name-Last: Li
Title: Multiple linear regression with compositional response and covariates
Abstract:
The standard regression model designed for real space is not suitable for compositional variables; it should be considered, whether the response and/or covariates are of compositional nature. There are usually three types of multiple regression model with compositional variables: Type 1 refers to the case where all the covariates are compositional data and the response is real; Type 2 is the opposite of Type 1; Type 3 relates to the model with compositional response and covariates. There have been some models for the three types. In this paper, we focus on Type 3 and propose multiple linear regression models including model in the simplex and model in isometric log-ratio (ilr) coordinates. The model in the simplex is based on matrix product, which can project a $ D_{1} $ D1-part composition to another $ D_{2} $ D2-part composition, and can deal with different number of parts of compositional variables. Some theorems are given to point out the relationship of parameters between the proposed models. Moreover, the inference for parameters in proposed models is also given. Real example is studied to verify the validity and usefulness of proposed models.
Journal: Journal of Applied Statistics
Pages: 2270-2285
Issue: 12
Volume: 44
Year: 2017
Month: 9
X-DOI: 10.1080/02664763.2016.1157145
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1157145
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:12:p:2270-2285
Template-Type: ReDIF-Article 1.0
Author-Name: Zheng Wei
Author-X-Name-First: Zheng
Author-X-Name-Last: Wei
Author-Name: Erin M. Conlon
Author-X-Name-First: Erin M.
Author-X-Name-Last: Conlon
Title: Parallel Markov chain Monte Carlo for Bayesian hierarchical models with big data, in two stages
Abstract:
Due to the escalating growth of big data sets in recent years, new Bayesian Markov chain Monte Carlo (MCMC) parallel computing methods have been developed. These methods partition large data sets by observations into subsets. However, for Bayesian nested hierarchical models, typically only a few parameters are common for the full data set, with most parameters being group specific. Thus, parallel Bayesian MCMC methods that take into account the structure of the model and split the full data set by groups rather than by observations are a more natural approach for analysis. Here, we adapt and extend a recently introduced two-stage Bayesian hierarchical modeling approach, and we partition complete data sets by groups. In stage 1, the group-specific parameters are estimated independently in parallel. The stage 1 posteriors are used as proposal distributions in stage 2, where the target distribution is the full model. Using three-level and four-level models, we show in both simulation and real data studies that results of our method agree closely with the full data analysis, with greatly increased MCMC efficiency and greatly reduced computation times. The advantages of our method versus existing parallel MCMC computing methods are also described.
Journal: Journal of Applied Statistics
Pages: 1917-1936
Issue: 11
Volume: 46
Year: 2019
Month: 8
X-DOI: 10.1080/02664763.2019.1572723
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1572723
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:11:p:1917-1936
Template-Type: ReDIF-Article 1.0
Author-Name: Dileep Kumar M.
Author-X-Name-First: Dileep
Author-X-Name-Last: Kumar M.
Author-Name: Sankaran P.G.
Author-X-Name-First: Sankaran
Author-X-Name-Last: P.G.
Author-Name: Unnikrishnan Nair N.
Author-X-Name-First: Unnikrishnan
Author-X-Name-Last: Nair N.
Title: Proportional odds model – a quantile approach
Abstract:
The paper discusses a quantile-based definition for the well-known proportional odds model. We present various reliability properties of the model using quantile functions. Different ageing properties are derived. A generalization for the class of distributions with bilinear hazard quantile function is established and the practical application of this model is illustrated with a real-life data set.
Journal: Journal of Applied Statistics
Pages: 1937-1955
Issue: 11
Volume: 46
Year: 2019
Month: 8
X-DOI: 10.1080/02664763.2019.1572724
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1572724
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:11:p:1937-1955
Template-Type: ReDIF-Article 1.0
Author-Name: Alka Sabharwal
Author-X-Name-First: Alka
Author-X-Name-Last: Sabharwal
Author-Name: Gurprit Grover
Author-X-Name-First: Gurprit
Author-X-Name-Last: Grover
Author-Name: Sakshi Kaushik
Author-X-Name-First: Sakshi
Author-X-Name-Last: Kaushik
Title: Testing the difference between bipolar disorder and schizophrenia on the basis of the severity of symptoms with C(α) test
Abstract:
Bipolar disorder and schizophrenia share some key symptoms which lead to misdiagnosis, especially on initial presentation. In this study, we have considered two categories of patients belonging to schizophrenia and bipolar disorder with (i) total duration of illness (TDI) less than or equal to 2 years and (ii) TDI greater than 2 years. We statistically test the difference between the severity of symptoms of the two groups as measured by their respective psychiatric rating scales using $ C(\alpha ) $ C(α) (or score tests), likelihood ratio and permutation tests for both categories of patients. The unknown parameters are estimated using maximum likelihood, moments by Cran and Bayesian estimation. It is observed that there exists a significant difference between the two disorders for patients in second category based on real and simulated data. Further, performance of $ C(\alpha ) $ C(α) statistic is compared on the basis of p-value and power performance with the other two methods. A new weight suggested in this paper is found to be as efficient as the previous weight based on simulation study. A retrospective data of 108 patients diagnosed with schizophrenia and bipolar disorders is collected from Lady Hardinge Medical College & Smt. S.K. Hospital, New Delhi, India for the calendar year 2013–2014.
Journal: Journal of Applied Statistics
Pages: 2101-2110
Issue: 11
Volume: 46
Year: 2019
Month: 8
X-DOI: 10.1080/02664763.2019.1573882
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1573882
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:11:p:2101-2110
Template-Type: ReDIF-Article 1.0
Author-Name: Fengqing Zhang
Author-X-Name-First: Fengqing
Author-X-Name-Last: Zhang
Author-Name: Jiangtao Gou
Author-X-Name-First: Jiangtao
Author-X-Name-Last: Gou
Title: Control of false positive rates in clusterwise fMRI inferences
Abstract:
Random field theory (RFT) provided a theoretical foundation for cluster-extent-based thresholding, the most widely used method for multiple comparison correction of statistical maps in neuroimaging research. However, several studies questioned the validity of the standard clusterwise inference in fMRI analyses and observed inflated false positive rates. In particular, Eklund et al. [Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates, Proc. Natl. Acad. Sci. 113 (2016), pp. 7900–7905. Available at http://www.pnas.org/content/113/28/7900.abstract] used resting-state fMRI as null data and found false positive rates of up to $ 70\% $ 70%, which immediately led to many discussions. In this study, we summarize the assumptions in RFT clusterwise inference and propose new parametric ways to approximate the distribution of the cluster size by properly combining the limiting distribution of the cluster size given by Nosko [Local structure of Gaussian random fields in the vicinity of high-level shines, Sov. Math. Dokl. 10 (1969), pp. 1481–1484] and the expected value of the cluster size provided by Friston et al. [Assessing the significance of focal activations using their spatial extent, Hum. Brain Mapp. 1 (1994), pp. 210–220. Available at http://dx.doi.org/10.1002/hbm.460010306]. We evaluated our proposed method using four different classic simulation settings in published papers. Results show that our method produces a more stringent estimation of cluster extent size, which leads to a better control of false positive rates.
Journal: Journal of Applied Statistics
Pages: 1956-1972
Issue: 11
Volume: 46
Year: 2019
Month: 8
X-DOI: 10.1080/02664763.2019.1573883
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1573883
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:11:p:1956-1972
Template-Type: ReDIF-Article 1.0
Author-Name: Yunfei Wei
Author-X-Name-First: Yunfei
Author-X-Name-Last: Wei
Author-Name: Shifeng Xiong
Author-X-Name-First: Shifeng
Author-X-Name-Last: Xiong
Title: Bayesian integrative analysis for multi-fidelity computer experiments
Abstract:
This paper proposes a Bayesian integrative analysis method for linking multi-fidelity computer experiments. Instead of assuming covariance structures of multivariate Gaussian process models, we handle the outputs from different levels of accuracy as independent processes and link them via a penalization method that controls the distance between their overall trends. Based on the priors induced by the penalty, we build Bayesian prediction models for the output at the highest accuracy. Simulated and real examples show that the proposed method is better than existing methods in terms of prediction accuracy for many cases.
Journal: Journal of Applied Statistics
Pages: 1973-1987
Issue: 11
Volume: 46
Year: 2019
Month: 8
X-DOI: 10.1080/02664763.2019.1575340
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1575340
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:11:p:1973-1987
Template-Type: ReDIF-Article 1.0
Author-Name: Yue Shi
Author-X-Name-First: Yue
Author-X-Name-Last: Shi
Author-Name: Chi Tim Ng
Author-X-Name-First: Chi Tim
Author-X-Name-Last: Ng
Author-Name: Zhiguo Feng
Author-X-Name-First: Zhiguo
Author-X-Name-Last: Feng
Author-Name: Ka-Fai Cedric Yiu
Author-X-Name-First: Ka-Fai Cedric
Author-X-Name-Last: Yiu
Title: A descent algorithm for constrained LAD-Lasso estimation with applications in portfolio selection
Abstract:
To improve the out-of-sample performance of the portfolio, Lasso regularization is incorporated to the Mean Absolute Deviance (MAD)-based portfolio selection method. It is shown that such a portfolio selection problem can be reformulated as a constrained Least Absolute Deviance problem with linear equality constraints. Moreover, we propose a new descent algorithm based on the ideas of ‘nonsmooth optimality conditions’ and ‘basis descent direction set’. The resulting MAD-Lasso method enjoys at least two advantages. First, it does not involve the estimation of covariance matrix that is difficult particularly in the high-dimensional settings. Second, sparsity is encouraged. This means that assets with weights close to zero in the Markovwitz's portfolio are driven to zero automatically. This reduces the management cost of the portfolio. Extensive simulation and real data examples indicate that if the Lasso regularization is incorporated, MAD portfolio selection method is consistently improved in terms of out-of-sample performance, measured by Sharpe ratio and sparsity. Moreover, simulation results suggest that the proposed descent algorithm is more time-efficient than interior point method and ADMM algorithm.
Journal: Journal of Applied Statistics
Pages: 1988-2009
Issue: 11
Volume: 46
Year: 2019
Month: 8
X-DOI: 10.1080/02664763.2019.1575952
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1575952
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:11:p:1988-2009
Template-Type: ReDIF-Article 1.0
Author-Name: A. Hajrajabi
Author-X-Name-First: A.
Author-X-Name-Last: Hajrajabi
Author-Name: M. Maleki
Author-X-Name-First: M.
Author-X-Name-Last: Maleki
Title: Nonlinear semiparametric autoregressive model with finite mixtures of scale mixtures of skew normal innovations
Abstract:
We propose data generating structures which can be represented as the nonlinear autoregressive models with single and finite mixtures of scale mixtures of skew normal innovations. This class of models covers symmetric/asymmetric and light/heavy-tailed distributions, so provide a useful generalization of the symmetrical nonlinear autoregressive models. As semiparametric and nonparametric curve estimation are the approaches for exploring the structure of a nonlinear time series data set, in this article the semiparametric estimator for estimating the nonlinear function of the model is investigated based on the conditional least square method and nonparametric kernel approach. Also, an Expectation–Maximization-type algorithm to perform the maximum likelihood (ML) inference of unknown parameters of the model is proposed. Furthermore, some strong and weak consistency of the semiparametric estimator in this class of models are presented. Finally, to illustrate the usefulness of the proposed model, some simulation studies and an application to real data set are considered.
Journal: Journal of Applied Statistics
Pages: 2010-2029
Issue: 11
Volume: 46
Year: 2019
Month: 8
X-DOI: 10.1080/02664763.2019.1575953
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1575953
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:11:p:2010-2029
Template-Type: ReDIF-Article 1.0
Author-Name: Tom Fong
Author-X-Name-First: Tom
Author-X-Name-Last: Fong
Author-Name: Ceara Hui
Author-X-Name-First: Ceara
Author-X-Name-Last: Hui
Author-Name: Alfred Y.-T. Wong
Author-X-Name-First: Alfred Y.-T.
Author-X-Name-Last: Wong
Title: How might sovereign bond yields in Asia Pacific react to US monetary normalisation under turbulent market conditions?
Abstract:
This paper examines the potential impact of US monetary normalisation on sovereign bond yields in Asia Pacific. We apply the quantile vector autoregressive model with principal component analysis to the assessment of tail risk of sovereign debt, which may not be detectable using traditional OLS-based analysis. Our empirical evidence suggests that US Treasury bond yields can have a significant impact on sovereign bond yields in the region, an important channel through which monetary normalisation by the Fed can affect Asia-Pacific economies. Increases in sovereign bond yields will not only compromise the ability of the sovereigns in the region to service their debt but also translate into higher costs of borrowing for the rest of economy. The results show how much the outsized impact could potentially be if US monetary normalisation somehow turns out to be much more disorderly than expected.
Journal: Journal of Applied Statistics
Pages: 2030-2055
Issue: 11
Volume: 46
Year: 2019
Month: 8
X-DOI: 10.1080/02664763.2019.1579305
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1579305
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:11:p:2030-2055
Template-Type: ReDIF-Article 1.0
Author-Name: Yong Wang
Author-X-Name-First: Yong
Author-X-Name-Last: Wang
Author-Name: Xuxu Wang
Author-X-Name-First: Xuxu
Author-X-Name-Last: Wang
Title: Classification using semiparametric mixtures
Abstract:
A new density-based classification method that uses semiparametric mixtures is proposed. Like other density-based classifiers, it first estimates the probability density function for the observations in each class, with a semiparametric mixture, and then classifies a new observation by the highest posterior probability. By making a proper use of a multivariate nonparametric density estimator that has been developed recently, it is able to produce adaptively smooth and complicated decision boundaries in a high-dimensional space and can thus work well in such cases. Issues specific to classification are studied and discussed. Numerical studies using simulated and real-world data show that the new classifier performs very well as compared with other commonly used classification methods.
Journal: Journal of Applied Statistics
Pages: 2056-2074
Issue: 11
Volume: 46
Year: 2019
Month: 8
X-DOI: 10.1080/02664763.2019.1579306
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1579306
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:11:p:2056-2074
Template-Type: ReDIF-Article 1.0
Author-Name: Leila Amiri
Author-X-Name-First: Leila
Author-X-Name-Last: Amiri
Author-Name: Mojtaba Khazaei
Author-X-Name-First: Mojtaba
Author-X-Name-Last: Khazaei
Author-Name: Mojtaba Ganjali
Author-X-Name-First: Mojtaba
Author-X-Name-Last: Ganjali
Title: Mixtures of general location model with factor analyzer covariance structure for clustering mixed type data
Abstract:
Cluster analysis is one of the most widely used method in statistical analyses, in which homogeneous subgroups are identified in a heterogeneous population. Due to the existence of the continuous and discrete mixed data in many applications, so far, some ordinary clustering methods such as, hierarchical methods, k-means and model-based methods have been extended for analysis of mixed data. However, in the available model-based clustering methods, by increasing the number of continuous variables, the number of parameters increases and identifying as well as fitting an appropriate model may be difficult. In this paper, to reduce the number of the parameters, for the model-based clustering mixed data of continuous (normal) and nominal data, a set of parsimonious models is introduced. Models in this set are extended, using the general location model approach, for modeling distribution of mixed variables and applying factor analyzer structure for covariance matrices. The ECM algorithm is used for estimating the parameters of these models. In order to show the performance of the proposed models for clustering, results from some simulation studies and analyzing two real data sets are presented.
Journal: Journal of Applied Statistics
Pages: 2075-2100
Issue: 11
Volume: 46
Year: 2019
Month: 8
X-DOI: 10.1080/02664763.2019.1579307
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1579307
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:11:p:2075-2100
Template-Type: ReDIF-Article 1.0
Author-Name: Robert G. Aykroyd
Author-X-Name-First: Robert G.
Author-X-Name-Last: Aykroyd
Author-Name: M. Rosario Gonzales-Rogriguez
Author-X-Name-First: M. Rosario
Author-X-Name-Last: Gonzales-Rogriguez
Author-Name: Biagio Simonetti
Author-X-Name-First: Biagio
Author-X-Name-Last: Simonetti
Author-Name: Massimo Squillante
Author-X-Name-First: Massimo
Author-X-Name-Last: Squillante
Title: Editorial
Journal: Journal of Applied Statistics
Pages: 2347-2347
Issue: 13
Volume: 43
Year: 2016
Month: 10
X-DOI: 10.1080/02664763.2016.1213003
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1213003
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:13:p:2347-2347
Template-Type: ReDIF-Article 1.0
Author-Name: M. A. Di Palma
Author-X-Name-First: M. A.
Author-X-Name-Last: Di Palma
Author-Name: M. Gallo
Author-X-Name-First: M.
Author-X-Name-Last: Gallo
Title: A co-median approach to detect compositional outliers
Abstract:
Compositional data consist of vectors of positive values summing up to a unit or to some fixed constant. They find application in chemometrics, geology, economics, psychometrics and many other field of studies. In statistical analysis many theoretical efforts have been dedicated to identify procedures able to accomodate outliers included in the estimation of the model even in compositional data. The principal purpose of this work is to introduce an alternative robust procedure, defined as COMCoDa, capable to cope with compositional outliers and based on median absolute deviation (MAD) and correlation median. The new method is first evaluated in a simulation study and then on real data sets. The algorithm requires considerably less computational time than other procedures already existing in literature, it works well for huge compositional data sets at any level of contamination.
Journal: Journal of Applied Statistics
Pages: 2348-2362
Issue: 13
Volume: 43
Year: 2016
Month: 10
X-DOI: 10.1080/02664763.2016.1163525
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1163525
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:13:p:2348-2362
Template-Type: ReDIF-Article 1.0
Author-Name: A.A. Romano
Author-X-Name-First: A.A.
Author-X-Name-Last: Romano
Author-Name: G. Scandurra
Author-X-Name-First: G.
Author-X-Name-Last: Scandurra
Title: Divergences in the determinants of investments in renewable energy sources: hydroelectric vs. other renewable sources
Abstract:
In this paper, we analyze the drivers promoting the investments in renewable energy sources (RES) and the divergences on the basis of generation sources (hydroelectric and other renewable sources). To address these issues, a dynamic panel analysis of the renewable investments in a sample of 32 countries (Organisation for economic co-operation and development and Brasil, Russia, India, China and South Africa) with distinct economic and social structures, in the years between 2000 and 2008, is proposed. Results confirm that key factors promoting investments in RES vary according to generation sources considered. Investments in hydroelectric sources contribute to improve the environmental conditions, while the other sources are not significant. The policies are useful to support the investments in renewable energy. Results also show that share of nuclear and thermal electricity generation depress the investments in renewables.
Journal: Journal of Applied Statistics
Pages: 2363-2376
Issue: 13
Volume: 43
Year: 2016
Month: 10
X-DOI: 10.1080/02664763.2016.1163526
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1163526
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:13:p:2363-2376
Template-Type: ReDIF-Article 1.0
Author-Name: Md Hasinur Rahaman Khan
Author-X-Name-First: Md Hasinur
Author-X-Name-Last: Rahaman Khan
Author-Name: A. M. Azharul Islam
Author-X-Name-First: A. M. Azharul
Author-X-Name-Last: Islam
Author-Name: Faisal Ababneh
Author-X-Name-First: Faisal
Author-X-Name-Last: Ababneh
Title: Substantial gender gap reduction in Bangladesh explained by the proximity measure of literacy and life expectancy
Abstract:
The Human Development Index (HDI) is an indicator that substantially captures the overall country level status on human welfare based on issues of equity, poverty, and gender. This study uses a proximity measure of simultaneous effect of literacy and life expectancy called literate life expectancy (LLE) as a measure of human quality. This study discusses the distribution of LLE along with giving a detail gender and spatial differentials. With the proximity indicator we quantify gander gap between the year 1981 and 2008. Over the 27 years more than substantial improvement in LLE are found among women than with far less improvement rate among men in both national and residence level. We also learn that measured over time, the indicator allows statements about the rate of change and not just static differences. The LLE is useful as this index could be used to calculate future social development by adopting different mortality and educational scenarios such as health treatment facilities, nutritious food, easy access to clean drinking water, air pollution, greenhouse emissions, psychological stress, and most importantly, poverty, which can be associated with specific policy assumptions.
Journal: Journal of Applied Statistics
Pages: 2377-2395
Issue: 13
Volume: 43
Year: 2016
Month: 10
X-DOI: 10.1080/02664763.2016.1163527
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1163527
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:13:p:2377-2395
Template-Type: ReDIF-Article 1.0
Author-Name: M. Rosario González-Rodríguez
Author-X-Name-First: M. Rosario
Author-X-Name-Last: González-Rodríguez
Author-Name: M. Carmen Díaz Fernández
Author-X-Name-First: M. Carmen
Author-X-Name-Last: Díaz Fernández
Author-Name: Biagio Simonetti
Author-X-Name-First: Biagio
Author-X-Name-Last: Simonetti
Title: Corporate Social Responsibility perception versus human values: a structural equation modeling approach
Abstract:
In the business world, increasing importance is being given to Corporate Social Responsibility (CSR). Consumer perception of CSR is determinant on the success of CSR practices and this perception is directly influenced by individual value structures. Despite research efforts and the continued preoccupation of CSR role in business and Society, few studies to date have analyzed jointly CSR perception and the value structure. As a result, the paper brings new knowledge of the relationship between basic human values and CSR's perception under a particular social initiative carried out by a company. To reach our purpose a Hierarchical Component Model which includes the variable gender to control for heterogeneity is adopted. The model focuses on not only by analyzing the effects of human values on CSR but also analyzes the influence of values by gender on CSR perception. This approach to study the relationship of CSR versus values considering the Schwartz's higher-order values and the moderating role of gender constitutes a new perspective. The main results of this study reveal the influence of values on CSR, the strength of those relationships and the importance of analyzing the moderator effect to control for heterogeneity.
Journal: Journal of Applied Statistics
Pages: 2396-2415
Issue: 13
Volume: 43
Year: 2016
Month: 10
X-DOI: 10.1080/02664763.2016.1163528
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1163528
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:13:p:2396-2415
Template-Type: ReDIF-Article 1.0
Author-Name: Amir T. Payandeh Najafabadi
Author-X-Name-First: Amir T.
Author-X-Name-Last: Payandeh Najafabadi
Author-Name: Maryam Omidi Najafabadi
Author-X-Name-First: Maryam Omidi
Author-X-Name-Last: Najafabadi
Title: On the Bayesian estimation for Cronbach's alpha
Abstract:
This article considers the problem of estimating Cronbach's alpha under a Bayeisan framework. Such Bayes estimator arrives through out approximating distribution of the maximum likelihood estimator for Cronbach's alpha by an F distribution. Then, employing a noninformative prior distribution, Bayes estimator under squared-error and LINEX loss functions have been evaluated. Simulation studies suggest that the Bayes estimator under LINEX loss function reduce biasness of the ordinary maximum likelihood estimator. Moreover, The LINEX Bayes estimator does not sensitive with respect to choice of hyperparameters of prior distribution. R codes for readers to calculate Bayesian Cronbach's alpha have been given.
Journal: Journal of Applied Statistics
Pages: 2416-2441
Issue: 13
Volume: 43
Year: 2016
Month: 10
X-DOI: 10.1080/02664763.2016.1163529
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1163529
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:13:p:2416-2441
Template-Type: ReDIF-Article 1.0
Author-Name: Sergio Scippacercola
Author-X-Name-First: Sergio
Author-X-Name-Last: Scippacercola
Author-Name: Enrica Sepe
Author-X-Name-First: Enrica
Author-X-Name-Last: Sepe
Title: Ordinal principal component analysis for a common ranking of stochastic frontiers
Abstract:
The Stochastic Frontier Analysis (SFA) is a model to evaluate the Technical Efficiency (TE) for Production Units (PU). When SFA is applied on different output variables with same input, the analysis estimates different TEs for the PU. We refer to these TEs as the Multiple Technical Efficiency (MTE) of the PU. In this work, we present a method to unify the MTE in one ranking, in order to compute a synthetic index of the TE based on a parametric model. Our approach transforms the measures of efficiency into values on an ordinal scale. Then, using the Ordinal Principal Component Analysis and a genetic algorithm, we merge the multiple rankings.
Journal: Journal of Applied Statistics
Pages: 2442-2451
Issue: 13
Volume: 43
Year: 2016
Month: 10
X-DOI: 10.1080/02664763.2016.1163530
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1163530
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:13:p:2442-2451
Template-Type: ReDIF-Article 1.0
Author-Name: Alejandra Figliola
Author-X-Name-First: Alejandra
Author-X-Name-Last: Figliola
Author-Name: Lucas Catalano
Author-X-Name-First: Lucas
Author-X-Name-Last: Catalano
Title: Evolution of multifractal cross-correlations between the Argentina MERVAL Index and international commodities prices
Abstract:
We compute the auto-correlations and cross-correlations of the volatility time series of the Argentina MERVAL Index (the Buenos Aires Stock Exchange main index) and three agricultural commodities, in a multifractal context using the Detrended Cross-Correlation Analysis [12]. We observe a clear increase of the cross-correlations between the Merval series and the grain quotations which can be ascribed to a stronger coupling between the agricultural sector and the rest of the Argentinian economy. We connect this to fiscal decisions implemented since 2004 and reinforced after 2009.
Journal: Journal of Applied Statistics
Pages: 2452-2461
Issue: 13
Volume: 43
Year: 2016
Month: 10
X-DOI: 10.1080/02664763.2016.1181725
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1181725
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:13:p:2452-2461
Template-Type: ReDIF-Article 1.0
Author-Name: C.K. Chandrasekhar
Author-X-Name-First: C.K.
Author-X-Name-Last: Chandrasekhar
Author-Name: H. Bagyalakshmi
Author-X-Name-First: H.
Author-X-Name-Last: Bagyalakshmi
Author-Name: M.R. Srinivasan
Author-X-Name-First: M.R.
Author-X-Name-Last: Srinivasan
Author-Name: M. Gallo
Author-X-Name-First: M.
Author-X-Name-Last: Gallo
Title: Partial ridge regression under multicollinearity
Abstract:
In multiple linear regression analysis, linear dependencies in the regressor variables lead to ill-conditioning known as multicollinearity. Multicollinearity inflates variance of the estimates as well as causes changes in direction of signs of the coefficient estimates leading to unreliable, and many times erroneous inference. Principal components regression and ridge or shrinkage approach have not provided completely satisfactory results in dealing with the multicollinearity. There are host of issues in ridge regression like choosing bias k and stability or consistency of the variances which still remain unresolved. In this paper, a partial ridge regression estimation is proposed, which involves selectively adjusting the ridge constants associated with highly collinear variables to control instability in the variances of coefficient estimates. Results based on synthetic data from simulations, and a real-world data set from the manufacturing industry show that the proposed method outperforms the existing solutions in terms of bias, mean square error, and relative efficiency of the estimated parameters.
Journal: Journal of Applied Statistics
Pages: 2462-2473
Issue: 13
Volume: 43
Year: 2016
Month: 10
X-DOI: 10.1080/02664763.2016.1181726
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1181726
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:13:p:2462-2473
Template-Type: ReDIF-Article 1.0
Author-Name: Mehmet Ozer Demir
Author-X-Name-First: Mehmet Ozer
Author-X-Name-Last: Demir
Author-Name: Murat Alper Basaran
Author-X-Name-First: Murat Alper
Author-X-Name-Last: Basaran
Author-Name: Biagio Simonetti
Author-X-Name-First: Biagio
Author-X-Name-Last: Simonetti
Title: Determining factors affecting healthcare service satisfaction utilizing fuzzy rule-based systems
Abstract:
Health communication, which is a multi-attribute concept, is a generic title describing clinical practice. The literature shows that the relation between health communication and healthcare service satisfaction (HSS) has been found to be significant. The main objective of pursuing better health communication is to achieve the best outcome and patient satisfaction where healthcare systems are supposed to deliver. However, the health communication is a complex process. Also, measuring patients’ satisfaction is not an easy task since satisfaction is a complex notion with several factors. In this study, questions in the questionnaire directed to patients are factor-analyzed in order to obtain components which are used as independent attributes that will be modeled by fuzzy rule-based systems (FRBS) in order to explain HSS. Utilizing FRBS brings two different advantages, one of which is to use mathematical functions called membership functions for linguistically expressed responses. The second one is to observe the transition among the linguistic values expressed by patients. The four independent variables, namely, doctor–patient communication (DPC), information seeking behavior (ISB), equal behavior and tolerance to cultural differences (TCD) and the dependent variable HSS are employed in the modeling. Although both DPC and ISB have positive effects on HSS, TCD has none. One interesting finding about DPC is that if DPC scores below the average value tend to lower, it does not have a decreasing effect on HSS, which means that if a patient does expect to have average DPC, his or her evaluation on HSS does not lower, which says that if a patient knows that the doctor has a poor communication skill, the patient does not pay attention to this attribute.
Journal: Journal of Applied Statistics
Pages: 2474-2489
Issue: 13
Volume: 43
Year: 2016
Month: 10
X-DOI: 10.1080/02664763.2016.1181727
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1181727
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:13:p:2474-2489
Template-Type: ReDIF-Article 1.0
Author-Name: Pasquale Sarnacchiaro
Author-X-Name-First: Pasquale
Author-X-Name-Last: Sarnacchiaro
Author-Name: Antonello D’Ambra
Author-X-Name-First: Antonello
Author-X-Name-Last: D’Ambra
Author-Name: Luigi D’Ambra
Author-X-Name-First: Luigi
Author-X-Name-Last: D’Ambra
Title: CATANOVA for ordinal variables using orthogonal polynomials with different scoring methods
Abstract:
In the context of categorical data analysis, the CATegorical ANalysis Of Variance (CATANOVA) has been proposed to analyse the scheme variable-factor, both for nominal and ordinal variables. This method is based on the C statistic and allows to test the statistical significance of the tau index using its relationship with the C statistic. Through Emerson orthogonal polynomials (EOP) a useful decomposition of C statistic into bivariate moments (location, dispersion and higher order components) has been developed. In the construction of EOP the categories are replaced by scores, typically natural scores. In the paper, we provide an overview of the main scoring schemes focusing on the advantages and the statistical properties; we pay special attention to the impact of the chosen scores on the C statistic of CATANOVA and the graphical representations of doubly ordered non-symmetrical correspondence analysis. Through a real data example, we show the impact of the scoring schemes and we consider the RV and multidimensional scaling as tools to measure similarity among the results achieved with each method.
Journal: Journal of Applied Statistics
Pages: 2490-2502
Issue: 13
Volume: 43
Year: 2016
Month: 10
X-DOI: 10.1080/02664763.2016.1184627
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1184627
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:43:y:2016:i:13:p:2490-2502
Template-Type: ReDIF-Article 1.0
Author-Name: Puyu Wang
Author-X-Name-First: Puyu
Author-X-Name-Last: Wang
Author-Name: Hai Zhang
Author-X-Name-First: Hai
Author-X-Name-Last: Zhang
Author-Name: Yong Liang
Author-X-Name-First: Yong
Author-X-Name-Last: Liang
Title: Model selection with distributed SCAD penalty
Abstract:
In this paper, we focus on the feature extraction and variable selection of massive data which is divided and stored in different linked computers. Specifically, we study the distributed model selection with the Smoothly Clipped Absolute Deviation (SCAD) penalty. Based on the Alternating Direction Method of Multipliers (ADMM) algorithm, we propose distributed SCAD algorithm and prove its convergence. The results of variable selection of the distributed approach are same with the results of the non-distributed approach. Numerical studies show that our method is both effective and efficient which performs well in distributed data analysis.
Journal: Journal of Applied Statistics
Pages: 1938-1955
Issue: 11
Volume: 45
Year: 2018
Month: 8
X-DOI: 10.1080/02664763.2017.1401052
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1401052
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:11:p:1938-1955
Template-Type: ReDIF-Article 1.0
Author-Name: D. M. Swanson
Author-X-Name-First: D. M.
Author-X-Name-Last: Swanson
Author-Name: C. D. Anderson
Author-X-Name-First: C. D.
Author-X-Name-Last: Anderson
Author-Name: R. A. Betensky
Author-X-Name-First: R. A.
Author-X-Name-Last: Betensky
Title: Hypothesis Tests for Neyman's Bias in Case–Control Studies
Abstract:
Survival bias is a long recognized problem in case–control studies, and many varieties of bias can come under this umbrella term. We focus on one of them, termed Neyman's bias or ‘prevalence–incidence bias’. It occurs in case–control studies when exposure affects both disease and disease-induced mortality, and we give a formula for the observed, biased odds ratio under such conditions. We compare our result with previous investigations into this phenomenon and consider models under which this bias may or may not be important. Finally, we propose three hypothesis tests to identify when Neyman's bias may be present in case–control studies. We apply these tests to three data sets, one of stroke mortality, another of brain tumors, and the last of atrial fibrillation, and find some evidence of Neyman's bias in the former two cases, but not the last case.
Journal: Journal of Applied Statistics
Pages: 1956-1977
Issue: 11
Volume: 45
Year: 2018
Month: 8
X-DOI: 10.1080/02664763.2017.1401053
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1401053
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:11:p:1956-1977
Template-Type: ReDIF-Article 1.0
Author-Name: Wen Su
Author-X-Name-First: Wen
Author-X-Name-Last: Su
Author-Name: Hangjin Jiang
Author-X-Name-First: Hangjin
Author-X-Name-Last: Jiang
Title: Semiparametric analysis of longitudinal data with informative observation times and censoring times
Abstract:
We focus on regression analysis of irregularly observed longitudinal data which often occur in medical follow-up studies and observational investigations. The model for such data involves two processes: a longitudinal response process of interest and an observation process controlling observation times. Restrictive models and questionable assumptions, such as Poisson assumption and independent censoring time assumption, were posed in previous works for analysing longitudinal data. In this paper, we propose a more general model together with a robust estimation approach for longitudinal data with informative observation times and censoring times, and the asymptotic normalities of the proposed estimators are established. Both simulation studies and real data application indicate that the proposed method is promising.
Journal: Journal of Applied Statistics
Pages: 1978-1993
Issue: 11
Volume: 45
Year: 2018
Month: 8
X-DOI: 10.1080/02664763.2017.1403574
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1403574
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:11:p:1978-1993
Template-Type: ReDIF-Article 1.0
Author-Name: Yunlu Jiang
Author-X-Name-First: Yunlu
Author-X-Name-Last: Jiang
Author-Name: Yu Conglian
Author-X-Name-First: Yu
Author-X-Name-Last: Conglian
Author-Name: Ji Qinghua
Author-X-Name-First: Ji
Author-X-Name-Last: Qinghua
Title: Model selection for the localized mixture of experts models
Abstract:
In this paper, we propose a penalized likelihood method to simultaneous select covariate, and mixing component and obtain parameter estimation in the localized mixture of experts models. We develop an expectation maximization algorithm to solve the proposed penalized likelihood procedure, and introduce a data-driven procedure to select the tuning parameters. Extensive numerical studies are carried out to compare the finite sample performances of our proposed method and other existing methods. Finally, we apply the proposed methodology to analyze the Boston housing price data set and the baseball salaries data set.
Journal: Journal of Applied Statistics
Pages: 1994-2006
Issue: 11
Volume: 45
Year: 2018
Month: 8
X-DOI: 10.1080/02664763.2017.1405914
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1405914
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:11:p:1994-2006
Template-Type: ReDIF-Article 1.0
Author-Name: Feng-shou Ko
Author-X-Name-First: Feng-shou
Author-X-Name-Last: Ko
Title: Discussion on the issue of sample size determination for a targeted to an untargeted and to a mixed effect model-based clinical trial design
Abstract:
More and more studies have shown that genetic determinants may mediate variability among persons in the response to a drug. In other words, some therapeutics benefit only a subset of treated patients. Genomic technologies – such as DNA sequencing, mRNA transcript profiling, and comparative genomic hybridization – are providing biomarkers that can be used to predict which patients are most likely to respond to a given drug. In this paper, the sample size determination of a targeted clinical trial, an untargeted clinical trial and a random effect model is conducted. Treatment effect for the responder and non-responder patients, the assay specificity and sensitivity, and the proportion of the population for non-responder can affect sample size determination of the experimental design.
Journal: Journal of Applied Statistics
Pages: 2007-2019
Issue: 11
Volume: 45
Year: 2018
Month: 8
X-DOI: 10.1080/02664763.2017.1405915
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1405915
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:11:p:2007-2019
Template-Type: ReDIF-Article 1.0
Author-Name: Cristina Coscia
Author-X-Name-First: Cristina
Author-X-Name-Last: Coscia
Author-Name: Roberto Fontana
Author-X-Name-First: Roberto
Author-X-Name-Last: Fontana
Author-Name: Patrizia Semeraro
Author-X-Name-First: Patrizia
Author-X-Name-Last: Semeraro
Title: Graphical models for complex networks: an application to Italian museums
Abstract:
This paper applies probabilistic graphical models in a new framework to study association rules driven by consumer choices in a network of Italian museums. The network consists of the museums participating in the programme of Abbonamento Musei Torino Piemonte, which is a yearly subscription managed by Associazione Torino Città Capitale Europea. It is available to people living in the Piemonte region, Italy. Consumers are card-holders, who are allowed entry to all the museums in the network for one year. We employ graphical models to highlight associations between the museums driven by card-holder visiting behaviour. We use both simple undirected graphs and more complex directed graphs, and we do not make any hypothesis on the models but rather learn their structures directly from the data. We also use methodologies and tools for robust network identification and principal component analysis to complete the analysis of the phenomenon.
Journal: Journal of Applied Statistics
Pages: 2020-2038
Issue: 11
Volume: 45
Year: 2018
Month: 8
X-DOI: 10.1080/02664763.2017.1406901
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1406901
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:11:p:2020-2038
Template-Type: ReDIF-Article 1.0
Author-Name: Gislaine V. Duarte
Author-X-Name-First: Gislaine V.
Author-X-Name-Last: Duarte
Author-Name: Altemir Braga
Author-X-Name-First: Altemir
Author-X-Name-Last: Braga
Author-Name: Daniel L. Miquelluti
Author-X-Name-First: Daniel L.
Author-X-Name-Last: Miquelluti
Author-Name: Vitor A. Ozaki
Author-X-Name-First: Vitor A.
Author-X-Name-Last: Ozaki
Title: Modeling of soybean yield using symmetric, asymmetric and bimodal distributions: implications for crop insurance
Abstract:
Over the years, many papers used parametric distributions to model crop yields, such as: normal (N), Beta, Log-normal and the Skew-normal (SN). These models are well-defined, mathematically and also computationally, but its do not incorporate bimodality. Therefore, it is necessary to study distributions which are more flexible in modeling, since most of crop yield data in Brazil presents evidence of asymmetry or bimodality. Thus, the aim of this study was to model and forecast soybean yields for municipalities in the State of Paran, in the period from 1980 to 2014, using the Odd log normal logistic (OLLN) distribution for the bimodal data and the Beta, SN and Skew-t distributions for the symmetrical and asymmetrical series. The OLLN model was the one which best fit the data. The results were discussed in the context of crop insurance pricing.
Journal: Journal of Applied Statistics
Pages: 1920-1937
Issue: 11
Volume: 45
Year: 2018
Month: 8
X-DOI: 10.1080/02664763.2017.1406902
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1406902
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:11:p:1920-1937
Template-Type: ReDIF-Article 1.0
Author-Name: Thalita do Bem Mattos
Author-X-Name-First: Thalita do Bem
Author-X-Name-Last: Mattos
Author-Name: Aldo M. Garay
Author-X-Name-First: Aldo M.
Author-X-Name-Last: Garay
Author-Name: Victor H. Lachos
Author-X-Name-First: Victor H.
Author-X-Name-Last: Lachos
Title: Likelihood-based inference for censored linear regression models with scale mixtures of skew-normal distributions
Abstract:
In many studies, the data collected are subject to some upper and lower detection limits. Hence, the responses are either left or right censored. A complication arises when these continuous measures present heavy tails and asymmetrical behavior; simultaneously. For such data structures, we propose a robust-censored linear model based on the scale mixtures of skew-normal (SMSN) distributions. The SMSN is an attractive class of asymmetrical heavy-tailed densities that includes the skew-normal, skew-t, skew-slash, skew-contaminated normal and the entire family of scale mixtures of normal (SMN) distributions as special cases. We propose a fast estimation procedure to obtain the maximum likelihood (ML) estimates of the parameters, using a stochastic approximation of the EM (SAEM) algorithm. This approach allows us to estimate the parameters of interest easily and quickly, obtaining as a byproducts the standard errors, predictions of unobservable values of the response and the log-likelihood function. The proposed methods are illustrated through real data applications and several simulation studies.
Journal: Journal of Applied Statistics
Pages: 2039-2066
Issue: 11
Volume: 45
Year: 2018
Month: 8
X-DOI: 10.1080/02664763.2017.1408788
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1408788
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:11:p:2039-2066
Template-Type: ReDIF-Article 1.0
Author-Name: Jiajia Chen
Author-X-Name-First: Jiajia
Author-X-Name-Last: Chen
Author-Name: Xiaoqin Zhang
Author-X-Name-First: Xiaoqin
Author-X-Name-Last: Zhang
Author-Name: Karel Hron
Author-X-Name-First: Karel
Author-X-Name-Last: Hron
Author-Name: Matthias Templ
Author-X-Name-First: Matthias
Author-X-Name-Last: Templ
Author-Name: Shengjia Li
Author-X-Name-First: Shengjia
Author-X-Name-Last: Li
Title: Regression imputation with Q-mode clustering for rounded zero replacement in high-dimensional compositional data
Abstract:
The logratio methodology is not applicable when rounded zeros occur in compositional data. There are many methods to deal with rounded zeros. However, some methods are not suitable for analyzing data sets with high dimensionality. Recently, related methods have been developed, but they cannot balance the calculation time and accuracy. For further improvement, we propose a method based on regression imputation with Q-mode clustering. This method forms the groups of parts and builds partial least squares regression with these groups using centered logratio coordinates. We also prove that using centered logratio coordinates or isometric logratio coordinates in the response of partial least squares regression have the equivalent results for the replacement of rounded zeros. Simulation study and real example are conducted to analyze the performance of the proposed method. The results show that the proposed method can reduce the calculation time in higher dimensions and improve the quality of results.
Journal: Journal of Applied Statistics
Pages: 2067-2080
Issue: 11
Volume: 45
Year: 2018
Month: 8
X-DOI: 10.1080/02664763.2017.1410524
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1410524
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:11:p:2067-2080
Template-Type: ReDIF-Article 1.0
Author-Name: Emílio A. Coelho-Barros
Author-X-Name-First: Emílio A.
Author-X-Name-Last: Coelho-Barros
Author-Name: Josmar Mazucheli
Author-X-Name-First: Josmar
Author-X-Name-Last: Mazucheli
Author-Name: Jorge A. Achcar
Author-X-Name-First: Jorge A.
Author-X-Name-Last: Achcar
Author-Name: Kelly Vanessa Parede Barco
Author-X-Name-First: Kelly Vanessa Parede
Author-X-Name-Last: Barco
Author-Name: José Rafael Tovar Cuevas
Author-X-Name-First: José Rafael
Author-X-Name-Last: Tovar Cuevas
Title: The inverse power Lindley distribution in the presence of left-censored data
Abstract:
In this study, classical and Bayesian inference methods are introduced to analyze lifetime data sets in the presence of left censoring considering two generalizations of the Lindley distribution: a first generalization proposed by Ghitany et al. [Power Lindley distribution and associated inference, Comput. Statist. Data Anal. 64 (2013), pp. 20–33], denoted as a power Lindley distribution and a second generalization proposed by Sharma et al. [The inverse Lindley distribution: A stress–strength reliability model with application to head and neck cancer data, J. Ind. Prod. Eng. 32 (2015), pp. 162–173], denoted as an inverse Lindley distribution. In our approach, we have used a distribution obtained from these two generalizations denoted as an inverse power Lindley distribution. A numerical illustration is presented considering a dataset of thyroglobulin levels present in a group of individuals with differentiated cancer of thyroid.
Journal: Journal of Applied Statistics
Pages: 2081-2094
Issue: 11
Volume: 45
Year: 2018
Month: 8
X-DOI: 10.1080/02664763.2017.1410525
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1410525
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:11:p:2081-2094
Template-Type: ReDIF-Article 1.0
Author-Name: Aliakbar Mastani Shirazi
Author-X-Name-First: Aliakbar Mastani
Author-X-Name-Last: Shirazi
Author-Name: Aluisio Pinheiro
Author-X-Name-First: Aluisio
Author-X-Name-Last: Pinheiro
Title: A proportional hazard cure model for ordinal responses by self-modeling regression
Abstract:
In a medical study, patients have various stages of illness. After treatment the patient will be cured or the stage of illness will change. Since there are suitable evidences of a susceptible population by several levels, the authors combine a Self-Modeling ordinal model for the probability of occurrence of an event with a Cox regression for the time of occurrence of an event. We proposed the use of self-modeling ordinal longitudinal where the conditional cumulative probabilities for a category of an outcome have a relation with shape-invariant model. A simulation study is carried out for justification of the methodology. A schizophrenia illness data are analyzed based on our model to see whether the treatment affects the illness.
Journal: Journal of Applied Statistics
Pages: 2095-2106
Issue: 11
Volume: 45
Year: 2018
Month: 8
X-DOI: 10.1080/02664763.2017.1410526
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1410526
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:11:p:2095-2106
Template-Type: ReDIF-Article 1.0
Author-Name: Jooyong Shim
Author-X-Name-First: Jooyong
Author-X-Name-Last: Shim
Author-Name: Changha Hwang
Author-X-Name-First: Changha
Author-X-Name-Last: Hwang
Author-Name: Sunjoo Jeong
Author-X-Name-First: Sunjoo
Author-X-Name-Last: Jeong
Author-Name: Insuk Sohn
Author-X-Name-First: Insuk
Author-X-Name-Last: Sohn
Title: Semivarying coefficient least-squares support vector regression for analyzing high-dimensional gene-environmental data
Abstract:
In the context of genetics and genomic medicine, gene-environment (G×E) interactions have a great impact on the risk of human diseases. Some existing methods for identifying G×E interactions are considered to be limited, since they analyze one or a few number of G factors at a time, assume linear effects of E factors, and use inefficient selection methods. In this paper, we propose a new method to identify significant main effects and G×E interactions. This is based on a semivarying coefficient least-squares support vector regression (LS-SVR) technique, which is devised by utilizing flexible semiparametric LS-SVR approach for censored survival data. This semivarying coefficient model is used to deal with the nonlinear effects of E factors. We also derive a generalized cross validation (GCV) function for determining the optimal values of hyperparameters of the proposed method. This GCV function is also used to identify significant main effects and G×E interactions. The proposed method is evaluated through numerical studies.
Journal: Journal of Applied Statistics
Pages: 1370-1381
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1371676
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1371676
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1370-1381
Template-Type: ReDIF-Article 1.0
Author-Name: Flavio Mignone
Author-X-Name-First: Flavio
Author-X-Name-Last: Mignone
Author-Name: Fabio Rapallo
Author-X-Name-First: Fabio
Author-X-Name-Last: Rapallo
Title: Detection of outlying proportions
Abstract:
In this paper we introduce a new method for detecting outliers in a set of proportions. It is based on the construction of a suitable two-way contingency table and on the application of an algorithm for the detection of outlying cells in such table. We exploit the special structure of the relevant contingency table to increase the efficiency of the method. The main properties of our algorithm, together with a guide for the choice of the parameters, are investigated through simulations, and in simple cases some theoretical justifications are provided. Several examples on synthetic data and an example based on pseudo-real data from biological experiments demonstrate the good performances of our algorithm.
Journal: Journal of Applied Statistics
Pages: 1382-1395
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1371677
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1371677
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1382-1395
Template-Type: ReDIF-Article 1.0
Author-Name: Philippe Casin
Author-X-Name-First: Philippe
Author-X-Name-Last: Casin
Title: Categorical multiblock linear discriminant analysis
Abstract:
Techniques of credit scoring have been developed these last years in order to reduce the risk taken by banks and financial institutions in the loans that they are granting. Credit Scoring is a classification problem of individuals in one of the two following groups: defaulting borrowers or non-defaulting borrowers. The aim of this paper is to propose a new method of discrimination when the dependent variable is categorical and when a large number of categorical explanatory variables are retained. This method, Categorical Multiblock Linear Discriminant Analysis, computes components which take into account both relationships between explanatory categorical variables and canonical correlation between each explanatory categorical variable and the dependent variable. A comparison with three other techniques and an application on credit scoring data are provided.
Journal: Journal of Applied Statistics
Pages: 1396-1409
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1371678
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1371678
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1396-1409
Template-Type: ReDIF-Article 1.0
Author-Name: Xi Shen
Author-X-Name-First: Xi
Author-X-Name-Last: Shen
Author-Name: Chang-Xing Ma
Author-X-Name-First: Chang-Xing
Author-X-Name-Last: Ma
Title: Testing homogeneity of difference of two proportions for stratified correlated paired binary data
Abstract:
In ophthalmologic or otolaryngologic study, each subject may contribute paired organs measurements to the analysis. A number of statistical methods have been proposed on bilateral correlated data. In practice, it is important to detect confounding effect by treatment interaction, since ignoring confounding effect may lead to unreliable conclusion. Therefore, stratified data analysis can be considered to adjust the effect of confounder on statistical inference. In this article, we investigate and derive three test procedures for testing homogeneity of difference of two proportions for stratified correlated paired binary data in the basis of equal correlation model assumption. The performance of proposed test procedures is examined through Monte Carlo simulation. The simulation results show that the Score test is usually robust on type I error control with high power, and therefore is recommended among the three methods. One example from otolaryngologic study is given to illustrate the three test procedures.
Journal: Journal of Applied Statistics
Pages: 1410-1425
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1371679
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1371679
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1410-1425
Template-Type: ReDIF-Article 1.0
Author-Name: Kalanka P. Jayalath
Author-X-Name-First: Kalanka P.
Author-X-Name-Last: Jayalath
Author-Name: Hon Keung Tony Ng
Author-X-Name-First: Hon Keung Tony
Author-X-Name-Last: Ng
Title: Analysis of means approach for random factor analysis
Abstract:
Analysis of means (ANOM) is a powerful tool for comparing means and variances in fixed-effects models. The graphical exhibit of ANOM is considered as a great advantage because of its interpretability and its ability to evaluate the practical significance of the mean effects. However, the presence of random factors may be problematic for the ANOM method. In this paper, we propose an ANOM approach that can be applied to test random effects in many different balanced statistical models including fixed-, random- and mixed-effects models. The proposed approach utilizes the range of the treatment averages for identifying the dispersions of the underlying populations. The power performance of the proposed procedure is compared to the analysis of variance (ANOVA) approach in a wide range of situations via a Monte Carlo simulation study. Illustrative examples are used to demonstrate the usefulness of the proposed approach and its graphical exhibits, provide meaningful interpretations, and discuss the statistical and practical significance of factor effects.
Journal: Journal of Applied Statistics
Pages: 1426-1446
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1375083
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1375083
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1426-1446
Template-Type: ReDIF-Article 1.0
Author-Name: Stijn Luca
Author-X-Name-First: Stijn
Author-X-Name-Last: Luca
Title: Modified chain sampling plans for lot inspection by variables and attributes
Abstract:
The purpose of acceptance sampling is to develop decision rules to accept or reject production lots based on sample data. When testing is destructive or expensive, dependent sampling procedures cumulate results from several preceding lots. This chaining of past lot results reduces the required size of the samples. A large part of these procedures only chain past lot results when defects are found in the current sample. However, such selective use of past lot results only achieves a limited reduction of sample sizes. In this article, a modified approach for chaining past lot results is proposed that is less selective in its use of quality history and, as a result, requires a smaller sample size than the one required for commonly used dependent sampling procedures, such as multiple dependent sampling plans and chain sampling plans of Dodge. The proposed plans are applicable for inspection by attributes and inspection by variables. Several properties of their operating characteristic-curves are derived, and search procedures are given to select such modified chain sampling plans by using the two-point method.
Journal: Journal of Applied Statistics
Pages: 1447-1464
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1375084
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1375084
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1447-1464
Template-Type: ReDIF-Article 1.0
Author-Name: James O. Chipperfield
Author-X-Name-First: James O.
Author-X-Name-Last: Chipperfield
Author-Name: Margo L. Barr
Author-X-Name-First: Margo L.
Author-X-Name-Last: Barr
Author-Name: David. G. Steel
Author-X-Name-First: David. G.
Author-X-Name-Last: Steel
Title: Split Questionnaire Designs: collecting only the data that you need through MCAR and MAR designs
Abstract:
We call a sample design that allows for different patterns, or sets, of data items to be collected from different sample units a Split Questionnaire Design (SQD). SQDs can be thought of as incorporating missing data into survey design. This paper examines the situation where data that are not collected by an SQD can be treated as Missing Completely At Random or Missing At Random, targets are regression coefficients in a generalised linear model fitted to binary variables, and targets are estimated using Maximum Likelihood. A key finding is that it can be easy to measure the relative contribution of a respondent to the accuracy of estimated model parameters before collecting all the respondent's model covariates. We show empirically and theoretically that we could achieve a significant reduction in respondent burden with a negligible impact on the accuracy of estimates by not collecting model covariates from respondents who we identify as contributing little to the accuracy of estimates. We discuss the general implications for SQDs.
Journal: Journal of Applied Statistics
Pages: 1465-1475
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1375085
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1375085
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1465-1475
Template-Type: ReDIF-Article 1.0
Author-Name: S. Reza H. Shojaei
Author-X-Name-First: S. Reza H.
Author-X-Name-Last: Shojaei
Author-Name: Yadollah Waghei
Author-X-Name-First: Yadollah
Author-X-Name-Last: Waghei
Author-Name: Mohsen Mohammadzadeh
Author-X-Name-First: Mohsen
Author-X-Name-Last: Mohammadzadeh
Title: Geostatistical analysis of disease data: a case study of tuberculosis incidence in Iran
Abstract:
The main objective of this study is to introduce two advanced statistical methods and to consider geographical distribution of tuberculosis incidence in Iran. With the knowledge that environmental and climatic conditions in each region are affective for the incidence and spread of the disease, the study has been taken into consideration. The disease incidences in different counties are realizations of spatial data, therefore we apply the Poisson kriging and ordinary kriging for prediction of tuberculosis incidence rates map in Iran. To identify high risk areas using statistical map of disease, our results show that tuberculosis incidences are not uniformly distributed in whole of the country and estimated risk is high in the eastern parts. Assessing geographical distribution of a disease is essential for health officials to recognize high-risk areas, and improve case management and resource allocation.
Journal: Journal of Applied Statistics
Pages: 1476-1483
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1375468
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1375468
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1476-1483
Template-Type: ReDIF-Article 1.0
Author-Name: David P. M. Scollnik
Author-X-Name-First: David P. M.
Author-X-Name-Last: Scollnik
Title: Bayesian analysis of a quarantine inspection model
Abstract:
In this paper, we propose a quarantine inspection model and examine its analysis from a Bayesian point of view. This model is a generalization of the one appearing in Decrouez and Robinson [Aust. N. Z. J. Stat., 54 (2012), pp. 281–299]. The context has to do with items approaching a border, some of which are randomly selected and inspected for contamination. A random sample of the items that pass this first inspection is submitted to a second inspection that is assumed to detect all contamination. Inference is sought with respect to the model parameters and also especially the proportion of items that pass through the border that are still contaminated. A hierarchical quarantine inspection model is also introduced and discussed. Three illustrative examples are given.
Journal: Journal of Applied Statistics
Pages: 1484-1496
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1380785
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1380785
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1484-1496
Template-Type: ReDIF-Article 1.0
Author-Name: Erindi Allaj
Author-X-Name-First: Erindi
Author-X-Name-Last: Allaj
Title: Two simple measures of variability for categorical data
Abstract:
This paper proposes two new variability measures for categorical data. The first variability measure is obtained as one minus the square root of the sum of the squares of the relative frequencies of the different categories. The second measure is obtained by standardizing the first measure. The measures proposed are functions of the variability measure proposed by Gini [Variabilitá e Mutuabilitá Contributo allo Studio delle Distribuzioni e delle Relazioni Statistiche, C. Cuppini, Bologna, 1912] and approximate the coefficient of nominal variation introduced by Kvålseth [Coefficients of variation for nominal and ordinal categorical data, Percept. Motor Skills 80 (1995), pp. 843–847] when the number of categories increases. Different mathematical properties of the proposed variability measures are studied and analyzed. Several examples illustrate how the variability measures can be interpreted and used in practice.
Journal: Journal of Applied Statistics
Pages: 1497-1516
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1380787
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1380787
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1497-1516
Template-Type: ReDIF-Article 1.0
Author-Name: Kung-Jong Lui
Author-X-Name-First: Kung-Jong
Author-X-Name-Last: Lui
Title: Sample size determination for testing equality in frequency data under an incomplete block crossover design
Abstract:
When there are more than two treatments under comparison, we may consider the use of the incomplete block crossover design (IBCD) to save the number of patients needed for a parallel groups design and reduce the duration of a crossover trial. We develop an asymptotic procedure for simultaneously testing equality of two treatments versus a control treatment (or placebo) in frequency data under the IBCD with two periods. We derive a sample size calculation procedure for the desired power of detecting the given treatment effects at a nominal-level and suggest a simple ad hoc adjustment procedure to improve the accuracy of the sample size determination when the resulting minimum required number of patients is not large. We employ Monte Carlo simulation to evaluate the finite-sample performance of the proposed test, the accuracy of the sample size calculation procedure, and that with the simple ad hoc adjustment suggested here. We use the data taken as a part of a crossover trial comparing the number of exacerbations between using salbutamol or salmeterol and a placebo in asthma patients to illustrate the sample size calculation procedure.
Journal: Journal of Applied Statistics
Pages: 1517-1529
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1380788
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1380788
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1517-1529
Template-Type: ReDIF-Article 1.0
Author-Name: M. A. Di Palma
Author-X-Name-First: M. A.
Author-X-Name-Last: Di Palma
Author-Name: P. Filzmoser
Author-X-Name-First: P.
Author-X-Name-Last: Filzmoser
Author-Name: M. Gallo
Author-X-Name-First: M.
Author-X-Name-Last: Gallo
Author-Name: K. Hron
Author-X-Name-First: K.
Author-X-Name-Last: Hron
Title: A robust Parafac model for compositional data
Abstract:
Compositional data are characterized by values containing relative information, and thus the ratios between the data values are of interest for the analysis. Due to specific features of compositional data, standard statistical methods should be applied to compositions expressed in a proper coordinate system with respect to an orthonormal basis. It is discussed how three-way compositional data can be analyzed with the Parafac model. When data are contaminated by outliers, robust estimates for the Parafac model parameters should be employed. It is demonstrated how robust estimation can be done in the context of compositional data and how the results can be interpreted. A real data example from macroeconomics underlines the usefulness of this approach.
Journal: Journal of Applied Statistics
Pages: 1347-1369
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1381669
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1381669
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1347-1369
Template-Type: ReDIF-Article 1.0
Author-Name: Mahesh Pandit
Author-X-Name-First: Mahesh
Author-X-Name-Last: Pandit
Author-Name: Krishna P. Paudel
Author-X-Name-First: Krishna P.
Author-X-Name-Last: Paudel
Author-Name: Roger Hinson
Author-X-Name-First: Roger
Author-X-Name-Last: Hinson
Title: Market channel selections by US nursery plant producers: a multivariate nonparametric fractional regression analysis
Abstract:
Availability of market channel alternatives has helped the growth of ornamental plant sales in the United States. To identify the factors affecting the choice and allocation of outputs to different market channels by nursery producers, we first use a mixture of experts model to select clusters of homogenous subpopulations of US nursery producers based on a 2009 National Nursery Survey. The impact of growers’ business characteristics on shares of sales to these channels was estimated using multivariate parametric and nonparametric fractional regression models. Specification tests indicated a nonparametric model was superior to a parametric model in some clusters. Important explanatory variables affecting the sales volume to different channels were sales of plant groups, kinds of contract sales, promotional expenses, and farm size. Results indicated the existence of clear market segmentation of nursery producers in the United States.
Journal: Journal of Applied Statistics
Pages: 1530-1546
Issue: 8
Volume: 45
Year: 2018
Month: 6
X-DOI: 10.1080/02664763.2017.1381670
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1381670
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:8:p:1530-1546
Template-Type: ReDIF-Article 1.0
Author-Name: Yan-Ting Xiao
Author-X-Name-First: Yan-Ting
Author-X-Name-Last: Xiao
Author-Name: Zhan-Shou Chen
Author-X-Name-First: Zhan-Shou
Author-X-Name-Last: Chen
Title: Bias-corrected estimations in varying-coefficient partially nonlinear models with measurement error in the nonparametric part
Abstract:
In this paper, we consider the statistical inference for the varying-coefficient partially nonlinear model with additive measurement errors in the nonparametric part. The local bias-corrected profile nonlinear least-squares estimation procedure for parameter in nonlinear function and nonparametric function is proposed. Then, the asymptotic normality properties of the resulting estimators are established. With the empirical likelihood method, a local bias-corrected empirical log-likelihood ratio statistic for the unknown parameter, and a corrected and residual adjusted empirical log-likelihood ratio for the nonparametric component are constructed. It is shown that the resulting statistics are asymptotically chi-square distribution under some suitable conditions. Some simulations are conducted to evaluate the performance of the proposed methods. The results indicate that the empirical likelihood method is superior to the profile nonlinear least-squares method in terms of the confidence regions of parameter and point-wise confidence intervals of nonparametric function.
Journal: Journal of Applied Statistics
Pages: 586-603
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1288201
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1288201
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:586-603
Template-Type: ReDIF-Article 1.0
Author-Name: Qing Li
Author-X-Name-First: Qing
Author-X-Name-Last: Li
Author-Name: Feng Guo
Author-X-Name-First: Feng
Author-X-Name-Last: Guo
Author-Name: Inyoung Kim
Author-X-Name-First: Inyoung
Author-X-Name-Last: Kim
Author-Name: Sheila G. Klauer
Author-X-Name-First: Sheila G.
Author-X-Name-Last: Klauer
Author-Name: Bruce G. Simons-Morton
Author-X-Name-First: Bruce G.
Author-X-Name-Last: Simons-Morton
Title: A Bayesian finite mixture change-point model for assessing the risk of novice teenage drivers
Abstract:
The driving risk during the initial period after licensure for novice teenage drivers is typically the highest but decreases rapidly right after. The change-point of driving risk is a critical parameter for evaluating teenage driving risk, which also varies substantially among drivers. This paper presents latent class recurrent-event change-point models for detecting the change-points. The proposed model is applied to the Naturalist Teenage Driving Study, which continuously recorded the driving data of 42 novice teenage drivers for 18 months using advanced in-vehicle instrumentation. We propose a hierarchical BFMM to estimate the change-points by clusters of drivers with similar risk profiles. The model is based on a non-homogeneous Poisson process with piecewise-constant intensity functions. Latent variables which identify the membership of the subjects are used to detect potential clusters among subjects. Application to the Naturalistic Teenage Driving Study identifies three distinct clusters with change-points at 52.30, 108.99 and 150.20 hours of driving after first licensure, respectively. The overall intensity rate and the pattern of change also differ substantially among clusters. The results of this research provide more insight in teenagers' driving behaviour and will be critical to improve young drivers' safety education and parent management programs, as well as provide crucial reference for the GDL regulations to encourage safer driving.
Journal: Journal of Applied Statistics
Pages: 604-625
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1288202
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1288202
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:604-625
Template-Type: ReDIF-Article 1.0
Author-Name: Kyle M. Irimata
Author-X-Name-First: Kyle M.
Author-X-Name-Last: Irimata
Author-Name: Jeffrey R. Wilson
Author-X-Name-First: Jeffrey R.
Author-X-Name-Last: Wilson
Title: Identifying intraclass correlations necessitating hierarchical modeling
Abstract:
Hierarchical binary outcome data with three levels, such as disease remission for patients nested within physicians, nested within clinics are frequently encountered in practice. One important aspect in such data is the correlation that occurs at each level of the data. In parametric modeling, accounting for these correlations increases the complexity. These models may also yield results that lead to the same conclusions as simpler models. We developed a measure of intraclass correlation at each stage of a three-level nested structure and identified guidelines for determining when the dependencies in hierarchical models need to be taken into account. These guidelines are supported by simulations of hierarchical data sets, as well as the analysis of AIDS knowledge in Bangladesh from the 2011 Demographic Health Survey. We also provide a simple rule of thumb to assist researchers faced with the challenge of choosing an appropriately complex model when analyzing hierarchical binary data.
Journal: Journal of Applied Statistics
Pages: 626-641
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1288203
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1288203
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:626-641
Template-Type: ReDIF-Article 1.0
Author-Name: Oscar Palmeros
Author-X-Name-First: Oscar
Author-X-Name-Last: Palmeros
Author-Name: Jose A. Villaseñor
Author-X-Name-First: Jose A.
Author-X-Name-Last: Villaseñor
Author-Name: Elizabeth González
Author-X-Name-First: Elizabeth
Author-X-Name-Last: González
Title: On computing estimates of a change-point in the Weibull regression hazard model
Abstract:
The hazard function describes the instantaneous rate of failure at a time t, given that the individual survives up to t. In applications, the effect of covariates produce changes in the hazard function. When dealing with survival analysis, it is of interest to identify where a change point in time has occurred. In this work, covariates and censored variables are considered in order to estimate a change-point in the Weibull regression hazard model, which is a generalization of the exponential model. For this more general model, it is possible to obtain maximum likelihood estimators for the change-point and for the parameters involved. A Monte Carlo simulation study shows that indeed, it is possible to implement this model in practice. An application with clinical trial data coming from a treatment of chronic granulomatous disease is also included.
Journal: Journal of Applied Statistics
Pages: 642-648
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1289366
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1289366
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:642-648
Template-Type: ReDIF-Article 1.0
Author-Name: Diego Casadei
Author-X-Name-First: Diego
Author-X-Name-Last: Casadei
Author-Name: Cornelius Grunwald
Author-X-Name-First: Cornelius
Author-X-Name-Last: Grunwald
Author-Name: Kevin Kröninger
Author-X-Name-First: Kevin
Author-X-Name-Last: Kröninger
Author-Name: Florian Mentzel
Author-X-Name-First: Florian
Author-X-Name-Last: Mentzel
Title: Objective Bayesian analysis of counting experiments with correlated sources of background
Abstract:
Searches for faint signals in counting experiments are often encountered in particle physics and astrophysics, as well as in other fields. Many problems can be reduced to the case of a model with independent and Poisson-distributed signal and background. Often several background contributions are present at the same time, possibly correlated. We provide the analytic solution of the statistical inference problem of estimating the signal in the presence of multiple backgrounds, in the framework of objective Bayes statistics. The model can be written in the form of a product of a single Poisson distribution with a multinomial distribution. The first is related to the total number of events, whereas the latter describes the fraction of events coming from each individual source. Correlations among different backgrounds can be included in the inference problem by a suitable choice of the priors.
Journal: Journal of Applied Statistics
Pages: 649-667
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1289367
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1289367
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:649-667
Template-Type: ReDIF-Article 1.0
Author-Name: I. Wilms
Author-X-Name-First: I.
Author-X-Name-Last: Wilms
Author-Name: C. Croux
Author-X-Name-First: C.
Author-X-Name-Last: Croux
Title: An algorithm for the multivariate group lasso with covariance estimation
Abstract:
We study a group lasso estimator for the multivariate linear regression model that accounts for correlated error terms. A block coordinate descent algorithm is used to compute this estimator. We perform a simulation study with categorical data and multivariate time series data, typical settings with a natural grouping among the predictor variables. Our simulation studies show the good performance of the proposed group lasso estimator compared to alternative estimators. We illustrate the method on a time series data set of gene expressions.
Journal: Journal of Applied Statistics
Pages: 668-681
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1289503
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1289503
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:668-681
Template-Type: ReDIF-Article 1.0
Author-Name: Bishal Gurung
Author-X-Name-First: Bishal
Author-X-Name-Last: Gurung
Author-Name: K. N. Singh
Author-X-Name-First: K. N.
Author-X-Name-Last: Singh
Author-Name: Ravindra Singh Shekhawat
Author-X-Name-First: Ravindra Singh
Author-X-Name-Last: Shekhawat
Author-Name: Md Yeasin
Author-X-Name-First: Md
Author-X-Name-Last: Yeasin
Title: An insight into technology diffusion of tractor through Weibull growth model
Abstract:
Most of the technological innovation diffusion follows an S-shaped curve. But, in many practical situations this may not hold true. To this end, Weibull model was proposed to capture the diffusion of new technological innovation, which does not follow any specific pattern. Nonlinear growth models play a very important role in getting an insight into the underlying mechanism. These models are generally ‘mechanistic’ as the parameters have meaningful interpretation. The nonlinear method of estimation of parameters of Weibull model fails to converge. Taking this problem into consideration, we propose the use of a powerful technique of genetic algorithm for parameter estimation. The methodology is also validated by simulation study to check whether parameter estimates are closer to the real value. For illustration purpose, we model the tractor density time-series data of India as a whole and some major states of India. It is seen that fitted Weibull model is able to capture the technology diffusion process in a reasonable manner. Further, comparison is also made with Logistic and Gompertz model; and is found to perform better for the data sets under consideration.
Journal: Journal of Applied Statistics
Pages: 682-696
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1289504
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1289504
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:682-696
Template-Type: ReDIF-Article 1.0
Author-Name: Zicheng Hu
Author-X-Name-First: Zicheng
Author-X-Name-Last: Hu
Author-Name: Jessica N Lancaster
Author-X-Name-First: Jessica N
Author-X-Name-Last: Lancaster
Author-Name: Lauren I. R. Ehrlich
Author-X-Name-First: Lauren I. R.
Author-X-Name-Last: Ehrlich
Author-Name: Peter Müller
Author-X-Name-First: Peter
Author-X-Name-Last: Müller
Title: Detecting T cell activation using a varying dimension Bayesian model
Abstract:
The detection of T cell activation is critical in many immunological assays. However, detecting T cell activation in live tissues remains a challenge due to highly noisy data. We developed a Bayesian probabilistic model to identify T cell activation based on calcium flux, a increase in intracellular calcium concentration that occurs during T cell activation. Because a T cell has unknown number of flux events, the implementation of posterior inference requires trans-dimensional posterior simulation. The model is able to detect calcium flux events at the single cell level from simulated data, as well as from noisy biological data.
Journal: Journal of Applied Statistics
Pages: 697-713
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1290789
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1290789
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:697-713
Template-Type: ReDIF-Article 1.0
Author-Name: Amanpreet Kaur
Author-X-Name-First: Amanpreet
Author-X-Name-Last: Kaur
Author-Name: Ravinder Agarwal
Author-X-Name-First: Ravinder
Author-X-Name-Last: Agarwal
Author-Name: Amod Kumar
Author-X-Name-First: Amod
Author-X-Name-Last: Kumar
Title: Adaptive threshold method for peak detection of surface electromyography signal from around shoulder muscles
Abstract:
This paper illustrates the accurate identification of the surface electromyography signal obtained from the shoulder muscles (Teres, Trapezius and Pectoralis) of amputee subjects with three different arm motions (elevation, protraction and retraction). During the acquisition of the signal, a variety of variations (amplitude, frequency and noise) were introduced into the acquired signal which will misguide in the prediction of motion of the shoulder. Therefore, a novel approach has been aimed to adaptively adjust the threshold of Teager energy operator in order to filter the unwanted peaks in the pre-processing stage of the surface electromyography (SEMG) signal. Results show that the proposed approach is accurate and effective in the analysis of biomedical signal where peaks are important to detect without the knowledge of the shape of the waveform. As clinical research continues, these algorithms helps us to process SEMG signal and the identified signal would be used to design more accurate and efficient controllers for the upper-limb amputee.
Journal: Journal of Applied Statistics
Pages: 714-726
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1293624
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1293624
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:714-726
Template-Type: ReDIF-Article 1.0
Author-Name: Mariano Ruiz Espejo
Author-X-Name-First: Mariano Ruiz
Author-X-Name-Last: Espejo
Author-Name: Adalbert Marqués Vilallonga
Author-X-Name-First: Adalbert Marqués
Author-X-Name-Last: Vilallonga
Title: Principles of Scientific Methods
Journal: Journal of Applied Statistics
Pages: 775-776
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1295522
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1295522
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:775-776
Template-Type: ReDIF-Article 1.0
Author-Name: Fadel Hamid Hadi Alhusseini
Author-X-Name-First: Fadel Hamid Hadi
Author-X-Name-Last: Alhusseini
Author-Name: Vasile Georgescu
Author-X-Name-First: Vasile
Author-X-Name-Last: Georgescu
Title: Bayesian composite Tobit quantile regression
Abstract:
Composite quantile regression models have been shown to be effective techniques in improving the prediction accuracy [H. Zou and M. Yuan, Composite quantile regression and the oracle model selection theory, Ann. Statist. 36 (2008), pp. 1108–1126; J. Bradic, J. Fan, and W. Wang, Penalized composite quasi-likelihood for ultrahighdimensional variable selection, J. R. Stat. Soc. Ser. B 73 (2011), pp. 325–349; Z. Zhao and Z. Xiao, Efficient regressions via optimally combining quantile information, Econometric Theory 30(06) (2014), pp. 1272–1314]. This paper studies composite Tobit quantile regression (TQReg) from a Bayesian perspective. A simple and efficient MCMC-based computation method is derived for posterior inference using a mixture of an exponential and a scaled normal distribution of the skewed Laplace distribution. The approach is illustrated via simulation studies and a real data set. Results show that combine information across different quantiles can provide a useful method in efficient statistical estimation. This is the first work to discuss composite TQReg from a Bayesian perspective.
Journal: Journal of Applied Statistics
Pages: 727-739
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1299697
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1299697
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:727-739
Template-Type: ReDIF-Article 1.0
Author-Name: Ján Dolinský
Author-X-Name-First: Ján
Author-X-Name-Last: Dolinský
Author-Name: Kei Hirose
Author-X-Name-First: Kei
Author-X-Name-Last: Hirose
Author-Name: Sadanori Konishi
Author-X-Name-First: Sadanori
Author-X-Name-Last: Konishi
Title: Readouts for echo-state networks built using locally regularized orthogonal forward regression
Abstract:
Echo state network (ESN) is viewed as a temporal expansion which naturally give rise to regressors of various relevance to a teacher output. We illustrate that often only a certain amount of the generated echo-regressors effectively explain the teacher output and we propose to determine the importance of the echo-regressors by a joint calculation of the individual variance contributions and Bayesian relevance using the locally regularized orthogonal forward regression (LROFR). This information can be advantageously used in a variety of ways for an analysis of an ESN structure. We present a locally regularized linear readout built using LROFR. The readout may have a smaller dimensionality than the ESN model itself, and improves robustness and accuracy of an ESN. Its main advantage is ability to determine what type of an additional readout is suitable for a task at hand. Comparison with PCA is provided too. We also propose a radial basis function (RBF) readout built using LROFR, since flexibility of the linear readout has limitations and might be insufficient for complex tasks. Its excellent generalization abilities make it a viable alternative to feed-forward neural networks or relevance-vector-machines. For cases where more temporal capacity is required we propose well studied delay&sum readout.
Journal: Journal of Applied Statistics
Pages: 740-762
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1305331
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1305331
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:740-762
Template-Type: ReDIF-Article 1.0
Author-Name: K. Nichols
Author-X-Name-First: K.
Author-X-Name-Last: Nichols
Author-Name: E. Trevino
Author-X-Name-First: E.
Author-X-Name-Last: Trevino
Author-Name: N. Ikeda
Author-X-Name-First: N.
Author-X-Name-Last: Ikeda
Author-Name: D. Philo
Author-X-Name-First: D.
Author-X-Name-Last: Philo
Author-Name: A. Garcia
Author-X-Name-First: A.
Author-X-Name-Last: Garcia
Author-Name: D. Bowman
Author-X-Name-First: D.
Author-X-Name-Last: Bowman
Title: Interdependency amongst earthquake magnitudes in Southern California
Abstract:
Recent research has shown that for the larger earthquakes recorded ( $ M\ge 5.2 $ M≥5.2) within the global centroid moment tensor (gCMT) there is a positive correlation between the magnitudes of earthquakes and the magnitudes of their aftershocks [13]. Through a modification of model independent stochastic de-clustering [12] and a more localized catalog provided by the Southern California Earthquake Data Center (SCEDC), the methodologies of Nichols and Schoenberg can be extended to catalogs complete with a much lower minimum magnitude of completeness ( $ M\ge 2.2 $ M≥2.2). Results indicate that the positive correlation observed between larger earthquakes in the gCMT catalog and their aftershocks is also evident in the relationship between the magnitudes of earthquakes in the SCEDC data and their aftershocks. However, with the lower minimum magnitude of completeness found in the SCEDC catalog and with short periods of extreme earthquake activity evident within the data, the statistical power of the stochastic de-clustering algorithms to distinguish between mainshocks and aftershocks is diminished.
Journal: Journal of Applied Statistics
Pages: 763-774
Issue: 4
Volume: 45
Year: 2018
Month: 3
X-DOI: 10.1080/02664763.2017.1313965
File-URL: http://hdl.handle.net/10.1080/02664763.2017.1313965
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:45:y:2018:i:4:p:763-774
Template-Type: ReDIF-Article 1.0
Author-Name: Connie Stewart
Author-X-Name-First: Connie
Author-X-Name-Last: Stewart
Title: An approach to measure distance between compositional diet estimates containing essential zeros
Abstract:
For many applications involving compositional data, it is necessary to establish a valid measure of distance, yet when essential zeros are present traditional distance measures are problematic. In quantitative fatty acid signature analysis (QFASA), compositional diet estimates are produced that often contain many zeros. In order to test for a difference in diet between two populations of predators using the QFASA diet estimates, a legitimate measure of distance for use in the test statistic is necessary. Since ecologists using QFASA must first select the potential species of prey in the predator's diet, the chosen measure of distance should be such that the distance between samples does not decrease as the number of species considered increases, a property known in general as subcompositional coherence. In this paper we compare three measures of distance for compositional data capable of handling zeros, but not satisfying some of the well-accepted principles of compositional data analysis. For compositional diet estimates, the most relevant of these is the property of subcompositionally coherence and we show that this property may be approximately satisfied. Based on the results of a simulation study and an application to real-life QFASA diet estimates of grey seals, we recommend the chi-square measure of distance.
Journal: Journal of Applied Statistics
Pages: 1137-1152
Issue: 7
Volume: 44
Year: 2017
Month: 5
X-DOI: 10.1080/02664763.2016.1193846
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1193846
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:7:p:1137-1152
Template-Type: ReDIF-Article 1.0
Author-Name: Diego I. Gallardo
Author-X-Name-First: Diego I.
Author-X-Name-Last: Gallardo
Author-Name: Héctor W. Gómez
Author-X-Name-First: Héctor W.
Author-X-Name-Last: Gómez
Author-Name: Heleno Bolfarine
Author-X-Name-First: Heleno
Author-X-Name-Last: Bolfarine
Title: A new cure rate model based on the Yule–Simon distribution with application to a melanoma data set
Abstract:
In this paper, a new survival cure rate model is introduced considering the Yule–Simon distribution [12] to model the number of concurrent causes. We study some properties of this distribution and the model arising when the distribution of the competing causes is the Weibull model. We call this distribution the Weibull–Yule–Simon distribution. Maximum likelihood estimation is conducted for model parameters. A small scale simulation study is conducted indicating satisfactory parameter recovery by the estimation approach. Results are applied to a real data set (melanoma) illustrating the fact that the model proposed can outperform traditional alternative models in terms of model fitting.
Journal: Journal of Applied Statistics
Pages: 1153-1164
Issue: 7
Volume: 44
Year: 2017
Month: 5
X-DOI: 10.1080/02664763.2016.1194385
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1194385
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:7:p:1153-1164
Template-Type: ReDIF-Article 1.0
Author-Name: Tatsuya Kubota
Author-X-Name-First: Tatsuya
Author-X-Name-Last: Kubota
Author-Name: Takeshi Kurosawa
Author-X-Name-First: Takeshi
Author-X-Name-Last: Kurosawa
Title: Bayesian prediction of unobserved values for Type-II censored data
Abstract:
In this paper, we consider posterior predictive distributions of Type-II censored data for an inverse Weibull distribution. These functions are given by using conditional density functions and conditional survival functions. Although the conditional survival functions were expressed by integral forms in previous studies, we derive the conditional survival functions in closed forms and thereby reduce the computation cost. In addition, we calculate the predictive confidence intervals of unobserved values and coverage probabilities of unobserved values by using the posterior predictive survival functions.
Journal: Journal of Applied Statistics
Pages: 1165-1180
Issue: 7
Volume: 44
Year: 2017
Month: 5
X-DOI: 10.1080/02664763.2016.1201792
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1201792
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:7:p:1165-1180
Template-Type: ReDIF-Article 1.0
Author-Name: Öznur İşçi Güneri
Author-X-Name-First: Öznur İşçi
Author-X-Name-Last: Güneri
Author-Name: Atilla Göktaş
Author-X-Name-First: Atilla
Author-X-Name-Last: Göktaş
Author-Name: Uğur Kayalı
Author-X-Name-First: Uğur
Author-X-Name-Last: Kayalı
Title: Path analysis and determining the distribution of indirect effects via simulation
Abstract:
The difference between a path analysis and the other multivariate analyses is that the path analysis has the ability to compute the indirect effects apart from the direct effects. The aim of this study is to investigate the distribution of indirect effects that is one of the components of path analysis via generated data. To realize this, a simulation study has been conducted with four different sample sizes, three different numbers of explanatory variables and with three different correlation matrices. A replication of 1000 has been applied for every single combination. According to the results obtained, it is found that irrespective of the sample size path coefficients tend to be stable. Moreover, path coefficients are not affected by correlation types either. Since the replication number is 1000, which is fairly large, the indirect effects from the path models have been treated as normal and their confidence intervals have been presented as well. It is also found that the path analysis should not be used with three explanatory variables. We think that this study would help scientists who are working in both natural and social sciences to determine sample size and different number of variables in the path analysis.
Journal: Journal of Applied Statistics
Pages: 1181-1210
Issue: 7
Volume: 44
Year: 2017
Month: 5
X-DOI: 10.1080/02664763.2016.1201793
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1201793
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:7:p:1181-1210
Template-Type: ReDIF-Article 1.0
Author-Name: Rafael Bernardo Carmona-Benítez
Author-X-Name-First: Rafael Bernardo
Author-X-Name-Last: Carmona-Benítez
Author-Name: María Rosa Nieto
Author-X-Name-First: María Rosa
Author-X-Name-Last: Nieto
Title: Comparison of bootstrap estimation intervals to forecast arithmetic mean and median air passenger demand
Abstract:
The aim of this paper is to compare passenger (pax) demand between airports based on the arithmetic mean (MPD) and the median pax demand (MePD). A three phases approach is applied. First phase, we use bootstrap procedures to estimate the distribution of the arithmetic MPD and the MePD for each block of routes distance; second phase, we use percentile, standard, bias corrected, and bias corrected accelerated methods to calculate bootstrap confidence bands for the MPD and the MePD; and third phase, we implement Monte Carlo (MC) experiments to analyse the finite sample performance of the applied bootstrap. Our results conclude that it is more meaningful to use the estimation of MePD rather than the estimation of MPD in the air transport industry. By carrying out MC experiments, we demonstrate that the bootstrap methods produce coverages close to the nominal for the MPD and the MePD.
Journal: Journal of Applied Statistics
Pages: 1211-1224
Issue: 7
Volume: 44
Year: 2017
Month: 5
X-DOI: 10.1080/02664763.2016.1201794
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1201794
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:7:p:1211-1224
Template-Type: ReDIF-Article 1.0
Author-Name: Izabella A. R. A. Santos
Author-X-Name-First: Izabella A. R. A.
Author-X-Name-Last: Santos
Author-Name: Denise Duarte
Author-X-Name-First: Denise
Author-X-Name-Last: Duarte
Author-Name: Marcelo Azevedo Costa
Author-X-Name-First: Marcelo Azevedo
Author-X-Name-Last: Costa
Title: Use of jump process to model mobility in massive multiplayer on-line games
Abstract:
This paper proposes a methodology to model the mobility of characters in Massively Multiplayer On-line (MMO) Games. We propose to model the mobility of characters in the map of an MMO game as a jump process using two approaches to model the times spent in the states of the process: parametric and non-parametric. Furthermore, a simulator for the mobility is presented. We analyze geographic position data of the characters in the map of the game World of Warcraft and compare the observed and simulated data. The proposed methodology and the simulator can be used to optimize computing load allocation of servers, which is extremely important for game performance, service quality and cost.
Journal: Journal of Applied Statistics
Pages: 1225-1247
Issue: 7
Volume: 44
Year: 2017
Month: 5
X-DOI: 10.1080/02664763.2016.1201795
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1201795
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:7:p:1225-1247
Template-Type: ReDIF-Article 1.0
Author-Name: Jouchi Nakajima
Author-X-Name-First: Jouchi
Author-X-Name-Last: Nakajima
Author-Name: Tsuyoshi Kunihama
Author-X-Name-First: Tsuyoshi
Author-X-Name-Last: Kunihama
Author-Name: Yasuhiro Omori
Author-X-Name-First: Yasuhiro
Author-X-Name-Last: Omori
Title: Bayesian modeling of dynamic extreme values: extension of generalized extreme value distributions with latent stochastic processes
Abstract:
This paper develops Bayesian inference of extreme value models with a flexible time-dependent latent structure. The generalized extreme value distribution is utilized to incorporate state variables that follow an autoregressive moving average (ARMA) process with Gumbel-distributed innovations. The time-dependent extreme value distribution is combined with heavy-tailed error terms. An efficient Markov chain Monte Carlo algorithm is proposed using a state-space representation with a finite mixture of normal distributions to approximate the Gumbel distribution. The methodology is illustrated by simulated data and two different sets of real data. Monthly minima of daily returns of stock price index, and monthly maxima of hourly electricity demand are fit to the proposed model and used for model comparison. Estimation results show the usefulness of the proposed model and methodology, and provide evidence that the latent autoregressive process and heavy-tailed errors play an important role to describe the monthly series of minimum stock returns and maximum electricity demand.
Journal: Journal of Applied Statistics
Pages: 1248-1268
Issue: 7
Volume: 44
Year: 2017
Month: 5
X-DOI: 10.1080/02664763.2016.1201796
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1201796
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:7:p:1248-1268
Template-Type: ReDIF-Article 1.0
Author-Name: Camillo Cammarota
Author-X-Name-First: Camillo
Author-X-Name-Last: Cammarota
Title: Estimating the turning point location in shifted exponential model of time series
Abstract:
We consider the distribution of the turning point location of time series modeled as the sum of deterministic trend plus random noise. If the variables are modeled by shifted exponentials, whose location parameters define the trend, we provide a formula for computing the distribution of the turning point location and consequently to estimate a confidence interval for the location. We test this formula in simulated data series having a trend with asymmetric minimum, investigating the coverage rate as a function of a bandwidth parameter. The method is applied to estimate the confidence interval of the minimum location of two types of real-time series: the RT intervals extracted from the electrocardiogram recorded during the exercise test and an economic indicator, the current account balance. We discuss the connection with stochastic ordering.
Journal: Journal of Applied Statistics
Pages: 1269-1281
Issue: 7
Volume: 44
Year: 2017
Month: 5
X-DOI: 10.1080/02664763.2016.1201797
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1201797
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:7:p:1269-1281
Template-Type: ReDIF-Article 1.0
Author-Name: M. Moghimbeygi
Author-X-Name-First: M.
Author-X-Name-Last: Moghimbeygi
Author-Name: M. Golalizadeh
Author-X-Name-First: M.
Author-X-Name-Last: Golalizadeh
Title: Longitudinal shape analysis by using the spherical coordinates
Abstract:
One of the important topics in morphometry that received high attention recently is the longitudinal analysis of shape variation. According to Kendall's definition of shape, the shape of object appertains on non-Euclidean space, making the longitudinal study of configuration somehow difficult. However, to simplify this task, triangulation of the objects and then constructing a non-parametric regression-type model on the unit sphere is pursued in this paper. The prediction of the configurations in some time instances is done using both properties of triangulation and the size of great baselines. Moreover, minimizing a Euclidean risk function is proposed to select feasible weights in constructing smoother functions in a non-parametric smoothing manner. These will provide some proper shape growth models to analysis objects varying in time. The proposed models are applied to analysis of two real-life data sets.
Journal: Journal of Applied Statistics
Pages: 1282-1295
Issue: 7
Volume: 44
Year: 2017
Month: 5
X-DOI: 10.1080/02664763.2016.1201798
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1201798
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:7:p:1282-1295
Template-Type: ReDIF-Article 1.0
Author-Name: C. Masci
Author-X-Name-First: C.
Author-X-Name-Last: Masci
Author-Name: F. Ieva
Author-X-Name-First: F.
Author-X-Name-Last: Ieva
Author-Name: T. Agasisti
Author-X-Name-First: T.
Author-X-Name-Last: Agasisti
Author-Name: A. M. Paganoni
Author-X-Name-First: A. M.
Author-X-Name-Last: Paganoni
Title: Bivariate multilevel models for the analysis of mathematics and reading pupils' achievements
Abstract:
The purpose of this paper is to identify a relationship between pupils' mathematics and reading test scores and the characteristics of students themselves, stratifying for classes, schools and geographical areas. The data set of interest contains detailed information about more than 500,000 students at the first year of junior secondary school in the year 2012/2013, provided by the Italian Institute for the Evaluation of Educational System. The innovation of this work is in the use of multivariate multilevel models, in which the outcome is bivariate: reading and mathematics achievement. Using the bivariate outcome enables researchers to analyze the correlations between achievement levels in the two fields and to predict statistically significant school and class effects after adjusting for pupil's characteristics. The statistical model employed here explicates account for the potential covariance between the two topics, and at the same time it allows the school effect to vary among them. The results show that while for most cases the direction of school's effect is coherent for reading and mathematics (i.e. positive/negative), there are cases where internal school factors lead to different performances in the two fields.
Journal: Journal of Applied Statistics
Pages: 1296-1317
Issue: 7
Volume: 44
Year: 2017
Month: 5
X-DOI: 10.1080/02664763.2016.1201799
File-URL: http://hdl.handle.net/10.1080/02664763.2016.1201799
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:44:y:2017:i:7:p:1296-1317
Template-Type: ReDIF-Article 1.0
Author-Name: Yanqin Feng
Author-X-Name-First: Yanqin
Author-X-Name-Last: Feng
Author-Name: Shurong Lin
Author-X-Name-First: Shurong
Author-X-Name-Last: Lin
Author-Name: Yang Li
Author-X-Name-First: Yang
Author-X-Name-Last: Li
Title: Semiparametric regression of clustered current status data
Abstract:
This paper discusses regression analysis of clustered current status data under semiparametric additive hazards models. In particular, we consider the situation when cluster sizes can be informative about correlated failure times from the same cluster. To address the problem, we present estimating equation-based estimation procedures and establish asymptotic properties of the resulting estimates. Finite sample performance of the proposed method is assessed through an extensive simulation study, which indicates the procedure works well. The method is applied to a motivating data set from a lung tumorigenicity study.
Journal: Journal of Applied Statistics
Pages: 1724-1737
Issue: 10
Volume: 46
Year: 2019
Month: 7
X-DOI: 10.1080/02664763.2018.1564022
File-URL: http://hdl.handle.net/10.1080/02664763.2018.1564022
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:10:p:1724-1737
Template-Type: ReDIF-Article 1.0
Author-Name: Muhammad Aslam Mohd Safari
Author-X-Name-First: Muhammad Aslam Mohd
Author-X-Name-Last: Safari
Author-Name: Nurulkamal Masseran
Author-X-Name-First: Nurulkamal
Author-X-Name-Last: Masseran
Author-Name: Kamarulzaman Ibrahim
Author-X-Name-First: Kamarulzaman
Author-X-Name-Last: Ibrahim
Title: On the identification of extreme outliers and dragon-kings mechanisms in the upper tail of income distribution
Abstract:
The presence of extreme outliers in the upper tail data of income distribution affects the Pareto tail modeling. A simulation study is carried out to compare the performance of three types of boxplot in the detection of extreme outliers for Pareto data, including standard boxplot, adjusted boxplot and generalized boxplot. It is found that the generalized boxplot is the best method for determining extreme outliers for Pareto distributed data. For the application, the generalized boxplot is utilized for determining the exreme outliers in the upper tail of Malaysian income distribution. In addition, for this data set, the confidence interval method is applied for examining the presence of dragon-kings, extreme outliers which are beyond the Pareto or power-laws distribution.
Journal: Journal of Applied Statistics
Pages: 1886-1902
Issue: 10
Volume: 46
Year: 2019
Month: 7
X-DOI: 10.1080/02664763.2019.1566447
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1566447
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:46:y:2019:i:10:p:1886-1902
Template-Type: ReDIF-Article 1.0
Author-Name: Shuang Xu
Author-X-Name-First: Shuang
Author-X-Name-Last: Xu
Author-Name: Chun-Xia Zhang
Author-X-Name-First: Chun-Xia
Author-X-Name-Last: Zhang
Title: Robust sparse regression by modeling noise as a mixture of gaussians
Abstract:
Regression analysis has been proven to be a quite effective tool in a large variety of fields. In many regression models, it is often assumed that noise is with a specific distribution. Although the theoretical analysis can be greatly facilitated, the model-fitting performance may be poor since the supposed noise distribution may deviate from real noise to a large extent. Meanwhile, the model is also expected to be robust in consideration of the complexity of real-world data. Without any assumption about noise, we propose in this paper a novel sparse regression method called MoG-Lasso to directly model noise in linear regression models via a mixture of Gaussian distributions (MoG). Meanwhile, the $ L_1 $ L1 penalty is included as a part of the loss function of MoG-Lasso to enhance its ability to identify a sparse model. As for the parameters in MoG-Lasso, we present an efficient algorithm to estimate them via the EM (expectation maximization) and ADMM (alternating direction method of multipliers) algorithms. With some simulated and real data contaminated by complex noise, the experiments show that the novel model MoG-Lasso performs better than several other popular methods in both ‘p>n’ and ‘p3 levels per parent) we show Bayesian GLM supports richer inference, particularly on interactions, even with few scenarios, providing more information regarding accuracy of encoding.
Journal: Journal of Applied Statistics
Pages: 1848-1884
Issue: 10
Volume: 47
Year: 2020
Month: 7
X-DOI: 10.1080/02664763.2019.1697651
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1697651
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:10:p:1848-1884
Template-Type: ReDIF-Article 1.0
Author-Name: Hadi Alizadeh Noughabi
Author-X-Name-First: Hadi
Author-X-Name-Last: Alizadeh Noughabi
Author-Name: Jalil Jarrahiferiz
Author-X-Name-First: Jalil
Author-X-Name-Last: Jarrahiferiz
Title: Tests of fit for the Gumbel distribution: EDF-based tests against entropy-based tests
Abstract:
In this article, we propose some tests of fit based on sample entropy for the composite Gumbel (Extreme Value) hypothesis. The proposed test statistics are constructed using different entropy estimates. Through a Monte Carlo simulation, critical values of the test statistics for various sample sizes are obtained. Since the tests based on the empirical distribution function (EDF) are commonly used in practice, the power values of the entropy-based tests with those of the EDF tests are compared against various alternatives and different sample sizes. Finally, two real data sets are modeled by the Gumbel distribution.
Journal: Journal of Applied Statistics
Pages: 1885-1900
Issue: 10
Volume: 47
Year: 2020
Month: 7
X-DOI: 10.1080/02664763.2019.1698522
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1698522
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:10:p:1885-1900
Template-Type: ReDIF-Article 1.0
Author-Name: Seyed Ehsan Saffari
Author-X-Name-First: Seyed Ehsan
Author-X-Name-Last: Saffari
Author-Name: John Carson Allen
Author-X-Name-First: John Carson
Author-X-Name-Last: Allen
Title: Bivariate negative binomial regression model with excess zeros and right censoring: an application to Indonesian data
Abstract:
We propose a bivariate hurdle negative binomial (BHNB) regression model with right censoring to model correlated bivariate count data with excess zeros and few extreme observations. The parameters of the BHNB regression model are obtained using maximum likelihood with conjugate gradient optimization. The proposed model is applied to actual survey data where the bivariate outcome is number of days missed from primary activities and number of days spent in bed due to illness during the 4-week period preceding the inquiry date. We compared the right censored BHNB model to the right censored bivariate negative binomial (BNB) model. A simulation study is conducted to discuss some properties of the BHNB model. Our proposed model demonstrated superior performance in goodness-of-fit of estimated frequencies.
Journal: Journal of Applied Statistics
Pages: 1901-1914
Issue: 10
Volume: 47
Year: 2020
Month: 7
X-DOI: 10.1080/02664763.2019.1695761
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1695761
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:10:p:1901-1914
Template-Type: ReDIF-Article 1.0
Author-Name: Jörn Schulz
Author-X-Name-First: Jörn
Author-X-Name-Last: Schulz
Author-Name: Jan Terje Kvaløy
Author-X-Name-First: Jan Terje
Author-X-Name-Last: Kvaløy
Author-Name: Kjersti Engan
Author-X-Name-First: Kjersti
Author-X-Name-Last: Engan
Author-Name: Trygve Eftestøl
Author-X-Name-First: Trygve
Author-X-Name-Last: Eftestøl
Author-Name: Samwel Jatosh
Author-X-Name-First: Samwel
Author-X-Name-Last: Jatosh
Author-Name: Hussein Kidanto
Author-X-Name-First: Hussein
Author-X-Name-Last: Kidanto
Author-Name: Hege Ersdal
Author-X-Name-First: Hege
Author-X-Name-Last: Ersdal
Title: State transition modeling of complex monitored health data
Abstract:
This article considers the analysis of complex monitored health data, where often one or several signals are reflecting the current health status that can be represented by a finite number of states, in addition to a set of covariates. In particular, we consider a novel application of a non-parametric state intensity regression method in order to study time-dependent effects of covariates on the state transition intensities. The method can handle baseline, time varying as well as dynamic covariates. Because of the non-parametric nature, the method can handle different data types and challenges under minimal assumptions. If the signal that is reflecting the current health status is of continuous nature, we propose the application of a weighted median and a hysteresis filter as data pre-processing steps in order to facilitate robust analysis. In intensity regression, covariates can be aggregated by a suitable functional form over a time history window. We propose to study the estimated cumulative regression parameters for different choices of the time history window in order to investigate short- and long-term effects of the given covariates. The proposed framework is discussed and applied to resuscitation data of newborns collected in Tanzania.
Journal: Journal of Applied Statistics
Pages: 1915-1935
Issue: 11
Volume: 47
Year: 2020
Month: 8
X-DOI: 10.1080/02664763.2019.1698523
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1698523
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:11:p:1915-1935
Template-Type: ReDIF-Article 1.0
Author-Name: Yuantao Zhang
Author-X-Name-First: Yuantao
Author-X-Name-Last: Zhang
Author-Name: Yue Kuen Kwok
Author-X-Name-First: Yue
Author-X-Name-Last: Kuen Kwok
Title: Saddlepoint approximations to tail expectations under non-Gaussian base distributions: option pricing applications
Abstract:
The saddlepoint approximation formulas provide versatile tools for analytic approximation of the tail expectation of a random variable by approximating the complex Laplace integral of the tail expectation expressed in terms of the cumulant generating function of the random variable. We generalize the saddlepoint approximation formulas for calculating tail expectations from the usual Gaussian base distribution to an arbitrary base distribution. Specific discussion is presented on the criteria of choosing the base distribution that fits better the underlying distribution. Numerical performance and comparison of accuracy are made among different saddlepoint approximation formulas. Improved accuracy of the saddlepoint approximations to tail expectations is revealed when proper base distributions are chosen. We also demonstrate enhanced accuracy of the generalized saddlepoint approximation formulas under non-Gaussian base distributions in pricing European options on continuous integrated variance under the Heston stochastic volatility model.
Journal: Journal of Applied Statistics
Pages: 1936-1956
Issue: 11
Volume: 47
Year: 2020
Month: 8
X-DOI: 10.1080/02664763.2019.1703915
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1703915
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:11:p:1936-1956
Template-Type: ReDIF-Article 1.0
Author-Name: Junhyeon Kwon
Author-X-Name-First: Junhyeon
Author-X-Name-Last: Kwon
Author-Name: Hee-Seok Oh
Author-X-Name-First: Hee-Seok
Author-X-Name-Last: Oh
Author-Name: Yaeji Lim
Author-X-Name-First: Yaeji
Author-X-Name-Last: Lim
Title: Dynamic principal component analysis with missing values
Abstract:
Dynamic principal component analysis (DPCA), also known as frequency domain principal component analysis, has been developed by Brillinger [Time Series: Data Analysis and Theory, Vol. 36, SIAM, 1981] to decompose multivariate time-series data into a few principal component series. A primary advantage of DPCA is its capability of extracting essential components from the data by reflecting the serial dependence of them. It is also used to estimate the common component in a dynamic factor model, which is frequently used in econometrics. However, its beneficial property cannot be utilized when missing values are present, which should not be simply ignored when estimating the spectral density matrix in the DPCA procedure. Based on a novel combination of conventional DPCA and self-consistency concept, we propose a DPCA method when missing values are present. We demonstrate the advantage of the proposed method over some existing imputation methods through the Monte Carlo experiments and real data analysis.
Journal: Journal of Applied Statistics
Pages: 1957-1969
Issue: 11
Volume: 47
Year: 2020
Month: 8
X-DOI: 10.1080/02664763.2019.1699910
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1699910
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:11:p:1957-1969
Template-Type: ReDIF-Article 1.0
Author-Name: Stephan Schlüter
Author-X-Name-First: Stephan
Author-X-Name-Last: Schlüter
Author-Name: Milena Kresoja
Author-X-Name-First: Milena
Author-X-Name-Last: Kresoja
Title: Two preprocessing algorithms for climate time series
Abstract:
We propose two preprocessing algorithms suitable for climate time series. The first algorithm detects outliers based on an autoregressive cost update mechanism. The second one is based on the wavelet transform, a method from pattern recognition. In order to benchmark the algorithms' performance we compare them to existing methods based on a synthetic data set. Eventually, for exemplary purposes, the proposed methods are applied to a data set of high-frequent temperature measurements from Novi Sad, Serbia. The results show that both methods together form a powerful tool for signal preprocessing: In case of solitary outliers the autoregressive cost update mechanism prevails, whereas the wavelet-based mechanism is the method of choice in the presence of multiple consecutive outliers.
Journal: Journal of Applied Statistics
Pages: 1970-1989
Issue: 11
Volume: 47
Year: 2020
Month: 8
X-DOI: 10.1080/02664763.2019.1701637
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1701637
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:11:p:1970-1989
Template-Type: ReDIF-Article 1.0
Author-Name: Catalina B. García
Author-X-Name-First: Catalina B.
Author-X-Name-Last: García
Author-Name: Román Salmerón
Author-X-Name-First: Román
Author-X-Name-Last: Salmerón
Author-Name: Claudia García
Author-X-Name-First: Claudia
Author-X-Name-Last: García
Author-Name: José García
Author-X-Name-First: José
Author-X-Name-Last: García
Title: Residualization: justification, properties and application
Abstract:
Although it is usual to find collinearity in econometric models, it is commonly disregarded. An extended solution is to eliminate the variable causing the problem but, in some cases, this decision can affect the goal of the research. Alternatively, residualization not only allows mitigation of collinearity, but it also provides an alternative interpretation of the coefficients isolating the effect of the residualized variable. This paper fully develops the residualization procedure and justifies its application not only for dealing with multicollinearity but also for separating the individual effects of the regressor variables. This contribution is illustrated by two econometric models with financial and ecological data, although it can also be extended to many different fields.
Journal: Journal of Applied Statistics
Pages: 1990-2010
Issue: 11
Volume: 47
Year: 2020
Month: 8
X-DOI: 10.1080/02664763.2019.1701638
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1701638
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:11:p:1990-2010
Template-Type: ReDIF-Article 1.0
Author-Name: Jan Graffelman
Author-X-Name-First: Jan
Author-X-Name-Last: Graffelman
Title: Goodness-of-fit filtering in classical metric multidimensional scaling with large datasets
Abstract:
Metric multidimensional scaling (MDS) is a widely used multivariate method with applications in almost all scientific disciplines. Eigenvalues obtained in the analysis are usually reported in order to calculate the overall goodness-of-fit of the distance matrix. In this paper, we refine MDS goodness-of-fit calculations, proposing additional point and pairwise goodness-of-fit statistics that can be used to filter poorly represented observations in MDS maps. The proposed statistics are especially relevant for large data sets that contain outliers, with typically many poorly fitted observations, and are helpful for improving MDS output and emphasizing the most important features of the dataset. Several goodness-of-fit statistics are considered, and both Euclidean and non-Euclidean distance matrices are considered. Some examples with data from demographic, genetic and geographic studies are shown.
Journal: Journal of Applied Statistics
Pages: 2011-2024
Issue: 11
Volume: 47
Year: 2020
Month: 8
X-DOI: 10.1080/02664763.2019.1702929
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1702929
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:11:p:2011-2024
Template-Type: ReDIF-Article 1.0
Author-Name: Fulya Gokalp Yavuz
Author-X-Name-First: Fulya
Author-X-Name-Last: Gokalp Yavuz
Author-Name: Olcay Arslan
Author-X-Name-First: Olcay
Author-X-Name-Last: Arslan
Title: Variable selection in elliptical linear mixed model
Abstract:
Variable selection in elliptical Linear Mixed Models (LMMs) with a shrinkage penalty function (SPF) is the main scope of this study. SPFs are applied for parameter estimation and variable selection simultaneously. The smoothly clipped absolute deviation penalty (SCAD) is one of the SPFs and it is adapted into the elliptical LMM in this study. The proposed idea is highly applicable to a variety of models which are set up with different distributions such as normal, student-t, Pearson VII, power exponential and so on. Simulation studies and real data example with one of the elliptical distributions show that if the variable selection is also a concern, it is worthwhile to carry on the variable selection and the parameter estimation simultaneously in the elliptical LMM.
Journal: Journal of Applied Statistics
Pages: 2025-2043
Issue: 11
Volume: 47
Year: 2020
Month: 8
X-DOI: 10.1080/02664763.2019.1702928
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1702928
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:11:p:2025-2043
Template-Type: ReDIF-Article 1.0
Author-Name: Massimiliano Bonamente
Author-X-Name-First: Massimiliano
Author-X-Name-Last: Bonamente
Title: Distribution of the C statistic with applications to the sample mean of Poisson data
Abstract:
The ${C} $C statistic, also known as the Cash statistic, is often used in astronomy for the analysis of low-count Poisson data. The main advantage of this statistic, compared to the more commonly used $\chi ^2 $χ2 statistic, is its applicability without the need to combine data points. This feature has made the ${C} $C statistic a very useful method to analyze Poisson data that have small (or even null) counts in each resolution element. One of the challenges of the ${C} $C statistic is that its probability distribution, under the null hypothesis that the data follow a parent model, is not known exactly. This paper presents an effort towards improving our understanding of the ${C} $C statistic by studying (a) the distribution of ${C} $C statistic for a fully specified model, (b) the distribution of Cmin resulting from a maximum-likelihood fit to a simple one-parameter constant model, i.e. a model that represents the sample mean of N Poisson measurements, and (c) the distribution of the associated $\Delta C $ΔC statistic that is used for parameter estimation. The results confirm the expectation that, in the high-count limit, both ${C} $C statistic and Cmin have the same mean and variance as a $\chi ^2 $χ2 statistic with same number of degrees of freedom. It is also found that, in the low-count regime, the expectation of the ${C} $C statistic and Cmin can be substantially lower than for a $\chi ^2 $χ2 distribution. The paper makes use of recent X-ray observations of the astronomical source PG 1116+215 to illustrate the application of the ${C} $C statistic to Poisson data.
Journal: Journal of Applied Statistics
Pages: 2044-2065
Issue: 11
Volume: 47
Year: 2020
Month: 8
X-DOI: 10.1080/02664763.2019.1704703
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1704703
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:11:p:2044-2065
Template-Type: ReDIF-Article 1.0
Author-Name: Nadja Klein
Author-X-Name-First: Nadja
Author-X-Name-Last: Klein
Author-Name: Andrew Entwistle
Author-X-Name-First: Andrew
Author-X-Name-Last: Entwistle
Author-Name: Albert Rosenberger
Author-X-Name-First: Albert
Author-X-Name-Last: Rosenberger
Author-Name: Thomas Kneib
Author-X-Name-First: Thomas
Author-X-Name-Last: Kneib
Author-Name: Heike Bickeböller
Author-X-Name-First: Heike
Author-X-Name-Last: Bickeböller
Title: Candidate-gene association analysis for a continuous phenotype with a spike at zero using parent-offspring trios
Abstract:
In this paper, we propose the class of generalized additive models for location, scale and shape in a test for the association of genetic markers with non-normally distributed phenotypes comprising a spike at zero. The resulting statistical test is a generalization of the quantitative transmission disequilibrium test with mating type indicator, which was originally designed for normally distributed quantitative traits and parent-offspring data. As a motivational example, we consider coronary artery calcification (CAC), which can accurately be identified by electron beam tomography. In the investigated regions, individuals will have a continuous measure of the extent of calcium found or they will be calcium-free. Hence, the resulting distribution is a mixed discrete-continuous distribution with spike at zero. We carry out parent-offspring simulations motivated by such CAC measurement values in a screening population to study statistical properties of the proposed test for genetic association. Furthermore, we apply the approach to data of the Genetic Analysis Workshop 16 that are based on real genotype and family data of the Framingham Heart Study, and test the association of selected genetic markers with simulated coronary artery calcification.
Journal: Journal of Applied Statistics
Pages: 2066-2080
Issue: 11
Volume: 47
Year: 2020
Month: 8
X-DOI: 10.1080/02664763.2019.1704226
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1704226
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:11:p:2066-2080
Template-Type: ReDIF-Article 1.0
Author-Name: Christopher J. Cappelli
Author-X-Name-First: Christopher J.
Author-X-Name-Last: Cappelli
Author-Name: Audrey J. Leroux
Author-X-Name-First: Audrey J.
Author-X-Name-Last: Leroux
Author-Name: Congying Sun
Author-X-Name-First: Congying
Author-X-Name-Last: Sun
Title: A new way for handling mobility in longitudinal data
Abstract:
In the social sciences, applied researchers often face a statistical dilemma when multilevel data is structured such that lower-level units are not purely clustered within higher-level units. To aid applied researchers in appropriately analyzing such data structures, this study proposes a multiple membership growth curve model (MM-GCM). The MM-GCM offers some advantages to other similar modeling approaches, including greater flexibility in modeling the intercept at the time-point most desired for interpretation. A real longitudinal dataset from the field of education with a multiple membership structure, where some students changed schools over time, was used to demonstrate the application of the MM-GCM. Baseline and conditional MM-GCMs are presented, and parameter estimates were compared with two other common approaches to handling such data structures – the final school-GCM that ignores mobile students by only modeling the final school attended and the delete-GCM that deletes mobile students. Additionally, a simulation study was conducted to further assess the impact of ignoring mobility on parameter estimates. The results indicate that ignoring mobility results in substantial bias in model estimates, especially for cluster-level coefficients and variance components.
Journal: Journal of Applied Statistics
Pages: 2081-2096
Issue: 11
Volume: 47
Year: 2020
Month: 8
X-DOI: 10.1080/02664763.2019.1704224
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1704224
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:11:p:2081-2096
Template-Type: ReDIF-Article 1.0
Author-Name: Mustafa Ç. Korkmaz
Author-X-Name-First: Mustafa Ç.
Author-X-Name-Last: Korkmaz
Title: A new heavy-tailed distribution defined on the bounded interval: the logit slash distribution and its application
Abstract:
This paper proposes a new heavy-tailed and alternative slash type distribution on a bounded interval via a relation of a slash random variable with respect to the standard logistic function to model the real data set with skewed and high kurtosis which includes the outlier observation. Some basic statistical properties of the newly defined distribution are studied. We derive the maximum likelihood, least-square, and weighted least-square estimations of its parameters. We assess the performance of the estimators of these estimation methods by the simulation study. Moreover, an application to real data demonstrates that the proposed distribution can provide a better fit than well-known bounded distributions in the literature when the skewed data set with high kurtosis contains the outlier observations.
Journal: Journal of Applied Statistics
Pages: 2097-2119
Issue: 12
Volume: 47
Year: 2020
Month: 9
X-DOI: 10.1080/02664763.2019.1704701
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1704701
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:12:p:2097-2119
Template-Type: ReDIF-Article 1.0
Author-Name: Rodolfo Metulini
Author-X-Name-First: Rodolfo
Author-X-Name-Last: Metulini
Author-Name: Mael Le Carre
Author-X-Name-First: Mael
Author-X-Name-Last: Le Carre
Title: Measuring sport performances under pressure by classification trees with application to basketball shooting
Abstract:
Measuring players' performance in team sports is fundamental since managers need to evaluate players with respect to the ability to score during crucial moments of the game. Using Classification and Regression Trees (CART) and play-by-play basketball data, we estimate the probabilities to score the shot with respect to a selection of game covariates related to game pressure. We use scoring probabilities to develop a player-specific shooting performance index that takes into account for the difficulty associated to score different types of shots. By applying this procedure to a large sample of 2016–2017 Basketball Champions League (BCL) and 2017–2018 National Basketball Association (NBA) games, we compare the factors affecting shooting performance in Europe and in the United States and we evaluate a selection of players in terms of the proposed shooting performance index with the final aim of providing useful guidelines for the team strategy.
Journal: Journal of Applied Statistics
Pages: 2120-2135
Issue: 12
Volume: 47
Year: 2020
Month: 9
X-DOI: 10.1080/02664763.2019.1704702
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1704702
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:12:p:2120-2135
Template-Type: ReDIF-Article 1.0
Author-Name: Zainab Abbasi Ganji
Author-X-Name-First: Zainab
Author-X-Name-Last: Abbasi Ganji
Author-Name: Bahram Sadeghpour Gildeh
Author-X-Name-First: Bahram
Author-X-Name-Last: Sadeghpour Gildeh
Title: Fuzzy process capability indices for simple linear profile
Abstract:
Process capability indices are numerical tools that quantify how well a process can meet customer requirements, specifications or engineering tolerances. Fuzzy logic is incorporated to deal imprecise, incomplete data along with uncertainty. This paper develops two fuzzy methods for measuring the process capability in simple linear profiles for the circumstances in which lower and upper specification limits are imprecise. To guide practitioners, numerical example is provided.
Journal: Journal of Applied Statistics
Pages: 2136-2158
Issue: 12
Volume: 47
Year: 2020
Month: 9
X-DOI: 10.1080/02664763.2019.1704225
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1704225
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:12:p:2136-2158
Template-Type: ReDIF-Article 1.0
Author-Name: E. M. Hashimoto
Author-X-Name-First: E. M.
Author-X-Name-Last: Hashimoto
Author-Name: E. M. M. Ortega
Author-X-Name-First: E. M. M.
Author-X-Name-Last: Ortega
Author-Name: G. M. Cordeiro
Author-X-Name-First: G. M.
Author-X-Name-Last: Cordeiro
Author-Name: A. K. Suzuki
Author-X-Name-First: A. K.
Author-X-Name-Last: Suzuki
Author-Name: M. W. Kattan
Author-X-Name-First: M. W.
Author-X-Name-Last: Kattan
Title: The multinomial logistic regression model for predicting the discharge status after liver transplantation: estimation and diagnostics analysis
Abstract:
The multinomial logistic regression model (MLRM) can be interpreted as a natural extension of the binomial model with logit link function to situations where the response variable can have three or more possible outcomes. In addition, when the categories of the response variable are nominal, the MLRM can be expressed in terms of two or more logistic models and analyzed in both frequentist and Bayesian approaches. However, few discussions about post modeling in categorical data models are found in the literature, and they mainly use Bayesian inference. The objective of this work is to present classic and Bayesian diagnostic measures for categorical data models. These measures are applied to a dataset (status) of patients undergoing kidney transplantation.
Journal: Journal of Applied Statistics
Pages: 2159-2177
Issue: 12
Volume: 47
Year: 2020
Month: 9
X-DOI: 10.1080/02664763.2019.1706725
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1706725
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:12:p:2159-2177
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaodong Fan
Author-X-Name-First: Xiaodong
Author-X-Name-Last: Fan
Author-Name: Shi-shun Zhao
Author-X-Name-First: Shi-shun
Author-X-Name-Last: Zhao
Author-Name: Qingchun Zhang
Author-X-Name-First: Qingchun
Author-X-Name-Last: Zhang
Author-Name: Jianguo Sun
Author-X-Name-First: Jianguo
Author-X-Name-Last: Sun
Title: Nonparametric tests for stratified additive hazards model based on current status data
Abstract:
Stratified regression models are commonly employed when study subjects may come from possibly different strata such as different medical centers, and for the situation, one common question of interest is to test the existence of the stratum effect. To address this, there exists some literature on the testing of the stratum effects under the framework of the proportional hazards model when one observes right-censored data or interval-censored data. In this paper, we consider the situation under the additive hazards model when one faces current status data, for which there does not seem to exist an established test procedure. The asymptotic distributions of the proposed test procedure are provided. Also a simulation study is performed to evaluate the performance of the proposed method and indicates that it works well for practical situations. The approach is applied to a set of real current status data from a tumorigenicity study.
Journal: Journal of Applied Statistics
Pages: 2178-2191
Issue: 12
Volume: 47
Year: 2020
Month: 9
X-DOI: 10.1080/02664763.2019.1707515
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1707515
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:12:p:2178-2191
Template-Type: ReDIF-Article 1.0
Author-Name: Elif Akça
Author-X-Name-First: Elif
Author-X-Name-Last: Akça
Author-Name: Ceylan Yozgatlıgil
Author-X-Name-First: Ceylan
Author-X-Name-Last: Yozgatlıgil
Title: Mutual information model selection algorithm for time series
Abstract:
Time series model selection has been widely studied in recent years. It is of importance to select the best model among candidate models proposed for a series in terms of explaining the procedure that governs the series and providing the most accurate forecast for the future observations. In this study, it is aimed to create an algorithm for order selection in Box–Jenkins models that combines penalized natural logarithm of mutual information among the original series and predictions coming from each candidate. The penalization is achieved by subtracting the number of parameters in each candidate and empirical information the data provide.Simulation studies under various scenarios and applications on real data sets imply that our algorithm offers a promising and satisfactory alternative to its counterparts.
Journal: Journal of Applied Statistics
Pages: 2192-2207
Issue: 12
Volume: 47
Year: 2020
Month: 9
X-DOI: 10.1080/02664763.2019.1707516
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1707516
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:12:p:2192-2207
Template-Type: ReDIF-Article 1.0
Author-Name: E. Gómez–Déniz
Author-X-Name-First: E.
Author-X-Name-Last: Gómez–Déniz
Author-Name: D. I. Gallardo
Author-X-Name-First: D. I.
Author-X-Name-Last: Gallardo
Author-Name: H. W. Gómez
Author-X-Name-First: H. W.
Author-X-Name-Last: Gómez
Title: Quasi-binomial zero-inflated regression model suitable for variables with bounded support
Abstract:
In recent years, a variety of regression models, including zero-inflated and hurdle versions, have been proposed to explain the case of a dependent variable with respect to exogenous covariates. Apart from the classical Poisson, negative binomial and generalised Poisson distributions, many proposals have appeared in the statistical literature, perhaps in response to the new possibilities offered by advanced software that now enables researchers to implement numerous special functions in a relatively simple way. However, we believe that a significant research gap remains, since very little attention has been paid to the quasi-binomial distribution, which was first proposed over fifty years ago. We believe this distribution might constitute a valid alternative to existing regression models, in situations in which the variable has bounded support. Therefore, in this paper we present a zero-inflated regression model based on the quasi-binomial distribution, taking into account the moments and maximum likelihood estimators, and perform a score test to compare the zero-inflated quasi-binomial distribution with the zero-inflated binomial distribution, and the zero-inflated model with the homogeneous model (the model in which covariates are not considered). This analysis is illustrated with two data sets that are well known in the statistical literature and which contain a large number of zeros.
Journal: Journal of Applied Statistics
Pages: 2208-2229
Issue: 12
Volume: 47
Year: 2020
Month: 9
X-DOI: 10.1080/02664763.2019.1707517
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1707517
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:12:p:2208-2229
Template-Type: ReDIF-Article 1.0
Author-Name: Ioannis S. Triantafyllou
Author-X-Name-First: Ioannis S.
Author-X-Name-Last: Triantafyllou
Author-Name: Nikolaos I. Panayiotou
Author-X-Name-First: Nikolaos I.
Author-X-Name-Last: Panayiotou
Title: Distribution-free monitoring schemes based on order statistics: a general approach
Abstract:
In this article, we establish a new class of distribution-free Shewhart-type monitoring schemes based on order statistics. The setup of the proposed family of nonparametric control charts is presented in detail. Specific monitoring schemes, already introduced in the literature, are confirmed to be members of the new class. In addition, a new nonparametric monitoring scheme that belongs to the class is established, while explicit formulae for its basic characteristics are reached. The numerical study carried out reveals that the proposed scheme achieves adversarial in-control and out-of-control performance.
Journal: Journal of Applied Statistics
Pages: 2230-2257
Issue: 12
Volume: 47
Year: 2020
Month: 9
X-DOI: 10.1080/02664763.2019.1707518
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1707518
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:12:p:2230-2257
Template-Type: ReDIF-Article 1.0
Author-Name: Muhammad Qasim
Author-X-Name-First: Muhammad
Author-X-Name-Last: Qasim
Author-Name: B. M. G. Kibria
Author-X-Name-First: B. M. G.
Author-X-Name-Last: Kibria
Author-Name: Kristofer Månsson
Author-X-Name-First: Kristofer
Author-X-Name-Last: Månsson
Author-Name: Pär Sjölander
Author-X-Name-First: Pär
Author-X-Name-Last: Sjölander
Title: A new Poisson Liu Regression Estimator: method and application
Abstract:
This paper considers the estimation of parameters for the Poisson regression model in the presence of high, but imperfect multicollinearity. To mitigate this problem, we suggest using the Poisson Liu Regression Estimator (PLRE) and propose some new approaches to estimate this shrinkage parameter. The small sample statistical properties of these estimators are systematically scrutinized using Monte Carlo simulations. To evaluate the performance of these estimators, we assess the Mean Square Errors (MSE) and the Mean Absolute Percentage Errors (MAPE). The simulation results clearly illustrate the benefit of the methods of estimating these types of shrinkage parameters in finite samples. Finally, we illustrate the empirical relevance of our newly proposed methods using an empirically relevant application. Thus, in summary, via simulations of empirically relevant parameter values, and by a standard empirical application, it is clearly demonstrated that our technique exhibits more precise estimators, compared to traditional techniques – at least when multicollinearity exist among the regressors.
Journal: Journal of Applied Statistics
Pages: 2258-2271
Issue: 12
Volume: 47
Year: 2020
Month: 9
X-DOI: 10.1080/02664763.2019.1707485
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1707485
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:12:p:2258-2271
Template-Type: ReDIF-Article 1.0
Author-Name: Leili Tapak
Author-X-Name-First: Leili
Author-X-Name-Last: Tapak
Author-Name: Omid Hamidi
Author-X-Name-First: Omid
Author-X-Name-Last: Hamidi
Author-Name: Payam Amini
Author-X-Name-First: Payam
Author-X-Name-Last: Amini
Author-Name: Geert Verbeke
Author-X-Name-First: Geert
Author-X-Name-Last: Verbeke
Title: Random effect exponentiated-exponential geometric model for clustered/longitudinal zero-inflated count data
Abstract:
For count responses, there are situations in biomedical and sociological applications in which extra zeroes occur. Modeling correlated (e.g. repeated measures and clustered) zero-inflated count data includes special challenges because the correlation between measurements for a subject or a cluster needs to be taken into account. Moreover, zero-inflated count data are often faced with over/under dispersion problem. In this paper, we propose a random effect model for repeated measurements or clustered data with over/under dispersed response called random effect zero-inflated exponentiated-exponential geometric regression model. The proposed method was illustrated through real examples. The performance of the model and asymptotical properties of the estimations were investigated using simulation studies.
Journal: Journal of Applied Statistics
Pages: 2272-2288
Issue: 12
Volume: 47
Year: 2020
Month: 9
X-DOI: 10.1080/02664763.2019.1706726
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1706726
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:12:p:2272-2288
Template-Type: ReDIF-Article 1.0
Author-Name: M. Stehlík
Author-X-Name-First: M.
Author-X-Name-Last: Stehlík
Author-Name: L. M. Grilo
Author-X-Name-First: L. M.
Author-X-Name-Last: Grilo
Author-Name: P. K. Jordanova
Author-X-Name-First: P. K.
Author-X-Name-Last: Jordanova
Title: Editorial to special issue V WCDANM 2018
Journal: Journal of Applied Statistics
Pages: 2289-2298
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1818489
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1818489
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2289-2298
Template-Type: ReDIF-Article 1.0
Author-Name: Barry C. Arnold
Author-X-Name-First: Barry C.
Author-X-Name-Last: Arnold
Author-Name: Matthew A. Arvanitis
Author-X-Name-First: Matthew A.
Author-X-Name-Last: Arvanitis
Title: On bivariate pseudo-exponential distributions
Abstract:
A bivariate conditionally specified distribution is one in which the dependence relationship between the two random variables is accomplished by defining the distribution of one of the random variables, given the other. One such conditionally specified model is called the pseudo-exponential distribution, where both the marginal distribution of one and the conditional distribution of the other, given the first, are exponential. In this paper, a variation of this conditioning regime is introduced, and its characteristics are contrasted with the original. An example is used to demonstrate the applicability of the new model. Per-capita Gross Domestic Product (GDP) is a measure of a nation's total annual production of goods and services, divided by its population. Two variations of both the original and the new conditioning regime are applied to GDP and infant mortality data across nations and territories. Possible generalizations are considered.
Journal: Journal of Applied Statistics
Pages: 2299-2311
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2019.1686132
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1686132
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2299-2311
Template-Type: ReDIF-Article 1.0
Author-Name: Ryan A. Peterson
Author-X-Name-First: Ryan A.
Author-X-Name-Last: Peterson
Author-Name: Joseph E. Cavanaugh
Author-X-Name-First: Joseph E.
Author-X-Name-Last: Cavanaugh
Title: Ordered quantile normalization: a semiparametric transformation built for the cross-validation era
Abstract:
Normalization transformations have recently experienced a resurgence in popularity in the era of machine learning, particularly in data preprocessing. However, the classical methods that can be adapted to cross-validation are not always effective. We introduce Ordered Quantile (ORQ) normalization, a one-to-one transformation that is designed to consistently and effectively transform a vector of arbitrary distribution into a vector that follows a normal (Gaussian) distribution. In the absence of ties, ORQ normalization is guaranteed to produce normally distributed transformed data. Once trained, an ORQ transformation can be readily and effectively applied to new data. We compare the effectiveness of the ORQ technique with other popular normalization methods in a simulation study where the true data generating distributions are known. We find that ORQ normalization is the only method that works consistently and effectively, regardless of the underlying distribution. We also explore the use of repeated cross-validation to identify the best normalizing transformation when the true underlying distribution is unknown. We apply our technique and other normalization methods via the bestNormalize R package on a car pricing data set. We built bestNormalize to evaluate the normalization efficacy of many candidate transformations; the package is freely available via the Comprehensive R Archive Network.
Journal: Journal of Applied Statistics
Pages: 2312-2327
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2019.1630372
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1630372
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2312-2327
Template-Type: ReDIF-Article 1.0
Author-Name: Salvatore D. Tomarchio
Author-X-Name-First: Salvatore D.
Author-X-Name-Last: Tomarchio
Author-Name: Antonio Punzo
Author-X-Name-First: Antonio
Author-X-Name-Last: Punzo
Title: Dichotomous unimodal compound models: application to the distribution of insurance losses
Abstract:
A correct modelization of the insurance losses distribution is crucial in the insurance industry. This distribution is generally highly positively skewed, unimodal hump-shaped, and with a heavy right tail. Compound models are a profitable way to accommodate situations in which some of the probability masses are shifted to the tails of the distribution. Therefore, in this work, a general approach to compound unimodal hump-shaped distributions with a mixing dichotomous distribution is introduced. A 2-parameter unimodal hump-shaped distribution, defined on a positive support, is considered and reparametrized with respect to the mode and to another parameter related to the distribution variability. The compound is performed by scaling the latter parameter by means of a dichotomous mixing distribution that governs the tail behavior of the resulting model. The proposed model can also allow for automatic detection of typical and atypical losses via a simple procedure based on maximum a posteriori probabilities. Unimodal gamma and log-normal are considered as examples of unimodal hump-shaped distributions. The resulting models are firstly evaluated in a sensitivity study and then fitted to two real insurance loss datasets, along with several well-known competitors. Likelihood-based information criteria and risk measures are used to compare the models.
Journal: Journal of Applied Statistics
Pages: 2328-2353
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1789076
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1789076
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2328-2353
Template-Type: ReDIF-Article 1.0
Author-Name: Vlad Stefan Barbu
Author-X-Name-First: Vlad Stefan
Author-X-Name-Last: Barbu
Author-Name: Alex Karagrigoriou
Author-X-Name-First: Alex
Author-X-Name-Last: Karagrigoriou
Author-Name: Andreas Makrides
Author-X-Name-First: Andreas
Author-X-Name-Last: Makrides
Title: Statistical inference for a general class of distributions with time-varying parameters
Abstract:
In this article we are interested in a general class of distributions for independent not necessarily identically distributed random variables, closed under minima, that includes a number of discrete and continuous distributions like the Geometric, Exponential, Weibull or Pareto. The main parameter involved in this class of distributions is assumed to be time varying with several possible modeling options. This is of particular interest in reliability and survival analysis for describing the time to event or failure. The maximum likelihood estimation of the parameters is addressed and the asymptotic properties of the estimators are discussed. We provide real and simulated examples and we explore the accuracy of the estimating procedure as well as the performance of classical model selection criteria in choosing the correct model among a number of competing models for the time-varying parameters of interest.
Journal: Journal of Applied Statistics
Pages: 2354-2373
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1763271
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1763271
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2354-2373
Template-Type: ReDIF-Article 1.0
Author-Name: Patrícia Antunes
Author-X-Name-First: Patrícia
Author-X-Name-Last: Antunes
Author-Name: Sandra S. Ferreira
Author-X-Name-First: Sandra S.
Author-X-Name-Last: Ferreira
Author-Name: Dário Ferreira
Author-X-Name-First: Dário
Author-X-Name-Last: Ferreira
Author-Name: Célia Nunes
Author-X-Name-First: Célia
Author-X-Name-Last: Nunes
Author-Name: João Tiago Mexia
Author-X-Name-First: João
Author-X-Name-Last: Tiago Mexia
Title: Estimation in additive models and ANOVA-like applications
Abstract:
A well-known property of cumulant generating function is used to estimate the first four order cumulants, using least-squares estimators. In the case of additive models, empirical best linear unbiased predictors are also obtained. Pairs of independent and identically distributed models associated with the treatments of a base design are used to obtain unbiased estimators for the fourth-order cumulants. An application to real data is presented, showing the good behaviour of the least-squares estimators and the great flexibility of our approach.
Journal: Journal of Applied Statistics
Pages: 2374-2383
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1723501
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1723501
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2374-2383
Template-Type: ReDIF-Article 1.0
Author-Name: Knute D. Carter
Author-X-Name-First: Knute D.
Author-X-Name-Last: Carter
Author-Name: Joseph E. Cavanaugh
Author-X-Name-First: Joseph E.
Author-X-Name-Last: Cavanaugh
Title: Best-subset model selection based on multitudinal assessments of likelihood improvements
Abstract:
A common model selection approach is to select the best model, according to some criterion, from among the collection of models defined by all possible subsets of the explanatory variables. Identifying an optimal subset has proven to be a challenging problem, both statistically and computationally. Our model selection procedure allows the researcher to nominate, a priori, the probability at which models containing false or spurious variables will be selected from among all possible subsets. The procedure determines whether inclusion of each candidate variable results in a sufficiently improved fitting term – and is hence named the SIFT procedure. Two variants are proposed: a naive method based on a set of restrictive assumptions and an empirical permutation-based method. Properties of these methods are investigated within the standard linear modeling framework and performance is evaluated against other model selection techniques. The SIFT procedure behaves as designed – asymptotically selecting variables that characterize the underlying data generating mechanism, while limiting selection of spurious variables to the desired level. The SIFT methodology offers researchers a promising new approach to model selection, providing the ability to control the probability of selecting a model that includes spurious variables to a level based on the context of the application.
Journal: Journal of Applied Statistics
Pages: 2384-2420
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2019.1645097
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1645097
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2384-2420
Template-Type: ReDIF-Article 1.0
Author-Name: C. Santos
Author-X-Name-First: C.
Author-X-Name-Last: Santos
Author-Name: C. Nunes
Author-X-Name-First: C.
Author-X-Name-Last: Nunes
Author-Name: C. Dias
Author-X-Name-First: C.
Author-X-Name-Last: Dias
Author-Name: J.T. Mexia
Author-X-Name-First: J.T.
Author-X-Name-Last: Mexia
Title: Models with commutative orthogonal block structure: a general condition for commutativity
Abstract:
A linear mixed model whose variance-covariance matrix is a linear combination of known pairwise orthogonal projection matrices that add to the identity matrix, is a model with orthogonal block structure (OBS). OBS have estimators with good behavior for estimable vectors and variance components, moreover it may be interesting that the least squares estimators give the best linear unbiased estimators, for estimable vectors. We can achieve it, requiring commutativity between the orthogonal projection matrix, on the space spanned by the mean vector, and the orthogonal projection matrices involved in the expression of the variance-covariance matrix. This commutativity condition defines a more restrict class of OBS, named COBS (model with commutative orthogonal block structure). With this work we aim to present a commutativity condition, resorting to a special class of matrices, named U-matrices.
Journal: Journal of Applied Statistics
Pages: 2421-2430
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1765322
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1765322
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2421-2430
Template-Type: ReDIF-Article 1.0
Author-Name: Nancy Flournoy
Author-X-Name-First: Nancy
Author-X-Name-Last: Flournoy
Author-Name: Assaf P. Oron
Author-X-Name-First: Assaf P.
Author-X-Name-Last: Oron
Title: Bias induced by adaptive dose-finding designs
Abstract:
There is a long literature on bias in maximum likelihood estimators. Here we demonstrate that adaptive dose-finding procedures (such as Continual Reassessment Methods, Up-and-Down and Interval Designs) themselves induce bias. In particular, with Bernoulli responses and dose assignments that depend on prior responses, we provide an explicit formula for the bias of observed response rates. We illustrate the patterns of bias for designs that aim to concentrate dose allocations around a target dose, which represents a specific quantile of a cumulative response-threshold distribution. For such designs, bias tends to be positive above the target dose and negative below it. To our knowledge, this property of dose-finding designs has not previously been recognized by design developers. We discuss the implications of this bias and suggest a simple shrinkage mitigation formula that improves estimation at doses away from the target.
Journal: Journal of Applied Statistics
Pages: 2431-2442
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2019.1649375
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1649375
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2431-2442
Template-Type: ReDIF-Article 1.0
Author-Name: M. Jabbari Nooghabi
Author-X-Name-First: M.
Author-X-Name-Last: Jabbari Nooghabi
Title: Process capability indices in normal distribution with the presence of outliers
Abstract:
Process capability indices (PCIs) are useful measures to evaluate the performance and capability of a process when it is under control. Assuming the specification variable is distributed from a normal population, several PCIs are derived by the researchers. Also, many scientists have worked on these indices when data are contaminated with outliers as well as in the homogenous case. But, in almost all studies, they evaluated the effect of outliers on the PCIs nonparametrical and used robust methods. Here, the parametric model of outliers is considered and introduced the PCIs based on the outliers model. Therefore, these indices are estimated based on the maximum-likelihood and moment estimator of the unknown parameters of the normal distribution contaminated by outliers. Finally, the performances of these measures as well as their parametric and nonparametric estimators are discussed by using simulation studies and several numerical examples. It has been seen that parametric estimation has better performances than a nonparametric method.
Journal: Journal of Applied Statistics
Pages: 2443-2478
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1796934
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1796934
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2443-2478
Template-Type: ReDIF-Article 1.0
Author-Name: Aboubacar Y. Touré
Author-X-Name-First: Aboubacar Y.
Author-X-Name-Last: Touré
Author-Name: Simplice Dossou-Gbété
Author-X-Name-First: Simplice
Author-X-Name-Last: Dossou-Gbété
Author-Name: Célestin C. Kokonendji
Author-X-Name-First: Célestin C.
Author-X-Name-Last: Kokonendji
Title: Asymptotic normality of the test statistics for the unified relative dispersion and relative variation indexes
Abstract:
Dispersion indexes with respect to the Poisson and binomial distributions are widely used to assess the conformity of the underlying distribution from an observed sample of the count with one or the other of these theoretical distributions. Recently, the exponential variation index has been proposed as an extension to nonnegative continuous data. This paper aims to gather to study the unified definition of these indexes with respect to the relative variability of a nonnegative natural exponential family of distributions through its variance function. We establish the strong consistency of the plug-in estimators of the indexes as well as their asymptotic normalities. Since the exact distributions of the estimators are not available in closed form, we consider the test of hypothesis relying on these estimators as test statistics with their asymptotic distributions. Simulation studies globally suggest good behaviours of these tests of hypothesis procedures. Applicable examples are analysed, including the lesser-known references such as negative binomial and inverse Gaussian, and improving the very usual case of the Poisson dispersion index. Concluding remarks are made with suggestions of possible extensions.
Journal: Journal of Applied Statistics
Pages: 2479-2491
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1779193
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1779193
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2479-2491
Template-Type: ReDIF-Article 1.0
Author-Name: Essam A. Ahmed
Author-X-Name-First: Essam A.
Author-X-Name-Last: Ahmed
Author-Name: Ziyad Ali Alhussain
Author-X-Name-First: Ziyad
Author-X-Name-Last: Ali Alhussain
Author-Name: Mukhtar M. Salah
Author-X-Name-First: Mukhtar M.
Author-X-Name-Last: Salah
Author-Name: Hanan Haj Ahmed
Author-X-Name-First: Hanan
Author-X-Name-Last: Haj Ahmed
Author-Name: M. S. Eliwa
Author-X-Name-First: M. S.
Author-X-Name-Last: Eliwa
Title: Inference of progressively type-II censored competing risks data from Chen distribution with an application
Abstract:
In this paper, the estimation of unknown parameters of Chen distribution is considered under progressive Type-II censoring in the presence of competing failure causes. It is assumed that the latent causes of failures have independent Chen distributions with the common shape parameter, but different scale parameters. From a frequentist perspective, the maximum likelihood estimate of parameters via expectation–maximization (EM) algorithm is obtained. Also, the expected Fisher information matrix based on the missing information principle is computed. By using the obtained expected Fisher information matrix of the MLEs, asymptotic 95% confidence intervals for the parameters are constructed. We also apply the bootstrap methods (Bootstrap-p and Bootstrap-t) to construct confidence intervals. From Bayesian aspect, the Bayes estimates of the unknown parameters are computed by applying the Markov chain Monte Carlo (MCMC) procedure, the average length and coverage rate of credible intervals are also carried out. The Bayes inference is based on the squared error, LINEX, and general entropy loss functions. The performance of point estimators and confidence intervals is evaluated by a simulation study. Finally, a real-life example is considered for illustrative purposes.
Journal: Journal of Applied Statistics
Pages: 2492-2524
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1815670
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1815670
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2492-2524
Template-Type: ReDIF-Article 1.0
Author-Name: Lisandro J. Fermín
Author-X-Name-First: Lisandro J.
Author-X-Name-Last: Fermín
Author-Name: Jacques Lévy-Véhel
Author-X-Name-First: Jacques
Author-X-Name-Last: Lévy-Véhel
Title: Variability and singularity arising from a Piecewise-Deterministic Markov Process applied to model poor patient compliance in the multi-IV case
Abstract:
We propose a Piecewise-Deterministic Markov Process (PDMP) to model the drug concentration in the case of multiple intravenous-bolus (multi-IV) doses and poor patient adherence situation: the scheduled time and doses of drug administration are not respected by the patient, the drug administration considers switching regime with random drug intake times. We study the randomness of drug concentration and derive probability results on the stochastic dynamics using the PDMP theory, focusing on two aspects of practical relevance: the variability of the concentration and the regularity of its stationary probability distribution. The main result show as the regularity of the concentration is governed by a parameter, which quantifies in a precise way the situations where drug intake times are too scarce concerning the elimination rate. Our approach is novel for the study of the regularity of the stationary distribution in PDMP models. This article extends the results given in [J. Lévy-Véhel and P.E. Lévy-Véhel, Variability and singularity arising from poor compliance in a pharmacodynamical model I: The multi-IV case, J. Pharmacokinet. Pharmacodyn. 40 (2013), pp. 15–39], by considering more realistic irregular dosing schedules. The computations permit precise assessment of the effect of various significant parameters such as the mean rate of intake, the elimination rate, and the mean dose. They quantify how much poor adherence will affect the regimen. Our results help to understand the consequences of poor adherence.
Journal: Journal of Applied Statistics
Pages: 2525-2545
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2019.1711030
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1711030
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2525-2545
Template-Type: ReDIF-Article 1.0
Author-Name: Isabel Silva
Author-X-Name-First: Isabel
Author-X-Name-Last: Silva
Author-Name: Maria Eduarda Silva
Author-X-Name-First: Maria
Author-X-Name-Last: Eduarda Silva
Author-Name: Cristina Torres
Author-X-Name-First: Cristina
Author-X-Name-Last: Torres
Title: Inference for bivariate integer-valued moving average models based on binomial thinning operation
Abstract:
Time series of (small) counts are common in practice and appear in a wide variety of fields. In the last three decades, several models that explicitly account for the discreteness of the data have been proposed in the literature. However, for multivariate time series of counts several difficulties arise and the literature is not so detailed. This work considers Bivariate INteger-valued Moving Average, BINMA, models based on the binomial thinning operation. The main probabilistic and statistical properties of BINMA models are studied. Two parametric cases are analysed, one with the cross-correlation generated through a Bivariate Poisson innovation process and another with a Bivariate Negative Binomial innovation process. Moreover, parameter estimation is carried out by the Generalized Method of Moments. The performance of the model is illustrated with synthetic data as well as with real datasets.
Journal: Journal of Applied Statistics
Pages: 2546-2564
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1747411
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1747411
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2546-2564
Template-Type: ReDIF-Article 1.0
Author-Name: Michael Cullan
Author-X-Name-First: Michael
Author-X-Name-Last: Cullan
Author-Name: Scott Lidgard
Author-X-Name-First: Scott
Author-X-Name-Last: Lidgard
Author-Name: Beckett Sterner
Author-X-Name-First: Beckett
Author-X-Name-Last: Sterner
Title: Controlling the error probabilities of model selection information criteria using bootstrapping
Abstract:
The Akaike Information Criterion (AIC) and related information criteria are powerful and increasingly popular tools for comparing multiple, non-nested models without the specification of a null model. However, existing procedures for information-theoretic model selection do not provide explicit and uniform control over error rates for the choice between models, a key feature of classical hypothesis testing. We show how to extend notions of Type-I and Type-II error to more than two models without requiring a null. We then present the Error Control for Information Criteria (ECIC) method, a bootstrap approach to controlling Type-I error using Difference of Goodness of Fit (DGOF) distributions. We apply ECIC to empirical and simulated data in time series and regression contexts to illustrate its value for parametric Neyman–Pearson classification. An R package implementing the bootstrap method is publicly available.
Journal: Journal of Applied Statistics
Pages: 2565-2581
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2019.1701636
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1701636
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2565-2581
Template-Type: ReDIF-Article 1.0
Author-Name: Benjamin Riedle
Author-X-Name-First: Benjamin
Author-X-Name-Last: Riedle
Author-Name: Andrew A. Neath
Author-X-Name-First: Andrew A.
Author-X-Name-Last: Neath
Author-Name: Joseph E. Cavanaugh
Author-X-Name-First: Joseph E.
Author-X-Name-Last: Cavanaugh
Title: Reconceptualizing the p-value from a likelihood ratio test: a probabilistic pairwise comparison of models based on Kullback-Leibler discrepancy measures
Abstract:
Discrepancy measures are often employed in problems involving the selection and assessment of statistical models. A discrepancy gauges the separation between a fitted candidate model and the underlying generating model. In this work, we consider pairwise comparisons of fitted models based on a probabilistic evaluation of the ordering of the constituent discrepancies. An estimator of the probability is derived using the bootstrap. In the framework of hypothesis testing, nested models are often compared on the basis of the p-value. Specifically, the simpler null model is favored unless the p-value is sufficiently small, in which case the null model is rejected and the more general alternative model is retained. Using suitably defined discrepancy measures, we mathematically show that, in general settings, the likelihood ratio test p-value is approximated by the bootstrapped discrepancy comparison probability (BDCP). We argue that the connection between the p-value and the BDCP leads to potentially new insights regarding the utility and limitations of the p-value. The BDCP framework also facilitates discrepancy-based inferences in settings beyond the limited confines of nested model hypothesis testing.
Journal: Journal of Applied Statistics
Pages: 2582-2609
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1754360
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1754360
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2582-2609
Template-Type: ReDIF-Article 1.0
Author-Name: Ya. Yu. Nikitin
Author-X-Name-First: Ya. Yu.
Author-X-Name-Last: Nikitin
Author-Name: I. A. Ragozin
Author-X-Name-First: I. A.
Author-X-Name-Last: Ragozin
Title: Goodness-of-fit tests for the logistic location family
Abstract:
We construct two U-empirical tests for the logistic location family which are based on appropriate characterization of this family using independent exponential shifts. We study the limiting distributions and local Bahadur efficiency of corresponding test statistics under close alternatives. It turns out that the present tests are considerably more efficient than the recently proposed similar tests based on another characterization. The efficiency calculations are accompanied by the simulation of power for new tests together with the previous ones.Both efficiency and power turn out to be very high. Finally we consider the application of our tests to real data example.
Journal: Journal of Applied Statistics
Pages: 2610-2622
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1761952
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1761952
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2610-2622
Template-Type: ReDIF-Article 1.0
Author-Name: M. Li
Author-X-Name-First: M.
Author-X-Name-Last: Li
Author-Name: Q. Guo
Author-X-Name-First: Q.
Author-X-Name-Last: Guo
Author-Name: W. J. Zhai
Author-X-Name-First: W. J.
Author-X-Name-Last: Zhai
Author-Name: B. Z. Chen
Author-X-Name-First: B. Z.
Author-X-Name-Last: Chen
Title: The linearized alternating direction method of multipliers for low-rank and fused LASSO matrix regression model
Abstract:
Datasets with matrix and vector form are increasingly popular in modern scientific fields. Based on structures of datasets, matrix and vector coefficients need to be estimated. At present, the matrix regression models were proposed, and they mainly focused on the matrix without vector variables. In order to fully explore complex structures of datasets, we propose a novel matrix regression model which combines fused LASSO and nuclear norm penalty, which can deal with the data containing matrix and vector variables meanwhile. Our main work is to design an efficient algorithm to solve the proposed low-rank and fused LASSO matrix regression model. Following the existing idea, we design the linearized alternating direction method of multipliers and establish its global convergence. Finally, we carry out numerical experiments to demonstrate the efficiency of our method. Especially, we apply our model to two real datasets, i.e. the signal shapes and the trip time prediction from partial trajectories.
Journal: Journal of Applied Statistics
Pages: 2623-2640
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1742296
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1742296
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2623-2640
Template-Type: ReDIF-Article 1.0
Author-Name: Célia Nunes
Author-X-Name-First: Célia
Author-X-Name-Last: Nunes
Author-Name: Elsa Moreira
Author-X-Name-First: Elsa
Author-X-Name-Last: Moreira
Author-Name: Sandra S. Ferreira
Author-X-Name-First: Sandra S.
Author-X-Name-Last: Ferreira
Author-Name: Dário Ferreira
Author-X-Name-First: Dário
Author-X-Name-Last: Ferreira
Author-Name: João T. Mexia
Author-X-Name-First: João T.
Author-X-Name-Last: Mexia
Title: Considering the sample sizes as truncated Poisson random variables in mixed effects models
Abstract:
When applying analysis of variance, the sample sizes may not be previously known, so it is more appropriate to consider them as realizations of random variables. A motivating example is the collection of observations during a fixed time span in a study comparing, for example, several pathologies of patients arriving at a hospital. This paper extends the theory of analysis of variance to those situations considering mixed effects models. We will assume that the occurrences of observations correspond to a counting process and the sample dimensions have Poisson distribution. The proposed approach is applied to a study of cancer patients.
Journal: Journal of Applied Statistics
Pages: 2641-2657
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2019.1641188
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1641188
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2641-2657
Template-Type: ReDIF-Article 1.0
Author-Name: Mario Giacomazzo
Author-X-Name-First: Mario
Author-X-Name-Last: Giacomazzo
Author-Name: Yiannis Kamarianakis
Author-X-Name-First: Yiannis
Author-X-Name-Last: Kamarianakis
Title: Bayesian estimation of subset threshold autoregressions: short-term forecasting of traffic occupancy
Abstract:
Traffic management authorities in metropolitan areas use real-time systems that analyze high-frequency measurements from fixed sensors, to perform short-term forecasting and incident detection for various locations of a road network. Published research over the last 20 years focused primarily on modeling and forecasting of traffic volumes and speeds. Traffic occupancy approximates vehicular density through the percentage of time a sensor detects a vehicle within a pre-specified time interval. It exhibits weekly periodic patterns and heteroskedasticity and has been used as a metric for characterizing traffic regimes (e.g. free flow, congestion). This article presents a Bayesian three-step model building procedure for parsimonious estimation of Threshold-Autoregressive (TAR) models, designed for location- day- and horizon-specific forecasting of traffic occupancy. In the first step, multiple regime TAR models reformulated as high-dimensional linear regressions are estimated using Bayesian horseshoe priors. Next, significant regimes are identified through a forward selection algorithm based on Kullback-Leibler (KL) distances between the posterior predictive distribution of the full reference model and TAR models with fewer regimes. Given the regimes, the forward selection algorithm can be implemented again to select significant autoregressive terms. In addition to forecasting, the proposed specification and model-building scheme, may assist in determining location-specific congestion thresholds and associations between traffic dynamics observed in different regions of a network. Empirical results applied to data from a traffic forecasting competition, illustrate the efficacy of the proposed procedures in obtaining interpretable models and in producing satisfactory point and density forecasts at multiple horizons.
Journal: Journal of Applied Statistics
Pages: 2658-2689
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1801606
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1801606
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2658-2689
Template-Type: ReDIF-Article 1.0
Author-Name: Jaime Arrué
Author-X-Name-First: Jaime
Author-X-Name-Last: Arrué
Author-Name: Reinaldo B. Arellano-Valle
Author-X-Name-First: Reinaldo B.
Author-X-Name-Last: Arellano-Valle
Author-Name: Héctor W. Gómez
Author-X-Name-First: Héctor W.
Author-X-Name-Last: Gómez
Author-Name: Víctor Leiva
Author-X-Name-First: Víctor
Author-X-Name-Last: Leiva
Title: On a new type of Birnbaum-Saunders models and its inference and application to fatigue data
Abstract:
The Birnbaum-Saunders distribution is a widely studied model with diverse applications. Its origins are in the modeling of lifetimes associated with material fatigue. By using a motivating example, we show that, even when lifetime data related to fatigue are modeled, the Birnbaum-Saunders distribution can be unsuitable to fit these data in the distribution tails. Based on the nice properties of the Birnbaum-Saunders model, in this work, we use a modified skew-normal distribution to construct such a model. This allows us to obtain flexibility in skewness and kurtosis, which is controlled by a shape parameter. We provide a mathematical characterization of this new type of Birnbaum-Saunders distribution and then its statistical characterization is derived by using the maximum-likelihood method, including the associated information matrices. In order to improve the inferential performance, we correct the bias of the corresponding estimators, which is supported by a simulation study. To conclude our investigation, we retake the motivating example based on fatigue life data to show the good agreement between the new type of Birnbaum-Saunders distribution proposed in this work and the data, reporting its potential applications.
Journal: Journal of Applied Statistics
Pages: 2690-2710
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2019.1668365
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1668365
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2690-2710
Template-Type: ReDIF-Article 1.0
Author-Name: Rafael Romero
Author-X-Name-First: Rafael
Author-X-Name-Last: Romero
Author-Name: Jose M. Pavía
Author-X-Name-First: Jose M.
Author-X-Name-Last: Pavía
Author-Name: Jorge Martín
Author-X-Name-First: Jorge
Author-X-Name-Last: Martín
Author-Name: Gerardo Romero
Author-X-Name-First: Gerardo
Author-X-Name-Last: Romero
Title: Assessing uncertainty of voter transitions estimated from aggregated data. Application to the 2017 French presidential election
Abstract:
Inferring electoral individual behaviour from aggregated data is a very active research area, with ramifications in sociology and political science. A new approach based on linear programming is proposed to estimate voter transitions among parties (or candidates) between two elections. Compared to other linear and quadratic programming models previously published, our approach presents two important innovations. Firstly, it explicitly deals with new entries and exits in the election census without assuming unrealistic hypotheses, enabling a reasonable estimation of vote behaviour of young electors voting for the first time. Secondly, by exploiting the information contained in the model residuals, we develop a procedure to assess the uncertainty in the estimates. This significantly distinguishes our model from other published mathematical programming methods. The method is illustrated estimating the vote transfer matrix between the first and second rounds of the 2017 French presidential election and measuring its level of uncertainty. Likewise, compared to the most current alternatives based on ecological regression, our approach is considerably simpler and faster, and has provided reasonable results in all the actual elections to which it has been applied. Interested scholars can easily use our procedure with the aid of the R-function provided in the Supplemental Material.
Journal: Journal of Applied Statistics
Pages: 2711-2736
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1804842
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1804842
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2711-2736
Template-Type: ReDIF-Article 1.0
Author-Name: Sandra Oliveira
Author-X-Name-First: Sandra
Author-X-Name-Last: Oliveira
Author-Name: Célia Nunes
Author-X-Name-First: Célia
Author-X-Name-Last: Nunes
Author-Name: Elsa Moreira
Author-X-Name-First: Elsa
Author-X-Name-Last: Moreira
Author-Name: Miguel Fonseca
Author-X-Name-First: Miguel
Author-X-Name-Last: Fonseca
Author-Name: João T. Mexia
Author-X-Name-First: João T.
Author-X-Name-Last: Mexia
Title: Balanced prime basis factorial fixed effects model with random number of observations
Abstract:
Factorial designs are in general more efficient for experiments that involve the study of the effects of two or more factors. In this paper we consider a $p^U $pU factorial model with U factors, each one having a p prime number of levels. We consider a balanced (r replicates per treatment) prime factorial with fixed effects. Our goal is to extend these models to the case where it is not possible to known in advance the number of treatments replicates, r. In these situations is more appropriate to consider r as a realization of a random variable R, which will be assumed to be geometrically distributed. The proposed approach is illustrated through an application considering simulated data.
Journal: Journal of Applied Statistics
Pages: 2737-2748
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2019.1679097
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1679097
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2737-2748
Template-Type: ReDIF-Article 1.0
Author-Name: Viktor Witkovský
Author-X-Name-First: Viktor
Author-X-Name-Last: Witkovský
Title: Computing the exact distribution of the Bartlett's test statistic by numerical inversion of its characteristic function
Abstract:
Application of the exact statistical inference frequently leads to non-standard probability distributions of the considered estimators or test statistics. The exact distributions of many estimators and test statistics can be specified by their characteristic functions, as is the case for the null distribution of the Bartlett's test statistic. However, analytical inversion of the characteristic function, if possible, frequently leads to complicated expressions for computing the distribution function and the corresponding quantiles. An efficient alternative is the well-known method based on numerical inversion of the characteristic functions, which is, however, ignored in popular statistical software packages. In this paper, we present the explicit characteristic function of the corrected Bartlett's test statistic together with the computationally fast and efficient implementation of the approach based on numerical inversion of this characteristic function, suggested for evaluating the exact null distribution used for testing homogeneity of variances in several normal populations, with possibly unequal sample sizes.
Journal: Journal of Applied Statistics
Pages: 2749-2764
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2019.1675608
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1675608
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2749-2764
Template-Type: ReDIF-Article 1.0
Author-Name: F. J. Marques
Author-X-Name-First: F. J.
Author-X-Name-Last: Marques
Author-Name: C. A. Coelho
Author-X-Name-First: C. A.
Author-X-Name-Last: Coelho
Title: Testing simultaneously different covariance block diagonal structures – the multi-sample case
Abstract:
In this work a likelihood ratio test which allows to test simultaneously if, several covariance matrices are equal and block diagonal with different specific structures in the diagonal blocks, is developed. The distribution of the likelihood ratio statistic is studied and the expression of its hth null moment are derived. In order to make the test useful in practical terms, near-exact approximations are developed for the likelihood ratio statistic. A practical application to real data set together with numerical studies and simulations are provided in order illustrate the applicability of the test and also to assess the precision of the near-exact approximations developed.
Journal: Journal of Applied Statistics
Pages: 2765-2784
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1712590
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1712590
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2765-2784
Template-Type: ReDIF-Article 1.0
Author-Name: Najme Sharifipanah
Author-X-Name-First: Najme
Author-X-Name-Last: Sharifipanah
Author-Name: Rahim Chinipardaz
Author-X-Name-First: Rahim
Author-X-Name-Last: Chinipardaz
Author-Name: Gholam Ali Parham
Author-X-Name-First: Gholam Ali
Author-X-Name-Last: Parham
Title: A new class of weighted bimodal distribution with application to gamma-ray burst duration data
Abstract:
Gamma-ray bursts (GRBs) have been confidently identified thus far and are prescribed to different physical scenarios, black hole mergers, and collapse of massive stars. The distribution of GRBs duration, which is one of the main characteristics of GRBs, is bimodal. Hence, many authors have used mixtures of distribution models to fit them, which suffers serious estimation problems either from classical or Bayesian approaches. Therefore, in this article we introduced a more flexible class of weighted bimodal distribution, called alpha two-piece skew normal (ATPSN), for modeling GRBs duration data set. Some of the main probabilistic and inferential properties of the distribution are discussed and a simulation study is carried out to illustrate the performance of the MLEs.
Journal: Journal of Applied Statistics
Pages: 2785-2807
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1815669
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1815669
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2785-2807
Template-Type: ReDIF-Article 1.0
Author-Name: P. Economou
Author-X-Name-First: P.
Author-X-Name-Last: Economou
Author-Name: G. Tzavelas
Author-X-Name-First: G.
Author-X-Name-Last: Tzavelas
Author-Name: A. Batsidis
Author-X-Name-First: A.
Author-X-Name-Last: Batsidis
Title: Robust inference under r-size-biased sampling without replacement from finite population
Abstract:
The case of size-biased sampling of known order from a finite population without replacement is considered. The behavior of such a sampling scheme is studied with respect to the sampling fraction. Based on a simulation study, it is concluded that such a sample cannot be treated either as a random sample from the parent distribution or as a random sample from the corresponding r-size weighted distribution and as the sampling fraction increases, the biasness in the sample decreases resulting in a transition from an r-size-biased sample to a random sample. A modified version of a likelihood-free method is adopted for making statistical inference for the unknown population parameters, as well as for the size of the population when it is unknown. A simulation study, which takes under consideration the sampling fraction, demonstrates that the proposed method presents better and more robust behavior compared to the approaches, which treat the r-size-biased sample either as a random sample from the parent distribution or as a random sample from the corresponding r-size weighted distribution. Finally, a numerical example which motivates this study illustrates our results.
Journal: Journal of Applied Statistics
Pages: 2808-2824
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2019.1711031
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1711031
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2808-2824
Template-Type: ReDIF-Article 1.0
Author-Name: Helena Penalva
Author-X-Name-First: Helena
Author-X-Name-Last: Penalva
Author-Name: M. Ivette Gomes
Author-X-Name-First: M. Ivette
Author-X-Name-Last: Gomes
Author-Name: Frederico Caeiro
Author-X-Name-First: Frederico
Author-X-Name-Last: Caeiro
Author-Name: M. Manuela Neves
Author-X-Name-First: M. Manuela
Author-X-Name-Last: Neves
Title: Lehmer's mean-of-order-p extreme value index estimation: a simulation study and applications
Abstract:
The main objective of extreme value theory is essentially the estimation of quantities related to extreme events. One of its main issues has been the estimation of the extreme value index (EVI), a parameter directly related to the tail weight of the distribution. Here we deal with the semi-parametric estimation of the EVI, for heavy tails. A recent class of EVI-estimators, based on the Lehmer's mean-of-order p (L $_p $p), which generalizes the arithmetic mean, is considered. An asymptotic comparison at optimal levels performed in previous works has revealed the competitiveness of this class of EVI-estimators. A large-scale Monte-Carlo simulation study for finite simulated samples has been here performed, showing the behaviour of L $_p $p, as a function of p. A bootstrap adaptive choice of $(k,p) $(k,p), where k is the number of upper order statistics used in the estimation, and a second algorithm based on a stability criterion are computationally studied and applied to simulated and real data.
Journal: Journal of Applied Statistics
Pages: 2825-2845
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2019.1694871
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1694871
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2825-2845
Template-Type: ReDIF-Article 1.0
Author-Name: D. Prata Gomes
Author-X-Name-First: D. Prata
Author-X-Name-Last: Gomes
Author-Name: M. Manuela Neves
Author-X-Name-First: M. Manuela
Author-X-Name-Last: Neves
Title: Extremal index blocks estimator: the threshold and the block size choice
Abstract:
The main objective of Statistics of Extremes is the estimation of probabilities of rare events. When extending the analysis of the limiting behaviour of the extreme values from independent and identically distributed sequences to stationary sequences a key parameter appears, the extremal index θ, whose accurate estimation is not easy. Here we focus on the estimation of θ using blocks estimators, that can be constructed by using disjoint or sliding blocks. The asymptotic properties for both procedures were studied and compared but both blocks estimators require the choice of a threshold and a block length. Some criteria have appeared for the choice of those nuisance quantities but some research is still needed. We will show how the threshold and the block size choices can affect the estimates. However the main objective of this work is to revisit another estimation procedure that only depends on the block length, although some conditions on the underlying process need to be verified. The associated estimator presents nice asymptotic properties, and for finite samples is here illustrated a stability criterion for choosing the block length and then obtaining the θ estimate. A large simulation study has been performed and an application to daily mean flow discharge rate in the hydrometric station of Fragas da Torre in river Paiva, data collected from 1 October 1946 to 30 April 2012 is done.
Journal: Journal of Applied Statistics
Pages: 2846-2861
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1720626
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1720626
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2846-2861
Template-Type: ReDIF-Article 1.0
Author-Name: A. Tchorbadjieff
Author-X-Name-First: A.
Author-X-Name-Last: Tchorbadjieff
Author-Name: P. Mayster
Author-X-Name-First: P.
Author-X-Name-Last: Mayster
Title: Models induced from critical birth–death process with random initial conditions
Abstract:
In this work, we study a linear birth–death process starting from random initial conditions. First, we consider these initial conditions as a random number of particles following different standard probabilistic distributions – Negative-Binomial and its closest Geometric, Poisson or Pólya–Aeppli distributions. It is proved analytically and numerically that in these cases the random number of particles alive at any positive time follows the same probability law like the initial condition, but with different parameters depending on time. The random initial conditions cannot change the critical parameter of branching mechanism, but they impact the extinction probability. Finally, the numerical model is extended to an application for studying branching processes with more complex initial conditions. This is demonstrated with a linear birth–death process initialised with Pólya urn sampling scheme. The obtained preliminary results for particle distribution show close relation to Pólya–Aeppli distribution.
Journal: Journal of Applied Statistics
Pages: 2862-2878
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1732309
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1732309
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2862-2878
Template-Type: ReDIF-Article 1.0
Author-Name: Eliana Costa e Silva
Author-X-Name-First: Eliana
Author-X-Name-Last: Costa e Silva
Author-Name: Isabel Cristina Lopes
Author-X-Name-First: Isabel Cristina
Author-X-Name-Last: Lopes
Author-Name: Aldina Correia
Author-X-Name-First: Aldina
Author-X-Name-Last: Correia
Author-Name: Susana Faria
Author-X-Name-First: Susana
Author-X-Name-Last: Faria
Title: A logistic regression model for consumer default risk
Abstract:
In this study, a logistic regression model is applied to credit scoring data from a given Portuguese financial institution to evaluate the default risk of consumer loans. It was found that the risk of default increases with the loan spread, loan term and age of the customer, but decreases if the customer owns more credit cards. Clients receiving the salary in the same banking institution of the loan have less chances of default than clients receiving their salary in another institution. We also found that clients in the lowest income tax echelon have more propensity to default. The model predicted default correctly in 89.79% of the cases.
Journal: Journal of Applied Statistics
Pages: 2879-2894
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1759030
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1759030
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2879-2894
Template-Type: ReDIF-Article 1.0
Author-Name: Olusola Samuel Makinde
Author-X-Name-First: Olusola
Author-X-Name-Last: Samuel Makinde
Title: On rank distribution classifiers for high-dimensional data
Abstract:
Spatial sign and rank-based methods have been studied in the recent literature, especially when the dimension is smaller than the sample size. In this paper, a classification method based on the distribution of rank functions for high-dimensional data is considered with extension to functional data. The method is fully nonparametric in nature. The performance of the classification method is illustrated in comparison with some other classifiers using simulated and real data sets. Supporting code in R are provided for computational implementation of the classification method that will be of use to others.
Journal: Journal of Applied Statistics
Pages: 2895-2911
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1768227
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1768227
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2895-2911
Template-Type: ReDIF-Article 1.0
Author-Name: A. N. Oliveira
Author-X-Name-First: A. N.
Author-X-Name-Last: Oliveira
Author-Name: R. Menezes
Author-X-Name-First: R.
Author-X-Name-Last: Menezes
Author-Name: S. Faria
Author-X-Name-First: S.
Author-X-Name-Last: Faria
Author-Name: P. Afonso
Author-X-Name-First: P.
Author-X-Name-Last: Afonso
Title: Mixed-effects modelling for crossed and nested data: an analysis of dengue fever in the state of Goiás, Brazil
Abstract:
Dengue fever is a viral disease transmitted by the mosquito Aedes aegypti. In order to avoid epidemics and deaths, this transmitting vector must be controlled. This work assembles, for the first time, data from multiple governmental bodies describing the number of dengue cases reported, and meteorological conditions in 20 cities in the Goiás state, Brazil, from 2008 to 2015. We then apply generalised linear mixed modelling to this novel data set to model dengue occurrences in this state, where the tropical climate favours the proliferation of the main transmitting vector of this disease. The number of reported dengue cases is estimated using meteorological variables as fixed effects, and city and year data are included in the model as random effects. The proposed models can cope with complex data structures, such as nested data, while taking into account the particularities of each year dependent on the city under analysis. The results confirm that precipitation, minimum temperature, and relative air humidity contribute to the increase of dengue cases number, while year and city location are determining factors. Public policies, based on these new results, together with joint actions involving local populations, are essential to combat the vector transmitting dengue and avoid epidemics.
Journal: Journal of Applied Statistics
Pages: 2912-2926
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1736528
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1736528
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2912-2926
Template-Type: ReDIF-Article 1.0
Author-Name: Isabel Silva
Author-X-Name-First: Isabel
Author-X-Name-Last: Silva
Author-Name: Hugo Alonso
Author-X-Name-First: Hugo
Author-X-Name-Last: Alonso
Title: New developments in the forecasting of monthly overnight stays in the North Region of Portugal
Abstract:
The Tourism sector is of strategic importance to the North Region of Portugal and is growing. Forecasting monthly overnight stays in this region is, therefore, a relevant problem. In this paper, we analyze data more recent than those considered in previous studies and use them to develop and compare several forecasting models and methods. We conclude that the best results are achieved by models based on a non-parametric approach not considered so far for these data, the singular spectrum analysis.
Journal: Journal of Applied Statistics
Pages: 2927-2940
Issue: 13-15
Volume: 47
Year: 2020
Month: 11
X-DOI: 10.1080/02664763.2020.1795812
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1795812
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:13-15:p:2927-2940
Template-Type: ReDIF-Article 1.0
Author-Name: Junhui Yin
Author-X-Name-First: Junhui
Author-X-Name-Last: Yin
Author-Name: Liucang Wu
Author-X-Name-First: Liucang
Author-X-Name-Last: Wu
Author-Name: Lin Dai
Author-X-Name-First: Lin
Author-X-Name-Last: Dai
Title: Variable selection in finite mixture of regression models using the skew-normal distribution
Abstract:
Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The majority of applications of variable selection in FMR models use a normal distribution for regression error. Such assumptions are unsuitable for a set of data containing a group or groups of observations with asymmetric behavior. In this paper, we introduce a variable selection procedure for FMR models using the skew-normal distribution. With appropriate choice of the tuning parameters, we establish the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation. To estimate the parameters of the model, a modified EM algorithm for numerical computations is developed. The methodology is illustrated through numerical experiments and a real data example.
Journal: Journal of Applied Statistics
Pages: 2941-2960
Issue: 16
Volume: 47
Year: 2020
Month: 12
X-DOI: 10.1080/02664763.2019.1709051
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1709051
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:16:p:2941-2960
Template-Type: ReDIF-Article 1.0
Author-Name: Zs. T. Kosztyán
Author-X-Name-First: Zs. T.
Author-X-Name-Last: Kosztyán
Author-Name: É. Orbán-Mihálykó
Author-X-Name-First: É.
Author-X-Name-Last: Orbán-Mihálykó
Author-Name: Cs. Mihálykó
Author-X-Name-First: Cs.
Author-X-Name-Last: Mihálykó
Author-Name: V. V. Csányi
Author-X-Name-First: V. V.
Author-X-Name-Last: Csányi
Author-Name: A. Telcs
Author-X-Name-First: A.
Author-X-Name-Last: Telcs
Title: Analyzing and clustering students' application preferences in higher education
Abstract:
We present a framework based on a higher education application preference list that allows a different type of flexible aggregation and, hence, the analysis and clustering of application data. Preference lists are converted into scores. The proposed approach is demonstrated in the context of higher education applications in Hungary over the period of 2006–2015. Our method reveals that efforts to leverage center-periphery differences do not fulfill expectations. Furthermore, the student's top preference is very hard to influence, and recruiters may build their strategy on information about the first and second choices.
Journal: Journal of Applied Statistics
Pages: 2961-2983
Issue: 16
Volume: 47
Year: 2020
Month: 12
X-DOI: 10.1080/02664763.2019.1709052
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1709052
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:16:p:2961-2983
Template-Type: ReDIF-Article 1.0
Author-Name: J. van Doorn
Author-X-Name-First: J.
Author-X-Name-Last: van Doorn
Author-Name: A. Ly
Author-X-Name-First: A.
Author-X-Name-Last: Ly
Author-Name: M. Marsman
Author-X-Name-First: M.
Author-X-Name-Last: Marsman
Author-Name: E.-J. Wagenmakers
Author-X-Name-First: E.-J.
Author-X-Name-Last: Wagenmakers
Title: Bayesian rank-based hypothesis testing for the rank sum test, the signed rank test, and Spearman's ρ
Abstract:
Bayesian inference for rank-order problems is frustrated by the absence of an explicit likelihood function. This hurdle can be overcome by assuming a latent normal representation that is consistent with the ordinal information in the data: the observed ranks are conceptualized as an impoverished reflection of an underlying continuous scale, and inference concerns the parameters that govern the latent representation. We apply this generic data-augmentation method to obtain Bayes factors for three popular rank-based tests: the rank sum test, the signed rank test, and Spearman's $\rho _s $ρs.
Journal: Journal of Applied Statistics
Pages: 2984-3006
Issue: 16
Volume: 47
Year: 2020
Month: 12
X-DOI: 10.1080/02664763.2019.1709053
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1709053
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:16:p:2984-3006
Template-Type: ReDIF-Article 1.0
Author-Name: Farzane Hashemi
Author-X-Name-First: Farzane
Author-X-Name-Last: Hashemi
Author-Name: Mehrdad Naderi
Author-X-Name-First: Mehrdad
Author-X-Name-Last: Naderi
Author-Name: Ahad Jamalizadeh
Author-X-Name-First: Ahad
Author-X-Name-Last: Jamalizadeh
Author-Name: Tsung-I Lin
Author-X-Name-First: Tsung-I
Author-X-Name-Last: Lin
Title: A skew factor analysis model based on the normal mean–variance mixture of Birnbaum–Saunders distribution
Abstract:
This paper presents a robust extension of factor analysis model by assuming the multivariate normal mean–variance mixture of Birnbaum–Saunders distribution for the unobservable factors and errors. A computationally analytical EM-based algorithm is developed to find maximum likelihood estimates of the parameters. The asymptotic standard errors of parameter estimates are derived under an information-based paradigm. Numerical merits of the proposed methodology are illustrated using both simulated and real datasets.
Journal: Journal of Applied Statistics
Pages: 3007-3029
Issue: 16
Volume: 47
Year: 2020
Month: 12
X-DOI: 10.1080/02664763.2019.1709054
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1709054
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:16:p:3007-3029
Template-Type: ReDIF-Article 1.0
Author-Name: Mohammad Mahdi Maghami
Author-X-Name-First: Mohammad Mahdi
Author-X-Name-Last: Maghami
Author-Name: Mohammad Bahrami
Author-X-Name-First: Mohammad
Author-X-Name-Last: Bahrami
Author-Name: Farkhondeh Alsadat Sajadi
Author-X-Name-First: Farkhondeh Alsadat
Author-X-Name-Last: Sajadi
Title: On bias reduction estimators of skew-normal and skew-t distributions
Abstract:
A particular concerns of researchers in statistical inference is bias in parameters estimation. Maximum likelihood estimators are often biased and for small sample size, the first order bias of them can be large and so it may influence the efficiency of the estimator. There are different methods for reduction of this bias. In this paper, we proposed a modified maximum likelihood estimator for the shape parameter of two popular skew distributions, namely skew-normal and skew-t, by offering a new method. We show that this estimator has lower asymptotic bias than the maximum likelihood estimator and is more efficient than those based on the existing methods.
Journal: Journal of Applied Statistics
Pages: 3030-3052
Issue: 16
Volume: 47
Year: 2020
Month: 12
X-DOI: 10.1080/02664763.2019.1710114
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1710114
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:47:y:2020:i:16:p:3030-3052
Template-Type: ReDIF-Article 1.0
Author-Name: Jie Chen
Author-X-Name-First: Jie
Author-X-Name-Last: Chen
Title: Editorial
Journal: Journal of Applied Statistics
Pages: 1-3
Issue: 1
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1830613
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1830613
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:1:p:1-3
Template-Type: ReDIF-Article 1.0
Author-Name: Muhammad Shujaat Nawaz
Author-X-Name-First: Muhammad Shujaat
Author-X-Name-Last: Nawaz
Author-Name: Muhammad Azam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Azam
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Title: EWMA and DEWMA repetitive control charts under non-normal processes
Abstract:
In this paper, we present a repetitive sampling method to construct control charts using exponentially weighted moving averages (EWMA) and double exponentially weighted moving averages (DEWMA) to monitor shift in the process. For non-normal processes, t-distribution with various degrees of freedom (i.e. $\textrm{df} = 4, 10, 20, 40, 50 $df=4,10,20,40,50) is used as symmetric distribution, gamma distribution with unit scale parameter and various shape parameters (i.e. $0.5, 1, 2, 3, 4 $0.5,1,2,3,4) is used as positively skewed distribution and Weibull distribution with unit scale parameter and various shape parameters (i.e. 10 and 20) is used as negatively skewed distribution. We use Monte Carlo simulations to check whether the process is out of control. We use average run length as a tool to find the ability of proposed control charts to identify a shift earlier in a process, as compared to other control charts currently used to monitor the same type of process. The proposed control charts are applied to two real datasets.
Journal: Journal of Applied Statistics
Pages: 4-40
Issue: 1
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2019.1709809
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1709809
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:1:p:4-40
Template-Type: ReDIF-Article 1.0
Author-Name: Xu Zhang
Author-X-Name-First: Xu
Author-X-Name-Last: Zhang
Author-Name: Sean Barnes
Author-X-Name-First: Sean
Author-X-Name-Last: Barnes
Author-Name: Bruce Golden
Author-X-Name-First: Bruce
Author-X-Name-Last: Golden
Author-Name: Paul Smith
Author-X-Name-First: Paul
Author-X-Name-Last: Smith
Title: A continuous-time Markov model for estimating readmission risk for hospital inpatients
Abstract:
Research concerning hospital readmissions has mostly focused on statistical and machine learning models that attempt to predict this unfortunate outcome for individual patients. These models are useful in certain settings, but their performance in many cases is insufficient for implementation in practice, and the dynamics of how readmission risk changes over time is often ignored. Our objective is to develop a model for aggregated readmission risk over time – using a continuous-time Markov chain – beginning at the point of discharge. We derive point and interval estimators for readmission risk, and find the asymptotic distributions for these probabilities. Finally, we validate our derived estimators using simulation, and apply our methods to estimate readmission risk over time using discharge and readmission data for surgical patients.
Journal: Journal of Applied Statistics
Pages: 41-60
Issue: 1
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2019.1709810
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1709810
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:1:p:41-60
Template-Type: ReDIF-Article 1.0
Author-Name: Rohan D. Koshti
Author-X-Name-First: Rohan D.
Author-X-Name-Last: Koshti
Author-Name: Kirtee K. Kamalja
Author-X-Name-First: Kirtee K.
Author-X-Name-Last: Kamalja
Title: Parameter estimation of Cambanis-type bivariate uniform distribution with Ranked Set Sampling
Abstract:
The concept of ranked set sampling (RSS) is applicable whenever ranking on a set of sampling units can be done easily using a judgment method or based on an auxiliary variable. In this paper, we consider a study variable $Y $Y correlated with the auxiliary variable $X $X and use it to rank the sampling units. Further $({X,Y} ) $(X,Y) is assumed to have Cambanis-type bivariate uniform (CTBU) distribution. We obtain an unbiased estimator of a scale parameter associated with the study variable $Y $Y based on different RSS schemes. We perform the efficiency comparison of the proposed estimators numerically. We present the trends in the efficiency performance of estimators under various RSS schemes with respect to parameters through line and surface plots. Further, we develop a Matlab function to simulate data from CTBU distribution and present the performance of proposed estimators through a simulation study. The results developed are implemented to real-life data also.
Journal: Journal of Applied Statistics
Pages: 61-83
Issue: 1
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2019.1709808
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1709808
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:1:p:61-83
Template-Type: ReDIF-Article 1.0
Author-Name: Philippe Gagnon
Author-X-Name-First: Philippe
Author-X-Name-Last: Gagnon
Author-Name: Mylène Bédard
Author-X-Name-First: Mylène
Author-X-Name-Last: Bédard
Author-Name: Alain Desgagné
Author-X-Name-First: Alain
Author-X-Name-Last: Desgagné
Title: An automatic robust Bayesian approach to principal component regression
Abstract:
Principal component regression uses principal components (PCs) as regressors. It is particularly useful in prediction settings with high-dimensional covariates. The existing literature treating of Bayesian approaches is relatively sparse. We introduce a Bayesian approach that is robust to outliers in both the dependent variable and the covariates. Outliers can be thought of as observations that are not in line with the general trend. The proposed approach automatically penalises these observations so that their impact on the posterior gradually vanishes as they move further and further away from the general trend, corresponding to a concept in Bayesian statistics called whole robustness. The predictions produced are thus consistent with the bulk of the data. The approach also exploits the geometry of PCs to efficiently identify those that are significant. Individual predictions obtained from the resulting models are consolidated according to model-averaging mechanisms to account for model uncertainty. The approach is evaluated on real data and compared to its nonrobust Bayesian counterpart, the traditional frequentist approach and a commonly employed robust frequentist method. Detailed guidelines to automate the entire statistical procedure are provided. All required code is made available, see ArXiv:1711.06341.
Journal: Journal of Applied Statistics
Pages: 84-104
Issue: 1
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2019.1710478
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1710478
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:1:p:84-104
Template-Type: ReDIF-Article 1.0
Author-Name: Zhihang Dong
Author-X-Name-First: Zhihang
Author-X-Name-Last: Dong
Author-Name: Yen-Chi Chen
Author-X-Name-First: Yen-Chi
Author-X-Name-Last: Chen
Author-Name: Adrian Dobra
Author-X-Name-First: Adrian
Author-X-Name-Last: Dobra
Title: A statistical framework for measuring the temporal stability of human mobility patterns
Abstract:
Despite the growing popularity of human mobility studies that collect GPS location data, the problem of determining the minimum required length of GPS monitoring has not been addressed in the current statistical literature. In this paper, we tackle this problem by laying out a theoretical framework for assessing the temporal stability of human mobility based on GPS location data. We define several measures of the temporal dynamics of human spatiotemporal trajectories based on the average velocity process, and on activity distributions in a spatial observation window. We demonstrate the use of our methods with data that comprise the GPS locations of 185 individuals over the course of 18 months. Our empirical results suggest that GPS monitoring should be performed over periods of time that are significantly longer than what has been previously suggested. Furthermore, we argue that GPS study designs should take into account demographic groups.
Journal: Journal of Applied Statistics
Pages: 105-123
Issue: 1
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2019.1711363
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1711363
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:1:p:105-123
Template-Type: ReDIF-Article 1.0
Author-Name: Christophe Chesneau
Author-X-Name-First: Christophe
Author-X-Name-Last: Chesneau
Author-Name: Hassan S. Bakouch
Author-X-Name-First: Hassan S.
Author-X-Name-Last: Bakouch
Author-Name: Tassaddaq Hussain
Author-X-Name-First: Tassaddaq
Author-X-Name-Last: Hussain
Author-Name: Bilal A. Para
Author-X-Name-First: Bilal A.
Author-X-Name-Last: Para
Title: The cosine geometric distribution with count data modeling
Abstract:
In this paper, a new two-parameter discrete distribution is introduced. It belongs to the family of the weighted geometric distribution (GD), with the feature of using a particular trigonometric weight. This configuration adds an oscillating property to the former GD which can be helpful in analyzing the data with over-dispersion, as developed in this study. First, we present the basic statistical properties of the new distribution, including the cumulative distribution function, hazard rate function and moment generating function. Estimation of the related model parameters is investigated using the maximum likelihood method. A simulation study is performed to illustrate the convergence of the estimators. Applications to two practical datasets are given to show that the new model performs at least as well as some competitors.
Journal: Journal of Applied Statistics
Pages: 124-137
Issue: 1
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2019.1711364
File-URL: http://hdl.handle.net/10.1080/02664763.2019.1711364
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:1:p:124-137
Template-Type: ReDIF-Article 1.0
Author-Name: Muhammad Aslam
Author-X-Name-First: Muhammad
Author-X-Name-Last: Aslam
Author-Name: Muhammad Ali Raza
Author-X-Name-First: Muhammad Ali
Author-X-Name-Last: Raza
Author-Name: Rehan Ahmad Khan Sherwani
Author-X-Name-First: Rehan Ahmad Khan
Author-X-Name-Last: Sherwani
Author-Name: Muhammad Farooq
Author-X-Name-First: Muhammad
Author-X-Name-Last: Farooq
Author-Name: Jun Yong Jeong
Author-X-Name-First: Jun Yong
Author-X-Name-Last: Jeong
Author-Name: Chi-Hyuck Jun
Author-X-Name-First: Chi-Hyuck
Author-X-Name-Last: Jun
Title: A mixed control chart for monitoring failure times under accelerated hybrid censoring
Abstract:
In an accelerated hybrid censoring scheme several stress factors can be accelerated to make the products to respond to fail more quickly than under normal operating conditions. In such situations, the control charts available in the literature cover the attribute characteristics only to monitor the performance of the process over time. This study extends the idea by proposing an optimal mixed attribute-variable control chart for Weibull distribution under an accelerated hybrid censoring scheme keeping the advantages of both attribute and variable control charts. It first monitors the number of defectives under accelerated conditions and switches to the variable control chart to investigate the mean failure times when the process stability is dubious. The performance of the proposed chart is evaluated by using run-length characteristics, and the optimality of the design parameter is achieved by minimizing the out-of-control average run length. The simulation study depicted better performance of the proposed control chart than the traditional charts in detecting shifts in the process. A real-life application is also included.
Journal: Journal of Applied Statistics
Pages: 138-153
Issue: 1
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1713060
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1713060
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:1:p:138-153
Template-Type: ReDIF-Article 1.0
Author-Name: Chen Feng
Author-X-Name-First: Chen
Author-X-Name-Last: Feng
Author-Name: Paul Griffin
Author-X-Name-First: Paul
Author-X-Name-Last: Griffin
Author-Name: Shravan Kethireddy
Author-X-Name-First: Shravan
Author-X-Name-Last: Kethireddy
Author-Name: Yajun Mei
Author-X-Name-First: Yajun
Author-X-Name-Last: Mei
Title: A boosting inspired personalized threshold method for sepsis screening
Abstract:
Sepsis is one of the biggest risks to patient safety, with a natural mortality rate between 25% and 50%. It is difficult to diagnose, and no validated standard for diagnosis currently exists. A commonly used scoring criteria is the quick sequential organ failure assessment (qSOFA). It demonstrates very low specificity in ICU populations, however. We develop a method to personalize thresholds in qSOFA that incorporates easily to measure patient baseline characteristics. We compare the personalized threshold method to qSOFA, five previously published methods that obtain an optimal constant threshold for a single biomarker, and to the machine learning algorithms based on logistic regression and AdaBoosting using patient data in the MIMIC-III database. The personalized threshold method achieves higher accuracy than qSOFA and the five published methods and has comparable performance to machine learning methods. Personalized thresholds, however, are much easier to adopt in real-life monitoring than machine learning methods as they are computed once for a patient and used in the same way as qSOFA, whereas the machine learning methods are hard to implement and interpret.
Journal: Journal of Applied Statistics
Pages: 154-175
Issue: 1
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1716695
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1716695
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:1:p:154-175
Template-Type: ReDIF-Article 1.0
Author-Name: Jan van den Broek
Author-X-Name-First: Jan
Author-X-Name-Last: van den Broek
Title: Modelling the reproductive power function
Abstract:
This paper discusses methods of estimating the reproductive power and the accompanying survival function of communicable events, e.g. infectious disease transmission. The early stage of an outbreak can be described by the infectiousness of the outbreak process, but in later stages of the outbreak, this is complicated by factors such as changing contact patterns and the impact of control measures. It is important to take these factors into account in order to get a good, if approximate, model for an outbreak process. This paper proposes a non-homogeneous birth process and regression model for the reproductive power function, similar to models in discrete survival analysis. A baseline reproductive power function gives a description of the outbreak when covariates are at their baseline values. As an illustration these methods are applied to an avian influenza (H5N1) outbreak among poultry in Thailand.
Journal: Journal of Applied Statistics
Pages: 176-190
Issue: 1
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1716696
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1716696
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:1:p:176-190
Template-Type: ReDIF-Article 1.0
Author-Name: Hossein Hassani
Author-X-Name-First: Hossein
Author-X-Name-Last: Hassani
Author-Name: Mansi Ghodsi
Author-X-Name-First: Mansi
Author-X-Name-Last: Ghodsi
Author-Name: Xu Huang
Author-X-Name-First: Xu
Author-X-Name-Last: Huang
Author-Name: Emmanuel Sirimal Silva
Author-X-Name-First: Emmanuel Sirimal
Author-X-Name-Last: Silva
Title: Is there a causal relationship between oil prices and tourist arrivals?
Abstract:
This application note investigates the causal relationship between oil price and tourist arrivals to further explain the impact of oil price volatility on tourism-related economic activities. The analysis itself considers the time domain, frequency domain and information theory domain perspectives. Data relating to US and nine European countries are exploited in this paper with causality tests which include time domain, frequency domain, and Convergent Cross Mapping (CCM). The CCM approach is nonparametric and therefore not restricted by assumptions. We contribute to existing research through the successful and introductory application of an advanced method, and via the uncovering of significant causal links from oil prices to tourist arrivals.
Journal: Journal of Applied Statistics
Pages: 191-202
Issue: 1
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1720625
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1720625
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:1:p:191-202
Template-Type: ReDIF-Article 1.0
Author-Name: Patrick Borges
Author-X-Name-First: Patrick
Author-X-Name-Last: Borges
Title: Estimating the turning point of the log-logistic hazard function in the presence of long-term survivors with an application for uterine cervical cancer data
Abstract:
The hazard function plays an important role in cancer patient survival studies, as it quantifies the instantaneous risk of death of a patient at any given time. Often in cancer clinical trials, unimodal hazard functions are observed, and it is of interest to detect (estimate) the turning point (mode) of hazard function, as this may be an important measure in patient treatment strategies with cancer. Moreover, when patient cure is a possibility, estimating cure rates at different stages of cancer, in addition to their proportions, may provide a better summary of the effects of stages on survival rates. Therefore, the main objective of this paper is to consider the problem of estimating the mode of hazard function of patients at different stages of cervical cancer in the presence of long-term survivors. To this end, a mixture cure rate model is proposed using the log-logistic distribution. The model is conveniently parameterized through the mode of the hazard function, in which cancer stages can affect both the cured fraction and the mode. In addition, we discuss aspects of model inference through the maximum likelihood estimation method. A Monte Carlo simulation study assesses the coverage probability of asymptotic confidence intervals.
Journal: Journal of Applied Statistics
Pages: 203-213
Issue: 2
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1720627
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1720627
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:2:p:203-213
Template-Type: ReDIF-Article 1.0
Author-Name: J. de Sousa
Author-X-Name-First: J.
Author-X-Name-Last: de Sousa
Author-Name: K. Hron
Author-X-Name-First: K.
Author-X-Name-Last: Hron
Author-Name: K. Fačevicová
Author-X-Name-First: K.
Author-X-Name-Last: Fačevicová
Author-Name: P. Filzmoser
Author-X-Name-First: P.
Author-X-Name-Last: Filzmoser
Title: Robust principal component analysis for compositional tables
Abstract:
A data table arranged according to two factors can often be considered a compositional table. An example is the number of unemployed people, split according to gender and age classes. Analyzed as compositions, the relevant information consists of ratios between different cells of such a table. This is particularly useful when analyzing several compositional tables jointly, where the absolute numbers are in very different ranges, e.g. if unemployment data are considered from different countries. Within the framework of the logratio methodology, compositional tables can be decomposed into independent and interactive parts, and orthonormal coordinates can be assigned to these parts. However, these coordinates usually require some prior knowledge about the data, and they are not easy to handle for exploring the relationships between the given factors. Here we propose a special choice of coordinates with direct relation to centered logratio (clr) coefficients, which are particularly useful for an interpretation in terms of the original cells of the tables. With these coordinates, robust principal component analysis (rPCA) is performed for dimension reduction, allowing to investigate relationships between the factors. The link between orthonormal coordinates and clr coefficients enables to apply rPCA, which would otherwise suffer from the singularity of clr coefficients.
Journal: Journal of Applied Statistics
Pages: 214-233
Issue: 2
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1722078
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1722078
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:2:p:214-233
Template-Type: ReDIF-Article 1.0
Author-Name: Yunlu Jiang
Author-X-Name-First: Yunlu
Author-X-Name-Last: Jiang
Author-Name: Yan Wang
Author-X-Name-First: Yan
Author-X-Name-Last: Wang
Author-Name: Jiantao Zhang
Author-X-Name-First: Jiantao
Author-X-Name-Last: Zhang
Author-Name: Baojian Xie
Author-X-Name-First: Baojian
Author-X-Name-Last: Xie
Author-Name: Jibiao Liao
Author-X-Name-First: Jibiao
Author-X-Name-Last: Liao
Author-Name: Wenhui Liao
Author-X-Name-First: Wenhui
Author-X-Name-Last: Liao
Title: Outlier detection and robust variable selection via the penalized weighted LAD-LASSO method
Abstract:
This paper studies the outlier detection and robust variable selection problem in the linear regression model. The penalized weighted least absolute deviation (PWLAD) regression estimation method and the adaptive least absolute shrinkage and selection operator (LASSO) are combined to simultaneously achieve outlier detection, and robust variable selection. An iterative algorithm is proposed to solve the proposed optimization problem. Monte Carlo studies are evaluated the finite-sample performance of the proposed methods. The results indicate that the finite sample performance of the proposed methods performs better than that of the existing methods when there are leverage points or outliers in the response variable or explanatory variables. Finally, we apply the proposed methodology to analyze two real datasets.
Journal: Journal of Applied Statistics
Pages: 234-246
Issue: 2
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1722079
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1722079
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:2:p:234-246
Template-Type: ReDIF-Article 1.0
Author-Name: Sajid Ali
Author-X-Name-First: Sajid
Author-X-Name-Last: Ali
Title: Monitoring time and magnitude based on the renewal reward process with a random failure threshold
Abstract:
Control charts are effective tools to distinguish between special and natural variations and have applications in medical and business industries besides manufacturing industry. Due to the advancement of modern technology, we often deal with the high-quality products, where the traditional process monitoring techniques have certain drawbacks. This article presents a control chart for jointly monitoring time and magnitude based on the renewal reward process. In particular, the focus of this study is to model magnitudes by threshold exceedance. More specifically, assuming a random failure threshold, two cases for magnitude are considered: (i) magnitude is cumulative over time and, (ii) magnitude is non-cumulative or independent over time. A comparative study to show the effectiveness of the proposal is also a part of this study.
Journal: Journal of Applied Statistics
Pages: 247-284
Issue: 2
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1723502
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1723502
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:2:p:247-284
Template-Type: ReDIF-Article 1.0
Author-Name: Fayyaz Bahari
Author-X-Name-First: Fayyaz
Author-X-Name-Last: Bahari
Author-Name: Safar Parsi
Author-X-Name-First: Safar
Author-X-Name-Last: Parsi
Author-Name: Mojtaba Ganjali
Author-X-Name-First: Mojtaba
Author-X-Name-Last: Ganjali
Title: Reliability of a soccer player based on the bivariate Rayleigh distribution with right censored and ignorable missing data
Abstract:
In this paper, we study the performance of a soccer player based on analysing an incomplete data set. To achieve this aim, we fit the bivariate Rayleigh distribution to the soccer dataset by the maximum likelihood method. In this way, the missing data and right censoring problems, that usually happen in such studies, are considered. Our aim is to inference about the performance of a soccer player by considering the stress and strength components. The first goal of the player of interest in a match is assumed as the stress component and the second goal of the match is assumed as the strength component. We propose some methods to overcome incomplete data problem and we use these methods to inference about the performance of a soccer player.
Journal: Journal of Applied Statistics
Pages: 285-300
Issue: 2
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1723504
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1723504
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:2:p:285-300
Template-Type: ReDIF-Article 1.0
Author-Name: Andrea Gabrio
Author-X-Name-First: Andrea
Author-X-Name-Last: Gabrio
Title: Bayesian hierarchical models for the prediction of volleyball results
Abstract:
Statistical modelling of sports data has become more and more popular in the recent years and different types of models have been proposed to achieve a variety of objectives: from identifying the key characteristics which lead a team to win or lose to predicting the outcome of a game or the team rankings in national leagues. Although not as popular as football or basketball, volleyball is a team sport with both national and international level competitions in almost every country. However, there is almost no study investigating the prediction of volleyball game outcomes and team rankings in national leagues. We propose a Bayesian hierarchical model for the prediction of the rankings of volleyball national teams, which also allows to estimate the results of each match in the league. We consider two alternative model specifications of different complexity which are validated using data from the women's volleyball Italian Serie A1 2017–2018 season.
Journal: Journal of Applied Statistics
Pages: 301-321
Issue: 2
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1723506
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1723506
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:2:p:301-321
Template-Type: ReDIF-Article 1.0
Author-Name: Riccardo D'Alberto
Author-X-Name-First: Riccardo
Author-X-Name-Last: D'Alberto
Author-Name: Meri Raggi
Author-X-Name-First: Meri
Author-X-Name-Last: Raggi
Title: How much reliable are the integrated ‘live’ data? A validation strategy proposal for the non-parametric micro statistical matching
Abstract:
The integration of different data sources is a widely discussed topic among both the researchers and the Official Statistics. Integrating data helps to contain costs and time required by new data collections. The non-parametric micro Statistical Matching (SM) allows to integrate ‘live’ data resorting only to the observed information, potentially avoiding the misspecification bias and speeding the computational effort. Despite these pros, the assessment of the integration goodness when we use this method is not robust. Moreover, several applications comply with some commonly accepted practices which recommend e.g. to use the biggest data set as donor. We propose a validation strategy to assess the integration goodness. We apply it to investigate these practices and to explore how different combinations of the SM techniques and distance functions perform in terms of the reliability of the synthetic (complete) data set generated. The validation strategy takes advantage of the relation existing among the variables pre-and-post the integration. The results show that ‘the biggest, the best’ rule must not be considered mandatory anymore. Indeed, the integration goodness increases in relation to the variability of the matching variables rather than with respect to the dimensionality ratio between the recipient and the donor data set.
Journal: Journal of Applied Statistics
Pages: 322-348
Issue: 2
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1724272
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1724272
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:2:p:322-348
Template-Type: ReDIF-Article 1.0
Author-Name: Julio Cezar Souza Vasconcelos
Author-X-Name-First: Julio Cezar Souza
Author-X-Name-Last: Vasconcelos
Author-Name: Gauss Moutinho Cordeiro
Author-X-Name-First: Gauss Moutinho
Author-X-Name-Last: Cordeiro
Author-Name: Edwin Moises Marcos Ortega
Author-X-Name-First: Edwin Moises Marcos
Author-X-Name-Last: Ortega
Author-Name: Édila Maria de Rezende
Author-X-Name-First: Édila Maria de
Author-X-Name-Last: Rezende
Title: A new regression model for bimodal data and applications in agriculture
Abstract:
We define the odd log-logistic exponential Gaussian regression with two systematic components, which extends the heteroscedastic Gaussian regression and it is suitable for bimodal data quite common in the agriculture area. We estimate the parameters by the method of maximum likelihood. Some simulations indicate that the maximum-likelihood estimators are accurate. The model assumptions are checked through case deletion and quantile residuals. The usefulness of the new regression model is illustrated by means of three real data sets in different areas of agriculture, where the data present bimodality.
Journal: Journal of Applied Statistics
Pages: 349-372
Issue: 2
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1723503
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1723503
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:2:p:349-372
Template-Type: ReDIF-Article 1.0
Author-Name: Rui Martins
Author-X-Name-First: Rui
Author-X-Name-Last: Martins
Author-Name: Jorge Caldeira
Author-X-Name-First: Jorge
Author-X-Name-Last: Caldeira
Author-Name: Inês Lopes
Author-X-Name-First: Inês
Author-X-Name-Last: Lopes
Author-Name: José João Mendes
Author-X-Name-First: José
Author-X-Name-Last: João Mendes
Title: Improving teeth aesthetics using a spatially shared-parameters model for independent regular lattices
Abstract:
An important feature in dentistry is teeth gloss. During an intervention, the doctor applies a resin and a polishing to achieve the lowest roughness and the highest gloss possible. This work aims to evaluate the effect of four polishing protocols in teeth surface roughness and gloss when combined with two different resins and eventually indicate the best combination (treatment). An atomic force microscope is used for measuring the in vitro roughness of a dental surface surrogate. We consider a shared parameters approach for linking the information carried by those two correlated variables. The model fitted to the gloss considers some features of the roughness, namely the information conveyed by a set of spatial structured random effects, specific to each treatment, and the within treatment variance, which allows interpreting how the heterogeneity and the variability of the surface roughness impacts a tooth gloss. The statistical model here developed is an alternative to the “traditional” two-way ANOVA used in dentistry journals. The results, using the recent R-NIMBLE package in R, show that variability characteristics of the surface's roughness are central for explaining differences among the gloss achieved after each treatment and not just the mean roughness of that surface.
Journal: Journal of Applied Statistics
Pages: 373-392
Issue: 2
Volume: 48
Year: 2021
Month: 1
X-DOI: 10.1080/02664763.2020.1724273
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1724273
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:2:p:373-392
Template-Type: ReDIF-Article 1.0
Author-Name: Ji Hwan Cha
Author-X-Name-First: Ji Hwan
Author-X-Name-Last: Cha
Author-Name: F. G. Badía
Author-X-Name-First: F. G.
Author-X-Name-Last: Badía
Title: Variables acceptance reliability sampling plan for items subject to inverse Gaussian degradation process
Abstract:
Until now, in the literature, a variety of acceptance reliability sampling plans have been developed based on different life test plans. In most of the reliability sampling plans, the decision procedures to accept or reject the corresponding lot are developed based on the lifetimes of the items observed on tests, or the number of failures observed during a pre-specified testing time. However, frequently, the items are subject to degradation phenomena and, in these cases, the observed degradation level of the item can be used as a decision statistic. In this paper, we develop a variables acceptance sampling plan based on the information on the degradation process of the items, assuming that the degradation process follows the inverse Gaussian process. It is shown that the developed sampling plan improves the reliability performance of the items conditional on the acceptance in the test and that the lifetimes of items after the reliability sampling test are stochastically larger than those before the test. A study comparing the proposed degradation-based sampling plan with the conventional sampling plan which is based on a life test is also performed.
Journal: Journal of Applied Statistics
Pages: 393-409
Issue: 3
Volume: 48
Year: 2021
Month: 2
X-DOI: 10.1080/02664763.2020.1723505
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1723505
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:3:p:393-409
Template-Type: ReDIF-Article 1.0
Author-Name: Xia Wang
Author-X-Name-First: Xia
Author-X-Name-Last: Wang
Author-Name: Joseph Pancras
Author-X-Name-First: Joseph
Author-X-Name-Last: Pancras
Author-Name: Dipak K. Dey
Author-X-Name-First: Dipak K.
Author-X-Name-Last: Dey
Title: Investigating emergent nested geographic structure in consumer purchases: a Bayesian dynamic multi-scale spatiotemporal modeling approach
Abstract:
Spatial modeling of consumer response data has gained increased interest recently in the marketing literature. In this paper, we extend the (spatial) multi-scale model by incorporating both spatial and temporal dimensions in the dynamic multi-scale spatiotemporal modeling approach. Our empirical application with a US company’s catalog purchase data for the period 1997–2001 reveals a nested geographic market structure that spans geopolitical boundaries such as state borders. This structure identifies spatial clusters of consumers who exhibit similar spatiotemporal behavior, thus pointing to the importance of emergent geographic structure, emergent nested structure and dynamic patterns in multi-resolution methods. The multi-scale model also has better performance in estimation and prediction compared with several spatial and spatiotemporal models and uses a scalable and computationally efficient Markov chain Monte Carlo method that makes it suitable for analyzing large spatiotemporal consumer purchase datasets.
Journal: Journal of Applied Statistics
Pages: 410-433
Issue: 3
Volume: 48
Year: 2021
Month: 2
X-DOI: 10.1080/02664763.2020.1725810
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1725810
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:3:p:410-433
Template-Type: ReDIF-Article 1.0
Author-Name: Shu Wu
Author-X-Name-First: Shu
Author-X-Name-Last: Wu
Author-Name: Philippe Castagliola
Author-X-Name-First: Philippe
Author-X-Name-Last: Castagliola
Author-Name: Giovanni Celano
Author-X-Name-First: Giovanni
Author-X-Name-Last: Celano
Title: A distribution-free EWMA control chart for monitoring time-between-events-and-amplitude data
Abstract:
Many control charts have been developed for the simultaneous monitoring of the time interval T between successive occurrences of an event E and its magnitude X. All these TBEA (Time Between Events and Amplitude) control charts assume a known distribution for the random variables T and X. But, in practice, as it is rather difficult to know their actual distributions, proposing a distribution free approach could be a way to overcome this ‘distribution choice’ dilemma. For this reason, we propose in this paper a distribution free upper-sided EWMA (Exponentially Weighted Moving Average) type control chart, for simultaneously monitoring the time interval T and the magnitude X of an event. In order to investigate the performance of this control chart and obtain its run length properties, we also develop a specific method called ‘continuousify’ which, coupled with a classical Markov chain technique, allows to obtain reliable and replicable results. A numerical comparison shows that our distribution-free EWMA TBEA chart performs as the parametric Shewhart TBEA chart, but without the need to pre-specify any distribution. An illustrative example obtained from a French forest fire database is also provided to show the implementation of the proposed EWMA TBEA control chart.
Journal: Journal of Applied Statistics
Pages: 434-454
Issue: 3
Volume: 48
Year: 2021
Month: 2
X-DOI: 10.1080/02664763.2020.1729347
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1729347
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:3:p:434-454
Template-Type: ReDIF-Article 1.0
Author-Name: Cheng Zhang
Author-X-Name-First: Cheng
Author-X-Name-Last: Zhang
Author-Name: Tapan K. Nayak
Author-X-Name-First: Tapan K.
Author-X-Name-Last: Nayak
Title: Post-randomization for controlling identification risk in releasing microdata from general surveys
Abstract:
Before releasing survey data, statistical agencies usually perturb the original data to keep each survey unit's information confidential. One significant concern in releasing survey microdata is identity disclosure, which occurs when an intruder correctly identifies the records of a survey unit by matching the values of some key (or pseudo-identifying) variables. We examine a recently developed post-randomization method for a strict control of identification risks in releasing survey microdata. While that procedure well preserves the observed frequencies and hence statistical estimates in case of simple random sampling, we show that in general surveys, it may induce considerable bias in commonly used survey-weighted estimators. We propose a modified procedure that better preserves weighted estimates. The procedure is illustrated and empirically assessed with an application to a publicly available US Census Bureau data set.
Journal: Journal of Applied Statistics
Pages: 455-470
Issue: 3
Volume: 48
Year: 2021
Month: 2
X-DOI: 10.1080/02664763.2020.1732310
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1732310
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:3:p:455-470
Template-Type: ReDIF-Article 1.0
Author-Name: Alejandro Rodriguez
Author-X-Name-First: Alejandro
Author-X-Name-Last: Rodriguez
Author-Name: Gabriel Pino
Author-X-Name-First: Gabriel
Author-X-Name-Last: Pino
Author-Name: Rodrigo Herrera
Author-X-Name-First: Rodrigo
Author-X-Name-Last: Herrera
Title: A non-parametric statistic for testing conditional heteroscedasticity for unobserved component models
Abstract:
When prediction intervals are constructed using unobserved component models (UCM), problems can arise due to the possible existence of components that may or may not be conditionally heteroscedastic. Accurate coverage depends on correctly identifying the source of the heteroscedasticity. Different proposals for testing heteroscedasticity have been applied to UCM; however, in most cases, these procedures are unable to identify the heteroscedastic component correctly. The main issue is that test statistics are affected by the presence of serial correlation, causing the distribution of the statistic under conditional homoscedasticity to remain unknown. We propose a nonparametric statistic for testing heteroscedasticity based on the well-known Wilcoxon's rank statistic. We study the asymptotic validation of the statistic and examine bootstrap procedures for approximating its finite sample distribution. Simulation results show an improvement in the size of the homoscedasticity tests and a power that is clearly comparable with the best alternative in the literature. We also apply the test on real inflation data. Looking for the presence of a conditionally heteroscedastic effect on the error terms, we arrive at conclusions that almost all cases are different than those given by the alternative test statistics presented in the literature.
Journal: Journal of Applied Statistics
Pages: 471-497
Issue: 3
Volume: 48
Year: 2021
Month: 2
X-DOI: 10.1080/02664763.2020.1732885
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1732885
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:3:p:471-497
Template-Type: ReDIF-Article 1.0
Author-Name: Nedka Dechkova Nikiforova
Author-X-Name-First: Nedka Dechkova
Author-X-Name-Last: Nikiforova
Author-Name: Rossella Berni
Author-X-Name-First: Rossella
Author-X-Name-Last: Berni
Author-Name: Gabriele Arcidiacono
Author-X-Name-First: Gabriele
Author-X-Name-Last: Arcidiacono
Author-Name: Luciano Cantone
Author-X-Name-First: Luciano
Author-X-Name-Last: Cantone
Author-Name: Pierpaolo Placidoli
Author-X-Name-First: Pierpaolo
Author-X-Name-Last: Placidoli
Title: Latin hypercube designs based on strong orthogonal arrays and Kriging modelling to improve the payload distribution of trains
Abstract:
Nowadays, computer experiments are used increasingly more to solve complex engineering and technological issues. Computer experiments are analysed through suitable metamodels acting as statistical interpolators of the simulated input-output data: Kriging is the most appropriate and widely used one. We optimise the braking performance of freight trains through computer experiments and Kriging modelling by focussing on the payload distribution along the train, so as to reduce the effects of in-train forces among wagons during a train emergency braking. One contribution of this manuscript is that to improve the freight train efficiency in terms of braking performance, we consider that the train is composed of several train sections with each one characterised by its own overall payload. A suitable Latin hypercube design is planned for the computer experiment that achieves excellent space-filling properties with a relatively low number of experimental runs. Kriging models with anisotropic covariance function are subsequently applied to assess which is the best payload distribution capable of reducting the in-train forces according to the specific train-set arrangement considered. The results are very satisfactory and confirm that our approach represents a valid method to be successfully applied by interested Railway Undertakings.
Journal: Journal of Applied Statistics
Pages: 498-516
Issue: 3
Volume: 48
Year: 2021
Month: 2
X-DOI: 10.1080/02664763.2020.1733943
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1733943
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:3:p:498-516
Template-Type: ReDIF-Article 1.0
Author-Name: Farrukh Javed
Author-X-Name-First: Farrukh
Author-X-Name-Last: Javed
Author-Name: Stepan Mazur
Author-X-Name-First: Stepan
Author-X-Name-Last: Mazur
Author-Name: Edward Ngailo
Author-X-Name-First: Edward
Author-X-Name-Last: Ngailo
Title: Higher order moments of the estimated tangency portfolio weights
Abstract:
In this paper, we consider the estimated weights of the tangency portfolio. We derive analytical expressions for the higher order non-central and central moments of these weights when the returns are assumed to be independently and multivariate normally distributed. Moreover, the expressions for mean, variance, skewness and kurtosis of the estimated weights are obtained in closed forms. Later, we complement our results with a simulation study where data from the multivariate normal and t-distributions are simulated, and the first four moments of estimated weights are computed by using the Monte Carlo experiment. It is noteworthy to mention that the distributional assumption of returns is found to be important, especially for the first two moments. Finally, through an empirical illustration utilizing returns of four financial indices listed in NASDAQ stock exchange, we observe the presence of time dynamics in higher moments.
Journal: Journal of Applied Statistics
Pages: 517-535
Issue: 3
Volume: 48
Year: 2021
Month: 2
X-DOI: 10.1080/02664763.2020.1736523
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1736523
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:3:p:517-535
Template-Type: ReDIF-Article 1.0
Author-Name: Sancharee Basak
Author-X-Name-First: Sancharee
Author-X-Name-Last: Basak
Author-Name: Ayanendranath Basu
Author-X-Name-First: Ayanendranath
Author-X-Name-Last: Basu
Author-Name: M. C. Jones
Author-X-Name-First: M. C.
Author-X-Name-Last: Jones
Title: On the ‘optimal’ density power divergence tuning parameter
Abstract:
The density power divergence, indexed by a single tuning parameter α, has proved to be a very useful tool in minimum distance inference. The family of density power divergences provides a generalized estimation scheme which includes likelihood-based procedures (represented by choice $\alpha = 0 $α=0 for the tuning parameter) as a special case. However, under data contamination, this scheme provides several more stable choices for model fitting and analysis (provided by positive values for the tuning parameter α). As larger values of α necessarily lead to a drop in model efficiency, determining the optimal value of α to provide the best compromise between model-efficiency and stability against data contamination in any real situation is a major challenge. In this paper, we provide a refinement of an existing technique with the aim of eliminating the dependence of the procedure on an initial pilot estimator. Numerical evidence is provided to demonstrate the very good performance of the method. Our technique has a general flavour, and we expect that similar tuning parameter selection algorithms will work well for other M-estimators, or any robust procedure that depends on the choice of a tuning parameter.
Journal: Journal of Applied Statistics
Pages: 536-556
Issue: 3
Volume: 48
Year: 2021
Month: 2
X-DOI: 10.1080/02664763.2020.1736524
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1736524
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:3:p:536-556
Template-Type: ReDIF-Article 1.0
Author-Name: Julio Cezar Souza Vasconcelos
Author-X-Name-First: Julio Cezar
Author-X-Name-Last: Souza Vasconcelos
Author-Name: Cristian Villegas
Author-X-Name-First: Cristian
Author-X-Name-Last: Villegas
Title: Generalized symmetrical partial linear model
Abstract:
In this work, we propose a new model called generalized symmetrical partial linear model, based on the theory of generalized linear models and symmetrical distributions. In our model the response variable follows a symmetrical distribution such a normal, Student-t, power exponential, among others. Following the context of generalized linear models we consider replacing the traditional linear predictors by the more general predictors in whose case one covariate is related with the response variable in a non-parametric fashion, that we do not specified the parametric function. As an example, we could imagine a regression model in which the intercept term is believed to vary in time or geographical location. The backfitting algorithm is used for estimating the parameters of the proposed model. We perform a simulation study for assessing the behavior of the penalized maximum likelihood estimators. We use the quantile residuals for checking the assumption of the model. Finally, we analyzed real data set related with pH rivers in Ireland.
Journal: Journal of Applied Statistics
Pages: 557-572
Issue: 3
Volume: 48
Year: 2021
Month: 2
X-DOI: 10.1080/02664763.2020.1726301
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1726301
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:3:p:557-572
Template-Type: ReDIF-Article 1.0
Author-Name: Tomson Ogwang
Author-X-Name-First: Tomson
Author-X-Name-Last: Ogwang
Author-Name: Danny I. Cho
Author-X-Name-First: Danny I.
Author-X-Name-Last: Cho
Title: Olympic rankings based on objective weighting schemes
Abstract:
In this paper, we propose an objective principal components weighting scheme for all-time Winter Olympic gold, silver and bronze medals based solely on the number of medals won. Our results suggest that the approximately equal weights be assigned (or the total medal counts be used regardless of color) if all of the three medal types are retained for ranking purposes. When the proposed methodology is tested against five alternative weighting schemes that have been suggested in the literature using the results for the 2010 Vancouver Winter Olympics, we find a significant agreement in the country rankings. Furthermore, our implementation of principal components variable reduction strategy results in the identification of silver as the best representative medal count for parsimonious Winter Olympics rankings.
Journal: Journal of Applied Statistics
Pages: 573-582
Issue: 3
Volume: 48
Year: 2021
Month: 2
X-DOI: 10.1080/02664763.2020.1736525
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1736525
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:3:p:573-582
Template-Type: ReDIF-Article 1.0
Author-Name: Han Lin Shang
Author-X-Name-First: Han Lin
Author-X-Name-Last: Shang
Title: Bayesian bandwidth estimation and semi-metric selection for a functional partial linear model with unknown error density
Abstract:
This study examines the optimal selections of bandwidth and semi-metric for a functional partial linear model. Our proposed method begins by estimating the unknown error density using a kernel density estimator of residuals, where the regression function, consisting of parametric and nonparametric components, can be estimated by functional principal component and functional Nadayara-Watson estimators. The estimation accuracy of the regression function and error density crucially depends on the optimal estimations of bandwidth and semi-metric. A Bayesian method is utilized to simultaneously estimate the bandwidths in the regression function and kernel error density by minimizing the Kullback-Leibler divergence. For estimating the regression function and error density, a series of simulation studies demonstrate that the functional partial linear model gives improved estimation and forecast accuracies compared with the functional principal component regression and functional nonparametric regression. Using a spectroscopy dataset, the functional partial linear model yields better forecast accuracy than some commonly used functional regression models. As a by-product of the Bayesian method, a pointwise prediction interval can be obtained, and marginal likelihood can be used to select the optimal semi-metric.
Journal: Journal of Applied Statistics
Pages: 583-604
Issue: 4
Volume: 48
Year: 2021
Month: 3
X-DOI: 10.1080/02664763.2020.1736527
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1736527
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:4:p:583-604
Template-Type: ReDIF-Article 1.0
Author-Name: Özlem Kaymaz
Author-X-Name-First: Özlem
Author-X-Name-Last: Kaymaz
Author-Name: Khaled Alqahtani
Author-X-Name-First: Khaled
Author-X-Name-Last: Alqahtani
Author-Name: Henry M. Wood
Author-X-Name-First: Henry M.
Author-X-Name-Last: Wood
Author-Name: Arief Gusnanto
Author-X-Name-First: Arief
Author-X-Name-Last: Gusnanto
Title: Prediction of tumour pathological subtype from genomic profile using sparse logistic regression with random effects
Abstract:
The purpose of this study is to highlight the application of sparse logistic regression models in dealing with prediction of tumour pathological subtypes based on lung cancer patients' genomic information. We consider sparse logistic regression models to deal with the high dimensionality and correlation between genomic regions. In a hierarchical likelihood (HL) method, it is assumed that the random effects follow a normal distribution and its variance is assumed to follow a gamma distribution. This formulation considers ridge and lasso penalties as special cases. We extend the HL penalty to include a ridge penalty (called ‘HLnet’) in a similar principle of the elastic net penalty, which is constructed from lasso penalty. The results indicate that the HL penalty creates more sparse estimates than lasso penalty with comparable prediction performance, while HLnet and elastic net penalties have the best prediction performance in real data. We illustrate the methods in a lung cancer study.
Journal: Journal of Applied Statistics
Pages: 605-622
Issue: 4
Volume: 48
Year: 2021
Month: 3
X-DOI: 10.1080/02664763.2020.1738358
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1738358
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:4:p:605-622
Template-Type: ReDIF-Article 1.0
Author-Name: Ayan Pal
Author-X-Name-First: Ayan
Author-X-Name-Last: Pal
Author-Name: Sharmishtha Mitra
Author-X-Name-First: Sharmishtha
Author-X-Name-Last: Mitra
Author-Name: Debasis Kundu
Author-X-Name-First: Debasis
Author-X-Name-Last: Kundu
Title: Order restricted classical inference of a Weibull multiple step-stress model
Abstract:
In this paper, a multiple step-stress model is designed and analyzed when the data are Type-I censored. Lifetime distributions of the experimental units at each stress level are assumed to follow a two-parameter Weibull distribution. Further, distributions under each of the stress levels are connected through a tampered failure-rate based model. In a step-stress experiment, as the stress level increases, the load on the experimental units increases and hence the mean lifetime is expected to be shortened. Taking this into account, the aim of this paper is to develop the order restricted inference of the model parameters of a multiple step-stress model based on the frequentist approach. An extensive simulation study has been carried out and two real data sets have been analyzed for illustrative purposes.
Journal: Journal of Applied Statistics
Pages: 623-645
Issue: 4
Volume: 48
Year: 2021
Month: 3
X-DOI: 10.1080/02664763.2020.1736526
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1736526
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:4:p:623-645
Template-Type: ReDIF-Article 1.0
Author-Name: Jamil Ownuk
Author-X-Name-First: Jamil
Author-X-Name-Last: Ownuk
Author-Name: Hossein Baghishani
Author-X-Name-First: Hossein
Author-X-Name-Last: Baghishani
Author-Name: Ahmad Nezakati
Author-X-Name-First: Ahmad
Author-X-Name-Last: Nezakati
Title: Heavy or semi-heavy tail, that is the question
Abstract:
While there has been considerable research on the analysis of extreme values and outliers by using heavy-tailed distributions, little is known about the semi-heavy-tailed behaviors of data when there are a few suspicious outliers. To address the situation where data are skewed possessing semi-heavy tails, we introduce two new skewed distribution families of the hyperbolic secant with exciting properties. We extend the semi-heavy-tailedness property of data to a linear regression model. In particular, we investigate the asymptotic properties of the ML estimators of the regression parameters when the error term has a semi-heavy-tailed distribution. We conduct simulation studies comparing the ML estimators of the regression parameters under various assumptions for the distribution of the error term. We also provide three real examples to show the priority of the semi-heavy-tailedness of the error term comparing to heavy-tailedness. Online supplementary materials for this article are available. All the new proposed models in this work are implemented by the shs R package, which can be found on the GitHub webpage.
Journal: Journal of Applied Statistics
Pages: 646-668
Issue: 4
Volume: 48
Year: 2021
Month: 3
X-DOI: 10.1080/02664763.2020.1738360
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1738360
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:4:p:646-668
Template-Type: ReDIF-Article 1.0
Author-Name: Guangbao Guo
Author-X-Name-First: Guangbao
Author-X-Name-Last: Guo
Title: Taylor quasi-likelihood for limited generalized linear models
Abstract:
It is a major research topic of limited generalized linear models, namely, generalized linear models with limited dependent variables. The models are developed in many research fields. However, quasi-likelihood estimation of the models is an unresolved issue, due to including limited dependent variables. We propose a novel quasi-likelihood, called Taylor quasi-likelihood, to handle with the unified estimation problem of the limited models. It is based on Taylor expansion of distribution function or likelihood function. We also extend the likelihood to a generalized version and an adaptive version and propose a distributed procedure to obtain the likelihood estimator. In low-dimensional setting, we give selection criteria for the proposed method and make arguments for the consistency and asymptotic normality of the estimator. In high-dimensional setting, we discuss feature selection and oracle properties of the proposed method. Simulation results confirm the advantages of the proposed method.
Journal: Journal of Applied Statistics
Pages: 669-692
Issue: 4
Volume: 48
Year: 2021
Month: 3
X-DOI: 10.1080/02664763.2020.1743650
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1743650
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:4:p:669-692
Template-Type: ReDIF-Article 1.0
Author-Name: Uchenna C. Nduka
Author-X-Name-First: Uchenna C.
Author-X-Name-Last: Nduka
Author-Name: Tobias E. Ugah
Author-X-Name-First: Tobias E.
Author-X-Name-Last: Ugah
Author-Name: Chinyeaka H. Izunobi
Author-X-Name-First: Chinyeaka H.
Author-X-Name-Last: Izunobi
Title: Robust estimation using multivariate t innovations for vector autoregressive models via ECM algorithm
Abstract:
This paper considers the vector autoregressive model of order p, VAR(p), with multivariate t error distributions, the latter being more prevalent in real life than the usual multivariate normal distribution. It is believed that the maximum-likelihood equations for the multivariate t distribution have convergence problem, hence we develop estimation procedures for VAR(p) model using the normal mean–variance mixture representation of multivariate t distribution. The procedure relies on the computational ease available in Expectation Maximization-based algorithms. The estimators obtained are explicit functions of sample observations and therefore are easy to compute. Extensive simulation experiments show that the estimators have negligible bias and are considerably more efficient than an existing method that uses the least-squares error approach. It is shown that the proposed estimators are robust to plausible deviations from an assumed distribution and hence are more advantageous when compared with the other estimator. One real-life example is given for illustration purposes.
Journal: Journal of Applied Statistics
Pages: 693-711
Issue: 4
Volume: 48
Year: 2021
Month: 3
X-DOI: 10.1080/02664763.2020.1742297
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1742297
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:4:p:693-711
Template-Type: ReDIF-Article 1.0
Author-Name: Subrata Chakraborty
Author-X-Name-First: Subrata
Author-X-Name-Last: Chakraborty
Author-Name: Dhrubajyoti Chakravarty
Author-X-Name-First: Dhrubajyoti
Author-X-Name-Last: Chakravarty
Author-Name: Josmar Mazucheli
Author-X-Name-First: Josmar
Author-X-Name-Last: Mazucheli
Author-Name: Wesley Bertoli
Author-X-Name-First: Wesley
Author-X-Name-Last: Bertoli
Title: A discrete analog of Gumbel distribution: properties, parameter estimation and applications
Abstract:
A discrete version of the Gumbel distribution (Type-I Extreme Value distribution) has been derived by using the general approach of discretization of a continuous distribution. Important distributional and reliability properties have been explored. It has been shown that depending on the choice of parameters the proposed distribution can be positively or negatively skewed; possess long-tail(s). Log-concavity of the distribution and consequent results have been established. Estimation of parameters by method of maximum likelihood, method of moments, and method of proportions has been discussed. A method of checking model adequacy and regression type estimation based on empirical survival function has also been examined. A simulation study has been carried out to compare and check the efficacy of the three methods of estimations. The distribution has been applied to model three real count data sets from diverse application area namely, survival times in number of days, maximum annual floods data from Brazil and goal differences in English premier league, and the results show the relevance of the proposed distribution.
Journal: Journal of Applied Statistics
Pages: 712-737
Issue: 4
Volume: 48
Year: 2021
Month: 3
X-DOI: 10.1080/02664763.2020.1744538
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1744538
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:4:p:712-737
Template-Type: ReDIF-Article 1.0
Author-Name: Yang-Jin Kim
Author-X-Name-First: Yang-Jin
Author-X-Name-Last: Kim
Title: Joint model for bivariate zero-inflated recurrent event data with terminal events
Abstract:
Bivariate recurrent event data are observed when subjects are at risk of experiencing two different type of recurrent events. In this paper, our interest is to suggest statistical model when there is a substantial portion of subjects not experiencing recurrent events but having a terminal event. In a context of recurrent event data, zero events can be related with either the risk free group or a terminal event. For simultaneously reflecting both a zero inflation and a terminal event in a context of bivariate recurrent event data, a joint model is implemented with bivariate frailty effects. Simulation studies are performed to evaluate the suggested models. Infection data from AML (acute myeloid leukemia) patients are analyzed as an application.
Journal: Journal of Applied Statistics
Pages: 738-749
Issue: 4
Volume: 48
Year: 2021
Month: 3
X-DOI: 10.1080/02664763.2020.1744539
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1744539
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:4:p:738-749
Template-Type: ReDIF-Article 1.0
Author-Name: Qingzhao Yu
Author-X-Name-First: Qingzhao
Author-X-Name-Last: Yu
Author-Name: Bin Li
Author-X-Name-First: Bin
Author-X-Name-Last: Li
Title: A multivariate multiple third-variable effect analysis with an application to explore racial and ethnic disparities in obesity
Abstract:
Third-Variable effect refers to the intervening effect from a third variable (called mediators or confounders) to the observed relationship between an exposure and an outcome. The general multiple third-variable effect analysis method (TVEA) allows consideration of multiple mediators/confounders (MC) simultaneously and the use of linear and nonlinear predictive models for estimating MC effects. Previous studies have found that compared with non-Hispanic White population, Blacks and Hispanic Whites suffered disproportionally more with obesity and related chronic diseases. In this paper, we extend the general TVEA to deal with multivariate/multi-categorical predictors and multivariate response variables. We designed algorithms and an R package for this extension and applied MMA on the NHANES data to identify MCs and quantify the indirect effect of each MC in explaining both racial and ethnic disparities in obesity and the body mass index (BMI) simultaneously. We considered a number of socio-demographic variables, individual factors, and environmental variables as potential MCs and found that some of the ethnic/racial differences in obesity and BMI were explained by the included variables.
Journal: Journal of Applied Statistics
Pages: 750-764
Issue: 4
Volume: 48
Year: 2021
Month: 3
X-DOI: 10.1080/02664763.2020.1738359
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1738359
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:4:p:750-764
Template-Type: ReDIF-Article 1.0
Author-Name: Elham Tabrizi
Author-X-Name-First: Elham
Author-X-Name-Last: Tabrizi
Author-Name: Ehsan Bahrami Samani
Author-X-Name-First: Ehsan
Author-X-Name-Last: Bahrami Samani
Author-Name: Mojtaba Ganjali
Author-X-Name-First: Mojtaba
Author-X-Name-Last: Ganjali
Title: General location multivariate latent variable models for mixed correlated bounded continuous, ordinal, and nominal responses with non-ignorable missing data
Abstract:
Using a multivariate latent variable approach, this article proposes some new general models to analyze the correlated bounded continuous and categorical (nominal or/and ordinal) responses with and without non-ignorable missing values. First, we discuss regression methods for jointly analyzing continuous, nominal, and ordinal responses that we motivated by analyzing data from studies of toxicity development. Second, using the beta and Dirichlet distributions, we extend the models so that some bounded continuous responses are replaced for continuous responses. The joint distribution of the bounded continuous, nominal and ordinal variables is decomposed into a marginal multinomial distribution for the nominal variable and a conditional multivariate joint distribution for the bounded continuous and ordinal variables given the nominal variable. We estimate the regression parameters under the new general location models using the maximum-likelihood method. Sensitivity analysis is also performed to study the influence of small perturbations of the parameters of the missing mechanisms of the model on the maximal normal curvature. The proposed models are applied to two data sets: BMI, Steatosis and Osteoporosis data and Tehran household expenditure budgets.
Journal: Journal of Applied Statistics
Pages: 765-785
Issue: 5
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1745765
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1745765
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:5:p:765-785
Template-Type: ReDIF-Article 1.0
Author-Name: Mohammed Alqawba
Author-X-Name-First: Mohammed
Author-X-Name-Last: Alqawba
Author-Name: Norou Diawara
Author-X-Name-First: Norou
Author-X-Name-Last: Diawara
Title: Copula-based Markov zero-inflated count time series models with application
Abstract:
Count time series data with excess zeros are observed in several applied disciplines. When these zero-inflated counts are sequentially recorded, they might result in serial dependence. Ignoring the zero-inflation and the serial dependence might produce inaccurate results. In this paper, Markov zero-inflated count time series models based on a joint distribution on consecutive observations are proposed. The joint distribution function of the consecutive observations is constructed through copula functions. First- and second-order Markov chains are considered with the univariate margins of zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), or zero-inflated Conway–Maxwell–Poisson (ZICMP) distributions. Under the Markov models, bivariate copula functions such as the bivariate Gaussian, Frank, and Gumbel are chosen to construct a bivariate distribution of two consecutive observations. Moreover, the trivariate Gaussian and max-infinitely divisible copula functions are considered to build the joint distribution of three consecutive observations. Likelihood-based inference is performed and asymptotic properties are studied. To evaluate the estimation method and the asymptotic results, simulated examples are studied. The proposed class of models are applied to sandstorm counts example. The results suggest that the proposed models have some advantages over some of the models in the literature for modeling zero-inflated count time series data.
Journal: Journal of Applied Statistics
Pages: 786-803
Issue: 5
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1748581
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1748581
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:5:p:786-803
Template-Type: ReDIF-Article 1.0
Author-Name: António Casimiro Puindi
Author-X-Name-First: António Casimiro
Author-X-Name-Last: Puindi
Author-Name: Maria Eduarda Silva
Author-X-Name-First: Maria Eduarda
Author-X-Name-Last: Silva
Title: Dynamic structural models with covariates for short-term forecasting of time series with complex seasonal patterns
Abstract:
This work presents a framework of dynamic structural models with covariates for short-term forecasting of time series with complex seasonal patterns. The framework is based on the multiple sources of randomness formulation. A noise model is formulated to allow the incorporation of randomness into the seasonal component and to propagate this same randomness in the coefficients of the variant trigonometric terms over time. A unique, recursive and systematic computational procedure based on the maximum likelihood estimation under the hypothesis of Gaussian errors is introduced. The referred procedure combines the Kalman filter with recursive adjustment of the covariance matrices and the selection method of harmonics number in the trigonometric terms. A key feature of this method is that it allows estimating not only the states of the system but also allows obtaining the standard errors of the estimated parameters and the prediction intervals. In addition, this work also presents a non-parametric bootstrap approach to improve the forecasting method based on Kalman filter recursions. The proposed framework is empirically explored with two real time series.
Journal: Journal of Applied Statistics
Pages: 804-826
Issue: 5
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1748178
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1748178
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:5:p:804-826
Template-Type: ReDIF-Article 1.0
Author-Name: Paravee Maneejuk
Author-X-Name-First: Paravee
Author-X-Name-Last: Maneejuk
Author-Name: Woraphon Yamaka
Author-X-Name-First: Woraphon
Author-X-Name-Last: Yamaka
Title: Significance test for linear regression: how to test without P-values?
Abstract:
The discussion on the use and misuse of p-values in 2016 by the American Statistician Association was a timely assertion that statistical concept should be properly used in science. Some researchers, especially the economists, who adopt significance testing and p-values to report their results, may felt confused by the statement, leading to misinterpretations of the statement. In this study, we aim to re-examine the accuracy of the p-value and introduce an alternative way for testing the hypothesis. We conduct a simulation study to investigate the reliability of the p-value. Apart from investigating the performance of p-value, we also introduce some existing approaches, Minimum Bayes Factors and Belief functions, for replacing p-value. Results from the simulation study confirm unreliable p-value in some cases and that our proposed approaches seem to be useful as the substituted tool in the statistical inference. Moreover, our results show that the plausibility approach is more accurate for making decisions about the null hypothesis than the traditionally used p-values when the null hypothesis is true. However, the MBFs of Edwards et al. [Bayesian statistical inference for psychological research. Psychol. Rev. 70(3) (1963), pp. 193–242]; Vovk [A logic of probability, with application to the foundations of statistics. J. Royal Statistical Soc. Series B (Methodological) 55 (1993), pp. 317–351] and Sellke et al. [Calibration of p values for testing precise null hypotheses. Am. Stat. 55(1) (2001), pp. 62–71] provide more reliable results compared to all other methods when the null hypothesis is false.
Journal: Journal of Applied Statistics
Pages: 827-845
Issue: 5
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1748180
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1748180
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:5:p:827-845
Template-Type: ReDIF-Article 1.0
Author-Name: Mingyue Du
Author-X-Name-First: Mingyue
Author-X-Name-Last: Du
Author-Name: Qingning Zhou
Author-X-Name-First: Qingning
Author-X-Name-Last: Zhou
Author-Name: Shishun Zhao
Author-X-Name-First: Shishun
Author-X-Name-Last: Zhao
Author-Name: Jianguo Sun
Author-X-Name-First: Jianguo
Author-X-Name-Last: Sun
Title: Regression analysis of case-cohort studies in the presence of dependent interval censoring
Abstract:
The case-cohort design is widely used as a means of reducing the cost in large cohort studies, especially when the disease rate is low and covariate measurements may be expensive, and has been discussed by many authors. In this paper, we discuss regression analysis of case-cohort studies that produce interval-censored failure time with dependent censoring, a situation for which there does not seem to exist an established approach. For inference, a sieve inverse probability weighting estimation procedure is developed with the use of Bernstein polynomials to approximate the unknown baseline cumulative hazard functions. The proposed estimators are shown to be consistent and the asymptotic normality of the resulting regression parameter estimators is established. A simulation study is conducted to assess the finite sample properties of the proposed approach and indicates that it works well in practical situations. The proposed method is applied to an HIV/AIDS case-cohort study that motivated this investigation.
Journal: Journal of Applied Statistics
Pages: 846-865
Issue: 5
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1752633
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1752633
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:5:p:846-865
Template-Type: ReDIF-Article 1.0
Author-Name: Yoonsuh Jung
Author-X-Name-First: Yoonsuh
Author-X-Name-Last: Jung
Author-Name: Steven N. MacEachern
Author-X-Name-First: Steven N.
Author-X-Name-Last: MacEachern
Author-Name: Hang Joon Kim
Author-X-Name-First: Hang
Author-X-Name-Last: Joon Kim
Title: Modified check loss for efficient estimation via model selection in quantile regression
Abstract:
The check loss function is used to define quantile regression. In cross-validation, it is also employed as a validation function when the true distribution is unknown. However, our empirical study indicates that validation with the check loss often leads to overfitting the data. In this work, we suggest a modified or L2-adjusted check loss which rounds the sharp corner in the middle of check loss. This has the effect of guarding against overfitting to some extent. The adjustment is devised to shrink to zero as sample size grows. Through various simulation settings of linear and nonlinear regressions, the improvement due to modification of the check loss by quadratic adjustment is examined empirically.
Journal: Journal of Applied Statistics
Pages: 866-886
Issue: 5
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1753023
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1753023
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:5:p:866-886
Template-Type: ReDIF-Article 1.0
Author-Name: Mingjing Chen
Author-X-Name-First: Mingjing
Author-X-Name-Last: Chen
Author-Name: Xiangyong Tan
Author-X-Name-First: Xiangyong
Author-X-Name-Last: Tan
Author-Name: Jian Wu
Author-X-Name-First: Jian
Author-X-Name-Last: Wu
Title: Time varying factor models with possibly strongly correlated noises
Abstract:
In factor models, noises are often assumed to be weakly correlated; otherwise, separation of factors from noises becomes difficult, if not impossible. This paper will address this problem. We utilize an econometric idea, the so called common correlated effects (CCE) to estimate time varying factor models. We first cross sectionally average the covariates and then project the responses to the space spanned by the averaged covariates. By doing so, noises are diminished while factors are distinguished. The advantages of our new estimators are two folds. First, the convergence rates of estimated factors and loadings are independent of cross sectional dimension. Second, our new estimators are robust to the correlation of noises. Hence our new estimators can, on one hand, separate market factors for the stock data set used in this paper even if noises exhibit strong correlations within industries due to industry-specific factors and on the other hand, avoid inappropriately absorbing industry-specific factors into market factors.
Journal: Journal of Applied Statistics
Pages: 887-906
Issue: 5
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1753024
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1753024
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:5:p:887-906
Template-Type: ReDIF-Article 1.0
Author-Name: Magda Carvalho Pires
Author-X-Name-First: Magda Carvalho
Author-X-Name-Last: Pires
Author-Name: Enrico Antônio Colosimo
Author-X-Name-First: Enrico Antônio
Author-X-Name-Last: Colosimo
Author-Name: Guilherme Augusto Veloso
Author-X-Name-First: Guilherme Augusto
Author-X-Name-Last: Veloso
Author-Name: Raquel de Souza Borges Ferreira
Author-X-Name-First: Raquel de Souza Borges
Author-X-Name-Last: Ferreira
Title: Interval-censored data with misclassification: a Bayesian approach
Abstract:
Survival data involving silent events are often subject to interval censoring (the event is known to occur within a time interval) and classification errors if a test with no perfect sensitivity and specificity is applied. Considering the nature of this data plays an important role in estimating the time distribution until the occurrence of the event. In this context, we incorporate validation subsets into the parametric proportional hazard model, and show that this additional data, combined with Bayesian inference, compensate the lack of knowledge about test sensitivity and specificity improving the parameter estimates. The proposed model is evaluated through simulation studies, and Bayesian analysis is conducted within a Gibbs sampling procedure. The posterior estimates obtained under validation subset models present lower bias and standard deviation compared to the scenario with no validation subset or the model that assumes perfect sensitivity and specificity. Finally, we illustrate the usefulness of the new methodology with an analysis of real data about HIV acquisition in female sex workers that have been discussed in the literature.
Journal: Journal of Applied Statistics
Pages: 907-923
Issue: 5
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1753025
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1753025
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:5:p:907-923
Template-Type: ReDIF-Article 1.0
Author-Name: Özge Kuran
Author-X-Name-First: Özge
Author-X-Name-Last: Kuran
Author-Name: M. Revan Özkale
Author-X-Name-First: M. Revan
Author-X-Name-Last: Özkale
Title: Improvement of mixed predictors in linear mixed models
Abstract:
In this paper, we introduce stochastic-restricted Liu predictors which will be defined by combining in a special way the two approaches followed in obtaining the mixed predictors and the Liu predictors in the linear mixed models. Superiorities of the linear combination of the new predictor to the Liu and mixed predictors are done in the sense of mean square error matrix criterion. Finally, numerical examples and a simulation study are done to illustrate the findings. In numerical examples, we took some arbitrary observations from the data as the prior information since we did not have historical data or additional information about the data sets. The results show that this case does the new estimator gain efficiency over the constituent estimators and provide accurate estimation and prediction of the data.
Journal: Journal of Applied Statistics
Pages: 924-942
Issue: 5
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1833182
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1833182
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:5:p:924-942
Template-Type: ReDIF-Article 1.0
Author-Name: Šárka Večeřová
Author-X-Name-First: Šárka
Author-X-Name-Last: Večeřová
Author-Name: Arnošt Komárek
Author-X-Name-First: Arnošt
Author-X-Name-Last: Komárek
Author-Name: Luk Bruyneel
Author-X-Name-First: Luk
Author-X-Name-Last: Bruyneel
Author-Name: Emmanuel Lesaffre
Author-X-Name-First: Emmanuel
Author-X-Name-Last: Lesaffre
Title: Identifying influential observations in a Bayesian multi-level mediation model
Abstract:
Increasingly complex models are being fit to data these days. This is especially the case for Bayesian modelling making use of Markov chain Monte Carlo methods. Tailored model diagnostics are usually lacking behind. This is also the case for Bayesian mediation models. In this paper, we developed a method for the detection of influential observations for a popular mediation model and its extensions in a Bayesian context. Detection of influential observations is based on the case-deletion principle. Importance sampling with weights which take advantage of the dependence structure in hierarchical models is utilized in order to identify the part of the model which is influenced most. We make use of the variance of log importance sampling weights as the measure of influence. It is demonstrated that this approach is useful when interest lies in the impact of individual observations on a subset of model parameters. The method is illustrated on a three-level data set from the field of nursing research, which was previously used to fit a mediation model of patient satisfaction with care. We focused on influential cases on both the second and the third level of the data.
Journal: Journal of Applied Statistics
Pages: 943-960
Issue: 5
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1748179
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1748179
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:5:p:943-960
Template-Type: ReDIF-Article 1.0
Author-Name: Xu Guo
Author-X-Name-First: Xu
Author-X-Name-Last: Guo
Author-Name: Yan Wang
Author-X-Name-First: Yan
Author-X-Name-Last: Wang
Author-Name: Niwen Zhou
Author-X-Name-First: Niwen
Author-X-Name-Last: Zhou
Author-Name: Xuehu Zhu
Author-X-Name-First: Xuehu
Author-X-Name-Last: Zhu
Title: Optimal weighted two-sample t-test with partially paired data in a unified framework
Abstract:
In this paper, we provide a unified framework for two-sample t-test with partially paired data. We show that many existing two-sample t-tests with partially paired data can be viewed as special members in our unified framework. Some shortcomings of these t-tests are discussed. We also propose the asymptotically optimal weighted linear combination of the test statistics comparing all four paired and unpaired data sets. Simulation studies are used to illustrate the performance of our proposed asymptotically optimal weighted combinations of test statistics and compare with some existing methods. It is found that our proposed test statistic is generally more powerful. Three real data sets about CD4 count, DNA extraction concentrations, and the quality of sleep are also analyzed by using our newly introduced test statistic.
Journal: Journal of Applied Statistics
Pages: 961-976
Issue: 6
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1753027
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1753027
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:6:p:961-976
Template-Type: ReDIF-Article 1.0
Author-Name: Nusrat Harun
Author-X-Name-First: Nusrat
Author-X-Name-Last: Harun
Author-Name: Bo Cai
Author-X-Name-First: Bo
Author-X-Name-Last: Cai
Author-Name: Yu Shen
Author-X-Name-First: Yu
Author-X-Name-Last: Shen
Title: A Bayesian semiparametric method for analyzing length-biased data
Abstract:
Survival data obtained from prevalent cohort study designs are often subject to length-biased sampling. Frequentist methods including estimating equation approaches, as well as full likelihood methods, are available for assessing covariate effects on survival from such data. Bayesian methods allow a perspective of probability interpretation for the parameters of interest, and may easily provide the predictive distribution for future observations while incorporating weak prior knowledge on the baseline hazard function. There is lack of Bayesian methods for analyzing length-biased data. In this paper, we propose Bayesian methods for analyzing length-biased data under a proportional hazards model. The prior distribution for the cumulative hazard function is specified semiparametrically using I-Splines. Bayesian conditional and full likelihood approaches are developed for analyzing simulated and real data.
Journal: Journal of Applied Statistics
Pages: 977-992
Issue: 6
Volume: 48
Year: 2021
Month: 04
X-DOI: 10.1080/02664763.2020.1753028
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1753028
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:6:p:977-992
Template-Type: ReDIF-Article 1.0
Author-Name: Yu Zhang
Author-X-Name-First: Yu
Author-X-Name-Last: Zhang
Author-Name: Xiangzhong Fang
Author-X-Name-First: Xiangzhong
Author-X-Name-Last: Fang
Title: On the lengths of t-based confidence intervals
Abstract:
Confidence interval is a basic type of interval estimation in statistics. When dealing with samples from a normal population with the unknown mean and the variance, the traditional method to construct t-based confidence intervals for the mean parameter is to treat the n sampled units as n groups and build the intervals. Here we propose a generalized method. We first divide them into several equal-sized groups and then calculate the confidence intervals with the mean values of these groups. If we define “better” in terms of the expected length of the confidence interval, then the first method is better because the expected length of the confidence interval obtained from the first method is shorter. We prove this intuition theoretically. We also specify when the elements in each group are correlated, the first method is invalid, while the second can give us correct results in terms of the coverage probability. We illustrate this with analytical expressions. In practice, when the data set is extremely large and distributed in several data centers, the second method is a good tool to get confidence intervals, in both independent and correlated cases. Some simulations and real data analyses are presented to verify our theoretical results.
Journal: Journal of Applied Statistics
Pages: 993-1008
Issue: 6
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1754357
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1754357
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:6:p:993-1008
Template-Type: ReDIF-Article 1.0
Author-Name: Milind A. Phadnis
Author-X-Name-First: Milind A.
Author-X-Name-Last: Phadnis
Author-Name: Matthew S. Mayo
Author-X-Name-First: Matthew S.
Author-X-Name-Last: Mayo
Title: Sample size calculations for noninferiority trials for time-to-event data using the concept of proportional time
Abstract:
Noninferiority trials intend to show that a new treatment is ‘not worse' than a standard-of-care active control and can be used as an alternative when it is likely to cause fewer side effects compared to the active control. In the case of time-to-event endpoints, existing methods of sample size calculation are done either assuming proportional hazards between the two study arms, or assuming exponentially distributed lifetimes. In scenarios where these assumptions are not true, there are few reliable methods for calculating the sample sizes for a time-to-event noninferiority trial. Additionally, the choice of the non-inferiority margin is obtained either from a meta-analysis of prior studies, or strongly justifiable ‘expert opinion', or from a ‘well conducted' definitive large-sample study. Thus, when historical data do not support the traditional assumptions, it would not be appropriate to use these methods to design a noninferiority trial. For such scenarios, an alternate method of sample size calculation based on the assumption of Proportional Time is proposed. This method utilizes the generalized gamma ratio distribution to perform the sample size calculations. A practical example is discussed, followed by insights on choice of the non-inferiority margin, and the indirect testing of superiority of treatment compared to placebo.
Journal: Journal of Applied Statistics
Pages: 1009-1032
Issue: 6
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1753026
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1753026
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:6:p:1009-1032
Template-Type: ReDIF-Article 1.0
Author-Name: Zhensheng Huang
Author-X-Name-First: Zhensheng
Author-X-Name-Last: Huang
Author-Name: Wen Lou
Author-X-Name-First: Wen
Author-X-Name-Last: Lou
Title: Statistical inferences for single-index models with measurement errors
Abstract:
An important factor in house prices is its location. However, measurement errors arise frequently in the process of observing variables such as the latitude and longitude of the house. The single-index models with measurement errors are used to study the relationship between house location and house price. We obtain the estimators by a SIMEX method based on the local linear method and the estimating equation. To test the significance of the index coefficient and the linearity of the link function, we establish the generalized likelihood ratio (GLR) tests for the models. We demonstrate that the asymptotic null distributions of the established GLR tests follow
$\chi ^{2} $χ2-distributions which are independent of nuisance parameters or functions. Finally, two simulated examples and a real estate valuation data set are given to illustrate the effect of GLR tests.
Journal: Journal of Applied Statistics
Pages: 1033-1052
Issue: 6
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1754358
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1754358
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:6:p:1033-1052
Template-Type: ReDIF-Article 1.0
Author-Name: Alex Paynter
Author-X-Name-First: Alex
Author-X-Name-Last: Paynter
Author-Name: Amy D. Willis
Author-X-Name-First: Amy D.
Author-X-Name-Last: Willis
Title: Tuning parameter selection for a penalized estimator of species richness
Abstract:
Our goal is to estimate the true number of classes in a population, called the species richness. We consider the case where multiple frequency count tables have been collected from a homogeneous population and investigate a penalized maximum likelihood estimator under a negative binomial model. Because high probabilities of unobserved classes increase the variance of species richness estimates, our method penalizes the probability of a class being unobserved. Tuning the penalization parameter is challenging because the true species richness is never known, and so we propose and validate four novel methods for tuning the penalization parameter. We illustrate and contrast the performance of the proposed methods by estimating the strain-level microbial diversity of Lake Champlain over three consecutive years, and global human host-associated species-level microbial richness.
Journal: Journal of Applied Statistics
Pages: 1053-1070
Issue: 6
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1754359
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1754359
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:6:p:1053-1070
Template-Type: ReDIF-Article 1.0
Author-Name: Vahid Nekoukhou
Author-X-Name-First: Vahid
Author-X-Name-Last: Nekoukhou
Author-Name: Ashkan Khalifeh
Author-X-Name-First: Ashkan
Author-X-Name-Last: Khalifeh
Author-Name: Hamid Bidram
Author-X-Name-First: Hamid
Author-X-Name-Last: Bidram
Title: A bivariate discrete inverse resilience family of distributions with resilience marginals
Abstract:
In this paper, a new bivariate discrete generalized exponential distribution, whose marginals are discrete generalized exponential distributions, is studied. It is observed that the proposed bivariate distribution is a flexible distribution whose cumulative distribution function has an analytical structure. In addition, a new bivariate geometric distribution can be obtained as a special case. We study different properties of this distribution and propose estimation of its parameters. We will see that the maximum of the variables involved in the proposed bivariate distribution defines some new classes of univariate discrete distributions, which are interesting in their own sake, and can be used to analyze some Reliability systems whose components are positive dependent. Some important futures of this new univariate family of discrete distributions are also studied in details. In addition, a general class of bivariate discrete distributions, whose marginals are exponentiated discrete distributions, is introduced. Moreover, the analysis of two real bivariate data sets is performed to indicate the effectiveness of the proposed models. Finally, we conclude the paper.
Journal: Journal of Applied Statistics
Pages: 1071-1090
Issue: 6
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1755618
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1755618
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:6:p:1071-1090
Template-Type: ReDIF-Article 1.0
Author-Name: Fang Xia
Author-X-Name-First: Fang
Author-X-Name-Last: Xia
Author-Name: Stephen L. George
Author-X-Name-First: Stephen L.
Author-X-Name-Last: George
Author-Name: Jing Ning
Author-X-Name-First: Jing
Author-X-Name-Last: Ning
Author-Name: Liang Li
Author-X-Name-First: Liang
Author-X-Name-Last: Li
Author-Name: Xuelin Huang
Author-X-Name-First: Xuelin
Author-X-Name-Last: Huang
Title: A signature enrichment design with Bayesian adaptive randomization
Abstract:
Clinical trials in the era of precision cancer medicine aim to identify and validate biomarker signatures which can guide the assignment of individually optimal treatments to patients. In this article, we propose a group sequential randomized phase II design, which updates the biomarker signature as the trial goes on, utilizes enrichment strategies for patient selection, and uses Bayesian response-adaptive randomization for treatment assignment. To evaluate the performance of the new design, in addition to the commonly considered criteria of Type I error and power, we propose four new criteria measuring the benefits and losses for individuals both inside and outside of the clinical trial. Compared with designs with equal randomization, the proposed design gives trial participants a better chance to receive their personalized optimal treatments and thus results in a higher response rate on the trial. This design increases the chance to discover a successful new drug by an adaptive enrichment strategy, i.e. identification and selective enrollment of a subset of patients who are sensitive to the experimental therapies. Simulation studies demonstrate these advantages of the proposed design. It is illustrated by an example based on an actual clinical trial in non-small-cell lung cancer.
Journal: Journal of Applied Statistics
Pages: 1091-1110
Issue: 6
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1757048
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1757048
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:6:p:1091-1110
Template-Type: ReDIF-Article 1.0
Author-Name: Erfan Ghasemi
Author-X-Name-First: Erfan
Author-X-Name-Last: Ghasemi
Author-Name: Alireza Akbarzadeh Baghban
Author-X-Name-First: Alireza
Author-X-Name-Last: Akbarzadeh Baghban
Author-Name: Farid Zayeri
Author-X-Name-First: Farid
Author-X-Name-Last: Zayeri
Author-Name: Asma Pourhoseingholi
Author-X-Name-First: Asma
Author-X-Name-Last: Pourhoseingholi
Author-Name: Seyed Mohammadreza Safavi
Author-X-Name-First: Seyed Mohammadreza
Author-X-Name-Last: Safavi
Title: A doubly-inflated Poisson regression for correlated count data
Abstract:
Count data have emerged in many applied research areas. In recent years, there has been a considerable interest in models for count data. In modelling such data, it is common to face a large frequency of zeroes. The data are regarded as zero-inflated when the frequency of observed zeroes is larger than what is expected from a theoretical distribution such as Poisson distribution, as a standard model for analysing count data. Data analysis, using the simple Poisson model, may lead to over-dispersion. Several classes of different mixture models were proposed for handling zero-inflated data. But they do not apply to cases when inflated counts happen at some other points, in addition to zero. In these cases, a doubly-inflated Poisson model has been suggested which only be used for cross-sectional data and cannot consider correlations between observations. However, correlated count data have a large application, especially in the health and medical fields. The present study aims to introduce a Doubly-Inflated Poisson models with random effect for correlated doubly-inflated data. Then, the best performance of the proposed method is shown via different simulation scenarios. Finally, the proposed model is applied to a dental study.
Journal: Journal of Applied Statistics
Pages: 1111-1127
Issue: 6
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1757049
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1757049
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:6:p:1111-1127
Template-Type: ReDIF-Article 1.0
Author-Name: Rhoda Nandai Muse
Author-X-Name-First: Rhoda Nandai
Author-X-Name-Last: Muse
Author-Name: Satheesh Aradhyula
Author-X-Name-First: Satheesh
Author-X-Name-Last: Aradhyula
Title: Correlated discrete and continuous outcomes with endogeneity and lagged effects: past season yield impact on improved corn seed adoption
Abstract:
Farmers in Sub-Saharan Africa have lower agricultural technology adoption rates compared to the rest of the world. It is believed that the past season yield affects a farmer's capacity to take on the riskier improved seed variety; but this effect has not been studied. We quantify the effect of past season yield on improved corn seed use in future seasons while addressing the impact of the seed variety on yield. We develop a maximum likelihood method that addresses the fact that farmers self-select into a technology resulting in its effect on yield being endogenous. The method is unique since it models both lagged and endogenous effects in correlated discrete and continuous outcomes simultaneously. Due to the prescence of the lagged effect in a three year dataset, we also propose a solution to the initial conditions problem and demonstrate with simulations its effectiveness. We used survey longitudinal data collected from Kenyan corn farmers for three years. Our results show that higher past season yield increased the likelihood of adoption in future seasons. The simulation and empirical studies indicate that ignoring the self selection of improved seed use biases the results; we obtain a different sign in the covariance.
Journal: Journal of Applied Statistics
Pages: 1128-1153
Issue: 6
Volume: 48
Year: 2021
Month: 4
X-DOI: 10.1080/02664763.2020.1757050
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1757050
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:6:p:1128-1153
Template-Type: ReDIF-Article 1.0
Author-Name: Mahdi Teimouri
Author-X-Name-First: Mahdi
Author-X-Name-Last: Teimouri
Title: EM algorithm for mixture of skew-normal distributions fitted to grouped data
Abstract:
Grouped data are frequently used in several fields of study. In this work, we use the expectation-maximization (EM) algorithm for fitting the skew-normal (SN) mixture model to the grouped data. Implementing the EM algorithm requires computing the one-dimensional integrals for each group or class. Our simulation study and real data analyses reveal that the EM algorithm not only always converges but also can be implemented in just a few seconds even when the number of components is large, contrary to the Bayesian paradigm that is computationally expensive. The accuracy of the EM algorithm and superiority of the SN mixture model over the traditional normal mixture model in modelling grouped data are demonstrated through the simulation and three real data illustrations. For implementing the EM algorithm, we use the package called ForestFit developed for R environment available at https://cran.r-project.org/web/packages/ForestFit/index.html.
Journal: Journal of Applied Statistics
Pages: 1154-1179
Issue: 7
Volume: 48
Year: 2021
Month: 5
X-DOI: 10.1080/02664763.2020.1759032
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1759032
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:7:p:1154-1179
Template-Type: ReDIF-Article 1.0
Author-Name: Ingo Klein
Author-X-Name-First: Ingo
Author-X-Name-Last: Klein
Author-Name: Monika Doll
Author-X-Name-First: Monika
Author-X-Name-Last: Doll
Title: Tests on asymmetry for ordered categorical variables
Abstract:
Skewness is a well-established statistical concept for continuous and, to a lesser extent, for discrete quantitative statistical variables. However, for ordered categorical variables, limited literature concerning skewness exists, although this type of variables is common for behavioral, educational, and social sciences. Suitable measures of skewness for ordered categorical variables have to be invariant with respect to the group of strictly increasing, continuous transformations. Therefore, they have to depend on the corresponding maximal-invariants. Based on these maximal-invariants, we propose a new class of skewness functionals, show that members of this class preserve a suitable ordering of skewness and derive the asymptotic distribution of the corresponding skewness statistic. Finally, we show the good power behavior of the corresponding skewness tests and illustrate these tests by applying real data examples.
Journal: Journal of Applied Statistics
Pages: 1180-1198
Issue: 7
Volume: 48
Year: 2021
Month: 5
X-DOI: 10.1080/02664763.2020.1757045
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1757045
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:7:p:1180-1198
Template-Type: ReDIF-Article 1.0
Author-Name: T. Šimková
Author-X-Name-First: T.
Author-X-Name-Last: Šimková
Title: Confidence intervals based on L-moments for quantiles of the GP and GEV distributions with application to market-opening asset prices data
Abstract:
In a ground-breaking paper published in 1990 by the Journal of the Royal Statistical Society, J.R.M. Hosking defined the L-moment of a random variable as an expectation of certain linear combinations of order statistics. L-moments are an alternative to conventional moments and recently they have been used often in inferential statistics. L-moments have several advantages over the conventional moments, including robustness to the the presence of outliers, which may lead to more accurate estimates in some cases as the characteristics of distributions. In this contribution, asymptotic theory and L-moments are used to derive confidence intervals of the population parameters and quantiles of the three-parametric generalized Pareto and extreme-value distributions. Computer simulations are performed to determine the performance of confidence intervals for the population quantiles based on L-moments and to compare them to those obtained by traditional estimation techniques. The results obtained show that they perform well in comparison to the moments and maximum likelihood methods when the interest is in higher quantiles, or even best. L-moments are especially recommended when the tail of the distribution is rather heavier and the sample size is small. The derived intervals are applied to real economic data, and specifically to market-opening asset prices.
Journal: Journal of Applied Statistics
Pages: 1199-1226
Issue: 7
Volume: 48
Year: 2021
Month: 5
X-DOI: 10.1080/02664763.2020.1757046
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1757046
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:7:p:1199-1226
Template-Type: ReDIF-Article 1.0
Author-Name: Harsh Tripathi
Author-X-Name-First: Harsh
Author-X-Name-Last: Tripathi
Author-Name: Sanku Dey
Author-X-Name-First: Sanku
Author-X-Name-Last: Dey
Author-Name: Mahendra Saha
Author-X-Name-First: Mahendra
Author-X-Name-Last: Saha
Title: Double and group acceptance sampling plan for truncated life test based on inverse log-logistic distribution
Abstract:
This paper introduces a double and group acceptance sampling plans based on time truncated lifetimes when the lifetime of an item follows the inverse log-logistic (ILL) distribution with known shape parameter. The operating characteristic function and average sample number (ASN) values of the double acceptance sampling inspection plan are provided. The values of the minimum number of groups and operating characteristic function for various quality levels are obtained for a group acceptance sampling inspection plan. A comparative study between single acceptance sampling inspection plan and double acceptance sampling inspection plan is carried out in terms of sample size. One simulated example and four real-life examples are discussed to show the applicability of the proposed double and group acceptance sampling inspection plans for ILL distributed quality parameters.
Journal: Journal of Applied Statistics
Pages: 1227-1242
Issue: 7
Volume: 48
Year: 2021
Month: 5
X-DOI: 10.1080/02664763.2020.1759031
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1759031
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:7:p:1227-1242
Template-Type: ReDIF-Article 1.0
Author-Name: Sandile Charles Shongwe
Author-X-Name-First: Sandile Charles
Author-X-Name-Last: Shongwe
Author-Name: Jean-Claude Malela-Majika
Author-X-Name-First: Jean-Claude
Author-X-Name-Last: Malela-Majika
Author-Name: Philippe Castagliola
Author-X-Name-First: Philippe
Author-X-Name-Last: Castagliola
Title: A combined mixed-s-skip sampling strategy to reduce the effect of autocorrelation on the X̄ scheme with and without measurement errors
Abstract:
In order to reduce the effect of autocorrelation on the
$\bar{X} $X¯ monitoring scheme, a new sampling strategy is proposed to form rational subgroup samples of size n. It requires sampling to be done such that: (i) observations from two consecutive samples are merged, and (ii) some consecutive observations are skipped before sampling. This technique which is a generalized version of the mixed samples strategy is shown to yield a better reduction of the negative effect of autocorrelation when monitoring the mean of processes with and without measurement errors. For processes subjected to a combined effect of autocorrelation and measurement errors, the proposed sampling technique, together with multiple measurement strategy, yields an uniformly better zero-state run-length performance than its two main existing competitors for any autocorrelation level. However, in steady-state mode, it yields the best performance only when the monitoring process is subject to a high level of autocorrelation, for any given level of measurement errors. A real life example is used to illustrate the implementation of the proposed sampling strategy.
Journal: Journal of Applied Statistics
Pages: 1243-1268
Issue: 7
Volume: 48
Year: 2021
Month: 05
X-DOI: 10.1080/02664763.2020.1759033
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1759033
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:7:p:1243-1268
Template-Type: ReDIF-Article 1.0
Author-Name: Stefano Nasini
Author-X-Name-First: Stefano
Author-X-Name-Last: Nasini
Author-Name: Victor Martínez-de-Albéniz
Author-X-Name-First: Victor
Author-X-Name-Last: Martínez-de-Albéniz
Title: Pairwise influences in dynamic choice: network-based model and application
Abstract:
In this paper, we study the problem of network discovery and influence propagation, and propose an integrated approach for the analysis of lead-lag synchronization in multiple choices. Network models for the processes by which decisions propagate through social interaction have been studied before, but only a few consider unknown structures of interacting agents. In fact, while individual choices are typically observed, inferring individual influences – who influences who – from sequences of dynamic choices requires strong modeling assumptions on the cross-section dependencies of the observed panels. We propose a class of parametric models which extends the vector autoregression to the case of pairwise influences between individual choices over multiple items and supports the analysis of influence propagation. After uncovering a collection of theoretical properties (conditional moments, parameter sensitivity, identifiability and estimation), we provide an economic application to music broadcasting, where a set of songs are diffused over radio stations; we infer station-to-station influences based on the proposed methodology and assess the propagation effect of initial launching stations to maximize songs diffusion. Both on the theoretical and empirical sides, the proposed approach connects fields which are traditionally treated as separated areas: the problem of network discovery and the one of influence propagation.
Journal: Journal of Applied Statistics
Pages: 1269-1302
Issue: 7
Volume: 48
Year: 2021
Month: 5
X-DOI: 10.1080/02664763.2020.1761948
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1761948
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:7:p:1269-1302
Template-Type: ReDIF-Article 1.0
Author-Name: Pejman Bordbar
Author-X-Name-First: Pejman
Author-X-Name-Last: Bordbar
Author-Name: Sodeif Ahadpour
Author-X-Name-First: Sodeif
Author-X-Name-Last: Ahadpour
Title: Type-I intermittency from Markov binary block visibility graph perspective
Abstract:
In this work, the type-I intermittency is studied from the optimized Markov binary visibility graphs perspective. We consider a local Poincaré map such as the logistic map that is a simple model for exhibiting this type of intermittency. To consider the acceptance gate as
$G \ll 0.01 $G≪0.01, we show that the transition between laminar and non-laminar zones in type-I intermittency takes distinct phases and regions. According to their behavioral characteristics, we call them as pure, switching, threshold, trapping, and transforming phases for the laminar zone and initial, terminal reinjection, and chaotic burst regions for non-laminar zone. We investigate their properties based on statistical tools such as the maximum and the mean length of the laminar zone and also length distributions of the laminar zone. For further investigation, we study degree distribution of the complex network generated by type-I intermittency time series and finally, predict various behaviors of phases and regions by proposed theoretical degree distributions.
Journal: Journal of Applied Statistics
Pages: 1303-1318
Issue: 7
Volume: 48
Year: 2021
Month: 5
X-DOI: 10.1080/02664763.2020.1761949
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1761949
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:7:p:1303-1318
Template-Type: ReDIF-Article 1.0
Author-Name: Vasileios Alevizakos
Author-X-Name-First: Vasileios
Author-X-Name-Last: Alevizakos
Author-Name: Christos Koukouvinos
Author-X-Name-First: Christos
Author-X-Name-Last: Koukouvinos
Title: Monitoring of zero-inflated binomial processes with a DEWMA control chart
Abstract:
Control charts are widely used for monitoring quality characteristics of high-yield processes. In such processes where a large number of zero observations exists in count data, the zero-inflated binomial (ZIB) models are more appropriate than the ordinary binomial models. In ZIB models, random shocks occur with probability θ, and upon the occurrence of random shocks, the number of non-conforming items in a sample of size n follows the binomial distribution with proportion p. In the present article, we study in more detail the exponentially weighted moving average control chart based on ZIB distribution (ZIB-EWMA) and we also propose a new control chart based on the double exponentially weighted moving average statistic for monitoring ZIB data (ZIB-DEWMA). The two control charts are studied in detecting upward shifts in θ or p individually, as well as in both parameters simultaneously. Through a simulation study, we compare the performance of the proposed chart with the ZIB-Shewhart, ZIB-EWMA and ZIB-CUSUM charts. Finally, an illustrative example is also presented to display the practical application of the ZIB charts.
Journal: Journal of Applied Statistics
Pages: 1319-1338
Issue: 7
Volume: 48
Year: 2021
Month: 5
X-DOI: 10.1080/02664763.2020.1761950
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1761950
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:7:p:1319-1338
Template-Type: ReDIF-Article 1.0
Author-Name: Daniel K. Sewell
Author-X-Name-First: Daniel K.
Author-X-Name-Last: Sewell
Author-Name: Journey Penney
Author-X-Name-First: Journey
Author-X-Name-Last: Penney
Author-Name: Melissa Jay
Author-X-Name-First: Melissa
Author-X-Name-Last: Jay
Author-Name: Ying Zhang
Author-X-Name-First: Ying
Author-X-Name-Last: Zhang
Author-Name: Jane S. Paulsen
Author-X-Name-First: Jane S.
Author-X-Name-Last: Paulsen
Title: Predicting an optimal composite outcome variable for Huntington's disease clinical trials
Abstract:
While there is no known cure for Huntington's disease (HD), there are early-phase clinical trials aimed at altering disease progression patterns. There is, however, no obvious single outcome for these trials to evaluate treatment efficacy. Currently used outcomes are, while reasonable, not optimal in any sense. In this paper we derive a method for constructing a composite variable via a linear combination of clinical measures. Our composite variable optimizes the signal-to-noise ratio (SNR) within the context of a longitudinal study design. We also demonstrate how to induce sparsity using a soft-approximation of an
$L_1 $L1 penalty on the coefficients of the composite variable. We applied our method to data from the TRACK-HD study, a longitudinal study aimed at establishing good outcome measures for HD, and found that compared to the existing composite measurement our composite variable provides a larger SNR and allows clinical trials with smaller sample sizes to achieve equivalent power.
Journal: Journal of Applied Statistics
Pages: 1339-1348
Issue: 7
Volume: 48
Year: 2021
Month: 5
X-DOI: 10.1080/02664763.2020.1759034
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1759034
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:7:p:1339-1348
Template-Type: ReDIF-Article 1.0
Author-Name: The Editors
Title: Correction
Journal: Journal of Applied Statistics
Pages: i-i
Issue: 7
Volume: 48
Year: 2021
Month: 05
X-DOI: 10.1080/02664763.2020.1807783
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1807783
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:7:p:i-i
Template-Type: ReDIF-Article 1.0
Author-Name: Bojuan Barbara Zhao
Author-X-Name-First: Bojuan Barbara
Author-X-Name-Last: Zhao
Author-Name: Ruijuan Su
Author-X-Name-First: Ruijuan
Author-X-Name-Last: Su
Title: Determinants of the heavily right-tailed residential housing price in Tianjin
Abstract:
The housing price in Tianjin, one of the typical monocentric cities of China, exhibits a heavily right-tailed distribution even after the logarithm transformation of the price, which might lead to a biased estimation of the parameters under normal distribution assumption. Therefore, the extended Cox proportional hazards regression model and the generalized concept of relative risk are used to identify factors associated with the housing price. The analysis shows that the implementation dates of the macro regulation policies were related to the price changing trends. Qualities of public elementary and secondary schools were significantly associated with the housing price, and the associations between the structure and neighborhood characteristics and the housing price were influenced by the distance of the residential property to downtown Tianjin.
Journal: Journal of Applied Statistics
Pages: 1457-1474
Issue: 8
Volume: 48
Year: 2021
Month: 06
X-DOI: 10.1080/02664763.2020.1840534
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1840534
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:8:p:1457-1474
Template-Type: ReDIF-Article 1.0
Author-Name: Yang Liu
Author-X-Name-First: Yang
Author-X-Name-Last: Liu
Author-Name: JunJia Zhu
Author-X-Name-First: JunJia
Author-X-Name-Last: Zhu
Author-Name: Dennis K. J. Lin
Author-X-Name-First: Dennis K. J.
Author-X-Name-Last: Lin
Title: A generalized likelihood ratio test for monitoring profile data
Abstract:
Profile data emerges when the quality of a product or process is characterized by a functional relationship among (input and output) variables. In this paper, we focus on the case where each profile has one response variable Y, one explanatory variable x, and the functional relationship between these two variables can be rather arbitrary. The basic concept can be applied to a much wider case, however. We propose a general method based on the Generalized Likelihood Ratio Test (GLRT) for monitoring of profile data. The proposed method uses nonparametric regression to estimate the on-line profiles and thus does not require any functional form for the profiles. Both Shewhart-type and EWMA-type control charts are considered. The average run length (ARL) performance of the proposed method is studied. It is shown that the proposed GLRT-based control chart can efficiently detect both location and dispersion shifts of the on-line profiles from the baseline profile. An upper control limit (UCL) corresponding to a desired in-control ARL value is constructed.
Journal: Journal of Applied Statistics
Pages: 1402-1415
Issue: 8
Volume: 48
Year: 2021
Month: 06
X-DOI: 10.1080/02664763.2021.1880555
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1880555
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:8:p:1402-1415
Template-Type: ReDIF-Article 1.0
Author-Name: Antai Wang
Author-X-Name-First: Antai
Author-X-Name-Last: Wang
Author-Name: Xieyang Jia
Author-X-Name-First: Xieyang
Author-X-Name-Last: Jia
Author-Name: Zhezhen Jin
Author-X-Name-First: Zhezhen
Author-X-Name-Last: Jin
Title: Estimation of the cumulative baseline hazard function for dependently right-censored failure time data
Abstract:
In this paper, we study the properties of a special class of frailty models when the frailty is common to several failure times. The models are closely linked to Archimedean copula models. We establish a useful formula for cumulative baseline hazard functions and develop a new estimator for cumulative baseline hazard functions in bivariate frailty regression models. Based on our proposed estimator, we present a graphical model checking procedure. We fit a leukemia data set using our model and end our paper with some discussions.
Journal: Journal of Applied Statistics
Pages: 1416-1428
Issue: 8
Volume: 48
Year: 2021
Month: 06
X-DOI: 10.1080/02664763.2020.1795818
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1795818
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:8:p:1416-1428
Template-Type: ReDIF-Article 1.0
Author-Name: Liang Zhu
Author-X-Name-First: Liang
Author-X-Name-Last: Zhu
Author-Name: Xingwei Tong
Author-X-Name-First: Xingwei
Author-X-Name-Last: Tong
Author-Name: Dingjiao Cai
Author-X-Name-First: Dingjiao
Author-X-Name-Last: Cai
Author-Name: Yimei Li
Author-X-Name-First: Yimei
Author-X-Name-Last: Li
Author-Name: Ryan Sun
Author-X-Name-First: Ryan
Author-X-Name-Last: Sun
Author-Name: Deo K. Srivastava
Author-X-Name-First: Deo K.
Author-X-Name-Last: Srivastava
Author-Name: Melissa M. Hudson
Author-X-Name-First: Melissa M.
Author-X-Name-Last: Hudson
Title: Maximum likelihood estimation for the proportional odds model with mixed interval-censored failure time data
Abstract:
This article discusses regression analysis of mixed interval-censored failure time data. Such data frequently occur across a variety of settings, including clinical trials, epidemiologic investigations, and many other biomedical studies with a follow-up component. For example, mixed failure times are commonly found in the two largest studies of long-term survivorship after childhood cancer, the datasets that motivated this work. However, most existing methods for failure time data consider only right-censored or only interval-censored failure times, not the more general case where times may be mixed. Additionally, among regression models developed for mixed interval-censored failure times, the proportional hazards formulation is generally assumed. It is well-known that the proportional hazards model may be inappropriate in certain situations, and alternatives are needed to analyze mixed failure time data in such cases. To fill this need, we develop a maximum likelihood estimation procedure for the proportional odds regression model with mixed interval-censored data. We show that the resulting estimators are consistent and asymptotically Gaussian. An extensive simulation study is performed to assess the finite-sample properties of the method, and this investigation indicates that the proposed method works well for many practical situations. We then apply our approach to examine the impact of age at cranial radiation therapy on risk of growth hormone deficiency in long-term survivors of childhood cancer.
Journal: Journal of Applied Statistics
Pages: 1496-1512
Issue: 8
Volume: 48
Year: 2021
Month: 06
X-DOI: 10.1080/02664763.2020.1789077
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1789077
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:8:p:1496-1512
Template-Type: ReDIF-Article 1.0
Author-Name: Myeonggyun Lee
Author-X-Name-First: Myeonggyun
Author-X-Name-Last: Lee
Author-Name: Anne Zeleniuch-Jacquotte
Author-X-Name-First: Anne
Author-X-Name-Last: Zeleniuch-Jacquotte
Author-Name: Mengling Liu
Author-X-Name-First: Mengling
Author-X-Name-Last: Liu
Title: Empirical evaluation of sub-cohort sampling designs for risk prediction modeling
Abstract:
Sub-cohort sampling designs, such as nested case-control (NCC) and case-cohort (CC) studies, have been widely used to estimate biomarker-disease associations because of their cost effectiveness. These designs have been well studied and shown to maintain relatively high efficiency compared to full-cohort designs, but their performance of building risk prediction models has been less studied. Moreover, sub-cohort sampling designs often use matching (or stratifying) to further control for confounders or to reduce measurement error. Their predictive performance depends on both the design and matching procedures. Based on a dataset from the NYU Women's Health Study (NYUWHS), we performed Monte Carlo simulations to systematically evaluate risk prediction performance under NCC, CC, and full-cohort studies. Our simulations demonstrate that sub-cohort sampling designs can have predictive accuracy (i.e. discrimination and calibration) similar to that of the full-cohort design, but could be sensitive to the matching procedure used. Our results suggest that researchers can have the option of performing NCC and CC studies with huge potential benefits in cost and resources, but need to pay particular attention to the matching procedure when developing a risk prediction model in biomarker studies.
Journal: Journal of Applied Statistics
Pages: 1374-1401
Issue: 8
Volume: 48
Year: 2021
Month: 06
X-DOI: 10.1080/02664763.2020.1861225
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1861225
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:8:p:1374-1401
Template-Type: ReDIF-Article 1.0
Author-Name: Zhezhen Jin
Author-X-Name-First: Zhezhen
Author-X-Name-Last: Jin
Author-Name: Jianguo Sun
Author-X-Name-First: Jianguo
Author-X-Name-Last: Sun
Title: Editorial to special issue Frontiers of Data Analysis
Journal: Journal of Applied Statistics
Pages: 1349-1351
Issue: 8
Volume: 48
Year: 2021
Month: 06
X-DOI: 10.1080/02664763.2021.1922853
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1922853
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:8:p:1349-1351
Template-Type: ReDIF-Article 1.0
Author-Name: Chunjie Wang
Author-X-Name-First: Chunjie
Author-X-Name-Last: Wang
Author-Name: Jingjing Jiang
Author-X-Name-First: Jingjing
Author-X-Name-Last: Jiang
Author-Name: Linlin Luo
Author-X-Name-First: Linlin
Author-X-Name-Last: Luo
Author-Name: Shuying Wang
Author-X-Name-First: Shuying
Author-X-Name-Last: Wang
Title: Bayesian analysis of the Box-Cox transformation model based on left-truncated and right-censored data
Abstract:
In this paper, we discuss the inference problem about the Box-Cox transformation model when one faces left-truncated and right-censored data, which often occur in studies, for example, involving the cross-sectional sampling scheme. It is well-known that the Box-Cox transformation model includes many commonly used models as special cases such as the proportional hazards model and the additive hazards model. For inference, a Bayesian estimation approach is proposed and in the method, the piecewise function is used to approximate the baseline hazards function. Also the conditional marginal prior, whose marginal part is free of any constraints, is employed to deal with many computational challenges caused by the constraints on the parameters, and a MCMC sampling procedure is developed. A simulation study is conducted to assess the finite sample performance of the proposed method and indicates that it works well for practical situations. We apply the approach to a set of data arising from a retirement center.
Journal: Journal of Applied Statistics
Pages: 1429-1441
Issue: 8
Volume: 48
Year: 2021
Month: 06
X-DOI: 10.1080/02664763.2020.1784854
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1784854
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:8:p:1429-1441
Template-Type: ReDIF-Article 1.0
Author-Name: Xinwei He
Author-X-Name-First: Xinwei
Author-X-Name-Last: He
Author-Name: Xiaoqiang Sun
Author-X-Name-First: Xiaoqiang
Author-X-Name-Last: Sun
Author-Name: Yongzhao Shao
Author-X-Name-First: Yongzhao
Author-X-Name-Last: Shao
Title: Network-based survival analysis to discover target genes for developing cancer immunotherapies and predicting patient survival
Abstract:
Recently, cancer immunotherapies have been life-savers; however, only a fraction of treated patients have durable responses. Consequently, statistical methods that enable the discovery of target genes for developing new treatments and predicting the patient survival are of importance. This paper introduced a network-based survival analysis method and applied it to identify candidate genes as possible targets for developing new treatments. RNA-seq data from a mouse study was used to select differentially expressed genes, which were then translated to those in humans. We constructed a gene network and identified gene clusters using a training set of 310 human gliomas. Then we conducted gene set enrichment analysis to select the gene clusters with significant biological function. A penalized Cox model was built to identify a small set of candidate genes to predict survival. An independent set of 690 human glioma samples was used to evaluate a predictive accuracy of the survival model. The areas under time-dependent ROC curves in both the training and validation sets are more than 90%, indicating a strong association between selected genes and patient survival. Consequently, potential biomedical interventions targeting these genes might be able to alter their expressions and prolong patient survival.
Journal: Journal of Applied Statistics
Pages: 1352-1373
Issue: 8
Volume: 48
Year: 2021
Month: 06
X-DOI: 10.1080/02664763.2020.1812543
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1812543
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:8:p:1352-1373
Template-Type: ReDIF-Article 1.0
Author-Name: Jaehong Yu
Author-X-Name-First: Jaehong
Author-X-Name-Last: Yu
Author-Name: Hua Zhong
Author-X-Name-First: Hua
Author-X-Name-Last: Zhong
Title: Time varying mixed effects model with fused lasso regularization
Abstract:
The associations between covariates and the outcomes often vary over time, regardless of whether the covariate is time-varying or time-invariant. For example, we hypothesize that the impact of chronic diseases, such as diabetes and heart disease, on people’s physical functions differ with aging. However, the age-varying effect would be missed if one models the covariate simply as a time-invariant covariate (yes/no) with a time-constant coefficient. We propose a fused lasso-based time-varying linear mixed effect (FTLME) model and an efficient two-stage parameter estimation algorithm to estimate the longitudinal trajectories of fixed-effect coefficients. Simulation studies are presented to demonstrate the efficacy of the method and its computational efficiency in estimating smooth time-varying effects in high dimensional settings. A real data example on the Health and Retirement Study (HRS) analysis is used to demonstrate the practical usage of our method to infer age-varying impact of chronic disease on older people’s physical functions.
Journal: Journal of Applied Statistics
Pages: 1513-1526
Issue: 8
Volume: 48
Year: 2021
Month: 06
X-DOI: 10.1080/02664763.2020.1791805
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1791805
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:8:p:1513-1526
Template-Type: ReDIF-Article 1.0
Author-Name: Yuna Zhao
Author-X-Name-First: Yuna
Author-X-Name-Last: Zhao
Author-Name: Dennis K. J. Lin
Author-X-Name-First: Dennis K. J.
Author-X-Name-Last: Lin
Author-Name: Min-Qian Liu
Author-X-Name-First: Min-Qian
Author-X-Name-Last: Liu
Title: Designs for order-of-addition experiments
Abstract:
The order-of-addition experiment aims at determining the optimal order of adding components such that the response of interest is optimized. Order of addition has been widely involved in many areas, including bio-chemistry, food science, nutritional science, pharmaceutical science, etc. However, such an important study is rather primitive in statistical literature. In this paper, a thorough study on pair-wise ordering designs for order of addition is provided. The recursive relation between two successive full pair-wise ordering designs is developed. Based on this recursive relation, the full pair-wise ordering design can be obtained without evaluating all the orders of components. The value of the D-efficiency for the full pair-wise ordering model is then derived. It provides a benchmark for choosing the fractional pair-wise ordering designs. To overcome the unaffordability of the full pair-wise ordering design, a new class of minimal-point pair-wise ordering designs is proposed. A job scheduling problem as well as simulation studies are conducted to illustrate the performance of the pair-wise ordering designs for determining the optimal orders. It is shown that the proposed designs are very efficient in determining the optimal order of addition.
Journal: Journal of Applied Statistics
Pages: 1475-1495
Issue: 8
Volume: 48
Year: 2021
Month: 06
X-DOI: 10.1080/02664763.2020.1801607
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1801607
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:8:p:1475-1495
Template-Type: ReDIF-Article 1.0
Author-Name: Zhengyu Yang
Author-X-Name-First: Zhengyu
Author-X-Name-Last: Yang
Author-Name: Guo-Liang Tian
Author-X-Name-First: Guo-Liang
Author-X-Name-Last: Tian
Author-Name: Xiaobin Liu
Author-X-Name-First: Xiaobin
Author-X-Name-Last: Liu
Author-Name: Chang-Xing Ma
Author-X-Name-First: Chang-Xing
Author-X-Name-Last: Ma
Title: Simultaneous confidence interval construction for many-to-one comparisons of proportion differences based on correlated paired data
Abstract:
In some medical researches such as ophthalmological, orthopaedic and otolaryngologic studies, it is often of interest to compare multiple groups with a control using data collected from paired organs of patients. The major difficulty in performing the data analysis is to adjust the multiplicity between the comparison of multiple groups, and the correlation within the same patient's paired organs. In this article, we construct asymptotic simultaneous confidence intervals (SCIs) for many-to-one comparisons of proportion differences adjusting for multiplicity and the correlation. The coverage probabilities and widths of the proposed CIs are evaluated by Monte Carlo simulation studies. The methods are illustrated by a real data example.
Journal: Journal of Applied Statistics
Pages: 1442-1456
Issue: 8
Volume: 48
Year: 2021
Month: 06
X-DOI: 10.1080/02664763.2020.1795815
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1795815
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:8:p:1442-1456
Template-Type: ReDIF-Article 1.0
Author-Name: Yunfei Ye
Author-X-Name-First: Yunfei
Author-X-Name-Last: Ye
Author-Name: Dong Han
Author-X-Name-First: Dong
Author-X-Name-Last: Han
Title: An optimal control chart for finite matrix sequences at some unknown change point
Abstract:
We present a new measure for evaluating the performance of control charts to detect abrupt changes of finite matrix sequences. The objective is to minimize the probability that the control chart fails to raise the alarm at unknown change point time for a given in-control average run length. We construct and prove the optimal control chart with dynamic control limits in different pre- and post-change distributions. We validate the optimality of the proposed chart by conducting exhaustive experiments on both simulation study and real-world data.
Journal: Journal of Applied Statistics
Pages: 1628-1643
Issue: 9
Volume: 48
Year: 2021
Month: 7
X-DOI: 10.1080/02664763.2020.1772208
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1772208
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:9:p:1628-1643
Template-Type: ReDIF-Article 1.0
Author-Name: Maurizio Carpita
Author-X-Name-First: Maurizio
Author-X-Name-Last: Carpita
Author-Name: Silvia Golia
Author-X-Name-First: Silvia
Author-X-Name-Last: Golia
Title: Discovering associations between players' performance indicators and matches' results in the European Soccer Leagues
Abstract:
The application of data mining techniques and statistical analysis to the sports field has received increasing attention in the last decade. One of the most famous sports in the world is soccer, and the present work deals with it, using data from the 2009/2010 season to the 2015/2016 season from nine European leagues extracted from the Kaggle European Soccer database. Overall performance indicators of the four roles in a soccer team (forward, midfielder, defender and goalkeeper) for home and away teams are used to investigate the relationships between them and the results of matches, and to predict the wins of the home team. The model used to answer both these demands is the Bayesian Network. This study shows that this model can be very useful for mining the relations between players' performance indicators and for improving knowledge of the game strategies applied by coaches in different leagues. Moreover, it is shown that the ability to predict match results of the proposed Bayesian Network is roughly the same as that of the Naive Bayes model.
Journal: Journal of Applied Statistics
Pages: 1696-1711
Issue: 9
Volume: 48
Year: 2021
Month: 7
X-DOI: 10.1080/02664763.2020.1772210
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1772210
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:9:p:1696-1711
Template-Type: ReDIF-Article 1.0
Author-Name: Camillo Cammarota
Author-X-Name-First: Camillo
Author-X-Name-Last: Cammarota
Author-Name: Alessandro Pinto
Author-X-Name-First: Alessandro
Author-X-Name-Last: Pinto
Title: Variable selection and importance in presence of high collinearity: an application to the prediction of lean body mass from multi-frequency bioelectrical impedance
Abstract:
In prediction problems both response and covariates may have high correlation with a second group of influential regressors, that can be considered as background variables. An important challenge is to perform variable selection and importance assessment among the covariates in the presence of these variables. A clinical example is the prediction of the lean body mass (response) from bioimpedance (covariates), where anthropometric measures play the role of background variables. We introduce a reduced dataset in which the variables are defined as the residuals with respect to the background, and perform variable selection and importance assessment both in linear and random forest models. Using a clinical dataset of multi-frequency bioimpedance, we show the effectiveness of this method to select the most relevant predictors of the lean body mass beyond anthropometry.
Journal: Journal of Applied Statistics
Pages: 1644-1658
Issue: 9
Volume: 48
Year: 2021
Month: 7
X-DOI: 10.1080/02664763.2020.1763930
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1763930
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:9:p:1644-1658
Template-Type: ReDIF-Article 1.0
Author-Name: M. I. Alheety
Author-X-Name-First: M. I.
Author-X-Name-Last: Alheety
Author-Name: Kristofer Månsson
Author-X-Name-First: Kristofer
Author-X-Name-Last: Månsson
Author-Name: B. M. Golam Kibria
Author-X-Name-First: B. M.
Author-X-Name-Last: Golam Kibria
Title: A new kind of stochastic restricted biased estimator for logistic regression model
Abstract:
In the logistic regression model, the variance of the maximum likelihood estimator is inflated and unstable when the multicollinearity exists in the data. There are several methods available in literature to overcome this problem. We propose a new stochastic restricted biased estimator. We study the statistical properties of the proposed estimator and compare its performance with some existing estimators in the sense of scalar mean squared criterion. An example and a simulation study are provided to illustrate the performance of the proposed estimator.
Journal: Journal of Applied Statistics
Pages: 1559-1578
Issue: 9
Volume: 48
Year: 2021
Month: 7
X-DOI: 10.1080/02664763.2020.1769576
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1769576
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:9:p:1559-1578
Template-Type: ReDIF-Article 1.0
Author-Name: Kai-Sheng Song
Author-X-Name-First: Kai-Sheng
Author-X-Name-Last: Song
Title: Simultaneous statistical modelling of excess zeros, over/underdispersion, and multimodality with applications in hotel industry
Abstract:
We propose zero-inflated statistical models based on the generalized Hermite distribution for simultaneously modelling of excess zeros, over/underdispersion, and multimodality. These new models are parsimonious yet remarkably flexible allowing the covariates to be introduced directly through the mean, dispersion, and zero-inflated parameters. To accommodate the interval inequality constraint for the dispersion parameter, we present a new link function for the covariate-dependent dispersion regression model. We derive score tests for zero inflation in both covariate-free and covariate-dependent models. Both the score test and the likelihood-ratio test are conducted to examine the validity of zero inflation. The score test provides a useful tool when computing the likelihood-ratio statistic proves to be difficult. We analyse several hotel booking cancellation datasets extracted from two recently published real datasets from a resort hotel and a city hotel. These extracted cancellation datasets reveal complex features of excess zeros, over/underdispersion, and multimodality simultaneously making them difficult to analyse with existing approaches. The application of the proposed methods to the cancellation datasets illustrates the usefulness and flexibility of the models.
Journal: Journal of Applied Statistics
Pages: 1603-1627
Issue: 9
Volume: 48
Year: 2021
Month: 7
X-DOI: 10.1080/02664763.2020.1769577
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1769577
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:9:p:1603-1627
Template-Type: ReDIF-Article 1.0
Author-Name: Anne Helby Petersen
Author-X-Name-First: Anne Helby
Author-X-Name-Last: Petersen
Author-Name: Bo Markussen
Author-X-Name-First: Bo
Author-X-Name-Last: Markussen
Author-Name: Karl Bang Christensen
Author-X-Name-First: Karl Bang
Author-X-Name-Last: Christensen
Title: Exploratory data structure comparisons: three new visual tools based on principal component analysis
Abstract:
Datasets are sometimes divided into distinct subsets, e.g. due to multi-center sampling, or to variations in instruments, questionnaire item ordering or mode of administration, and the data analyst then needs to assess whether a joint analysis is meaningful. The Principal Component Analysis-based Data Structure Comparisons (PCADSC) tools are three new non-parametric, visual diagnostic tools for investigating differences in structure for two subsets of a dataset through covariance matrix comparisons by use of principal component analysis. The PCADCS tools are demonstrated in a data example using European Social Survey data on psychological well-being in three countries, Denmark, Sweden, and Bulgaria. The data structures are found to be different in Denmark and Bulgaria, and thus a comparison of for example mean psychological well-being scores is not meaningful. However, when comparing Denmark and Sweden, very similar data structures, and thus comparable concepts of well-being, are found. Therefore, inter-country comparisons are warranted for these countries.
Journal: Journal of Applied Statistics
Pages: 1675-1695
Issue: 9
Volume: 48
Year: 2021
Month: 7
X-DOI: 10.1080/02664763.2020.1773772
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1773772
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:9:p:1675-1695
Template-Type: ReDIF-Article 1.0
Author-Name: Satya Prakash Singh
Author-X-Name-First: Satya Prakash
Author-X-Name-Last: Singh
Author-Name: Pradeep Yadav
Author-X-Name-First: Pradeep
Author-X-Name-Last: Yadav
Title: Optimal allocation of subjects in a matched pair cluster-randomized trial with fixed number of heterogeneous clusters
Abstract:
In cluster-randomized trials, investigators randomize clusters of individuals such as households, medical practices, schools or classrooms despite the unit of interest are the individuals. It results in the loss of efficiency in terms of the estimation of the unknown parameters as well as the power of the test for testing the treatment effects. To recoup this efficiency loss, some studies pair similar clusters and randomize treatment within pairs. However, the clusters within a treatment arm might be heterogeneous in nature. In this article, we propose a locally optimal design that accounts the clusters heterogeneity and optimally allocates the subjects within each cluster. To address the dependency of design on the unknown parameters, we also discuss Bayesian optimal designs. Performances of proposed designs are investigated numerically through some data examples.
Journal: Journal of Applied Statistics
Pages: 1527-1540
Issue: 9
Volume: 48
Year: 2021
Month: 07
X-DOI: 10.1080/02664763.2020.1779195
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1779195
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:9:p:1527-1540
Template-Type: ReDIF-Article 1.0
Author-Name: Shin-Fu Tsai
Author-X-Name-First: Shin-Fu
Author-X-Name-Last: Tsai
Author-Name: Tse-Le Huang
Author-X-Name-First: Tse-Le
Author-X-Name-Last: Huang
Title: Confidence limits for conformance proportions in normal mixture models
Abstract:
Conformance proportions are important numerical indices for quality assessments. When the population is characterized by a normal mixture model, estimating conformance proportions can be a practical issue. To account for the inherent structure of normal mixture models, universal and individual conformance proportions are first defined for the purpose of evaluating the overall population and specific subpopulations of interest, respectively. On the basis of generalized fiducial quantities, a systematic method is then proposed in this paper to obtain confidence limits for the two classes of conformance proportions. The simulation results demonstrate that the proposed method can maintain the empirical coverage rate sufficiently close to the nominal level. In addition, two examples are given to illustrate the proposed method.
Journal: Journal of Applied Statistics
Pages: 1579-1602
Issue: 9
Volume: 48
Year: 2021
Month: 7
X-DOI: 10.1080/02664763.2020.1769578
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1769578
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:9:p:1579-1602
Template-Type: ReDIF-Article 1.0
Author-Name: J. P. Burgard
Author-X-Name-First: J. P.
Author-X-Name-Last: Burgard
Author-Name: J. Krause
Author-X-Name-First: J.
Author-X-Name-Last: Krause
Author-Name: R. Münnich
Author-X-Name-First: R.
Author-X-Name-Last: Münnich
Title: An elastic net penalized small area model combining unit- and area-level data for regional hypertension prevalence estimation
Abstract:
Hypertension is a highly prevalent cardiovascular disease. It marks a considerable cost factor to many national health systems. Despite its prevalence, regional disease distributions are often unknown and must be estimated from survey data. However, health surveys frequently lack in regional observations due to limited resources. Obtained prevalence estimates suffer from unacceptably large sampling variances and are not reliable. Small area estimation solves this problem by linking auxiliary data from multiple regions in suitable regression models. Typically, either unit- or area-level observations are considered for this purpose. But with respect to hypertension, both levels should be used. Hypertension has characteristic comorbidities and is strongly related to lifestyle features, which are unit-level information. It is also correlated with socioeconomic indicators that are usually measured on the area-level. But the level combination is challenging as it requires multi-level model parameter estimation from small samples. We use a multi-level small area model with level-specific penalization to overcome this issue. Model parameter estimation is performed via stochastic coordinate gradient descent. A jackknife estimator of the mean squared error is presented. The methodology is applied to combine health survey data and administrative records to estimate regional hypertension prevalence in Germany.
Journal: Journal of Applied Statistics
Pages: 1659-1674
Issue: 9
Volume: 48
Year: 2021
Month: 7
X-DOI: 10.1080/02664763.2020.1765323
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1765323
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:9:p:1659-1674
Template-Type: ReDIF-Article 1.0
Author-Name: Ngan Hoang-Nguyen-Thuy
Author-X-Name-First: Ngan
Author-X-Name-Last: Hoang-Nguyen-Thuy
Author-Name: K. Krishnamoorthy
Author-X-Name-First: K.
Author-X-Name-Last: Krishnamoorthy
Title: Estimation of the probability content in a specified interval using fiducial approach
Abstract:
Statistical methods for constructing confidence intervals for the probability content in a specified interval are proposed. Exact and approximate solutions based on the fiducial approach are described when the measurements on the variable of interest can be modelled by a location-scale (or log-location-scale) distribution. Methods are described for the normal, Weibull, two-parameter exponential and two-parameter Rayleigh distributions. For each case, the solutions are evaluated for their merits. Three examples, where it is desired to estimate the percentages of engineering products meet the specification limits, are provided to illustrate the methods.
Journal: Journal of Applied Statistics
Pages: 1541-1558
Issue: 9
Volume: 48
Year: 2021
Month: 7
X-DOI: 10.1080/02664763.2020.1768228
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1768228
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:9:p:1541-1558
Template-Type: ReDIF-Article 1.0
Author-Name: Thomas J. Zhou
Author-X-Name-First: Thomas J.
Author-X-Name-Last: Zhou
Author-Name: Sughra Raza
Author-X-Name-First: Sughra
Author-X-Name-Last: Raza
Author-Name: Kerrie P. Nelson
Author-X-Name-First: Kerrie P.
Author-X-Name-Last: Nelson
Title: Methods of assessing categorical agreement between correlated screening tests in clinical studies
Abstract:
Advances in breast imaging and other screening tests have prompted studies to evaluate and compare the consistency between experts' ratings of existing with new screening tests. In clinical settings, medical experts make subjective assessments of screening test results such as mammograms. Consistency between experts' ratings is evaluated by measures of inter-rater agreement or association. However, conventional measures, such as Cohen's and Fleiss' kappas, are unable to be applied or may perform poorly when studies consist of many experts, unbalanced data, or dependencies between experts' ratings exist. Here we assess the performance of existing approaches including recently developed summary measures for assessing the agreement between experts' binary and ordinal ratings when patients undergo two screening procedures. Methods to assess consistency between repeated measurements by the same experts are also described. We present applications to three large-scale clinical screening studies. Properties of these agreement measures are illustrated via simulation studies. Generally, a model-based approach provides several advantages over alternative methods including the ability to flexibly incorporate various measurement scales (i.e. binary or ordinal), large numbers of experts and patients, sparse data, and robustness to prevalence of underlying disease.
Journal: Journal of Applied Statistics
Pages: 1861-1881
Issue: 10
Volume: 48
Year: 2021
Month: 07
X-DOI: 10.1080/02664763.2020.1777394
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1777394
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:10:p:1861-1881
Template-Type: ReDIF-Article 1.0
Author-Name: Qazi J. Azhad
Author-X-Name-First: Qazi J.
Author-X-Name-Last: Azhad
Author-Name: Mohd. Arshad
Author-X-Name-First: Mohd.
Author-X-Name-Last: Arshad
Author-Name: Amit Kumar Misra
Author-X-Name-First: Amit Kumar
Author-X-Name-Last: Misra
Title: Estimation of common location parameter of several heterogeneous exponential populations based on generalized order statistics
Abstract:
In this article, several independent populations following exponential distribution with common location parameter and unknown and unequal scale parameters are considered. From these populations, several independent samples of generalized order statistics (gos) are drawn. Under the setup of gos, the problem of estimation of common location parameter is discussed and various estimators of common location parameter are derived. The authors obtained maximum likelihood estimator (MLE), modified MLE and uniformly minimum variance unbiased estimator of common location parameter. Furthermore, under scaled-squared error loss function, a general inadmissibility result of invariant estimator is proposed. The derived results are further reduced for upper record values which is a special case of gos. Finally, simulation study and real life example are reported to show the performances of various competing estimators in terms of percentage risk improvement.
Journal: Journal of Applied Statistics
Pages: 1798-1815
Issue: 10
Volume: 48
Year: 2021
Month: 07
X-DOI: 10.1080/02664763.2020.1777395
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1777395
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:10:p:1798-1815
Template-Type: ReDIF-Article 1.0
Author-Name: Joseph Wu
Author-X-Name-First: Joseph
Author-X-Name-Last: Wu
Author-Name: Mayetri Gupta
Author-X-Name-First: Mayetri
Author-X-Name-Last: Gupta
Author-Name: Amira I. Hussein
Author-X-Name-First: Amira I.
Author-X-Name-Last: Hussein
Author-Name: Louis Gerstenfeld
Author-X-Name-First: Louis
Author-X-Name-Last: Gerstenfeld
Title: Bayesian modeling of factorial time-course data with applications to a bone aging gene expression study
Abstract:
Many scientific studies, especially in the biomedical sciences, generate data measured simultaneously over a multitude of units, over a period of time, and under different conditions or combinations of factors. Often, an important question of interest asked relates to which units behave similarly under different conditions, but measuring the variation over time complicates the analysis significantly. In this article we address such a problem arising from a gene expression study relating to bone aging, and develop a Bayesian statistical method that can simultaneously detect and uncover signals on three levels within such data: factorial, longitudinal, and transcriptional. Our model framework considers both cluster and time-point-specific parameters and these parameters uniquely determine the shapes of the temporal gene expression profiles, allowing the discovery and characterization of latent gene clusters based on similar underlying biological mechanisms. Our methodology was successfully applied to discover transcriptional networks in a microarray data set comparing the transcriptomic changes that occurred during bone aging in male and female mice expressing one or both copies of the bromodomain (Brd2) gene, a transcriptional regulator which exhibits an age-dependent sex-linked bone loss phenotype.
Journal: Journal of Applied Statistics
Pages: 1730-1754
Issue: 10
Volume: 48
Year: 2021
Month: 7
X-DOI: 10.1080/02664763.2020.1772733
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1772733
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:10:p:1730-1754
Template-Type: ReDIF-Article 1.0
Author-Name: Richard Puurbalanta
Author-X-Name-First: Richard
Author-X-Name-Last: Puurbalanta
Title: A Clipped Gaussian Geo-Classification model for poverty mapping
Abstract:
The importance of discrete spatial models cannot be overemphasized, especially when measuring living standards. The battery of measurements is generally categorical with nearer geo-referenced observations featuring stronger dependencies. This study presents a Clipped Gaussian Geo-Classification (CGG-C) model for spatially-dependent ordered data, and compares its performance with existing methods to classify household poverty using Ghana living standards survey (GLSS 6) data. Bayesian inference was performed on data sampled by MCMC. Model evaluation was based on measures of classification and prediction accuracy. Spatial associations, given some household features, were quantified, and a poverty classification map for Ghana was developed. Overall, the results of estimation showed that many of the statistically significant covariates were generally strongly related with the ordered response variable. Households at specific locations tended to uniformly experience specific levels of poverty, thus, providing an empirical spatial character of poverty in Ghana. A comparative analysis of validation results showed that the CGG-C model (with 14.2% misclassification rate) outperformed the Cumulative Probit (CP) model with misclassification rate of 17.4%. This approach to poverty analysis is relevant for policy design and the implementation of cost-effective programmes to reduce category and site-specific poverty incidence, and monitor changes in both category and geographical trends thereof.
Journal: Journal of Applied Statistics
Pages: 1882-1895
Issue: 10
Volume: 48
Year: 2021
Month: 07
X-DOI: 10.1080/02664763.2020.1779191
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1779191
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:10:p:1882-1895
Template-Type: ReDIF-Article 1.0
Author-Name: H. Karamikabir
Author-X-Name-First: H.
Author-X-Name-Last: Karamikabir
Author-Name: M. Afshari
Author-X-Name-First: M.
Author-X-Name-Last: Afshari
Author-Name: F. Lak
Author-X-Name-First: F.
Author-X-Name-Last: Lak
Title: Wavelet threshold based on Stein's unbiased risk estimators of restricted location parameter in multivariate normal
Abstract:
In this paper, the problem of estimating the mean vector under non-negative constraints on location vector of the multivariate normal distribution is investigated. The value of the wavelet threshold based on Stein's unbiased risk estimators is calculated for the shrinkage estimator in restricted parameter space. We suppose that covariance matrix is unknown and we find the dominant class of shrinkage estimators under Balance loss function. The performance evaluation of the proposed class of estimators is checked through a simulation study by using risk and average mean square error values.
Journal: Journal of Applied Statistics
Pages: 1712-1729
Issue: 10
Volume: 48
Year: 2021
Month: 07
X-DOI: 10.1080/02664763.2020.1772209
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1772209
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:10:p:1712-1729
Template-Type: ReDIF-Article 1.0
Author-Name: Soyoung Park
Author-X-Name-First: Soyoung
Author-X-Name-Last: Park
Author-Name: Alicia Carriquiry
Author-X-Name-First: Alicia
Author-X-Name-Last: Carriquiry
Title: Quantifying the similarity of 2D images using edge pixels: an application to the forensic comparison of footwear impressions
Abstract:
We propose a novel method to quantify the similarity between an impression (Q) from an unknown source and a test impression (K) from a known source. Using the property of geometrical congruence in the impressions, the degree of correspondence is quantified using ideas from graph theory and maximum clique (MC). The algorithm uses the x and y coordinates of the edges in the images as the data. We focus on local areas in Q and the corresponding regions in K and extract features for comparison. Using pairs of images with known origin, we train a random forest to classify pairs into mates and non-mates. We collected impressions from 60 pairs of shoes of the same brand and model, worn over six months. Using a different set of very similar shoes, we evaluated the performance of the algorithm in terms of the accuracy with which it correctly classified images into source classes. Using classification error rates and ROC curves, we compare the proposed method to other algorithms in the literature and show that for these data, our method shows good classification performance relative to other methods. The algorithm can be implemented with the R package shoeprintr.
Journal: Journal of Applied Statistics
Pages: 1833-1860
Issue: 10
Volume: 48
Year: 2021
Month: 07
X-DOI: 10.1080/02664763.2020.1779194
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1779194
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:10:p:1833-1860
Template-Type: ReDIF-Article 1.0
Author-Name: Lahiru Wickramasinghe
Author-X-Name-First: Lahiru
Author-X-Name-Last: Wickramasinghe
Author-Name: Alexandre Leblanc
Author-X-Name-First: Alexandre
Author-X-Name-Last: Leblanc
Author-Name: Saman Muthukumarana
Author-X-Name-First: Saman
Author-X-Name-Last: Muthukumarana
Title: Model-based estimation of baseball batting metrics
Abstract:
We introduce an approach to model the batting outcomes of baseball batters based on the weighted likelihood approach and make use of our methodology to estimate commonly used baseball batting metrics. The weighted likelihood allows the sharing of relevant information among players. Specifically, this allows the inference on each batter to make use of the batting data from all other players in the league and, in the process, allows for improved inference. MAMSE (Minimum Averaged Mean Squared Error) weights are used as the likelihood weights. For comparison, we implemented a semi-parametric Bayesian approach based on the Dirichlet process, which enables the borrowing of information across batters while providing a natural clustering mechanism. We demonstrate and compare these approaches using 2018 Major League Baseball (MLB) batters data.
Journal: Journal of Applied Statistics
Pages: 1775-1797
Issue: 10
Volume: 48
Year: 2021
Month: 07
X-DOI: 10.1080/02664763.2020.1775792
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1775792
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:10:p:1775-1797
Template-Type: ReDIF-Article 1.0
Author-Name: Jing Zhang
Author-X-Name-First: Jing
Author-X-Name-Last: Zhang
Author-Name: Yanyan Liu
Author-X-Name-First: Yanyan
Author-X-Name-Last: Liu
Title: Model-free slice screening for ultrahigh-dimensional survival data
Abstract:
For ultrahigh-dimensional data, independent feature screening has been demonstrated both theoretically and empirically to be an effective dimension reduction method with low computational demanding. Motivated by the Buckley–James method to accommodate censoring, we propose a fused Kolmogorov–Smirnov filter to screen out the irrelevant dependent variables for ultrahigh-dimensional survival data. The proposed model-free screening method can work with many types of covariates (e.g. continuous, discrete and categorical variables) and is shown to enjoy the sure independent screening property under mild regularity conditions without requiring any moment conditions on covariates. In particular, the proposed procedure can still be powerful when covariates are strongly dependent on each other. We further develop an iterative algorithm to enhance the performance of our method while dealing with the practical situations where some covariates may be marginally unrelated but jointly related to the response. We conduct extensive simulations to evaluate the finite-sample performance of the proposed method, showing that it has favourable exhibition over the existing typical methods. As an illustration, we apply the proposed method to the diffuse large-B-cell lymphoma study.
Journal: Journal of Applied Statistics
Pages: 1755-1774
Issue: 10
Volume: 48
Year: 2021
Month: 7
X-DOI: 10.1080/02664763.2020.1772734
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1772734
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:10:p:1755-1774
Template-Type: ReDIF-Article 1.0
Author-Name: Opeoluwa F. Oyedele
Author-X-Name-First: Opeoluwa F.
Author-X-Name-Last: Oyedele
Title: Extension of biplot methodology to multivariate regression analysis
Abstract:
At the core of multivariate statistics is the investigation of relationships between different sets of variables. More precisely, the inter-variable relationships and the causal relationships. The latter is a regression problem, where one set of variables is referred to as the response variables and the other set of variables as the predictor variables. In this situation, the effect of the predictors on the response variables is revealed through the regression coefficients. Results from the resulting regression analysis can be viewed graphically using the biplot. The consequential biplot provides a single graphical representation of the samples together with the predictor variables and response variables. In addition, their effect in terms of the regression coefficients can be visualized, although sub-optimally, in the said biplot.
Journal: Journal of Applied Statistics
Pages: 1816-1832
Issue: 10
Volume: 48
Year: 2021
Month: 07
X-DOI: 10.1080/02664763.2020.1779192
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1779192
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:10:p:1816-1832
Template-Type: ReDIF-Article 1.0
Author-Name: Muhammad Ijaz
Author-X-Name-First: Muhammad
Author-X-Name-Last: Ijaz
Author-Name: Wali Khan Mashwani
Author-X-Name-First: Wali Khan
Author-X-Name-Last: Mashwani
Author-Name: Atilla Göktaş
Author-X-Name-First: Atilla
Author-X-Name-Last: Göktaş
Author-Name: Yuksel Akay Unvan
Author-X-Name-First: Yuksel Akay
Author-X-Name-Last: Unvan
Title: RETRACTED ARTICLE: A novel alpha power transformed exponential distribution with real-life applications
Abstract:
We, the Editor-in-Chief and Publisher of Journal of Applied Statistics have retracted the following article, which was due to appear in a special issue:Muhammad Ijaz, Wali Khan Mashwani, Atilla Göktaş & Yuksel Akay Unvan (2021): A novel alpha power transformed exponential distribution with real-life applications, Journal of Applied Statistics. DOI: 10.1080/02664763.2020.1870673.The Editor-in-Chief and the Publisher are cognisant of clear evidence that the findings presented are unreliable. The probability distribution is only valid if α > 1 and numerous mathematical properties in Section 2 have been shown to be incorrect. This has then impacted at least two figures in the article.We are further cognisant that the article contained a number of similarities to previously published papers where some of the findings had been published without proper cross-referencing including:Gupta, R.D. and Kundu, D. (2001), Exponentiated Exponential Family: An Alternative to Gamma and Weibull Distributions. Biom. J., 43: 117–130. https://doi.org/10.1002/1521-4036(200102)43:1<117::AID-BIMJ117>3.0.CO;2-R.We have been informed in our decision-making by our corrections and editorial policies and the Committee on Publication Ethics (COPE) guidelines on retractions.The retracted article will remain online to maintain the scholarly record, but it will be digitally watermarked on each page as ‘Retracted’.The Editor-in-Chief and the Publisher would like to thank the anonymous reader/s for their comments which alerted JAS to these major errors in the first instance.
Journal: Journal of Applied Statistics
Pages: I-XVI
Issue: 11
Volume: 48
Year: 2021
Month: 08
X-DOI: 10.1080/02664763.2020.1870673
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1870673
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:11:p:I-XVI
Template-Type: ReDIF-Article 1.0
Author-Name: Sugnet Gardner-Lubbe
Author-X-Name-First: Sugnet
Author-X-Name-Last: Gardner-Lubbe
Title: Linear discriminant analysis for multiple functional data analysis
Abstract:
In multivariate data analysis, Fisher linear discriminant analysis is useful to optimally separate two classes of observations by finding a linear combination of p variables. Functional data analysis deals with the analysis of continuous functions and thus can be seen as a generalisation of multivariate analysis where the dimension of the analysis space p strives to infinity. Several authors propose methods to perform discriminant analysis in this infinite dimensional space. Here, the methodology is introduced to perform discriminant analysis, not on single infinite dimensional functions, but to find a linear combination of p infinite dimensional continuous functions, providing a set of continuous canonical functions which are optimally separated in the canonical space.
Journal: Journal of Applied Statistics
Pages: 1917-1933
Issue: 11
Volume: 48
Year: 2021
Month: 08
X-DOI: 10.1080/02664763.2020.1780569
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1780569
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:11:p:1917-1933
Template-Type: ReDIF-Article 1.0
Author-Name: Fernando de Souza Bastos
Author-X-Name-First: Fernando de Souza
Author-X-Name-Last: Bastos
Author-Name: Wagner Barreto-Souza
Author-X-Name-First: Wagner
Author-X-Name-Last: Barreto-Souza
Title: Birnbaum–Saunders sample selection model
Abstract:
The sample selection bias problem occurs when the outcome of interest is only observed according to some selection rule, where there is a dependence structure between the outcome and the selection rule. In a pioneering work, J. Heckman proposed a sample selection model based on a bivariate normal distribution for dealing with this problem. Due to the non-robustness of the normal distribution, many alternatives have been introduced in the literature by assuming extensions of the normal distribution like the Student-t and skew-normal models. One common limitation of the existent sample selection models is that they require a transformation of the outcome of interest, which is common
$\mathbb {R}^+ $R+-valued, such as income and wage. With this, data are analyzed on a non-original scale which complicates the interpretation of the parameters. In this paper, we propose a sample selection model based on the bivariate Birnbaum–Saunders distribution, which has the same number of parameters that the classical Heckman model. Further, our associated outcome equation is
$\mathbb R^+ $R+-valued. We discuss estimation by maximum likelihood and present some Monte Carlo simulation studies. An empirical application to the ambulatory expenditures data from the 2001 Medical Expenditure Panel Survey is presented.
Journal: Journal of Applied Statistics
Pages: 1896-1916
Issue: 11
Volume: 48
Year: 2021
Month: 08
X-DOI: 10.1080/02664763.2020.1780570
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1780570
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:11:p:1896-1916
Template-Type: ReDIF-Article 1.0
Author-Name: Ana R. S. Silva
Author-X-Name-First: Ana R. S.
Author-X-Name-Last: Silva
Author-Name: Caio L. N. Azevedo
Author-X-Name-First: Caio L. N.
Author-X-Name-Last: Azevedo
Author-Name: Jorge L. Bazán
Author-X-Name-First: Jorge L.
Author-X-Name-Last: Bazán
Author-Name: Juvêncio S. Nobre
Author-X-Name-First: Juvêncio S.
Author-X-Name-Last: Nobre
Title: Augmented-limited regression models with an application to the study of the risk perceived using continuous scales
Abstract:
Studies of risk perceived using continuous scales of [0,100] were recently introduced in psychometrics, which can be transformed to the unit interval, but the presence of zeros or ones are commonly observed. Motivated by this, we introduce a full inferential set of tools that allows for augmented and limited data modeling. We considered parameter estimation, residual analysis, influence diagnostic and model selection for zero-and/or-one augmented beta rectangular (ZOABR) regression models and their particular nested models, which is based on a new parameterization of the beta rectangular distribution. Different from other alternatives, we performed maximum-likelihood estimation using a combination of the EM algorithm (for the continuous part) and Fisher scoring algorithm (for the discrete part). Also, we perform an additional step, by considering other link functions, besides the usual logistic link, for modeling the response mean. By considering randomized quantile residuals, (local) influence diagnostics and model selection tools, we identified that the ZOABR regression model is the best one. We also conducted extensive simulations studies, which indicate that all developed tools work properly. Finally, we discuss the use of this type of models to treat psychometric data. It is worthwhile to mention that applications of the developed methods go beyond to Psychometric data. Indeed, they can be useful when the response variable in bounded, including or not the respective limits.
Journal: Journal of Applied Statistics
Pages: 1998-2021
Issue: 11
Volume: 48
Year: 2021
Month: 08
X-DOI: 10.1080/02664763.2020.1783518
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1783518
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:11:p:1998-2021
Template-Type: ReDIF-Article 1.0
Author-Name: Xingli Yang
Author-X-Name-First: Xingli
Author-X-Name-Last: Yang
Author-Name: Yu Wang
Author-X-Name-First: Yu
Author-X-Name-Last: Wang
Author-Name: Wennan Yan
Author-X-Name-First: Wennan
Author-X-Name-Last: Yan
Author-Name: Jihong Li
Author-X-Name-First: Jihong
Author-X-Name-Last: Li
Title: Variance estimation based on blocked 3×2 cross-validation in high-dimensional linear regression
Abstract:
In high-dimensional linear regression, the dimension of variables is always greater than the sample size. In this situation, the traditional variance estimation technique based on ordinary least squares constantly exhibits a high bias even under sparsity assumption. One of the major reasons is the high spurious correlation between unobserved realized noise and several predictors. To alleviate this problem, a refitted cross-validation (RCV) method has been proposed in the literature. However, for a complicated model, the RCV exhibits a lower probability that the selected model includes the true model in case of finite samples. This phenomenon may easily result in a large bias of variance estimation. Thus, a model selection method based on the ranks of the frequency of occurrences in six votes from a blocked 3×2 cross-validation is proposed in this study. The proposed method has a considerably larger probability of including the true model in practice than the RCV method. The variance estimation obtained using the model selected by the proposed method also shows a lower bias and a smaller variance. Furthermore, theoretical analysis proves the asymptotic normality property of the proposed variance estimation.
Journal: Journal of Applied Statistics
Pages: 1934-1947
Issue: 11
Volume: 48
Year: 2021
Month: 08
X-DOI: 10.1080/02664763.2020.1780571
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1780571
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:11:p:1934-1947
Template-Type: ReDIF-Article 1.0
Author-Name: Zequn Sun
Author-X-Name-First: Zequn
Author-X-Name-Last: Sun
Author-Name: Thomas J. Fisher
Author-X-Name-First: Thomas J.
Author-X-Name-Last: Fisher
Title: Testing for correlation between two time series using a parametric bootstrap
Abstract:
We study the problem of determining if two time series are correlated in the mean and variance. Several test statistics, originally designed for determining the correlation between two mean processes or goodness-of-fit testing, are explored and formally introduced for determining cross-correlation in variance. Simulations demonstrate the theoretical asymptotic distribution can be ineffective in finite samples. Parametric bootstrapping is shown to be an effective tool in such an enterprise. A large simulation study is provided demonstrating the efficacy of the bootstrapping method. Lastly, an empirical example explores a correlation between the Standard & Poor's 500 index and the Euro/US dollar exchange rate while also demonstrating a level of robustness for the proposed method.
Journal: Journal of Applied Statistics
Pages: 2042-2063
Issue: 11
Volume: 48
Year: 2021
Month: 08
X-DOI: 10.1080/02664763.2020.1783519
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1783519
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:11:p:2042-2063
Template-Type: ReDIF-Article 1.0
Author-Name: Corrine F. Elliott
Author-X-Name-First: Corrine F.
Author-X-Name-Last: Elliott
Author-Name: Joshua W. Lambert
Author-X-Name-First: Joshua W.
Author-X-Name-Last: Lambert
Author-Name: Arnold J. Stromberg
Author-X-Name-First: Arnold J.
Author-X-Name-Last: Stromberg
Author-Name: Pei Wang
Author-X-Name-First: Pei
Author-X-Name-Last: Wang
Author-Name: Ting Zeng
Author-X-Name-First: Ting
Author-X-Name-Last: Zeng
Author-Name: Katherine L. Thompson
Author-X-Name-First: Katherine L.
Author-X-Name-Last: Thompson
Title: Feasibility as a mechanism for model identification and validation
Abstract:
As new technologies permit the generation of hitherto unprecedented volumes of data (e.g. genome-wide association study data), researchers struggle to keep up with the added complexity and time commitment required for its analysis. For this reason, model selection commonly relies on machine learning and data-reduction techniques, which tend to afford models with obscure interpretations. Even in cases with straightforward explanatory variables, the so-called ‘best’ model produced by a given model-selection technique may fail to capture information of vital importance to the domain-specific questions at hand. Herein we propose a new concept for model selection, feasibility, for use in identifying multiple models that are in some sense optimal and may unite to provide a wider range of information relevant to the topic of interest, including (but not limited to) interaction terms. We further provide an R package and associated Shiny Applications for use in identifying or validating feasible models, the performance of which we demonstrate on both simulated and real-life data.
Journal: Journal of Applied Statistics
Pages: 2022-2041
Issue: 11
Volume: 48
Year: 2021
Month: 08
X-DOI: 10.1080/02664763.2020.1783522
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1783522
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:11:p:2022-2041
Template-Type: ReDIF-Article 1.0
Author-Name: M. S. Eliwa
Author-X-Name-First: M. S.
Author-X-Name-Last: Eliwa
Author-Name: M. El-Morshedy
Author-X-Name-First: M.
Author-X-Name-Last: El-Morshedy
Author-Name: Sajid Ali
Author-X-Name-First: Sajid
Author-X-Name-Last: Ali
Title: Exponentiated odd Chen-G family of distributions: statistical properties, Bayesian and non-Bayesian estimation with applications
Abstract:
In this paper, a new flexible generator of distributions is proposed. Some of its fundamental properties including quantile, skewness, kurtosis, hazard rate function, moments, mean deviations, mean time to failure, mean time between failure, availability and reliability function of consecutive linear and circular systems are studied. The hazard rate function can be increasing, decreasing, unimodal-bathtub, unimodal, bathtub, J and inverse J-shaped depending on its parameters values. After introducing the general class, two special models of the new family are discussed in detail. Maximum likelihood and Bayesian methods are used to estimate the model parameters. A detailed simulation study is carried out to examine the bias and mean square error of maximum likelihood and Bayesian estimators. We also illustrate the importance of the new family by means of two distinctive real data sets. It can serve as an alternative model to other lifetime distributions in the existing statistical literature for modeling positive and negative real data in many areas.
Journal: Journal of Applied Statistics
Pages: 1948-1974
Issue: 11
Volume: 48
Year: 2021
Month: 08
X-DOI: 10.1080/02664763.2020.1783520
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1783520
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:11:p:1948-1974
Template-Type: ReDIF-Article 1.0
Author-Name: The Editors
Title: Statement of Retraction: A novel alpha power transformed exponential distribution with real-life applications
Journal: Journal of Applied Statistics
Pages: 2064-2064
Issue: 11
Volume: 48
Year: 2021
Month: 08
X-DOI: 10.1080/02664763.2021.1955521
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1955521
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:11:p:2064-2064
Template-Type: ReDIF-Article 1.0
Author-Name: Masoumeh Shirozhan
Author-X-Name-First: Masoumeh
Author-X-Name-Last: Shirozhan
Author-Name: Mehrnaz Mohammadpour
Author-X-Name-First: Mehrnaz
Author-X-Name-Last: Mohammadpour
Title: A dependent counting INAR model with serially dependent innovation
Abstract:
To provide a more flexible model of count data, we extend the first-order integer-valued autoregressive model with serially dependent innovations based on the dependent thinning operator. This model is appropriate for modelling the number of dependent random events affecting each other when the number of new cases depend on the previous count through a linear functional relationship. Several statistical properties of the model are determined, parameters are estimated by some methods and their properties are studied via simulations. This study was carried out to investigate the efficiency of the new model by two real count data sets, the number of contagious diseases and robbery.
Journal: Journal of Applied Statistics
Pages: 1975-1997
Issue: 11
Volume: 48
Year: 2021
Month: 08
X-DOI: 10.1080/02664763.2020.1783521
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1783521
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:11:p:1975-1997
Template-Type: ReDIF-Article 1.0
Author-Name: Katherine Davies
Author-X-Name-First: Katherine
Author-X-Name-Last: Davies
Author-Name: Suvra Pal
Author-X-Name-First: Suvra
Author-X-Name-Last: Pal
Author-Name: Joynob A. Siddiqua
Author-X-Name-First: Joynob A.
Author-X-Name-Last: Siddiqua
Title: Stochastic EM algorithm for generalized exponential cure rate model and an empirical study
Abstract:
In this paper, we consider two well-known parametric long-term survival models, namely, the Bernoulli cure rate model and the promotion time (or Poisson) cure rate model. Assuming the long-term survival probability to depend on a set of risk factors, the main contribution is in the development of the stochastic expectation maximization (SEM) algorithm to determine the maximum likelihood estimates of the model parameters. We carry out a detailed simulation study to demonstrate the performance of the proposed SEM algorithm. For this purpose, we assume the lifetimes due to each competing cause to follow a two-parameter generalized exponential distribution. We also compare the results obtained from the SEM algorithm with those obtained from the well-known expectation maximization (EM) algorithm. Furthermore, we investigate a simplified estimation procedure for both SEM and EM algorithms that allow the objective function to be maximized to split into simpler functions with lower dimensions with respect to model parameters. Moreover, we present examples where the EM algorithm fails to converge but the SEM algorithm still works. For illustrative purposes, we analyze a breast cancer survival data. Finally, we use a graphical method to assess the goodness-of-fit of the model with generalized exponential lifetimes.
Journal: Journal of Applied Statistics
Pages: 2112-2135
Issue: 12
Volume: 48
Year: 2021
Month: 09
X-DOI: 10.1080/02664763.2020.1786676
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1786676
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:12:p:2112-2135
Template-Type: ReDIF-Article 1.0
Author-Name: Piotr Sulewski
Author-X-Name-First: Piotr
Author-X-Name-Last: Sulewski
Title: Equal-bin-width histogram versus equal-bin-count histogram
Abstract:
The histogram has all its bin widths equal to some non-random number arbitrary set by an analyst (EBWH). In the result, particular bin counts are random variables. This paper presents also a histogram that is constructed in a converse manner. Bin counts are all equal to some non-random number arbitrary set by an analyst (EBCH). In the result, particular bin widths are random variables. The first goal of the paper is a choose of constant bin width (of bin numbers k) in the EBWH, which maximize the similarity measure in the Monte Carlo simulation. The second goal is a choose of constant bin count in the EBCH, which maximize the similarity measure in the Monte Carlo simulation. The third goal is to present similarity measures between empirical and theoretical data. The fourth goal is the comparative analysis of two histogram methods by means of the frequency formula. The first additional goal is a tip how to proceed in EBCH when modulo(n,k)≠0. The second additional goal is the software in the form of a Mathcad file with the implementation of EBWH and EBCH.
Journal: Journal of Applied Statistics
Pages: 2092-2111
Issue: 12
Volume: 48
Year: 2021
Month: 09
X-DOI: 10.1080/02664763.2020.1784853
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1784853
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:12:p:2092-2111
Template-Type: ReDIF-Article 1.0
Author-Name: Yousri Slaoui
Author-X-Name-First: Yousri
Author-X-Name-Last: Slaoui
Title: Two new nonparametric kernel distribution estimators based on a transformation of the data
Abstract:
In this paper, we propose two kernel distribution estimators based on a data transformation. We study the properties of these estimators and we compare them with two conventional estimators. It appears that with an appropriate choice of the parameters of the two proposed estimators, the convergence rate of two estimators will be faster than that of the two conventional estimators and the Mean Integrated Square Error will be smaller than the two conventional estimators. We corroborate these theoretical results through simulations as well as a real data set.
Journal: Journal of Applied Statistics
Pages: 2065-2091
Issue: 12
Volume: 48
Year: 2021
Month: 09
X-DOI: 10.1080/02664763.2020.1786675
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1786675
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:12:p:2065-2091
Template-Type: ReDIF-Article 1.0
Author-Name: Phuong Hanh Tran
Author-X-Name-First: Phuong Hanh
Author-X-Name-Last: Tran
Author-Name: Cédric Heuchenne
Author-X-Name-First: Cédric
Author-X-Name-Last: Heuchenne
Author-Name: Huu Du Nguyen
Author-X-Name-First: Huu Du
Author-X-Name-Last: Nguyen
Author-Name: Hélène Marie
Author-X-Name-First: Hélène
Author-X-Name-Last: Marie
Title: Monitoring coefficient of variation using one-sided run rules control charts in the presence of measurement errors
Abstract:
We investigate, in this paper, the effect of the measurement error (ME) on the performance of Run Rules control charts monitoring the coefficient of variation (CV) squared. The previous Run Rules CV chart in the literature is improved slightly by monitoring the CV squared using two one-sided Run Rules charts instead of monitoring the CV itself using a two-sided chart. The numerical results show that this improvement gives better performance in detecting process shifts. Moreover, we will show through simulation that the precision and accuracy errors do have a negative effect on the performance of the proposed Run Rules charts. We also find out that taking multiple measurements per item is not an effective way to reduce these negative effects. The proposed Run Rules control charts can be applied in the anomaly detection area.
Journal: Journal of Applied Statistics
Pages: 2178-2204
Issue: 12
Volume: 48
Year: 2021
Month: 09
X-DOI: 10.1080/02664763.2020.1787356
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1787356
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:12:p:2178-2204
Template-Type: ReDIF-Article 1.0
Author-Name: Q.F. Xu
Author-X-Name-First: Q.F.
Author-X-Name-Last: Xu
Author-Name: X.H. Ding
Author-X-Name-First: X.H.
Author-X-Name-Last: Ding
Author-Name: C.X. Jiang
Author-X-Name-First: C.X.
Author-X-Name-Last: Jiang
Author-Name: K.M. Yu
Author-X-Name-First: K.M.
Author-X-Name-Last: Yu
Author-Name: L. Shi
Author-X-Name-First: L.
Author-X-Name-Last: Shi
Title: An elastic-net penalized expectile regression with applications
Abstract:
To perform variable selection in expectile regression, we introduce the elastic-net penalty into expectile regression and propose an elastic-net penalized expectile regression (ER-EN) model. We then adopt the semismooth Newton coordinate descent (SNCD) algorithm to solve the proposed ER-EN model in high-dimensional settings. The advantages of ER-EN model are illustrated via extensive Monte Carlo simulations. The numerical results show that the ER-EN model outperforms the elastic-net penalized least squares regression (LSR-EN), the elastic-net penalized Huber regression (HR-EN), the elastic-net penalized quantile regression (QR-EN) and conventional expectile regression (ER) in terms of variable selection and predictive ability, especially for asymmetric distributions. We also apply the ER-EN model to two real-world applications: relative location of CT slices on the axial axis and metabolism of tacrolimus (Tac) drug. Empirical results also demonstrate the superiority of the ER-EN model.
Journal: Journal of Applied Statistics
Pages: 2205-2230
Issue: 12
Volume: 48
Year: 2021
Month: 09
X-DOI: 10.1080/02664763.2020.1787355
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1787355
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:12:p:2205-2230
Template-Type: ReDIF-Article 1.0
Author-Name: Sukru Acitas
Author-X-Name-First: Sukru
Author-X-Name-Last: Acitas
Author-Name: Ismail Yenilmez
Author-X-Name-First: Ismail
Author-X-Name-Last: Yenilmez
Author-Name: Birdal Senoglu
Author-X-Name-First: Birdal
Author-X-Name-Last: Senoglu
Author-Name: Yeliz Mert Kantar
Author-X-Name-First: Yeliz Mert
Author-X-Name-Last: Kantar
Title: Modified maximum likelihood estimator under the Jones and Faddy's skew t-error distribution for censored regression model
Abstract:
It is well-known that classical Tobit estimator of the parameters of the censored regression (CR) model is inefficient in case of non-normal error terms. In this paper, we propose to use the modified maximum likelihood (MML) estimator under the Jones and Faddy's skew t-error distribution, which covers a wide range of skew and symmetric distributions, for the CR model. The MML estimators, providing an alternative to the Tobit estimator, are explicitly expressed and they are asymptotically equivalent to the maximum likelihood estimator. A simulation study is conducted to compare the efficiencies of the MML estimators with the classical estimators such as the ordinary least squares, Tobit, censored least absolute deviations and symmetrically trimmed least squares estimators. The results of the simulation study show that the MML estimators work well among the others with respect to the root mean square error criterion for the CR model. A real life example is also provided to show the suitability of the MML methodology.
Journal: Journal of Applied Statistics
Pages: 2136-2151
Issue: 12
Volume: 48
Year: 2021
Month: 09
X-DOI: 10.1080/02664763.2020.1786673
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1786673
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:12:p:2136-2151
Template-Type: ReDIF-Article 1.0
Author-Name: Chong Sheng
Author-X-Name-First: Chong
Author-X-Name-Last: Sheng
Author-Name: Matthias Hwai Yong Tan
Author-X-Name-First: Matthias Hwai Yong
Author-X-Name-Last: Tan
Author-Name: Lu Zou
Author-X-Name-First: Lu
Author-X-Name-Last: Zou
Title: Maximum expected entropy transformed Latin hypercube designs
Abstract:
Existing projection designs (e.g. maximum projection designs) attempt to achieve good space-filling properties in all projections. However, when using a Gaussian process (GP), model-based design criteria such as the entropy criterion is more appropriate. We employ the entropy criterion averaged over a set of projections, called expected entropy criterion (EEC), to generate projection designs. We show that maximum EEC designs are invariant to monotonic transformations of the response, i.e. they are optimal for a wide class of stochastic process models. We also demonstrate that transformation of each column of a Latin hypercube design (LHD) based on a monotonic function can substantially improve the EEC. Two types of input transformations are considered: a quantile function of a symmetric Beta distribution chosen to optimize the EEC, and a nonparametric transformation corresponding to the quantile function of a symmetric density chosen to optimize the EEC. Numerical studies show that the proposed transformations of the LHD are efficient and effective for building robust maximum EEC designs. These designs give projections with markedly higher entropies and lower maximum prediction variances (MPV's) at the cost of small increases in average prediction variances (APV's) compared to state-of-the-art space-filling designs over wide ranges of covariance parameter values.
Journal: Journal of Applied Statistics
Pages: 2152-2177
Issue: 12
Volume: 48
Year: 2021
Month: 09
X-DOI: 10.1080/02664763.2020.1786674
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1786674
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:12:p:2152-2177
Template-Type: ReDIF-Article 1.0
Author-Name: Atakan Gerger
Author-X-Name-First: Atakan
Author-X-Name-Last: Gerger
Author-Name: Ali Riza Firuzan
Author-X-Name-First: Ali Riza
Author-X-Name-Last: Firuzan
Title: Taguchi based Case study in the automotive industry: nonconformity decreasing with use of Six Sigma methodology
Abstract:
In this study, we applied a conceptual Six Sigma/design of experiment hybrid framework that aims to integrate the Taguchi method and Six Sigma for process improvement in a complex industrial environment. In this context, the Six Sigma methodology was employed on a company operating within the automotive industry to improve a manufacturing process which caused a customer complaint within the company. Studies employing the Taguchi experiment design usually focus on a single variable and neglect the effects of adjustments on remaining quality characteristics. In this study, a multi-response Taguchi design of experiment was preferred, and all of the quality characteristics were taken into account. In our study, define, measure, analysis, improve and control phases were used to reduce the nonconformity rate from 23.940 percent (baseline) to 0.049 percent. As a result of implementing Six Sigma, the sigma level increased from 2.21 (baseline) to 4.80.
Journal: Journal of Applied Statistics
Pages: 2889-2905
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1837086
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1837086
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2889-2905
Template-Type: ReDIF-Article 1.0
Author-Name: Necla Gündüz
Author-X-Name-First: Necla
Author-X-Name-Last: Gündüz
Author-Name: Celal Aydın
Author-X-Name-First: Celal
Author-X-Name-Last: Aydın
Title: Optimal bandwidth estimators of kernel density functionals for contaminated data
Abstract:
In this study, we provide simulation-based exploration and characterization of the two most crucial kernel density functionals that play a central role in kernel density estimation, considering the probability density functions that are members of the location-scale family. Kernel density functional estimates are known to rely on the choice of preliminary bandwidth. Normal-scale estimators are commonly used to obtain preliminary bandwidth estimates, with the assumption that the data come from normal distribution. Here, we present an alternative approach, called the Cauchy-scale estimators, to obtain preliminary bandwidth estimates. In this approach, data are assumed to come from a Cauchy distribution. Furthermore, analysis results related to the sampling distribution of bandwidth estimators based on the normal- and Cauchy-scale approaches are presented. As a case study, we provide a comprehensive characterization of different contamination levels with a simulation study constructed for the random samples from normal distributions with various parameters and various contamination levels. The proposed preliminary bandwidth selection shows lower variance in both mixture and contaminated data in our simulations. Besides, functional bandwidth presents results similar to the simulation results in the applications we made on the real data set.
Journal: Journal of Applied Statistics
Pages: 2239-2258
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1944999
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1944999
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2239-2258
Template-Type: ReDIF-Article 1.0
Author-Name: Pakize Taylan
Author-X-Name-First: Pakize
Author-X-Name-Last: Taylan
Author-Name: Fatma Yerlikaya-Özkurt
Author-X-Name-First: Fatma
Author-X-Name-Last: Yerlikaya-Özkurt
Author-Name: Burcu Bilgiç Uçak
Author-X-Name-First: Burcu
Author-X-Name-Last: Bilgiç Uçak
Author-Name: Gerhard-Wilhelm Weber
Author-X-Name-First: Gerhard-Wilhelm
Author-X-Name-Last: Weber
Title: A new outlier detection method based on convex optimization: application to diagnosis of Parkinson’s disease
Abstract:
Neuroscience is a combination of different scientific disciplines which investigate the nervous system for understanding of the biological basis. Recently, applications to the diagnosis of neurodegenerative diseases like Parkinson’s disease have become very promising by considering different statistical regression models. However, well-known statistical regression models may give misleading results for the diagnosis of the neurodegenerative diseases when experimental data contain outlier observations that lie an abnormal distance from the other observation. The main achievements of this study consist of a novel mathematics-supported approach beside statistical regression models to identify and treat the outlier observations without direct elimination for a great and emerging challenge in humankind, such as neurodegenerative diseases. By this approach, a new method named as CMTMSOM is proposed with the contributions of the powerful convex and continuous optimization techniques referred to as conic quadratic programing. This method, based on the mean-shift outlier regression model, is developed by combining robustness of M-estimation and stability of Tikhonov regularization. We apply our method and other parametric models on Parkinson telemonitoring dataset which is a real-world dataset in Neuroscience. Then, we compare these methods by using well-known method-free performance measures. The results indicate that the CMTMSOM method performs better than current parametric models.
Journal: Journal of Applied Statistics
Pages: 2421-2440
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1864815
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1864815
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2421-2440
Template-Type: ReDIF-Article 1.0
Author-Name: Asuman Yılmaz
Author-X-Name-First: Asuman
Author-X-Name-Last: Yılmaz
Author-Name: Mahmut Kara
Author-X-Name-First: Mahmut
Author-X-Name-Last: Kara
Author-Name: Onur Özdemir
Author-X-Name-First: Onur
Author-X-Name-Last: Özdemir
Title: Comparison of different estimation methods for extreme value distribution
Abstract:
The extreme value distribution was developed for modeling extreme-order statistics or extreme events. In this study, we discuss the distribution of the largest extreme. The main objective of this paper is to determine the best estimators of the unknown parameters of the extreme value distribution. Thus, both classical and Bayesian methods are used. The classical estimation methods under consideration are maximum likelihood estimators, moment’s estimators, least squares estimators, and weighted least squares estimators, percentile estimators, the ordinary least squares estimators, best linear unbiased estimators, L-moments estimators, trimmed L-moments estimators, and Bain and Engelhardt estimators. We also propose new estimators for the unknown parameters. Bayesian estimators of the parameters are derived by using Lindley’s approximation and Markov Chain Monte Carlo methods. The asymptotic confidence intervals are considered by using maximum likelihood estimators. The Bayesian credible intervals are also obtained by using Gibbs sampling. The performances of these estimation methods are compared with respect to their biases and mean square errors through a simulation study. The maximum daily flood discharge (annual) data sets of the Meriç River and Feather River are analyzed at the end of the study for a better understanding of the methods presented in this paper.
Journal: Journal of Applied Statistics
Pages: 2259-2284
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1940109
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1940109
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2259-2284
Template-Type: ReDIF-Article 1.0
Author-Name: Esmat Jamshidi Eini
Author-X-Name-First: Esmat Jamshidi
Author-X-Name-Last: Eini
Author-Name: Hamid Khaloozadeh
Author-X-Name-First: Hamid
Author-X-Name-Last: Khaloozadeh
Title: Tail conditional moment for generalized skew-elliptical distributions
Abstract:
Substantial changes in the financial markets and insurance companies have needed the development of the structure of the risk benchmark, which is the challenge addressed in this paper. We propose a theorem that expands the tail conditional moment (TCM) measure from elliptical distributions to wider classes of skew-elliptical distributions. This family of distributions is suitable for modeling asymmetric phenomena. We obtain the analytical formula for the
$ {n^{\textrm{th}}} $ nth TCM for skew-elliptical distributions to help well to figure out the risk behavior along the tail of loss distributions. We derive four significant results and generalize the tail conditional skewness (TCS) and the tail conditional kurtosis (TCK) measures for generalized skew-elliptical distributions, which are used to determine the skewness and the kurtosis in the tail of loss distributions. The proposed TCM measure has been applied to well-known families of generalized skew-elliptical distributions. We also provide a practical example of a portfolio problem by calculating the proposed TCM measure for the weighted sum of generalized skew-elliptical distributions.
Journal: Journal of Applied Statistics
Pages: 2285-2305
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1896687
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1896687
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2285-2305
Template-Type: ReDIF-Article 1.0
Author-Name: O. Ozan Evkaya
Author-X-Name-First: O.
Author-X-Name-Last: Ozan Evkaya
Author-Name: Fatma Sevinç Kurnaz
Author-X-Name-First: Fatma
Author-X-Name-Last: Sevinç Kurnaz
Title: Forecasting drought using neural network approaches with transformed time series data
Abstract:
Drought is one of the important and costliest disaster all over the world. With the accelerated progress of climate change, its frequency of occurrence and negative impacts are rapidly increasing. It is crucial to initiate and sustain an early warning system to monitor and predict the possible impacts of future droughts. Recently, with the rise of data driven models, various case studies are conducted by using Machine Learning algorithms instead of using pure statistical approaches. The main goal of this paper is to conduct a drought forecasting study for a weather station located in Marmara Region. For that purpose, firstly, widely used univariate drought index, Standardized Precipitation Index is calculated for Bursa station. Thereafter, both the historical information retrieved from time series data and its wavelet transformation are considered to investigate Nonlinear Auto-Regressive and Nonlinear Auto-Regressive with External Input (NARX) type Neural Network (NN) models. According to a pool of Goodness-of-Fit (GOF) tests, the forecasting performance of the models with various number of hidden neurons are compared. The recent findings of the study showed that considering the data with its wavelet transformation under (NARX-NN) has benefits to increase the capacity of forecasting the drought index.
Journal: Journal of Applied Statistics
Pages: 2591-2606
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1867829
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1867829
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2591-2606
Template-Type: ReDIF-Article 1.0
Author-Name: Berna Yazici
Author-X-Name-First: Berna
Author-X-Name-Last: Yazici
Author-Name: Mustafa Cavus
Author-X-Name-First: Mustafa
Author-X-Name-Last: Cavus
Title: A comparative study of computation approaches of the generalized F-test
Abstract:
The Generalized F-test is derived based on the Generalized P-value Method to test the equality of normally distributed group means under unequal variances. There are two approaches to compute the p-value of the GF test, based on beta and chi-squared random numbers. From prior art in the literature, it appears that the two computation approaches of the Generalized tests are equivalent. In this study, the equivalence of these approaches is investigated in an extensive Monte-Carlo simulation study in terms of Type I error probability and penalized power. It is found that the equivalence of the computation approaches is not quite correct and that there is a difference between their conclusion, and researchers should decide which one is powerful than the others according to the structure of data, such as sample size, and the number of groups. Also, real data examples are given to show the opposite decisions of the computation approaches.
Journal: Journal of Applied Statistics
Pages: 2906-2919
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1939660
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1939660
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2906-2919
Template-Type: ReDIF-Article 1.0
Author-Name: M. Esra Atukalp
Author-X-Name-First: M. Esra
Author-X-Name-Last: Atukalp
Title: Determining the relationship between stock return and financial performance: an analysis on Turkish deposit banks
Abstract:
Banks play a very important role in financial markets due to their intermediary function. The availability of financing to businesses and individuals, the prevalence of branches throughout the country as well as the preference status at the collection point as a result of the habits of savings holders, have made deposit banks more active among other financial institutions. Since the banking system affects the whole economy, their performance and their performance evaluation become important. Performance measurement can be defined as one of the most important issues in the financial field. In this study, the relationship between stock return and financial performance of Turkish deposit banks was examined via CRITIC method, TOPSIS method and Spearman’s rank correlation analysis for 2014–2018 periods. According to the results of the analysis, there is no statistically significant correlation between the stock return ranking and financial performance rankings of deposit banks in Turkey.
Journal: Journal of Applied Statistics
Pages: 2643-2657
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1849056
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1849056
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2643-2657
Template-Type: ReDIF-Article 1.0
Author-Name: Ding-Geng Chen
Author-X-Name-First: Ding-Geng
Author-X-Name-Last: Chen
Author-Name: Haipeng Gao
Author-X-Name-First: Haipeng
Author-X-Name-Last: Gao
Author-Name: Chuanshu Ji
Author-X-Name-First: Chuanshu
Author-X-Name-Last: Ji
Author-Name: Xinguang Chen
Author-X-Name-First: Xinguang
Author-X-Name-Last: Chen
Title: Stochastic cusp catastrophe model and its Bayesian computations
Abstract:
This paper revitalizes the investigation of the classical cusp catastrophe model in catastrophe theory and tackles the unsolved statistical inference problem concerning stochastic cusp differential equation. This model is challenging because its associated transition density hence the likelihood function is analytically intractable. We propose a novel Bayesian approach combining Hamiltonian Monte Carlo with two likelihood approximation methods, namely, Euler approximation and Hermite expansion. We validate this novel approach through a series of simulation studies. We further demonstrate potential application of this novel approach using the real USD/EUR exchange rate.
Journal: Journal of Applied Statistics
Pages: 2714-2733
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1922993
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1922993
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2714-2733
Template-Type: ReDIF-Article 1.0
Author-Name: Atacan Erdiş
Author-X-Name-First: Atacan
Author-X-Name-Last: Erdiş
Author-Name: M. Akif Bakir
Author-X-Name-First: M.
Author-X-Name-Last: Akif Bakir
Author-Name: Muhammed I. Jaiteh
Author-X-Name-First: Muhammed
Author-X-Name-Last: I. Jaiteh
Title: A method for detection of Mode-Mixing problem
Abstract:
Classical Empirical Mode Decomposition (EMD) is a data-driven method used to analyze non-linear and non-stationary time series data. Besides being an adaptable method by its nature, EMD assumes that every data consists of oscillations of the intrinsic mode functions (IMF). EMD also requires the condition that IMFs which represent the characteristic structures in the data should show only a unique sub-characteristic of the data. However, in some cases, depending on the way the sub-characteristics which make up a sophisticated data coexist, the IMFs are able to be not unique. This is called the mode-mixing problem. Although there are many studies and successful methods (such as EEMD, CEEMDAN) for eliminating the mode-mixing problem, a limited number of studies exist on determining the presence of the aforementioned problem. In this study, a method for the determination of the mode-mixing problem is proposed. In the suggested method, the Itakura–Saito distance, which is a measurement of the similarity of stationary signals and based on Fourier spectrums, is modified by applying Kaiser filter onto short-time signals. The performance of the method is tested via various applications with simulated and real data, and the results show successful detection of the mode-mixing if it exists in time series.
Journal: Journal of Applied Statistics
Pages: 2847-2863
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1908969
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1908969
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2847-2863
Template-Type: ReDIF-Article 1.0
Author-Name: Naoya Yamaguchi
Author-X-Name-First: Naoya
Author-X-Name-Last: Yamaguchi
Author-Name: Yuka Yamaguchi
Author-X-Name-First: Yuka
Author-X-Name-Last: Yamaguchi
Author-Name: Ryuei Nishii
Author-X-Name-First: Ryuei
Author-X-Name-Last: Nishii
Title: Minimizing the expected value of the asymmetric loss function and an inequality for the variance of the loss
Abstract:
The coefficients of regression are usually estimated for minimization problems with asymmetric loss functions. In this paper, we rather correct predictions so that the prediction error follows a generalized Gaussian distribution. In our method, we not only minimize the expected value of the asymmetric loss, but also lower the variance of the loss. Predictions usually have errors. Therefore, it is necessary to use predictions in consideration of these errors. Our approach takes into account prediction errors. Furthermore, even if we do not understand the prediction method, which is a possible circumstance in, e.g. deep learning, we can use our method if we know the prediction error distribution and asymmetric loss function. Our method can be applied to procurement of electricity from electricity markets.
Journal: Journal of Applied Statistics
Pages: 2348-2368
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1761951
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1761951
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2348-2368
Template-Type: ReDIF-Article 1.0
Author-Name: Bükre Yıldırım Külekci
Author-X-Name-First: Bükre
Author-X-Name-Last: Yıldırım Külekci
Author-Name: A. Sevtap Selcuk-Kestel
Author-X-Name-First: A. Sevtap
Author-X-Name-Last: Selcuk-Kestel
Title: Assessment of longevity risk: credibility approach
Abstract:
To correctly measure the effect of mortality rates on the stability of insurance and pension provider's financial risk, longevity risk should be considered. This paper aims to investigate the future mortality and longevity risk with different age structures for different countries. Lee–Carter mortality model is used on the historical census data to forecast future mortality rates. Turkey, Germany, and Japan are chosen concerning their expected life and population distributions. Then, the longevity risk on a hypothetical portfolio is assessed based on static and dynamic mortality table approaches. To determine the impact of longevity risk, which is retrieved using a stochastic mortality model, a pension insurance product is taken into account. The net single premium for an annuity is quantified under the proposed set up for the selected countries. Additionally, the credibility approach is proposed to establish a reliable estimate for the annuity net single premium.
Journal: Journal of Applied Statistics
Pages: 2695-2713
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1922613
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1922613
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2695-2713
Template-Type: ReDIF-Article 1.0
Author-Name: Adnan Karaibrahimoglu
Author-X-Name-First: Adnan
Author-X-Name-Last: Karaibrahimoglu
Author-Name: Seren Ayhan
Author-X-Name-First: Seren
Author-X-Name-Last: Ayhan
Author-Name: Mustafa Karaagac
Author-X-Name-First: Mustafa
Author-X-Name-Last: Karaagac
Author-Name: Mehmet Artac
Author-X-Name-First: Mehmet
Author-X-Name-Last: Artac
Title: Circular analyses of dates on patients with gastric carcinoma
Abstract:
Dates have great importance in cancer diseases. However, the date variables themselves are not analyzed. This study aims to evaluate the descriptive statistics of diagnosis, operation, and last examination dates in gastric carcinoma patients by circular analysis methods. Totally 502 gastric carcinoma patients were enrolled in the study. The mean month of diagnosis date was found in nearly November (∼10.86) for females and May (∼5.17) for male patients. The mean month of operation date was found March (∼3.24) for females, and July & August (∼7.79) for males. The mean month of the last examination date was found as February & March (∼2.61) for females, and May (∼4.85) for males. Moreover, the mean day of the week for diagnosis date was found Thursday (∼5.50) for both female and male patients. The fitting of distributions of all variables was checked, also, according to von Mises, Rayleigh, and Kuiper’s tests. When the days and months were analyzed by classical descriptive statistics, the results were obtained completely different from the circular analyses results. Therefore, the dates and times should be analyzed in certain diseases to give an idea for physicians.
Journal: Journal of Applied Statistics
Pages: 2931-2943
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1977259
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1977259
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2931-2943
Template-Type: ReDIF-Article 1.0
Author-Name: Meral Ebegil
Author-X-Name-First: Meral
Author-X-Name-Last: Ebegil
Author-Name: Yaprak Arzu Özdemir
Author-X-Name-First: Yaprak Arzu
Author-X-Name-Last: Özdemir
Author-Name: Fikri Gökpinar
Author-X-Name-First: Fikri
Author-X-Name-Last: Gökpinar
Title: Some Shrinkage estimators based on median ranked set sampling
Abstract:
In this study, some shrinkage estimators using a median ranked set sample in the presence of multicollinearity were studied. Initially, we constructed the multiple regression model using median ranked set sampling. We also adapted the Ridge and Liu-type estimators to these multiple regression model. To investigate the efficiency of these estimators, a simulation study was performed for a different number of explanatory variables, sample sizes, correlation coefficients, and error variances in perfect and imperfect ranking cases. In addition, these estimators were compared with other estimators that are based on ranked set sample using simulation study. It is shown that when the collinearity is moderate, Ridge estimator using median ranked set sample performs better than other estimators and when the collinearity increases, Liu-type estimator using median ranked set sample gets better than all other estimators do. When the collinearity is smaller than 0.95, ridge estimator based on median ranked set sample is more efficient than Liu-type estimator based on same sample. However, this threshold increases as the sample size increases and the number of explanatory variables decreases. In addition, real data example is presented to illustrate how collinearity affects the estimators under median ranked set sampling and ranked set sampling.
Journal: Journal of Applied Statistics
Pages: 2473-2498
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1895088
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1895088
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2473-2498
Template-Type: ReDIF-Article 1.0
Author-Name: Cenker Burak Metin
Author-X-Name-First: Cenker Burak
Author-X-Name-Last: Metin
Author-Name: Sinem Tuğba Şahin Tekin
Author-X-Name-First: Sinem Tuğba
Author-X-Name-Last: Şahin Tekin
Author-Name: Yaprak Arzu Özdemir
Author-X-Name-First: Yaprak Arzu
Author-X-Name-Last: Özdemir
Title: Restricted calibration and weight trimming approaches for estimation of the population total in business statistics
Abstract:
Some adjustments are made to design weights to reduce the negative effects of non-response and out-of-scope problems. The calibration approach is a weighting process that agrees with the known population values by using auxiliary information. In this study, alternative calibration approaches and weight trimming process that can be used in large data sets with extreme weights and different correlation structures were analysed. In addition, the effect of the correlation structure of auxiliary variables on the efficiency of the calibration estimators was investigated by a simulation study. The 2017 Annual Industry and Service Statistics data were used in the simulation study and it was seen that restricted calibration estimators were more efficient than the generalized regression estimator in estimating the variables with a high variance such as turnover. Especially in small sample fractions, we recommend the application of restricted calibration estimators, as they are more efficient than the weight trimming in solving the negative and less than one weights problem encountered after the calibration process.
Journal: Journal of Applied Statistics
Pages: 2658-2672
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1869703
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1869703
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2658-2672
Template-Type: ReDIF-Article 1.0
Author-Name: Berivan Çakmak
Author-X-Name-First: Berivan
Author-X-Name-Last: Çakmak
Author-Name: Fatma Zehra Doğru
Author-X-Name-First: Fatma Zehra
Author-X-Name-Last: Doğru
Title: Optimal B-robust estimators for the parameters of the power Lindley distribution
Abstract:
Parameters of a distribution are generally estimated by using the classical methods such as maximum likelihood (ML) and least squares (LS) estimation. However, these classical methods are very sensitive to outliers. This study, therefore, proposes the application of the optimal B-robust (OBR) estimation method, which is resistant to outliers, to estimate the parameters of power Lindley (PL) distribution. We also provide a simulation study and a real data example to compare the performance of the OBR estimators with the performances of the ML, LS, and the regression M estimators.
Journal: Journal of Applied Statistics
Pages: 2369-2388
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1854201
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1854201
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2369-2388
Template-Type: ReDIF-Article 1.0
Author-Name: H. A. Rusyda
Author-X-Name-First: H. A.
Author-X-Name-Last: Rusyda
Author-Name: L. Noviyanti
Author-X-Name-First: L.
Author-X-Name-Last: Noviyanti
Author-Name: A. Z. Soleh
Author-X-Name-First: A. Z.
Author-X-Name-Last: Soleh
Author-Name: A. Chadidjah
Author-X-Name-First: A.
Author-X-Name-Last: Chadidjah
Author-Name: F. Indrayatna
Author-X-Name-First: F.
Author-X-Name-Last: Indrayatna
Title: The design of multiple crop insurance in Indonesia based on revenue risk using the copula model approach
Abstract:
It is important for Indonesia as a country with agricultural bases to develop crop insurance. Until now, Indonesia has not had any insurance for horticultural crops other than for corn. This paper discusses horticultural multicrop insurance products based on revenue risk that can be triggered by low prices, low yields, or a combination of both. In designing multicrop insurance products, it is important to model the variability of revenue risk through the implementation of copula toward crop yield and price and to estimate indemnity of the revenue-based multicrop insurance. The analysis employed Gumbel and Clayton copulas to model the dependency structure between crop yield and price of multicrops. Each marginal variable was modeled by using the ARIMA model. The results showed that multicrop revenue insurance tends to reduce the price of agricultural insurance in Indonesia, and thus this program has the potential to have good acceptance in agricultural insurance.
Journal: Journal of Applied Statistics
Pages: 2920-2930
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1897089
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1897089
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2920-2930
Template-Type: ReDIF-Article 1.0
Author-Name: Asiye Nur Yildirim
Author-X-Name-First: Asiye Nur
Author-X-Name-Last: Yildirim
Author-Name: Eren Bas
Author-X-Name-First: Eren
Author-X-Name-Last: Bas
Author-Name: Erol Egrioglu
Author-X-Name-First: Erol
Author-X-Name-Last: Egrioglu
Title: Threshold single multiplicative neuron artificial neural networks for non-linear time series forecasting
Abstract:
Single multiplicative neuron artificial neural networks have different importance than many other artificial neural networks because they do not have complex architecture problem, too many parameters and they need more computation time to use. In single multiplicative neuron artificial neural network, it is assumed that there is a one data generation process for time series. Many time series need an assumption that they have two data generation process or more. Based on this idea, the threshold model structure can be employed in a single multiplicative neuron model artificial neural network for taking into considering data generation processes problem. In this study, a new artificial neural network type is proposed and it is called a threshold single multiplicative neuron artificial neural network. It is assumed that time series have two data generation processes according to the architecture of single multiplicative neuron artificial neural network. Training algorithms are proposed based on harmony search algorithm and particle swarm optimization for threshold single multiplicative neuron artificial neural network. The proposed method is tested by various time series data sets and compared with well-known forecasting methods by considering different error measures. Finally, the performance of the proposed method is evaluated by a simulation study.
Journal: Journal of Applied Statistics
Pages: 2809-2825
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1869702
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1869702
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2809-2825
Template-Type: ReDIF-Article 1.0
Author-Name: Huseyin Guler
Author-X-Name-First: Huseyin
Author-X-Name-Last: Guler
Author-Name: Ebru Ozgur Guler
Author-X-Name-First: Ebru Ozgur
Author-X-Name-Last: Guler
Title: Mixed Lasso estimator for stochastic restricted regression models
Abstract:
Parameters of a linear regression model can be estimated with the help of traditional methods like generalized least squares and mixed estimator. However, recent developments increased the importance of big data sets, which have much more predictors than observations where some predictors have no impact on the dependent variable. The estimation and model selection problem of big datasets can be solved using the least absolute shrinkage and selection operator (Lasso). However, to the authors’ knowledge, there is no study that incorporates stochastic restrictions, within a Lasso framework. In this paper, we propose a Mixed Lasso (M-Lasso) estimator that incorporates stochastic linear restrictions to big data sets for selecting the true model and estimating parameters simultaneously. We conduct a simulation study to compare the performance of M-Lasso with existing estimators based on mean squared error
$ (\textrm{mse}) $ (mse) and model selection performance. Results show that M-Lasso is superior in terms of
$ \textrm{mse} $ mse and it generally dominates compared estimators according to the model selection criteria. We employ M-Lasso to estimate parameters of a widely analysed production function under stochastic restrictions raised from economic theory. Our results show that M-Lasso can provide reasonable and more precise estimates of model parameters that are in line with the economic theory.
Journal: Journal of Applied Statistics
Pages: 2795-2808
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1922614
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1922614
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2795-2808
Template-Type: ReDIF-Article 1.0
Author-Name: O. Ozan Evkaya
Author-X-Name-First: O.
Author-X-Name-Last: Ozan Evkaya
Author-Name: Ceylan Yozgatlıgil
Author-X-Name-First: Ceylan
Author-X-Name-Last: Yozgatlıgil
Author-Name: A. Sevtap Selcuk-Kestel
Author-X-Name-First: A.
Author-X-Name-Last: Sevtap Selcuk-Kestel
Title: CD-vine model for capturing complex dependence
Abstract:
Copula based finite mixture models allow us to capture the dependence between random variables more flexibly. Although bivariate case of finite mixture models has been commonly studied, limited efforts have been spent on finite mixture of vines. Instead of using classical mixture models, it is possible to incorporate C-vines into the D-vine model (CD-vine) to understand both the dependence among the variables over different time points. The aim of this study is to create a CD-vine mixture model expressing the dependencies between variables in temporal order. To achieve this, cumulative distribution function values generated within the time components are tied together with D-vine probabilistically. With this approach, dependence structure between variables at each time point is explained by C-vine and the dependence among the time points is captured by the D-vine model. The performance of the proposed CD-vine model is validated using simulated data and applied on four stock market indices.
Journal: Journal of Applied Statistics
Pages: 2406-2420
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1834519
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1834519
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2406-2420
Template-Type: ReDIF-Article 1.0
Author-Name: Beyza Cetin
Author-X-Name-First: Beyza
Author-X-Name-Last: Cetin
Author-Name: Idil Yavuz
Author-X-Name-First: Idil
Author-X-Name-Last: Yavuz
Title: Comparison of forecast accuracy of Ata and exponential smoothing
Abstract:
Forecasting is a crucial step in almost all scientific research and is essential in many areas of industrial, commercial, clinical and economic activity. There are many forecasting methods in the literature; but exponential smoothing stands out due to its simplicity and accuracy. Despite the facts that exponential smoothing is widely used and has been in the literature for a long time, it suffers from some problems that potentially affect the model's forecast accuracy. An alternative forecasting framework, called Ata, was recently proposed to overcome these problems and to provide improved forecasts. In this study, the forecast accuracy of Ata and exponential smoothing will be compared among data sets with no or linear trend. The results of this study are obtained using simulated data sets with different sample sizes, variances. Forecast errors are compared within both short and long term forecasting horizons. The results show that the proposed approach outperforms exponential smoothing for both types of time series data when forecasting the near and distant future. The methods are implemented on the U.S. annualized monthly interest rates for services data and their forecasting performance are also compared for this data set.
Journal: Journal of Applied Statistics
Pages: 2580-2590
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1803813
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1803813
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2580-2590
Template-Type: ReDIF-Article 1.0
Author-Name: Julide Yildirim
Author-X-Name-First: Julide
Author-X-Name-Last: Yildirim
Author-Name: Barış Alpaslan
Author-X-Name-First: Barış
Author-X-Name-Last: Alpaslan
Author-Name: Erdener Emin Eker
Author-X-Name-First: Erdener Emin
Author-X-Name-Last: Eker
Title: The role of social capital in environmental protection efforts: evidence from Turkey
Abstract:
The existing literature has recognized the role and importance of social capital in natural resource management. Several studies provide empirical evidence that higher levels of social capital may positively affect individuals' behavior towards natural resources management. This study is therefore an attempt to investigate the environmental quality impacts of social capital and central government expenditures on environmental protection, taking spatial dimension into account from 2009 to 2017 for Turkey. A general-to-specific approach has been adopted where spatial variations in the relationships have been examined with a dynamic spatial Durbin model, using the panel data at NUTS3 level. The empirical results do not support the validity of an environmental Kuznets curve, rather a U-shaped environmental Kuznets curve is validated, which exhibits spatial dependence. Estimation results show that industrial production has detrimental effects on the environment, while social capital improves it. The central government expenditures on environmental protection are effective in the abatement of pollution, and its effectiveness is enhanced when social capital is controlled. In addition to spatial spillover effects, our results show the presence of strong path dependency; that is, there is certain pollution inertia. Moreover, environmental protection policies would be more effective if social capital levels are improved.
Journal: Journal of Applied Statistics
Pages: 2626-2642
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1843609
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1843609
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2626-2642
Template-Type: ReDIF-Article 1.0
Author-Name: Timothy E. O’Brien
Author-X-Name-First: Timothy E.
Author-X-Name-Last: O’Brien
Author-Name: Jack Silcox
Author-X-Name-First: Jack
Author-X-Name-Last: Silcox
Title: Efficient experimental design for dose response modelling
Abstract:
The logit binomial logistic dose response model is commonly used in applied research to model binary outcomes as a function of the dose or concentration of a substance. This model is easily tailored to assess the relative potency of two substances. Consequently, in instances where two such dose response curves are parallel so one substance can be viewed as a dilution of the other, the degree of that dilution is captured in the relative potency model parameter. It is incumbent that experimental researchers working in fields including biomedicine, environmental science, toxicology and applied sciences choose efficient experimental designs to run their studies to both fit their dose response curves and to garner important information regarding drug or substance potency. This article provides far-reaching practical design strategies for dose response model fitting and estimation of relative potency using key illustrations. These results are subsequently extended here to handle situations where the assessment of parallelism and the proper dose-scale are also of interest. Conclusions and recommended strategies are supported by both theoretical and simulation results.
Journal: Journal of Applied Statistics
Pages: 2864-2888
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1880556
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1880556
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2864-2888
Template-Type: ReDIF-Article 1.0
Author-Name: Paul P. Momtaz
Author-X-Name-First: Paul P.
Author-X-Name-Last: Momtaz
Title: Econometric models of duration data in entrepreneurship with an application to start-ups' time-to-funding by venture capitalists (VCs)
Abstract:
Because time is a key determinant of entrepreneurial decision making, time-to-event models are ubiquitous in entrepreneurship. Widespread econometric misconception, however, may cause complicated biases in existing studies. The reason is spurious duration dependency, a complicated form of endogeneity caused by unobserved heterogeneity, which is particularly pronounced in entrepreneurship data. This article discusses the endogeneity problem and methods to ‘debias’ time-to-event models in entrepreneurship. Simulations and empirical evidence indicate that only the frailty approach yields consistently unbiased parameter estimates. An application to start-up firms' time-to-funding shows that other methods lead to dramatic biases. Therefore, this article advocates a paradigm shift in the modeling of time variables in entrepreneurship.
Journal: Journal of Applied Statistics
Pages: 2673-2694
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1896686
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1896686
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2673-2694
Template-Type: ReDIF-Article 1.0
Author-Name: Luis A. Gil-Alana
Author-X-Name-First: Luis A.
Author-X-Name-Last: Gil-Alana
Author-Name: OlaOluwa S. Yaya
Author-X-Name-First: OlaOluwa S.
Author-X-Name-Last: Yaya
Title: Testing fractional unit roots with non-linear smooth break approximations using Fourier functions
Abstract:
In this paper, we present a testing procedure for fractional orders of integration in the context of non-linear terms approximated by Fourier functions. The test statistic has an asymptotic standard normal distribution and several Monte Carlo experiments conducted in the paper show that it performs well in finite samples. Various applications using real life time series, such as US unemployment rates, US GNP and Purchasing Power Parity (PPP) of G7 countries are presented at the end of the paper.
Journal: Journal of Applied Statistics
Pages: 2542-2559
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1757047
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1757047
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2542-2559
Template-Type: ReDIF-Article 1.0
Author-Name: Mehmet Ali Balcı
Author-X-Name-First: Mehmet Ali
Author-X-Name-Last: Balcı
Author-Name: Ömer Akgüller
Author-X-Name-First: Ömer
Author-X-Name-Last: Akgüller
Author-Name: Serdar Can Güzel
Author-X-Name-First: Serdar
Author-X-Name-Last: Can Güzel
Title: Hierarchies in communities of UK stock market from the perspective of Brexit
Abstract:
Nowadays, increase of analyzing stock markets as complex systems lead graph theory to play a key role. For instance, detecting graph communities is an important task in the analysis of stocks, and as planar maximally filtered graphs let us to get important information for the topology of the market. In this study, we first obtain correlation network representation of UK's leading stock market network by using a novel threshold method. Then, we determine vertex clusters by using modularity and analyze clusters in planar maximally filtered graph substructures. Our analyze include a new measure called weighted Gini index for measuring the sparsity. The main goal of this paper is to study the hierarchical evolution of the market communities throughout the Brexit referendum, which is known as the stress period for the stock market. Hence, the overall sample is divided into two sub-periods of pre-referendum, and post-referendum to obtain communities and hierarchical structures. Our results indicate that financial companies are leading elements of the clusters. Moreover, the significant changes within the network topologies are observed for insurance, consumer goods, consumer services, mining, and technology sectors whereas oil and gas and health care sectors have not been affected by Brexit stress.
Journal: Journal of Applied Statistics
Pages: 2607-2625
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1796942
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1796942
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2607-2625
Template-Type: ReDIF-Article 1.0
Author-Name: Fatma Yerlikaya-Özkurt
Author-X-Name-First: Fatma
Author-X-Name-Last: Yerlikaya-Özkurt
Author-Name: Pakize Taylan
Author-X-Name-First: Pakize
Author-X-Name-Last: Taylan
Author-Name: Müjgan Tez
Author-X-Name-First: Müjgan
Author-X-Name-Last: Tez
Title: Estimation in the partially nonlinear model by continuous optimization
Abstract:
A useful model for data analysis is the partially nonlinear model where response variable is represented as the sum of a nonparametric and a parametric component. In this study, we propose a new procedure for estimating the parameters in the partially nonlinear models. Therefore, we consider penalized profile nonlinear least square problem where nonparametric components are expressed as a B-spline basis function, and then estimation problem is expressed in terms of conic quadratic programming which is a continuous optimization problem and solved interior point method. An application study is conducted to evaluate the performance of the proposed method by considering some well-known performance measures. The results are compared against parametric nonlinear model.
Journal: Journal of Applied Statistics
Pages: 2826-2846
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1864816
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1864816
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2826-2846
Template-Type: ReDIF-Article 1.0
Author-Name: Necla Gündüz
Author-X-Name-First: Necla
Author-X-Name-Last: Gündüz
Author-Name: Ernest Fokoué
Author-X-Name-First: Ernest
Author-X-Name-Last: Fokoué
Title: Understanding students' evaluations of professors using non-negative matrix factorization
Abstract:
In this paper, we use Nonnegative Matrix Factorization (NMF) and several other state of the art statistical machine learning techniques to provide an in-depth study of university professor evaluations by their students. We specifically use the Kullback–Leibler divergence as our loss function in keeping with the type of the data and extract revealing patterns consistent with the educational objectives underlying the questionnaire design. In particular, the application of our techniques to a dataset gathered at Gazi University in Turkey reveals compelling patterns such as the strong association between the student's seriousness and dedication (measured by attendance) and the kind of scores they tend to assign to the courses and the corresponding professors. Insights emerging from our study suggest that more aspects of students' evaluations should be explored at greater depths.
Journal: Journal of Applied Statistics
Pages: 2961-2981
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1991288
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1991288
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2961-2981
Template-Type: ReDIF-Article 1.0
Author-Name: J. M. Muñoz-Pichardo
Author-X-Name-First: J. M.
Author-X-Name-Last: Muñoz-Pichardo
Author-Name: R. Pino-Mejías
Author-X-Name-First: R.
Author-X-Name-Last: Pino-Mejías
Author-Name: J. García-Heras
Author-X-Name-First: J.
Author-X-Name-Last: García-Heras
Author-Name: F. Ruiz-Muñoz
Author-X-Name-First: F.
Author-X-Name-Last: Ruiz-Muñoz
Author-Name: M. Luz González-Regalado
Author-X-Name-First: M.
Author-X-Name-Last: Luz González-Regalado
Title: A multivariate Poisson regression model for count data
Abstract:
We propose a new technique for the study of multivariate count data. The proposed model is applied to the study of the number of individuals several fossil species found in a set of geographical observation points. First, we are proposing a multivariate model based on the Poisson distributions, which allows positive and negative correlations between the components. We are extending the log-linear Poisson model in the multivariate case through the conditional distributions. For this model, we obtain the maximum likelihood estimates and compute several goodness of fit statistics. Finally we illustrate the application of the proposed method over data sets: various simulated data sets and a count data set of various fossil species.
Journal: Journal of Applied Statistics
Pages: 2525-2541
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1877637
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1877637
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2525-2541
Template-Type: ReDIF-Article 1.0
Author-Name: Fatma Başoğlu Kabran
Author-X-Name-First: Fatma
Author-X-Name-Last: Başoğlu Kabran
Author-Name: Kamil Demirberk Ünlü
Author-X-Name-First: Kamil Demirberk
Author-X-Name-Last: Ünlü
Title: A two-step machine learning approach to predict S&P 500 bubbles
Abstract:
In this paper, we are interested in predicting the bubbles in the S&P 500 stock market with a two-step machine learning approach that employs a real-time bubble detection test and support vector machine (SVM). SVM as a nonparametric binary classification technique is already a widely used method in financial time series forecasting. In the literature, a bubble is often defined as a situation where the asset price exceeds its fundamental value. As one of the early warning signals, prediction of bubbles is vital for policymakers and regulators who are responsible to take preemptive measures against the future crises. Therefore, many attempts have been made to understand the main factors in bubble formation and to predict them in their earlier phases. Our analysis consists of two steps. The first step is to identify the bubbles in the S&P 500 index using a widely recognized right-tailed unit root test. Then, SVM is employed to predict the bubbles by macroeconomic indicators. Also, we compare SVM with different supervised learning algorithms by using k-fold cross-validation. The experimental results show that the proposed approach with high predictive power could be a favourable alternative in bubble prediction.
Journal: Journal of Applied Statistics
Pages: 2776-2794
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1823947
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1823947
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2776-2794
Template-Type: ReDIF-Article 1.0
Author-Name: Hatice Tul Kubra Akdur
Author-X-Name-First: Hatice Tul Kubra
Author-X-Name-Last: Akdur
Title: Unit-Lindley mixed-effect model for proportion data
Abstract:
Recently, unit-Lindley distribution and its associated regression models have been developed as an alternative to Beta regression model for which continuous outcome in the unit interval
$(0, 1) $(0,1). Proportion data usually occur in clinical trials, economics and social studies with hierarchical structures. In this study, unit-Lindley mixed-effect model is proposed and the appropriate likelihood analysis methods for parameter estimation are investigated. In the case of clustered or longitudinal proportion data in mixed-effect models, the full-likelihood function does not have a closed form. Parameter estimations of unit-Lindley mixed-effect model are obtained with Laplace and adaptive Gaussian quadrature approximation methods in this study. We analyzed a dataset on the proportion of households with insufficient water supply and sewage with some sociodemographic variables in the cities of Brazil by using unit-Lindley mixed-effect model including a random intercept as federative states of Brazil. Analysis results indicate that the proposed unit-Lindley mixed-effect model provides better fit than unit-Lindley regression model and beta mixed model. Also, in the simulation study the accuracy of the estimates of approximation methods are evaluated and compared via Monte Carlo simulation study in terms of bias and mean square error.
Journal: Journal of Applied Statistics
Pages: 2389-2405
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1823946
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1823946
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2389-2405
Template-Type: ReDIF-Article 1.0
Author-Name: Nuriye Sancar
Author-X-Name-First: Nuriye
Author-X-Name-Last: Sancar
Author-Name: Deniz Inan
Author-X-Name-First: Deniz
Author-X-Name-Last: Inan
Title: A new alternative estimation method for Liu-type logistic estimator via particle swarm optimization: an application to data of collapse of Turkish commercial banks during the Asian financial crisis
Abstract:
In the existence of multicollinearity problem in the logistic model, some important problems may occur in the analysis of the model, such as unstable maximum likelihood estimator with very high standard errors, false inferences. The Liu-type logistic estimator was proposed as two-parameter estimator to overcome multicollinearity problem in the logistic model. In the existing previous studies, the (k, d) pair in this shrinkage estimator is estimated by two-phase methods. However, since the different estimators can be utilized in the estimation of d, optimal choice of the (k, d) pair provided using the two-phase approaches is not guaranteed to overcome multicollinearity. In this article, a new alternative method based on particle swarm optimization is suggested to estimate (k, d) pair in Liu-type logistic estimator, simultaneously. For this purpose, an objective function that eliminates the multicollinearity problem, provides minimization of the bias of the model and improvement of the model’s predictive performance, is developed. Monte Carlo simulation study is conducted to show the performance of the proposed method by comparing it with existing methods. The performance of the proposed method is also demonstrated by the real dataset which is related to the collapse of commercial banks in Turkey during Asian financial crisis.
Journal: Journal of Applied Statistics
Pages: 2499-2514
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1837085
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1837085
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2499-2514
Template-Type: ReDIF-Article 1.0
Author-Name: Atila Göktaş
Author-X-Name-First: Atila
Author-X-Name-Last: Göktaş
Author-Name: Özge Akkuş
Author-X-Name-First: Özge
Author-X-Name-Last: Akkuş
Author-Name: Aykut Kuvat
Author-X-Name-First: Aykut
Author-X-Name-Last: Kuvat
Title: A new robust ridge parameter estimator based on search method for linear regression model
Abstract:
A large and wide variety of ridge parameter estimators proposed for linear regression models exist in the literature. Actually proposing new ridge parameter estimator lately proving its efficiency on few cases seems endless. However, so far there is no ridge parameter estimator that can serve best for any sample size or any degree of collinearity among regressors. In this study we propose a new robust ridge parameter estimator that serves best for any case assuring that is free of sample size, number of regressors and degree of collinearity. This is in fact realized by choosing three best from enormous number of ridge parameter estimators performing well in different cases in developing the new ridge parameter estimator in a way of search method providing the smallest mean square error values of regression parameters. After that a simulation study is conducted to show that the proposed parameter is robust. In conclusion, it is found that this ridge parameter estimator is promising in any case. Moreover, a recent data set is used as an example for illustration to show that the proposed ridge parameter estimator is performing better.
Journal: Journal of Applied Statistics
Pages: 2457-2472
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1803814
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1803814
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2457-2472
Template-Type: ReDIF-Article 1.0
Author-Name: G. Deliorman
Author-X-Name-First: G.
Author-X-Name-Last: Deliorman
Author-Name: D. Inan
Author-X-Name-First: D.
Author-X-Name-Last: Inan
Title: Binary particle swarm optimization as a detection tool for influential subsets in linear regression
Abstract:
An influential observation is any point that has a huge effect on the coefficients of a regression line fitting the data. The presence of such observations in the data set reduces the sensitivity and validity of the statistical analysis. In the literature there are many methods used for identifying influential observations. However, many of those methods are highly influenced by masking and swamping effects and require distributional assumptions. Especially in the presence of influential subsets most of these methods are insufficient to detect these observations. This study aims to develop a new diagnostic tool for identifying influential observations using the meta-heuristic binary particle swarm optimization algorithm. This proposed approach does not require any distributional assumptions and also not affected by masking and swamping effects as the known methods. The performance of the proposed method is analyzed via simulations and real data set applications.
Journal: Journal of Applied Statistics
Pages: 2441-2456
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1779196
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1779196
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2441-2456
Template-Type: ReDIF-Article 1.0
Author-Name: Adrien Ehrhardt
Author-X-Name-First: Adrien
Author-X-Name-Last: Ehrhardt
Author-Name: Christophe Biernacki
Author-X-Name-First: Christophe
Author-X-Name-Last: Biernacki
Author-Name: Vincent Vandewalle
Author-X-Name-First: Vincent
Author-X-Name-Last: Vandewalle
Author-Name: Philippe Heinrich
Author-X-Name-First: Philippe
Author-X-Name-Last: Heinrich
Author-Name: Sébastien Beben
Author-X-Name-First: Sébastien
Author-X-Name-Last: Beben
Title: Reject inference methods in credit scoring
Abstract:
The granting process is based on the probability that the applicant will refund his/her loan given his/her characteristics. This probability, also called score, is learnt based on a dataset in which rejected applicants are excluded. Thus, the population on which the score is used is different from the learning population. Many “reject inference” methods try to exploit the data available from the rejected applicants in the learning process. However, most of these methods are empirical and lack of formalization of their assumptions, and of their expected theoretical properties. We formalize such hidden assumptions in a general missing data setting for some of the most common reject inference methods. It reveals that hidden modelling is mostly incomplete, thus prohibiting to compare existing methods within the general model selection mechanism (except by financing “non-fundable” applicants). So, we assess performance of the methods on both simulated data and real data (from CACF, a major European loan issuer). Unsurprisingly, no method seems uniformly dominant. Both these theoretical and empirical results not only reinforce the idea to carefully use the classical reject inference methods but also to invest in future research works for designing model-based reject inference methods (without financing “non-fundable” applicants).
Journal: Journal of Applied Statistics
Pages: 2734-2754
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1929090
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1929090
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2734-2754
Template-Type: ReDIF-Article 1.0
Author-Name: Burak A. Erog̃lu
Author-X-Name-First: Burak A.
Author-X-Name-Last: Erog̃lu
Author-Name: Selim Yıldırım
Author-X-Name-First: Selim
Author-X-Name-Last: Yıldırım
Title: On the performance of the variance ratio unit root tests with flexible Fourier form
Abstract:
This article introduces a new unit root test that combines the variance ratio framework with the Flexible Fourier Form under the generalized least squares detrending mechanism. The advantage of the proposed method against its alternatives can be listed as: (1) it suggests a non-parametric procedure that does not require any parametric or semi-parametric model to remove serial correlation in the innovation process; (2) it can reasonably adapt itself to deal with the multiple structural breaks with various functional specifications. In the simulation exercises, we show that the proposed method exhibits satisfactory performance in the size and size-adjusted power analysis.
Journal: Journal of Applied Statistics
Pages: 2560-2579
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1796939
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1796939
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2560-2579
Template-Type: ReDIF-Article 1.0
Author-Name: Atif Evren
Author-X-Name-First: Atif
Author-X-Name-Last: Evren
Author-Name: Elif Tuna
Author-X-Name-First: Elif
Author-X-Name-Last: Tuna
Author-Name: Erhan Ustaoglu
Author-X-Name-First: Erhan
Author-X-Name-Last: Ustaoglu
Author-Name: Busra Sahin
Author-X-Name-First: Busra
Author-X-Name-Last: Sahin
Title: Some dominance indices to determine market concentration
Abstract:
This study intends to provide a new insight into the concentration and dominance indices as the concerns grow about the increasing concentration in the markets around the world. Most of the studies attempting to measure concentration or dominance in a market employ the popular concentration/dominance indices like Herfindahl–Hirschmann, Hannah–Kay, Rosenbluth–Hall–Tidemann and Concentration ratio. On the other hand, measures of qualitative variation are closely related to entropy, diversity and concentration/dominance measures. In this study, two normalized dominance measures that can be derived from the work of Wilcox on qualitative variation are proposed. The limiting distributions of these normalized dominance measures are formulated. By some simulations, asymptotic behaviors of these indices are analyzed under some assumptions about the market structure. In the end, by an application on the Turkish car sales in 2019, it is determined that the values of dominance indices vary in a considerably large range. Thus one of the dominance indices is determined to have the advantage of having less error in estimation, less sensitivity to smaller market shares, and less sampling variability.
Journal: Journal of Applied Statistics
Pages: 2755-2775
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1963421
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1963421
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2755-2775
Template-Type: ReDIF-Article 1.0
Author-Name: Hao Chen
Author-X-Name-First: Hao
Author-X-Name-Last: Chen
Author-Name: Minguang Zhang
Author-X-Name-First: Minguang
Author-X-Name-Last: Zhang
Author-Name: Lanshan Han
Author-X-Name-First: Lanshan
Author-X-Name-Last: Han
Author-Name: Alvin Lim
Author-X-Name-First: Alvin
Author-X-Name-Last: Lim
Title: Hierarchical marketing mix models with sign constraints
Abstract:
Marketing mix models (MMMs) are statistical models for measuring the effectiveness of various marketing activities such as promotion, media advertisement, etc. In this research, we propose a comprehensive marketing mix model that captures the hierarchical structure and the carryover, shape and scale effects of certain marketing activities, as well as sign restrictions on certain coefficients that are consistent with common business sense. In contrast to commonly adopted approaches in practice, which estimate parameters in a multi-stage process, the proposed approach estimates all the unknown parameters simultaneously using a constrained maximum likelihood approach and a Hamiltonian Monte Carlo algorithm. We present results on real datasets to illustrate the use of the proposed solution algorithms.
Journal: Journal of Applied Statistics
Pages: 2944-2960
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1946020
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1946020
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2944-2960
Template-Type: ReDIF-Article 1.0
Author-Name: Emrah Altun
Author-X-Name-First: Emrah
Author-X-Name-Last: Altun
Title: The Lomax regression model with residual analysis: an application to insurance data
Abstract:
In this paper, we introduce a new regression model, called Lomax regression model, as an alternative to the gamma regression model. The maximum-likelihood method is used to estimate the unknown parameters of the proposed model, and the finite sample performance of the maximum-likelihood estimation method is evaluated by means of the Monte-Carlo simulation study. The randomized quantile residuals are used to check the adequacy of the fitted model. The insurance data are analyzed to demonstrate the usefulness of the proposed regression model against the gamma regression model.
Journal: Journal of Applied Statistics
Pages: 2515-2524
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2020.1834515
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1834515
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2515-2524
Template-Type: ReDIF-Article 1.0
Author-Name: Barry C. Arnold
Author-X-Name-First: Barry C.
Author-X-Name-Last: Arnold
Author-Name: B. G. Manjunath
Author-X-Name-First: B. G.
Author-X-Name-Last: Manjunath
Title: Statistical inference for distributions with one Poisson conditional
Abstract:
It will be recalled that the classical bivariate normal distributions have normal marginals and normal conditionals. It is natural to ask whether a similar phenomenon can be encountered involving Poisson marginals and conditionals. However, it is known, from research on conditionally specified models, that Poisson marginals will be encountered, together with both conditionals being of the Poisson form, only in the case in which the variables are independent. In order to have a flexible dependent bivariate model with some Poisson components, in the present article, we will be focusing on bivariate distributions with one marginal and the other family of conditionals being of the Poisson form. Such distributions are called Pseudo-Poisson distributions. We discuss distributional features of such models, explore inferential aspects and include an example of applications of the Pseudo-Poisson model to sets of over-dispersed data.
Journal: Journal of Applied Statistics
Pages: 2306-2325
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1928017
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1928017
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2306-2325
Template-Type: ReDIF-Article 1.0
Author-Name: Özlem Türkşen
Author-X-Name-First: Özlem
Author-X-Name-Last: Türkşen
Title: A novel perspective for parameter estimation of seemingly unrelated nonlinear regression
Abstract:
Nonlinear regression is commonly used as a modeling tool to get a functional form between inputs and response variables when the inputs and the responses have a nonlinear relationship. It should be better to compose the predicted nonlinear models with considering correlation between the responses for multi-response data sets. For this purpose, seemingly unrelated nonlinear regression (SUNR) have been widely used in the literature. The parameter estimation procedure of the SUNR is based on nonlinear least squares (NLS) method, based on L2-norm. However, it is possible to use different norms for parameter estimation process. The novelty of this study is presenting the applicability of least absolute deviation (LAD) method, defined in L1-norm, with the NLS method simultaneously for obtaining parameter estimates of the SUNR model in a multi objective perspective. In this study, the proposed multi-objective SUNR model is called MO-SUNR. The optimization of the MO-SUNR model is achieved by using soft computing methods. Two data set examples are given for application purposes of the MO-SUNR model. It is seen from the results that the MO-SUNR provides many alternatively usable compromise parameter estimates through the simultaneous evaluation of the LAD and the NLS methods.
Journal: Journal of Applied Statistics
Pages: 2326-2347
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1877638
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1877638
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2326-2347
Template-Type: ReDIF-Article 1.0
Author-Name: A. Göktaş
Author-X-Name-First: A.
Author-X-Name-Last: Göktaş
Author-Name: Ö. Akkuş
Author-X-Name-First: Ö.
Author-X-Name-Last: Akkuş
Title: Editorial to the special issue: Recent statistical methods for data analysis, applied economics, business & finance
Journal: Journal of Applied Statistics
Pages: 2231-2238
Issue: 13-15
Volume: 48
Year: 2021
Month: 11
X-DOI: 10.1080/02664763.2021.1991180
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1991180
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:13-15:p:2231-2238
Template-Type: ReDIF-Article 1.0
Author-Name: Indranil Ghosh
Author-X-Name-First: Indranil
Author-X-Name-Last: Ghosh
Author-Name: Filipe Marques
Author-X-Name-First: Filipe
Author-X-Name-Last: Marques
Author-Name: Subrata Chakraborty
Author-X-Name-First: Subrata
Author-X-Name-Last: Chakraborty
Title: A new bivariate Poisson distribution via conditional specification: properties and applications
Abstract:
In this article, we discuss a bivariate Poisson distribution whose conditionals are univariate Poisson distributions and the marginals are not Poisson which exhibits negative correlation. Some useful structural properties of this distribution namely marginals, moments, generating functions, stochastic ordering are investigated. Simple proofs of negative correlation, marginal over-dispersion, distribution of sum and conditional given the sum are also derived. The distribution is shown to be a member of the multi-parameter exponential family and some natural but useful consequences are also outlined. Parameter estimation with maximum likelihood is implemented. Copula-based simulation experiments are carried out using Bivariate Normal and the Farlie–Gumbel–Morgenstern copulas to assess how the model behaves in dealing with the situation. Finally, the distribution is fitted to seven bivariate count data sets with an inherent negative correlation to illustrate suitability.
Journal: Journal of Applied Statistics
Pages: 3025-3047
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1793307
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1793307
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3025-3047
Template-Type: ReDIF-Article 1.0
Author-Name: Daniel C. F. Guzmán
Author-X-Name-First: Daniel C. F.
Author-X-Name-Last: Guzmán
Author-Name: Clécio S. Ferreira
Author-X-Name-First: Clécio S.
Author-X-Name-Last: Ferreira
Author-Name: Camila B. Zeller
Author-X-Name-First: Camila B.
Author-X-Name-Last: Zeller
Title: Linear censored regression models with skew scale mixtures of normal distributions
Abstract:
A special source of difficulty in the statistical analysis is the possibility that some subjects may not have a complete observation of the response variable. Such incomplete observation of the response variable is called censoring. Censorship can occur for a variety of reasons, including limitations of measurement equipment, design of the experiment, and non-occurrence of the event of interest until the end of the study. In the presence of censoring, the dependence of the response variable on the explanatory variables can be explored through regression analysis. In this paper, we propose to examine the censorship problem in context of the class of asymmetric, i.e., we have proposed a linear regression model with censored responses based on skew scale mixtures of normal distributions. We develop a Monte Carlo EM (MCEM) algorithm to perform maximum likelihood inference of the parameters in the proposed linear censored regression models with skew scale mixtures of normal distributions. The MCEM algorithm has been discussed with an emphasis on the skew-normal, skew Student-t-normal, skew-slash and skew-contaminated normal distributions. To examine the performance of the proposed method, we present some simulation studies and analyze a real dataset.
Journal: Journal of Applied Statistics
Pages: 3060-3085
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1795814
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1795814
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3060-3085
Template-Type: ReDIF-Article 1.0
Author-Name: Tao Chen
Author-X-Name-First: Tao
Author-X-Name-Last: Chen
Author-Name: Qingliang Fan
Author-X-Name-First: Qingliang
Author-X-Name-Last: Fan
Author-Name: Kai Liu
Author-X-Name-First: Kai
Author-X-Name-Last: Liu
Author-Name: Lingshan Le
Author-X-Name-First: Lingshan
Author-X-Name-Last: Le
Title: Identifying key factors in momentum in basketball games
Abstract:
Momentum as elaborated under a recent novel definition has been shown quantitatively to have a significant impact on basketball game outcomes. This paper makes two contributions to the analytical literature on sports momentum: (1) two aspects of the new definition are operationalized so that its practicality becomes evident; and (2) through a dimension-reduction technique (elastic net), key factors associated with momentum are identified. Both technical variables such as field goals, assists, rebounds, etc. and environmental variables such as the spectator attendance rate and player salary dispersion are considered, and the potential for useful real-time analyzes is illustrated.
Journal: Journal of Applied Statistics
Pages: 3116-3129
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1795819
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1795819
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3116-3129
Template-Type: ReDIF-Article 1.0
Author-Name: Vitor Capdeville
Author-X-Name-First: Vitor
Author-X-Name-Last: Capdeville
Author-Name: Kelly C. M. Gonçalves
Author-X-Name-First: Kelly C. M.
Author-X-Name-Last: Gonçalves
Author-Name: João B. M. Pereira
Author-X-Name-First: João B. M.
Author-X-Name-Last: Pereira
Title: Bayesian factor models for multivariate categorical data obtained from questionnaires
Abstract:
Factor analysis is a flexible technique for assessment of multivariate dependence and codependence. Besides being an exploratory tool used to reduce the dimensionality of multivariate data, it allows estimation of common factors that often have an interesting theoretical interpretation in real problems. However, standard factor analysis is only applicable when the variables are scaled, which is often inappropriate, for example, in data obtained from questionnaires in the field of psychology, where the variables are often categorical. In this framework, we propose a factor model for the analysis of multivariate ordered and non-ordered polychotomous data. The inference procedure is done under the Bayesian approach via Markov chain Monte Carlo methods. Two Monte Carlo simulation studies are presented to investigate the performance of this approach in terms of estimation bias, precision and assessment of the number of factors. We also illustrate the proposed method to analyze participants' responses to the Motivational State Questionnaire dataset, developed to study emotions in laboratory and field settings.
Journal: Journal of Applied Statistics
Pages: 3150-3173
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1796935
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1796935
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3150-3173
Template-Type: ReDIF-Article 1.0
Author-Name: The Editors
Title: Correction
Journal: Journal of Applied Statistics
Pages: 3253-3254
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2021.1972659
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1972659
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3253-3254
Template-Type: ReDIF-Article 1.0
Author-Name: Hassan S. Bakouch
Author-X-Name-First: Hassan S.
Author-X-Name-Last: Bakouch
Author-Name: Meitner Cadena
Author-X-Name-First: Meitner
Author-X-Name-Last: Cadena
Author-Name: Christophe Chesneau
Author-X-Name-First: Christophe
Author-X-Name-Last: Chesneau
Title: A new class of skew distributions with climate data analysis
Abstract:
In this paper, we develop a new general class of skew distributions with flexibility properties on the tails. Moreover, such class can provide heavy and light tails. Some of its mathematical properties are studied, including the quantile function, the moments, the moment generating function and the mean of deviations. New skew distributions are derived and used to construct new models capturing asymmetry inherent to data. The estimation of the class parameters is investigated by the method of maximum likelihood and the performance of the estimators is assessed by a simulation study. Applications of the proposed distribution are explored for two climate data sets. The first data set concerns the annual heat wave index and the second data set involves temperature and precipitation measures from the meteorological station located at Schiphol, Netherlands. Data fitting results show that our models perform better than the competitors.
Journal: Journal of Applied Statistics
Pages: 3002-3024
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1791804
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1791804
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3002-3024
Template-Type: ReDIF-Article 1.0
Author-Name: Renata Rojas Guerra
Author-X-Name-First: Renata Rojas
Author-X-Name-Last: Guerra
Author-Name: Fernando A. Peña-Ramírez
Author-X-Name-First: Fernando A.
Author-X-Name-Last: Peña-Ramírez
Author-Name: Marcelo Bourguignon
Author-X-Name-First: Marcelo
Author-X-Name-Last: Bourguignon
Title: The unit extended Weibull families of distributions and its applications
Abstract:
In this paper, two new general families of distributions supported on the unit interval are introduced. The proposed families include several known models as special cases and define at least twenty (each one) new special models. Since the list of well-being indicators may include several double bounded random variables, the applicability for modeling those is the major practical motivation for introducing the distributions on those families. We propose a parametrization of the new families in terms of the median and develop a shiny application to provide interactive density shape illustrations for some special cases. Various properties of the introduced families are studied. Some special models in the new families are discussed. In particular, the complementary unit Weibull distribution is studied in some detail. The method of maximum likelihood for estimating the model parameters is discussed. An extensive Monte Carlo experiment is conducted to evaluate the performances of these estimators in finite samples. Applications to the literacy rate in Brazilian and Colombian municipalities illustrate the usefulness of the two new families for modeling well-being indicators.
Journal: Journal of Applied Statistics
Pages: 3174-3192
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1796936
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1796936
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3174-3192
Template-Type: ReDIF-Article 1.0
Author-Name: Julio López-Laborda
Author-X-Name-First: Julio
Author-X-Name-Last: López-Laborda
Author-Name: Carmen Marín-González
Author-X-Name-First: Carmen
Author-X-Name-Last: Marín-González
Author-Name: Jorge Onrubia-Fernández
Author-X-Name-First: Jorge
Author-X-Name-Last: Onrubia-Fernández
Title: Estimating Engel curves: a new way to improve the SILC-HBS matching process using GLM methods
Abstract:
Microdata are required to evaluate the distributive impact of the taxation system as a whole (direct and indirect taxes) on individuals or households. However, in European Union countries this information is usually distributed into two separate surveys: the Household Budget Surveys (HBS), including total household expenditure and its composition, and EU Statistics on Income and Living Conditions (EU-SILC), including detailed information about households' income and direct (but not indirect) taxes paid. We present a parametric statistical matching procedure to merge both surveys. For the first stage of matching, we propose estimating total household expenditure in HBS (Engel curves) using a GLM estimator, instead of the traditionally used OLS method. It is a better alternative, insofar as it can deal with the heteroskedasticity problem of the OLS estimates, while making it unnecessary to retransform the regressors estimated in logarithms. To evaluate these advantages of the GLM estimator, we conducted a computational Monte Carlo simulation. In addition, when an error term is added to the deterministic imputation of expenditure in the EU-SILC, we propose replacing the usual Normal distribution of the error with a Chi-square type, which allows a better approximation to the original expenditures variance in the HBS. An empirical analysis is provided using Spanish surveys for years 2012–2016. In addition, we extend the empirical analysis to the rest of the European Union countries, using the surveys provided by Eurostat (EU-SILC, 2011; HBS, 2010).
Journal: Journal of Applied Statistics
Pages: 3233-3250
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1796933
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1796933
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3233-3250
Template-Type: ReDIF-Article 1.0
Author-Name: Sungsu Kim
Author-X-Name-First: Sungsu
Author-X-Name-Last: Kim
Author-Name: Ashis SenGupta
Author-X-Name-First: Ashis
Author-X-Name-Last: SenGupta
Title: Multimodal exponential families of circular distributions with application to daily peak hours of PM2.5 level in a large city
Abstract:
In this paper, we propose two multimodal circular distributions which are suitable for modeling circular data sets with two or more modes. Both distributions belong to the regular exponential family of distributions and are considered as extensions of the von Mises distribution. Hence, they possess the highly desirable properties, such as the existence of non-trivial sufficient statistics and optimal inferences for their parameters. Fine particulates (PM2.5) are generally emitted from activities such as industrial and residential combustion and from vehicle exhaust. We illustrate the utility of our proposed models using a real data set consisting of fine particulates (PM2.5) pollutant levels in Houston region during Fall season in 2019. Our results provide a strong evidence that its diurnal pattern exhibits four modes; two peaks during morning and evening rush hours and two peaks in between.
Journal: Journal of Applied Statistics
Pages: 3193-3207
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1796938
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1796938
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3193-3207
Template-Type: ReDIF-Article 1.0
Author-Name: Thomas J. Smith
Author-X-Name-First: Thomas J.
Author-X-Name-Last: Smith
Author-Name: David A. Walker
Author-X-Name-First: David A.
Author-X-Name-Last: Walker
Author-Name: Cornelius M. McKenna
Author-X-Name-First: Cornelius M.
Author-X-Name-Last: McKenna
Title: A coefficient of discrimination for use with nominal and ordinal regression models
Abstract:
This study introduces a coefficient of discrimination for use with nominal and ordinal regression models. Computation of the coefficient is demonstrated with data from the Pew Research Center’s 25th Anniversary of the Web Omnibus Survey pertaining to cell/home phone ownership, where the coefficient of discrimination indicates that respondent age and gender increased the probability of a correct versus incorrect classification by 13.9%. Additionally, the coefficient is compared to existing coefficients.
Journal: Journal of Applied Statistics
Pages: 3208-3219
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1796940
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1796940
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3208-3219
Template-Type: ReDIF-Article 1.0
Author-Name: E. P. Sreedevi
Author-X-Name-First: E. P.
Author-X-Name-Last: Sreedevi
Author-Name: P. G. Sankaran
Author-X-Name-First: P. G.
Author-X-Name-Last: Sankaran
Title: Nonparametric inference for panel count data with competing risks
Abstract:
In survival and reliability studies, panel count data arise when we investigate a recurrent event process and each study subject is observed only at discrete time points. If recurrent events of several types are possible, we obtain panel count data with competing risks. Such data arise frequently from transversal studies on recurrent events in demography, epidemiology and reliability experiments where the individuals cannot be observed continuously. In the present paper, we propose an isotonic regression estimator for the cause specific mean function of the underlying recurrent event process of a competing risks panel count data. Further, a nonparametric test is proposed to compare the cause specific mean functions of the panel count competing risks data. Asymptotic properties of the proposed estimator and test statistic are studied. A simulation study is conducted to assess the finite sample behaviour of the proposed estimator and test statistic. Finally, the procedures developed are applied to a real data arising from skin cancer chemo prevention trial.
Journal: Journal of Applied Statistics
Pages: 3102-3115
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1795816
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1795816
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3102-3115
Template-Type: ReDIF-Article 1.0
Author-Name: Pallab Kumar Ghosh
Author-X-Name-First: Pallab Kumar
Author-X-Name-Last: Ghosh
Title: Box–Cox power transformation unconditional quantile regressions with an application on wage inequality
Abstract:
This study proposes a semi-parametric estimation method, Box–Cox power transformation unconditional quantile regression, to estimate the impact of changes in the distribution of the explanatory variables on the unconditional quantile of the outcome variable. The proposed method consists of running a nonlinear regression of the recentered influence function (RIF) of the outcome variable on the explanatory variables. We also show the asymptotic properties of the proposed estimator and apply the estimation method to address an existing puzzle in labor economics–why the 50th/10th percentile wage gap has been falling in the USA since the late 1980s. Our results show that declining unionization can explain approximately 10% of the decline in the 50/10 wage gap in 1990–2000 and 23% in 2000–2010.
Journal: Journal of Applied Statistics
Pages: 3086-3101
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1795817
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1795817
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3086-3101
Template-Type: ReDIF-Article 1.0
Author-Name: Fernando Ferraz do Nascimento
Author-X-Name-First: Fernando Ferraz
Author-X-Name-Last: do Nascimento
Author-Name: Andreson Almeida Azevedo
Author-X-Name-First: Andreson Almeida
Author-X-Name-Last: Azevedo
Author-Name: Valmaria Rocha da Silva Ferraz
Author-X-Name-First: Valmaria Rocha da Silva
Author-X-Name-Last: Ferraz
Title: Regression models to dependence for exceedance
Abstract:
Extreme Value Theory (EVT) aims to study the tails of probability distributions in order to measure and quantify extreme events of maximum and minimum. In river flow data, an extreme level of a river may be related to the level of a neighboring river that flows into it. In this type of data, it is very common for flooding of a location to have been caused by a very large flow from an affluent river that is tens or hundreds of kilometers from this location. In this sense, an interesting approach is to consider a conditional model for the estimation of a multivariate model. Inspired by this idea, we propose a Bayesian model to describe the dependence of exceedance between rivers, where we considered a conditionally independent structure. In this model, the dependence between rivers is captured by modeling the excess marginally of one river as a consequence of linear functions of the other rivers. The results showed that there is a strong and positive connection between excesses in one river caused by the excesses of the other rivers.
Journal: Journal of Applied Statistics
Pages: 3048-3059
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1795088
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1795088
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3048-3059
Template-Type: ReDIF-Article 1.0
Author-Name: Jochen Kruppa
Author-X-Name-First: Jochen
Author-X-Name-Last: Kruppa
Author-Name: Ludwig Hothorn
Author-X-Name-First: Ludwig
Author-X-Name-Last: Hothorn
Title: A comparison study on modeling of clustered and overdispersed count data for multiple comparisons
Abstract:
Data collected in various scientific fields are count data. One way to analyze such data is to compare the individual levels of the factor treatment using multiple comparisons. However, the measured individuals are often clustered – e.g. according to litter or rearing. This must be considered when estimating the parameters by a repeated measurement model. In addition, ignoring the overdispersion to which count data is prone leads to an increase of the type one error rate. We carry out simulation studies using several different data settings and compare different multiple contrast tests with parameter estimates from generalized estimation equations and generalized linear mixed models in order to observe coverage and rejection probabilities. We generate overdispersed, clustered count data in small samples as can be observed in many biological settings. We have found that the generalized estimation equations outperform generalized linear mixed models if the variance-sandwich estimator is correctly specified. Furthermore, generalized linear mixed models show problems with the convergence rate under certain data settings, but there are model implementations with lower implications exists. Finally, we use an example of genetic data to demonstrate the application of the multiple contrast test and the problems of ignoring strong overdispersion.
Journal: Journal of Applied Statistics
Pages: 3220-3232
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1788518
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1788518
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3220-3232
Template-Type: ReDIF-Article 1.0
Author-Name: Reza Azimi
Author-X-Name-First: Reza
Author-X-Name-Last: Azimi
Author-Name: Mahdy Esmailian
Author-X-Name-First: Mahdy
Author-X-Name-Last: Esmailian
Title: Correction to: The Weibull Fréchet distribution and its applications
Abstract:
Afify et al. [The Weibull Fréchet distribution and its applications, J. Appl. Stat., 43 (2016), pp. 2608–2626] defined and studied a new four-parameter lifetime model called the Weibull Fréchet distribution. They made some mistakes in presenting the log-likelihood function and the components of score vector. In this note, we will correct them.
Journal: Journal of Applied Statistics
Pages: 3251-3252
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2021.1973388
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1973388
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3251-3252
Template-Type: ReDIF-Article 1.0
Author-Name: Ricardo Puziol de Oliveira
Author-X-Name-First: Ricardo Puziol
Author-X-Name-Last: de Oliveira
Author-Name: Jorge Alberto Achcar
Author-X-Name-First: Jorge Alberto
Author-X-Name-Last: Achcar
Title: Accurate estimation for extra-Poisson variability assuming random effect models
Abstract:
In this study, the components of extra-Poisson variability are estimated assuming random effect models under a Bayesian approach. A standard existing methodology to estimate extra-Poisson variability assumes a negative binomial distribution. The obtained results show that using the proposed random effect model it is possible to get more accurate estimates for the extra-Poisson variability components when compared to the use of a negative binomial distribution where it is possible to estimate only one component of extra-Poisson variability. Some illustrative examples are introduced considering real data sets.
Journal: Journal of Applied Statistics
Pages: 2982-3001
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1789075
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1789075
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:2982-3001
Template-Type: ReDIF-Article 1.0
Author-Name: Jiajia Chen
Author-X-Name-First: Jiajia
Author-X-Name-Last: Chen
Author-Name: Xiaoqin Zhang
Author-X-Name-First: Xiaoqin
Author-X-Name-Last: Zhang
Author-Name: Karel Hron
Author-X-Name-First: Karel
Author-X-Name-Last: Hron
Title: Partial least squares regression with compositional response variables and covariates
Abstract:
The common approach for regression analysis with compositional variables is to express compositions in log-ratio coordinates (coefficients) and then perform standard statistical processing in real space. Similar to working in real space, the problem is that the standard least squares regression fails when the number of parts of all compositional covariates is higher than the number of observations. The aim of this study is to analyze in detail the partial least squares (PLS) regression which can deal with this problem. In this paper, we focus on the PLS regression between more than one compositional response variable and more than one compositional covariate. First, we give the PLS regression model with log-ratio coordinates of compositional variables, then we express the PLS model directly in the simplex. We also prove that the PLS model is invariant under the change of coordinate system, such as the ilr coordinates with a different contrast matrix or the clr coefficients. Moreover, we give the estimation and inference for parameters in PLS model. Finally, the PLS model with clr coefficients is used to analyze the relationship between the chemical metabolites of Astragali Radix and the plasma metabolites of rat after giving Astragali Radix.
Journal: Journal of Applied Statistics
Pages: 3130-3149
Issue: 16
Volume: 48
Year: 2021
Month: 12
X-DOI: 10.1080/02664763.2020.1795813
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1795813
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:48:y:2021:i:16:p:3130-3149
Template-Type: ReDIF-Article 1.0
Author-Name: Nabakumar Jana
Author-X-Name-First: Nabakumar
Author-X-Name-Last: Jana
Author-Name: Samadrita Bera
Author-X-Name-First: Samadrita
Author-X-Name-Last: Bera
Title: Estimation of parameters of inverse Weibull distribution and application to multi-component stress-strength model
Abstract:
The problem of estimation of the parameters of two-parameter inverse Weibull distributions has been considered. We establish existence and uniqueness of the maximum likelihood estimators of the scale and shape parameters. We derive Bayes estimators of the parameters under the entropy loss function. Hierarchical Bayes estimator, equivariant estimator and a class of minimax estimators are derived when shape parameter is known. Ordered Bayes estimators using information about second population are also derived. We investigate the reliability of multi-component stress-strength model using classical and Bayesian approaches. Risk comparison of the classical and Bayes estimators is done using Monte Carlo simulations. Applications of the proposed estimators are shown using real data sets.
Journal: Journal of Applied Statistics
Pages: 169-194
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1803815
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1803815
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:169-194
Template-Type: ReDIF-Article 1.0
Author-Name: María Dolores Esteban
Author-X-Name-First: María Dolores
Author-X-Name-Last: Esteban
Author-Name: María José Lombardía
Author-X-Name-First: María José
Author-X-Name-Last: Lombardía
Author-Name: Esther López-Vizcaíno
Author-X-Name-First: Esther
Author-X-Name-Last: López-Vizcaíno
Author-Name: Domingo Morales
Author-X-Name-First: Domingo
Author-X-Name-Last: Morales
Author-Name: Agustín Pérez
Author-X-Name-First: Agustín
Author-X-Name-Last: Pérez
Title: Small area estimation of expenditure means and ratios under a unit-level bivariate linear mixed model
Abstract:
Under a unit-level bivariate linear mixed model, this paper introduces small area predictors of expenditure means and ratios, and derives approximations and estimators of the corresponding mean squared errors. For the considered model, the REML estimation method is implemented. Several simulation experiments, designed to analyze the behavior of the introduced fitting algorithm, predictors and mean squared error estimators, are carried out. An application to real data from the Spanish household budget survey illustrates the behavior of the proposed statistical methodology. The target is the estimation of means of food and non-food household annual expenditures and of ratios of food household expenditures by Spanish provinces.
Journal: Journal of Applied Statistics
Pages: 143-168
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1803809
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1803809
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:143-168
Template-Type: ReDIF-Article 1.0
Author-Name: F. Prataviera
Author-X-Name-First: F.
Author-X-Name-Last: Prataviera
Author-Name: A. M. Batista
Author-X-Name-First: A. M.
Author-X-Name-Last: Batista
Author-Name: P. L. Libardi
Author-X-Name-First: P. L.
Author-X-Name-Last: Libardi
Author-Name: G. M. Cordeiro
Author-X-Name-First: G. M.
Author-X-Name-Last: Cordeiro
Author-Name: E. M. M. Ortega
Author-X-Name-First: E. M. M.
Author-X-Name-Last: Ortega
Title: Joint regression modeling of location and scale parameters of the skew t distribution with application in soil chemistry data
Abstract:
In regression model applications, the errors may frequently present a symmetric shape. In such cases, the normal and Student t distributions are commonly used. In this paper, we shall be concerned only to model heavy-tailed, skewed errors and absence of variance homogeneity with two regression structures based on the skew t distribution. We consider a classic analysis for the parameters of the proposed model. We perform a diagnostic analysis based on global influence and quantile residuals. For different parameter settings and sample sizes, various simulation results are obtained and compared to evaluate the performance of the skew t regression. Further, we illustrate the usefulness of the new regression by means of a real data set (amount of potassium in different soil areas) from a study carried out at the Department of Soil Science of the Luiz de Queiroz School of Agriculture, University of São Paulo.
Journal: Journal of Applied Statistics
Pages: 195-213
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1801608
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1801608
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:195-213
Template-Type: ReDIF-Article 1.0
Author-Name: Shiva S. Dibaj
Author-X-Name-First: Shiva S.
Author-X-Name-Last: Dibaj
Author-Name: Alan D. Hutson
Author-X-Name-First: Alan D.
Author-X-Name-Last: Hutson
Author-Name: Graham W. Warren
Author-X-Name-First: Graham W.
Author-X-Name-Last: Warren
Author-Name: Gregory E. Wilding
Author-X-Name-First: Gregory E.
Author-X-Name-Last: Wilding
Title: Exact unconditional inference for analyzing contingency tables in finite populations
Abstract:
With recent developments in computer power the application of exact inferential methods has become more feasible which has resulted in increasing popularity of these approaches. However, there is a lack of such methodology for populations with more complex structure, such as finite populations. When a small sample is drawn from a finite population, the number of individuals with a specific characteristic of interest follows hypergeometric distribution. In order to test for the comparison of two proportions in finite populations we develop an exact unconditional test. We utilize the information gained from the sample to restrict our search for the maximum p-value. Our proposed test has power equal to its competitors while maintains the pre-specified nominal significance level.
Journal: Journal of Applied Statistics
Pages: 86-97
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1798363
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1798363
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:86-97
Template-Type: ReDIF-Article 1.0
Author-Name: Cindy Feng
Author-X-Name-First: Cindy
Author-X-Name-Last: Feng
Title: Zero-inflated models for adjusting varying exposures: a cautionary note on the pitfalls of using offset
Abstract:
Zero-inflated count data are frequently encountered in public health and epidemiology research. Two-parts model is often used to model the excessive zeros, which are a mixture of two components: a point mass at zero and a count distribution, such as a Poisson distribution. When the rate of events per unit exposure is of interest, offset is commonly used to account for the varying extent of exposure, which is essentially a predictor whose regression coefficient is fixed at one. Such an assumption of exposure effect is, however, quite restrictive for many practical problems. Further, for zero-inflated models, offset is often only included in the count component of the model. However, the probability of excessive zero component could also be affected by the amount of ‘exposure’. We, therefore, proposed incorporating the varying exposure as a covariate rather than an offset term in both the probability of excessive zeros and conditional counts components of the zero-inflated model. A real example is used to illustrate the usage of the proposed methods, and simulation studies are conducted to assess the performance of the proposed methods for a broad variety of situations.
Journal: Journal of Applied Statistics
Pages: 1-23
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1796943
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1796943
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:1-23
Template-Type: ReDIF-Article 1.0
Author-Name: Haiqiang Ma
Author-X-Name-First: Haiqiang
Author-X-Name-Last: Ma
Author-Name: Jin Yang
Author-X-Name-First: Jin
Author-X-Name-Last: Yang
Author-Name: Sheng Xu
Author-X-Name-First: Sheng
Author-X-Name-Last: Xu
Author-Name: Chao Liu
Author-X-Name-First: Chao
Author-X-Name-Last: Liu
Author-Name: Qinyi Zhang
Author-X-Name-First: Qinyi
Author-X-Name-Last: Zhang
Title: Combination of multiple functional markers to improve diagnostic accuracy
Abstract:
Combination of multiple biomarkers to improve diagnostic accuracy is meaningful for practitioners and clinicians, and are attractive to lots of researchers. Nowadays, with development of modern techniques, functional markers such as curves or images, play an important role in diagnosis. There exists rich literature developing combination methods for continuous scalar markers. Unfortunately, only sporadic works have studied how functional markers affect diagnosis in the literature. Moreover, no publication can be found to do combination of multiple functional markers to improve the diagnostic accuracy. It is impossible to apply scalar combination methods to the multiple functional markers directly because of infinite dimensionality of functional markers. In this article, we propose a one-dimension scalar feature motivated by square loss distance, as an alternative of the original functional curve in the sense that, it can retain information to the most extent. The square loss distance is defined as the function of projection scores generated from functional principal component decomposition. Then existing variety of scalar combination methods can be applied to scalar features of functional markers after dimension reduction to improve the diagnostic accuracy. Area under the receiver operating characteristic curve and Youden index are used to assess performances of various methods in numerical studies. We also analyzed the high- or low- hospital admissions due to respiratory diseases between 2010 and 2017 in Hong Kong by combining weather conditions and media information, which are regarded as functional markers. Finally, we provide an R function for convenient application.
Journal: Journal of Applied Statistics
Pages: 44-63
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1796945
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1796945
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:44-63
Template-Type: ReDIF-Article 1.0
Author-Name: D. Scaldelai
Author-X-Name-First: D.
Author-X-Name-Last: Scaldelai
Author-Name: L. C. Matioli
Author-X-Name-First: L. C.
Author-X-Name-Last: Matioli
Author-Name: S. R. Santos
Author-X-Name-First: S. R.
Author-X-Name-Last: Santos
Author-Name: M. Kleina
Author-X-Name-First: M.
Author-X-Name-Last: Kleina
Title: MulticlusterKDE: a new algorithm for clustering based on multivariate kernel density estimation
Abstract:
In this paper, we propose the MulticlusterKDE algorithm applied to classify elements of a database into categories based on their similarity. MulticlusterKDE is centered on the multiple optimization of the kernel density estimator function with multivariate Gaussian kernel. One of the main features of the proposed algorithm is that the number of clusters is an optional input parameter. Furthermore, it is very simple, easy to implement, well defined and stops at a finite number of steps and it always converges regardless of the data set. We illustrate our findings by implementing the algorithm in R software. The results indicate that the MulticlusterKDE algorithm is competitive when compared to K-means, K-medoids, CLARA, DBSCAN and PdfCluster algorithms. Features such as simplicity and efficiency make the proposed algorithm an attractive and promising research field that can be used as basis for its improvement and also for the development of new density-based clustering algorithms.
Journal: Journal of Applied Statistics
Pages: 98-121
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1799958
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1799958
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:98-121
Template-Type: ReDIF-Article 1.0
Author-Name: Dariush Najarzadeh
Author-X-Name-First: Dariush
Author-X-Name-Last: Najarzadeh
Title: Conservative confidence intervals on multiple correlation coefficient for high-dimensional elliptical data using random projection methodology
Abstract:
So called multiple correlation coefficient (MCC) is a measure of linear relationship between a given variable and set of covariates. In the multiple correlation and regression analysis, it is common practice to construct a confidence interval for the population MCC. In high-dimensional data settings, by which the data dimension p is much larger than the sample size n, due to the singularity of the sample covariance matrix, the classical confidence intervals for the MCC are no longer useable. For high-dimensional elliptical data, some (conservative) confidence intervals for the population MCC are presented using the random projection methodology. To evaluate and compare the performance of the proposed confidence intervals, some simulations are conducted in terms of the coverage probability and average interval length. Experimental validation of the proposed intervals is carried out on two real gene expression datasets.
Journal: Journal of Applied Statistics
Pages: 64-85
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1796937
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1796937
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:64-85
Template-Type: ReDIF-Article 1.0
Author-Name: M. Carmen Pardo
Author-X-Name-First: M.
Author-X-Name-Last: Carmen Pardo
Author-Name: Ying Lu
Author-X-Name-First: Ying
Author-X-Name-Last: Lu
Author-Name: Alba M. Franco-Pereira
Author-X-Name-First: Alba M.
Author-X-Name-Last: Franco-Pereira
Title: Extensions of empirical likelihood and chi-squared-based tests for ordered alternatives
Abstract:
Several methods for comparing k populations have been proposed in the literature. These methods assess the same null hypothesis of equal distributions but differ in the alternative hypothesis they consider. We focus on two important alternative hypotheses: monotone and umbrella ordering. Two new families of test statistics are proposed, including two known tests, as well as two new powerful tests under monotone ordering. Furthermore, these families are adapted for testing umbrella ordering. We compare some members of the families with respect to power and Type I errors under different simulation scenarios. Finally, the methods are illustrated in several applications to real data.
Journal: Journal of Applied Statistics
Pages: 24-43
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1796944
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1796944
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:24-43
Template-Type: ReDIF-Article 1.0
Author-Name: J. C. S. Vasconcelos
Author-X-Name-First: J. C. S.
Author-X-Name-Last: Vasconcelos
Author-Name: G. M. Cordeiro
Author-X-Name-First: G. M.
Author-X-Name-Last: Cordeiro
Author-Name: E. M. M. Ortega
Author-X-Name-First: E. M. M.
Author-X-Name-Last: Ortega
Title: The semiparametric regression model for bimodal data with different penalized smoothers applied to climatology, ethanol and air quality data
Abstract:
Semiparametric regressions can be used to model data when covariables and the response variable have a nonlinear relationship. In this work, we propose three flexible regression models for bimodal data called the additive, additive partial and semiparametric regressions, basing on the odd log-logistic generalized inverse Gaussian distribution under three types of penalized smoothers, where the main idea is not to confront the three forms of smoothings but to show the versatility of the distribution with three types of penalized smoothers. We present several Monte Carlo simulations carried out for different configurations of the parameters and some sample sizes to verify the precision of the penalized maximum-likelihood estimators. The usefulness of the proposed regressions is proved empirically through three applications to climatology, ethanol and air quality data.
Journal: Journal of Applied Statistics
Pages: 248-267
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1803812
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1803812
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:248-267
Template-Type: ReDIF-Article 1.0
Author-Name: Liang Wang
Author-X-Name-First: Liang
Author-X-Name-Last: Wang
Author-Name: Ke Wu
Author-X-Name-First: Ke
Author-X-Name-Last: Wu
Author-Name: Yogesh Mani Tripathi
Author-X-Name-First: Yogesh Mani
Author-X-Name-Last: Tripathi
Author-Name: Chandrakant Lodhi
Author-X-Name-First: Chandrakant
Author-X-Name-Last: Lodhi
Title: Reliability analysis of multicomponent stress–strength reliability from a bathtub-shaped distribution
Abstract:
In this paper, inference for a multicomponent stress–strength model is studied. When latent strength and stress random variables follow a bathtub-shaped distribution and the failure times are Type-II censored, the maximum likelihood estimate of the multicomponent stress–strength reliability (MSR) is established when there are common strength and stress parameters. Approximate confidence interval is also constructed by using the asymptotic distribution theory and delta method. Furthermore, another alternative generalized point and confidence interval estimators for the MSR are constructed based on pivotal quantities. Moreover, the likelihood and the pivotal quantities-based estimates for the MSR are also provided under unequal strength and stress parameter case. To compare the equivalence of the stress and strength parameters, the likelihood ratio test for hypothesis of interest is also provided. Finally, simulation studies and a real data example are given for illustration.
Journal: Journal of Applied Statistics
Pages: 122-142
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1803808
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1803808
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:122-142
Template-Type: ReDIF-Article 1.0
Author-Name: Han Yu
Author-X-Name-First: Han
Author-X-Name-Last: Yu
Author-Name: Rachael Hageman Blair
Author-X-Name-First: Rachael
Author-X-Name-Last: Hageman Blair
Title: Scalable module detection for attributed networks with applications to breast cancer
Abstract:
The objective of network module detection is to identify groups of nodes within a network structure that are tightly connected. Nodes in a network often have attributes (aka metadata) associated with them. It is often desirable to identify groups of nodes that are tightly connected in the network structure, but also have strong similarity in their attributes. Utilizing attribute information in module detection is a major challenge because it requires bridging the structural network with attribute data. A Weighted Fast Greedy (WFG) algorithm for attribute-based module detection is proposed. WFG utilizes logistic regression to bridge the structural and attribute spaces. The logistic function naturally emphasizes associations between attributes and network structure accordingly, and can be easily interpreted. A breast cancer application is presented that connects a protein–protein interaction network gene expression data and a survival outcome. This application demonstrates the importance of embedding attribute information into the community detection framework on a breast cancer dataset. Five modules were significant for survival and they contained known pathways and markers for cancer, including cell cycle, p53 pathway, BRCA1, BRCA2, and AURKB, among others. Whereas, neither the gene expression data nor the network structure alone gave rise to these cancer biomarkers and signatures.
Journal: Journal of Applied Statistics
Pages: 230-247
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1803811
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1803811
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:230-247
Template-Type: ReDIF-Article 1.0
Author-Name: Mariangela Sciandra
Author-X-Name-First: Mariangela
Author-X-Name-Last: Sciandra
Author-Name: Irene Carola Spera
Author-X-Name-First: Irene Carola
Author-X-Name-Last: Spera
Title: A model-based approach to Spotify data analysis: a Beta GLMM
Abstract:
Digital music distribution is increasingly powered by automated mechanisms that continuously capture, sort and analyze large amounts of Web-based data. This paper deals with the management of songs audio features from a statistical point of view. In particular, it explores the data catching mechanisms enabled by Spotify Web API and suggests statistical tools for the analysis of these data. Special attention is devoted to songs popularity and a Beta model, including random effects, is proposed in order to give the first answer to questions like: which are the determinants of popularity? The identification of a model able to describe this relationship, the determination within the set of characteristics of those considered most important in making a song popular is a very interesting topic for those who aim to predict the success of new products.
Journal: Journal of Applied Statistics
Pages: 214-229
Issue: 1
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1803810
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1803810
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:1:p:214-229
Template-Type: ReDIF-Article 1.0
Author-Name: José L. Jiménez
Author-X-Name-First: José L.
Author-X-Name-Last: Jiménez
Title: Quantifying treatment differences in confirmatory trials under non-proportional hazards
Abstract:
Proportional hazards are a common assumption when designing confirmatory clinical trials in oncology. With the emergence of immunotherapy and novel targeted therapies, departure from the proportional hazard assumption is not rare in nowadays clinical research. Under non-proportional hazards, the hazard ratio does not have a straightforward clinical interpretation, and the log-rank test is no longer the most powerful statistical test even though it is still valid. Nevertheless, the log-rank test and the hazard ratio are still the primary analysis tools, and traditional approaches such as sample size increase are still proposed to account for the impact of non-proportional hazards. The weighed log-rank test and the test based on the restricted mean survival time (RMST) are receiving a lot of attention as a potential alternative to the log-rank test. We conduct a simulation study comparing the performance and operating characteristics of the log-rank test, the weighted log-rank test and the test based on the RMST, including a treatment effect estimation, under different non-proportional hazards patterns. Results show that, under non-proportional hazards, the hazard ratio and weighted hazard ratio have no straightforward clinical interpretation whereas the RMST ratio can be interpreted regardless of the proportional hazards assumption. In terms of power, the RMST achieves a similar performance when compared to the log-rank test.
Journal: Journal of Applied Statistics
Pages: 466-484
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1815673
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1815673
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:466-484
Template-Type: ReDIF-Article 1.0
Author-Name: Hongwei Tan
Author-X-Name-First: Hongwei
Author-X-Name-Last: Tan
Author-Name: Guodong Wang
Author-X-Name-First: Guodong
Author-X-Name-Last: Wang
Author-Name: Wendong Wang
Author-X-Name-First: Wendong
Author-X-Name-Last: Wang
Author-Name: Zili Zhang
Author-X-Name-First: Zili
Author-X-Name-Last: Zhang
Title: Feature selection based on distance correlation: a filter algorithm
Abstract:
Feature selection (FS) is one of the most powerful techniques to cope with the curse of dimensionality. In the study, a new filter approach to feature selection based on distance correlation is presented (DCFS, for short), which keeps the model-free advantage without any pre-specified parameters. Our method consists of two steps: hard step (forward selection) and soft step (backward selection). In the hard step, two types of associations, between univariate feature and the classes and between group feature and the classes, are involved to pick out the most relevant features with respect to the target classes. Due to the strict screening condition in the first step, some of the useful features are likely removed. Therefore, in the soft step, a feature-relationship gain (like feature score) based on the distance correlation is introduced, which is concerned with five kinds of associations. We sort the feature gain values and implement the backward selection procedure until the errors stop declining. The simulation results show that our method becomes more competitive on several datasets compared with some of the representative feature selection methods based on several classification models.
Journal: Journal of Applied Statistics
Pages: 411-426
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1815672
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1815672
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:411-426
Template-Type: ReDIF-Article 1.0
Author-Name: Chandima N. P. G. Arachchige
Author-X-Name-First: Chandima N. P. G.
Author-X-Name-Last: Arachchige
Author-Name: Luke A. Prendergast
Author-X-Name-First: Luke A.
Author-X-Name-Last: Prendergast
Author-Name: Robert G. Staudte
Author-X-Name-First: Robert G.
Author-X-Name-Last: Staudte
Title: Robust analogs to the coefficient of variation
Abstract:
The coefficient of variation (CV) is commonly used to measure relative dispersion. However, since it is based on the sample mean and standard deviation, outliers can adversely affect it. Additionally, for skewed distributions the mean and standard deviation may be difficult to interpret and, consequently, that may also be the case for the
${\rm CV} $CV. Here we investigate the extent to which quantile-based measures of relative dispersion can provide appropriate summary information as an alternative to the CV. In particular, we investigate two measures, the first being the interquartile range (in lieu of the standard deviation), divided by the median (in lieu of the mean), and the second being the median absolute deviation, divided by the median, as robust estimators of relative dispersion. In addition to comparing the influence functions of the competing estimators and their asymptotic biases and variances, we compare interval estimators using simulation studies to assess coverage.
Journal: Journal of Applied Statistics
Pages: 268-290
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1808599
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1808599
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:268-290
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaoxue Li
Author-X-Name-First: Xiaoxue
Author-X-Name-Last: Li
Author-Name: Stewart J. Anderson
Author-X-Name-First: Stewart J.
Author-X-Name-Last: Anderson
Author-Name: Saul Shiffman
Author-X-Name-First: Saul
Author-X-Name-Last: Shiffman
Author-Name: Bo Zhang
Author-X-Name-First: Bo
Author-X-Name-Last: Zhang
Title: Time-varying coefficient cumulative gap time models for intensive longitudinal ecological momentary assessment data with missingness
Abstract:
Ecological momentary assessment (EMA) studies investigate intensive repeated observations of the current behavior and experiences of subjects in real time. In particular, such studies aim to minimize recall bias and maximize ecological validity, thereby strengthening the investigation and inference of microprocesses that influence behavior in real-world contexts by gathering intensive information on the temporal patterning of behavior of study subjects. Throughout this paper, we focus on the data analysis of an EMA study that examined behavior of intermittent smokers (ITS). Specifically, we sought to explore the pattern of clustered smoking behavior of ITS, or smoking ‘bouts’, as well as the covariates that predict such smoking behavior. To do this, in this paper we introduce a framework for characterizing the temporal behavior of ITS via the functions of event gap time to distinguish the smoking bouts. We used the time-varying coefficient models for the cumulative log gap time and to characterize the temporal patterns of smoking behavior, while simultaneously adjusting for behavioral covariates, and incorporated the inverse probability weighting into the models to accommodate missing data. Simulation studies showed that irrespective of whether missing by design or missing at random, the model was able to reliably determine prespecified time-varying functional forms of a given covariate coefficient, provided the the within-subject level was small.
Journal: Journal of Applied Statistics
Pages: 498-521
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1815676
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1815676
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:498-521
Template-Type: ReDIF-Article 1.0
Author-Name: Mohammad Vali Ahmadi
Author-X-Name-First: Mohammad Vali
Author-X-Name-Last: Ahmadi
Author-Name: Mahdi Doostparast
Author-X-Name-First: Mahdi
Author-X-Name-Last: Doostparast
Title: Predictive analysis for joint progressive censoring plans: a Bayesian approach
Abstract:
Comparative lifetime experiments are of particular importance in production processes when one wishes to determine the relative merits of several competing products with regard to their reliability. This paper confines itself to the data obtained by running a joint progressive Type-II censoring plan on samples in a combined manner. The problem of Bayesian predicting failure times of surviving units is discussed in details when parent populations are exponential. Two real data sets are analyzed in order to illustrate all the inferential procedures developed here. When destructive experiments under a censoring scheme finished, the researchers are usually interested to estimate remaining lifetimes of surviving units for sequel experiments. Findings of this paper are useful for these purposes specially when samples are non-homogeneous such as those taken from industrial storages.
Journal: Journal of Applied Statistics
Pages: 394-410
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1815671
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1815671
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:394-410
Template-Type: ReDIF-Article 1.0
Author-Name: C. Manté
Author-X-Name-First: C.
Author-X-Name-Last: Manté
Author-Name: S. Cornu
Author-X-Name-First: S.
Author-X-Name-Last: Cornu
Author-Name: D. Borschneck
Author-X-Name-First: D.
Author-X-Name-Last: Borschneck
Author-Name: C. Mocuta
Author-X-Name-First: C.
Author-X-Name-Last: Mocuta
Author-Name: R. van den Bogaert
Author-X-Name-First: R.
Author-X-Name-Last: van den Bogaert
Title: Detecting the Guttman effect with the help of ordinal correspondence analysis in synchrotron X-ray diffraction data analysis
Abstract:
We propose a method for detecting a Guttman effect in a complete disjunctive table
$\mathbf{U} $U with Q questions. Since such an investigation is a nonsense when the Q variables are independent, we reuse a previous unpublished work about the chi-squared independence test for Burt's tables. Then, we introduce a two-steps method consisting in plugging the first singular vector from a preliminary Correspondence Analysis (CA) of
$\mathbf{U} $U as a score x into a subsequent singly-ordered Ordinal Correspondence Analysis (OCA) of
$\mathbf{U} $U. OCA mainly consists in completing x by a sequence of orthogonal polynomials superseding the classical factors of CA. As a consequence, in presence of a pure Guttman effect, we should in principle have that the second singular vector coincide with the polynomial of degree 2, etc. The hybrid decomposition of the Pearson chi-squared statistics (resulting from OCA) used in association with permutation tests makes possible to reveal such relationships, i.e. the presence of a Guttman effect in the structure of
$\mathbf{U} $U, and to determine its degree - with an accuracy depending on the signal to noise ratio. The proposed method is successively tested on artificial data (more or less noisy), a well-known benchmark, and synchrotron X-ray diffraction data of soil samples.
Journal: Journal of Applied Statistics
Pages: 291-316
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1810644
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1810644
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:291-316
Template-Type: ReDIF-Article 1.0
Author-Name: Lingzhe Guo
Author-X-Name-First: Lingzhe
Author-X-Name-Last: Guo
Author-Name: Reza Modarres
Author-X-Name-First: Reza
Author-X-Name-Last: Modarres
Title: Two multivariate online change detection models
Abstract:
Online change point detection methods monitor changes in the distribution of a data stream. This article discusses two non-parametric online change detection methods based on the energy statistics and Mahalanobis depth. To apply the energy statistic, we use sliding-window algorithm with efficient training and updating procedures. For Mahalanobis depth, we propose an algorithm to train the threshold with desired protective ability against false alarms and discuss factors that have an influence on the threshold. Numerical studies evaluate and compare the performance of the proposed models with three existing methods to detect changes in the mean and variability of a data stream. The methods are applied to detecting changes in the flowing volume of the Mississippi River.
Journal: Journal of Applied Statistics
Pages: 427-448
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1815674
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1815674
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:427-448
Template-Type: ReDIF-Article 1.0
Author-Name: Kostas Loumponias
Author-X-Name-First: Kostas
Author-X-Name-Last: Loumponias
Author-Name: George Tsaklidis
Author-X-Name-First: George
Author-X-Name-Last: Tsaklidis
Title: Kalman filtering with censored measurements
Abstract:
This paper concerns Kalman filtering when the measurements of the process are censored. The censored measurements are addressed by the Tobit model of Type I and are one-dimensional with two censoring limits, while the (hidden) state vectors are multidimensional. For this model, Bayesian estimates for the state vectors are provided through a recursive algorithm of Kalman filtering type. Experiments are presented to illustrate the effectiveness and applicability of the algorithm. The experiments show that the proposed method outperforms other filtering methodologies in minimizing the computational cost as well as the overall Root Mean Square Error (RMSE) for synthetic and real data sets.
Journal: Journal of Applied Statistics
Pages: 317-335
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1810645
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1810645
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:317-335
Template-Type: ReDIF-Article 1.0
Author-Name: Paulo C. Marques F.
Author-X-Name-First: Paulo
Author-X-Name-Last: C. Marques F.
Author-Name: Helton Graziadei
Author-X-Name-First: Helton
Author-X-Name-Last: Graziadei
Author-Name: Hedibert F. Lopes
Author-X-Name-First: Hedibert F.
Author-X-Name-Last: Lopes
Title: Bayesian generalizations of the integer-valued autoregressive model
Abstract:
We develop two Bayesian generalizations of the Poisson integer-valued autoregressive model. The AdINAR(1) model accounts for overdispersed data by means of an innovation process whose marginal distributions are finite mixtures, while the DP-INAR(1) model is a hierarchical extension involving a Dirichlet process, which is capable of modeling a latent pattern of heterogeneity in the distribution of the innovations rates. The probabilistic forecasting capabilities of both models are put to test in the analysis of crime data in Pittsburgh, with favorable results.
Journal: Journal of Applied Statistics
Pages: 336-356
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1812544
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1812544
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:336-356
Template-Type: ReDIF-Article 1.0
Author-Name: Luai Al-Labadi
Author-X-Name-First: Luai
Author-X-Name-Last: Al-Labadi
Author-Name: Sean Berry
Author-X-Name-First: Sean
Author-X-Name-Last: Berry
Title: Bayesian estimation of extropy and goodness of fit tests
Abstract:
Extropy, a complementary dual of entropy, is considered in this paper. A Bayesian approach based on the Dirichlet process is proposed for the estimation of extropy. A goodness of fit test is also developed. Many theoretical properties of the procedure are derived. Several examples are discussed to illustrate the approach.
Journal: Journal of Applied Statistics
Pages: 357-370
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1812545
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1812545
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:357-370
Template-Type: ReDIF-Article 1.0
Author-Name: Vikas Kumar Sharma
Author-X-Name-First: Vikas Kumar
Author-X-Name-Last: Sharma
Author-Name: Sudhanshu V. Singh
Author-X-Name-First: Sudhanshu V.
Author-X-Name-Last: Singh
Author-Name: Komal Shekhawat
Author-X-Name-First: Komal
Author-X-Name-Last: Shekhawat
Title: Exponentiated Teissier distribution with increasing, decreasing and bathtub hazard functions
Abstract:
This article introduces a two-parameter exponentiated Teissier distribution. It is the main advantage of the distribution to have increasing, decreasing and bathtub shapes for its hazard rate function. The expressions of the ordinary moments, identifiability, quantiles, moments of order statistics, mean residual life function and entropy measure are derived. The skewness and kurtosis of the distribution are explored using the quantiles. In order to study two independent random variables, stress–strength reliability and stochastic orderings are discussed. Estimators based on likelihood, least squares, weighted least squares and product spacings are constructed for estimating the unknown parameters of the distribution. An algorithm is presented for random sample generation from the distribution. Simulation experiments are conducted to compare the performances of the considered estimators of the parameters and percentiles. Three sets of real data are fitted by using the proposed distribution over the competing distributions.
Journal: Journal of Applied Statistics
Pages: 371-393
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1813694
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1813694
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:371-393
Template-Type: ReDIF-Article 1.0
Author-Name: Caitlin Ward
Author-X-Name-First: Caitlin
Author-X-Name-Last: Ward
Author-Name: Jacob Oleson
Author-X-Name-First: Jacob
Author-X-Name-Last: Oleson
Author-Name: J. Bruce Tomblin
Author-X-Name-First: J. Bruce
Author-X-Name-Last: Tomblin
Author-Name: Elizabeth Walker
Author-X-Name-First: Elizabeth
Author-X-Name-Last: Walker
Title: Modeling population and subject-specific growth in a latent trait measured by multiple instruments over time using a hierarchical Bayesian framework
Abstract:
Psychometric growth curve modeling techniques are used to describe a person’s latent ability and how that ability changes over time based on a specific measurement instrument. However, the same instrument cannot always be used over a period of time to measure that latent ability. This is often the case when measuring traits longitudinally in children. Reasons may be that over time some measurement tools that were difficult for young children become too easy as they age resulting in floor effects or ceiling effects or both. We propose a Bayesian hierarchical model for such a scenario. Within the Bayesian model we combine information from multiple instruments used at different age ranges and having different scoring schemes to examine growth in latent ability over time. The model includes between-subject variance and within-subject variance and does not require linking item specific difficulty between the measurement tools. The model’s utility is demonstrated on a study of language ability in children from ages one to ten who are hard of hearing where measurement tool specific growth and subject-specific growth are shown in addition to a group level latent growth curve comparing the hard of hearing children to children with normal hearing.
Journal: Journal of Applied Statistics
Pages: 449-465
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1817346
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1817346
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:449-465
Template-Type: ReDIF-Article 1.0
Author-Name: Mikkel Slot Nielsen
Author-X-Name-First: Mikkel Slot
Author-X-Name-Last: Nielsen
Author-Name: Victor Rohde
Author-X-Name-First: Victor
Author-X-Name-Last: Rohde
Title: A surrogate model for estimating extreme tower loads on wind turbines based on random forest proximities
Abstract:
In the present paper, we present a surrogate model, which can be used to estimate extreme tower loads on a wind turbine from a number of signals and a suitable simulation tool. Due to the requirements of the International Electrotechnical Commission (IEC) Standard 61400-1, assessing extreme tower loads on wind turbines constitutes a key component of the design phase. The proposed model imputes tower loads by matching observed signals with simulated quantities using proximities induced by random forests. In this way, the algorithm's adaptability to high-dimensional and sparse settings is exploited without using regression-based surrogate loads (which may display misleading probabilistic characteristics). Finally, the model is applied to estimate tower loads on an operating wind turbine from data on its operational statistics.
Journal: Journal of Applied Statistics
Pages: 485-497
Issue: 2
Volume: 49
Year: 2022
Month: 01
X-DOI: 10.1080/02664763.2020.1815675
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1815675
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:2:p:485-497
Template-Type: ReDIF-Article 1.0
Author-Name: Tasnime Hamdeni
Author-X-Name-First: Tasnime
Author-X-Name-Last: Hamdeni
Author-Name: Soufiane Gasmi
Author-X-Name-First: Soufiane
Author-X-Name-Last: Gasmi
Title: A proportional-hazards model for survival analysis and long-term survivors modeling: application to amyotrophic lateral sclerosis data
Abstract:
The majority of survival data are affected by explanatory variables. We develop a new regression model for survival data analysis. As an alternative to standard mixture models, another model is proposed to describe the eventual presence of a surviving fraction. The proposed models are based on the Marshall–Olkin extended generalized Gompertz distribution. A maximum-likelihood inference is presented in the presence of covariates and a censorship phenomenon. Explanatory variables are incorporated into the model through proportional-hazards to evaluate the effect of risk factors on overall survival under different assumptions. Parametric, semi-parametric, and non-parametric methods are applied to survival analysis of patients treated for amyotrophic lateral sclerosis. Interesting results about riluzole use and other treatment effects on patients' survival have been obtained.
Journal: Journal of Applied Statistics
Pages: 694-708
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1830954
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1830954
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:694-708
Template-Type: ReDIF-Article 1.0
Author-Name: Xinyan Qin
Author-X-Name-First: Xinyan
Author-X-Name-Last: Qin
Author-Name: Jiao Yu
Author-X-Name-First: Jiao
Author-X-Name-Last: Yu
Author-Name: Wenhao Gui
Author-X-Name-First: Wenhao
Author-X-Name-Last: Gui
Title: Goodness-of-fit test for exponentiality based on spacings for general progressive Type-II censored data
Abstract:
There have been numerous tests proposed to determine whether or not the exponential model is suitable for a given data set. In this article, we propose a new test statistic based on spacings to test whether the general progressive Type-II censored samples are from exponential distribution. The null distribution of the test statistic is discussed and it could be approximated by the standard normal distribution. Meanwhile, we propose an approximate method for calculating the expectation and variance of samples under null hypothesis and corresponding power function is also given. Then, a simulation study is conducted. We calculate the approximation of the power based on normality and compare the results with those obtained by Monte Carlo simulation under different alternatives with distinct types of hazard function. Results of simulation study disclose that the power properties of this statistic by using Monte Carlo simulation are better for the alternatives with monotone increasing hazard function, and otherwise, normal approximation simulation results are relatively better. Finally, two illustrative examples are presented.
Journal: Journal of Applied Statistics
Pages: 599-620
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1821613
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1821613
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:599-620
Template-Type: ReDIF-Article 1.0
Author-Name: B. Bastien
Author-X-Name-First: B.
Author-X-Name-Last: Bastien
Author-Name: T. Boukhobza
Author-X-Name-First: T.
Author-X-Name-Last: Boukhobza
Author-Name: H. Dumond
Author-X-Name-First: H.
Author-X-Name-Last: Dumond
Author-Name: A. Gégout-Petit
Author-X-Name-First: A.
Author-X-Name-Last: Gégout-Petit
Author-Name: A. Muller-Gueudin
Author-X-Name-First: A.
Author-X-Name-Last: Muller-Gueudin
Author-Name: C. Thiébaut
Author-X-Name-First: C.
Author-X-Name-Last: Thiébaut
Title: A statistical methodology to select covariates in high-dimensional data under dependence. Application to the classification of genetic profiles in oncology
Abstract:
We propose a new methodology for selecting and ranking covariates associated with a variable of interest in a context of high-dimensional data under dependence but few observations. The methodology successively intertwines the clustering of covariates, decorrelation of covariates using Factor Latent Analysis, selection using aggregation of adapted methods and finally ranking. A simulation study shows the interest of the decorrelation inside the different clusters of covariates. We first apply our method to transcriptomic data of 37 patients with advanced non-small-cell lung cancer who have received chemotherapy, to select the transcriptomic covariates that explain the survival outcome of the treatment. Secondly, we apply our method to 79 breast tumor samples to define patient profiles for a new metastatic biomarker and associated gene network in order to personalize the treatments.
Journal: Journal of Applied Statistics
Pages: 764-781
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1837083
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1837083
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:764-781
Template-Type: ReDIF-Article 1.0
Author-Name: Kalanka. P. Jayalath
Author-X-Name-First: Kalanka. P.
Author-X-Name-Last: Jayalath
Author-Name: Raj S. Chhikara
Author-X-Name-First: Raj S.
Author-X-Name-Last: Chhikara
Title: Survival analysis for the inverse Gaussian distribution with the Gibbs sampler
Abstract:
This paper describes a comprehensive survival analysis for the inverse Gaussian distribution employing Bayesian and Fiducial approaches. It focuses on making inferences on the inverse Gaussian (IG) parameters μ and λ and the average remaining time of censored units. A flexible Gibbs sampling approach applicable in the presence of censoring is discussed and illustrations with Type II, progressive Type II, and random rightly censored observations are included. The analyses are performed using both simulated IG data and empirical data examples. Further, the bootstrap comparisons are made between the Bayesian and Fiducial estimates. It is concluded that the shape parameter (
$\phi =\lambda /\mu $ϕ=λ/μ) of the inverse Gaussian distribution has the most impact on the two analyses, Bayesian vs. Fiducial, and so does the size of censoring in data to a lesser extent. Overall, both these approaches are effective in estimating IG parameters and the average remaining lifetime. The suggested Gibbs sampler allowed a great deal of flexibility in implementation for all types of censoring considered.
Journal: Journal of Applied Statistics
Pages: 656-675
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1828314
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1828314
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:656-675
Template-Type: ReDIF-Article 1.0
Author-Name: A. Baíllo
Author-X-Name-First: A.
Author-X-Name-Last: Baíllo
Author-Name: J. E. Chacón
Author-X-Name-First: J. E.
Author-X-Name-Last: Chacón
Title: A new selection criterion for statistical home range estimation
Abstract:
The home range of an animal describes the geographic area where this individual spends most of the time while doing its usual activities. From a statistical viewpoint, the problem of home range estimation can be considered as a set estimation one. In the ecological literature, there are a variety of home range estimators. We address the open question of choosing the ‘best’ home range from a collection of them constructed on the same sample. We introduce the penalized overestimation ratio, a numerical index to rank the estimated home ranges. The key idea is to balance the excess area covered by the estimator (with respect to the sample) and a shape descriptor measuring the over-adjustment of the home range to the data. To our knowledge, apart from computing the home range area, our ranking procedure is the first one both applicable to real data and to any type of home range estimator. Further, optimization of the selection index provides a way to select the tuning parameters of nonparametric home ranges. For illustration purposes, we apply our selection proposal to a dataset of a Mongolian wolf and we carry out a simulation study.
Journal: Journal of Applied Statistics
Pages: 722-737
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1822302
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1822302
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:722-737
Template-Type: ReDIF-Article 1.0
Author-Name: Shafiqah Azman
Author-X-Name-First: Shafiqah
Author-X-Name-Last: Azman
Author-Name: Dharini Pathmanathan
Author-X-Name-First: Dharini
Author-X-Name-Last: Pathmanathan
Title: The GLM framework of the Lee–Carter model: a multi-country study
Abstract:
The Lee–Carter model is a well-known model in modeling mortality. We aim to compare three probability models (Poisson, negative binomial and binomial) based on the Generalized Linear Model (GLM) framework of the Lee–Carter model. These models are applied to mortality data for 10 selected countries (Japan, United States, United Kingdom, Australia, Sweden, Spain, Belgium, Canada, Netherlands and Bulgaria) and the fit of these models is assessed using the deviance statistics and standardized residuals against fitted value plot. Among these three models, the negative binomial Lee–Carter model gave the best fit based on the deviance statistics and estimates of the log of deaths.
Journal: Journal of Applied Statistics
Pages: 752-763
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1833183
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1833183
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:752-763
Template-Type: ReDIF-Article 1.0
Author-Name: Junhyung Park
Author-X-Name-First: Junhyung
Author-X-Name-Last: Park
Author-Name: Adam W. Chaffee
Author-X-Name-First: Adam W.
Author-X-Name-Last: Chaffee
Author-Name: Ryan J. Harrigan
Author-X-Name-First: Ryan J.
Author-X-Name-Last: Harrigan
Author-Name: Frederic Paik Schoenberg
Author-X-Name-First: Frederic Paik
Author-X-Name-Last: Schoenberg
Title: A non-parametric Hawkes model of the spread of Ebola in west Africa
Abstract:
Recently developed methods for the non-parametric estimation of Hawkes point process models facilitate their application for describing and forecasting the spread of epidemic diseases. We use data from the 2014 Ebola outbreak in West Africa to evaluate how well a simple Hawkes point process model can forecast the spread of Ebola virus in Guinea, Sierra Leone, and Liberia. For comparison, SEIR models that fit previously to the same data are evaluated using identical metrics. To test the predictive power of each of the models, we simulate the ability to make near real-time predictions during an actual outbreak by using the first 75% of the data for estimation and the subsequent 25% of the data for evaluation. Forecasts generated from Hawkes models more accurately describe the spread of Ebola in each of the three countries investigated and result in a 38% reduction in RMSE for weekly case estimation across all countries when compared to SEIR models (total RMSE of 59.8 cases/week using SEIR compared to 37.1 for Hawkes). We demonstrate that the improved fit from Hawkes modeling cannot be attributed to overfitting and evaluate the advantages and disadvantages of Hawkes models in general for forecasting the spread of epidemic diseases.
Journal: Journal of Applied Statistics
Pages: 621-637
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1825646
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1825646
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:621-637
Template-Type: ReDIF-Article 1.0
Author-Name: Massimiliano Bonamente
Author-X-Name-First: Massimiliano
Author-X-Name-Last: Bonamente
Author-Name: David Spence
Author-X-Name-First: David
Author-X-Name-Last: Spence
Title: A semi-analytical solution to the maximum-likelihood fit of Poisson data to a linear model using the Cash statistic
Abstract:
The Cash statistic, also known as the
${C} $C statistic, is commonly used for the analysis of low-count Poisson data, including data with null counts for certain values of the independent variable. The use of this statistic is especially attractive for low-count data that cannot be combined, or re-binned, without loss of resolution. This paper presents a new maximum-likelihood solution for the best-fit parameters of a linear model using the Poisson-based Cash statistic. The solution presented in this paper provides a new and simple method to measure the best-fit parameters of a linear model for any Poisson-based data, including data with null counts. In particular, the method enforces the requirement that the best-fit linear model be non-negative throughout the support of the independent variable. The method is summarized in a simple algorithm to fit Poisson counting data of any size and counting rate with a linear model, by-passing entirely the use of the traditional
$\chi ^2 $χ2 statistic.
Journal: Journal of Applied Statistics
Pages: 522-552
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1820960
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1820960
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:522-552
Template-Type: ReDIF-Article 1.0
Author-Name: T. Baghfalaki
Author-X-Name-First: T.
Author-X-Name-Last: Baghfalaki
Author-Name: M. Ganjali
Author-X-Name-First: M.
Author-X-Name-Last: Ganjali
Author-Name: A. Kabir
Author-X-Name-First: A.
Author-X-Name-Last: Kabir
Author-Name: A. Pazouki
Author-X-Name-First: A.
Author-X-Name-Last: Pazouki
Title: A Bayesian shared parameter model for joint modeling of longitudinal continuous and binary outcomes
Abstract:
Joint modeling of associated mixed biomarkers in longitudinal studies leads to a better clinical decision by improving the efficiency of parameter estimates. In many clinical studies, the observed time for two biomarkers may not be equivalent and one of the longitudinal responses may have recorded in a longer time than the other one. In addition, the response variables may have different missing patterns. In this paper, we propose a new joint model of associated continuous and binary responses by accounting different missing patterns for two longitudinal outcomes. A conditional model for joint modeling of the two responses is used and two shared random effects models are considered for intermittent missingness of two responses. A Bayesian approach using Markov Chain Monte Carlo (MCMC) is adopted for parameter estimation and model implementation. The validation and performance of the proposed model are investigated using some simulation studies. The proposed model is also applied for analyzing a real data set of bariatric surgery.
Journal: Journal of Applied Statistics
Pages: 638-655
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1822303
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1822303
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:638-655
Template-Type: ReDIF-Article 1.0
Author-Name: Man-Suk Oh
Author-X-Name-First: Man-Suk
Author-X-Name-Last: Oh
Author-Name: Chee Kyung Park
Author-X-Name-First: Chee Kyung
Author-X-Name-Last: Park
Title: Regional source apportionment of PM2.5 in Seoul using Bayesian multivariate receptor model
Abstract:
Seoul, the capital city of Korea with over 10 million residents, has been experiencing serious air pollution problems. Previous studies on source apportionment of PM2.5 in Seoul are based on measurements of chemical compositions of PM2.5 from a single monitoring site. In this paper, we analyse PM2.5 concentration data collected from multiple sites in 24 districts of Seoul and estimate regional source profiles using Bayesian multivariate receptor model. The regional source profiles provide information for the identification of major PM2.5 sources as well as the regions relatively more seriously affected by each source than other regions. These regional characteristics relevant to PM2.5 can help establish effective, customised, region-specific PM2.5 control strategies for each region rather than general strategies that apply to every region of Seoul.
Journal: Journal of Applied Statistics
Pages: 738-751
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1822305
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1822305
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:738-751
Template-Type: ReDIF-Article 1.0
Author-Name: T. Senga Kiessé
Author-X-Name-First: T.
Author-X-Name-Last: Senga Kiessé
Author-Name: Etienne Rivot
Author-X-Name-First: Etienne
Author-X-Name-Last: Rivot
Author-Name: Christophe Jaeger
Author-X-Name-First: Christophe
Author-X-Name-Last: Jaeger
Author-Name: Joël Aubin
Author-X-Name-First: Joël
Author-X-Name-Last: Aubin
Title: Bayesian inference in based-kernel regression: comparison of count data of condition factor of fish in pond systems
Abstract:
The discrete kernel-based regression approach generally provides pointwise estimates of count data that do not account for uncertainty about both parameters and resulting estimates. This work aims to provide probabilistic kernel estimates of count regression function by using Bayesian approach and then allows for a readily quantification of uncertainty. Bayesian approach enables to incorporate prior knowledge of parameters used in discrete kernel-based regression. An application was proposed on count data of condition factor of fish (K) provided from an experimental project that analyzed various pond management strategies. The probabilistic distribution of estimates were contrasted by discrete kernels, as a support to theoretical results on the performance of kernels. More practically, Bayesian credibility intervals of K-estimates were evaluated to compare pond management strategies. Thus, similarities were found between performances of semi-intensive and coupled fishponds, with formulated feed, in comparison with extensive fishponds, without formulated feed. In particular, the fish development was less predictable in extensive fishpond, dependent on natural resources, than in the two other fishponds, supplied in formulated feed.
Journal: Journal of Applied Statistics
Pages: 676-693
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1830953
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1830953
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:676-693
Template-Type: ReDIF-Article 1.0
Author-Name: Maria Anastasopoulou
Author-X-Name-First: Maria
Author-X-Name-Last: Anastasopoulou
Author-Name: Athanasios C. Rakitzis
Author-X-Name-First: Athanasios C.
Author-X-Name-Last: Rakitzis
Title: EWMA control charts for monitoring correlated counts with finite range
Abstract:
In this work, we develop and study upper and lower one-sided EWMA control charts for monitoring correlated counts with finite range. Often in practice, data of that kind can be adequately described by a first-order binomial or beta-binomial autoregressive model. Especially, when there is evidence that data demonstrate extra-binomial variation, the latter model is preferable than the former. The proposed charts can be used for detecting upward or downward shifts in process mean level. Practical guidelines concerning the statistical design of the proposed charts are given, while the effect of the extra-binomial variation is investigated as well. Comparisons with existing control charting procedures are also provided. Finally, an illustrative real-data example is also given.
Journal: Journal of Applied Statistics
Pages: 553-573
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1820959
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1820959
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:553-573
Template-Type: ReDIF-Article 1.0
Author-Name: Qingguo Tang
Author-X-Name-First: Qingguo
Author-X-Name-Last: Tang
Author-Name: Rohana J. Karunamuni
Author-X-Name-First: Rohana J.
Author-X-Name-Last: Karunamuni
Author-Name: Boxiao Liu
Author-X-Name-First: Boxiao
Author-X-Name-Last: Liu
Title: Regularized robust estimation in binary regression models
Abstract:
In this paper, we investigate robust parameter estimation and variable selection for binary regression models with grouped data. We investigate estimation procedures based on the minimum-distance approach. In particular, we employ minimum Hellinger and minimum symmetric chi-squared distances criteria and propose regularized minimum-distance estimators. These estimators appear to possess a certain degree of automatic robustness against model misspecification and/or for potential outliers. We show that the proposed non-penalized and penalized minimum-distance estimators are efficient under the model and simultaneously have excellent robustness properties. We study their asymptotic properties such as consistency, asymptotic normality and oracle properties. Using Monte Carlo studies, we examine the small-sample and robustness properties of the proposed estimators and compare them with traditional likelihood estimators. We also study two real-data applications to illustrate our methods. The numerical studies indicate the satisfactory finite-sample performance of our procedures.
Journal: Journal of Applied Statistics
Pages: 574-598
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1822304
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1822304
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:574-598
Template-Type: ReDIF-Article 1.0
Author-Name: Ayyub Sheikhi
Author-X-Name-First: Ayyub
Author-X-Name-Last: Sheikhi
Author-Name: Fatemeh Bahador
Author-X-Name-First: Fatemeh
Author-X-Name-Last: Bahador
Author-Name: Mohammad Arashi
Author-X-Name-First: Mohammad
Author-X-Name-Last: Arashi
Title: On a generalization of the test of endogeneity in a two stage least squares estimation
Abstract:
In situations that the predictors are correlated with the error term, we propose a bridge estimator in the two-stage least squares estimation. We apply this estimator to overcome the multicollinearity and sparsity of the explanatory variables, when the endogeneity problem is present.The proposed estimator was applied to modify the Durbin-Wu-Hausman (DWH) test of endogeneity in the presence of multicollinearity. To compare our modified test with the existing DWH for detection of an endogenous problem in multi-collinear data, some numerical assessments are carried out. The numerical results showed that the proposed estimators and the suggested test perform better for the multi-collinear data. Finally, a genetical data set is applied for illustration the our results by estimating the coefficients parameters in the presence of endogeneity and multicollinearity.
Journal: Journal of Applied Statistics
Pages: 709-721
Issue: 3
Volume: 49
Year: 2022
Month: 02
X-DOI: 10.1080/02664763.2020.1837084
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1837084
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:3:p:709-721
Template-Type: ReDIF-Article 1.0
Author-Name: The Editors
Title: Book Reviews
Journal: Journal of Applied Statistics
Pages: 639-643
Issue: 5
Volume: 34
Year: 2007
Month: 7
X-DOI: 10.1080/02664760701390652
File-URL: http://hdl.handle.net/10.1080/02664760701390652
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:34:y:2007:i:5:p:639-643
Template-Type: ReDIF-Article 1.0
Author-Name: Anurag Pathak
Author-X-Name-First: Anurag
Author-X-Name-Last: Pathak
Author-Name: Manoj Kumar
Author-X-Name-First: Manoj
Author-X-Name-Last: Kumar
Author-Name: Sanjay Kumar Singh
Author-X-Name-First: Sanjay Kumar
Author-X-Name-Last: Singh
Author-Name: Umesh Singh
Author-X-Name-First: Umesh
Author-X-Name-Last: Singh
Title: Bayesian inference: Weibull Poisson model for censored data using the expectation–maximization algorithm and its application to bladder cancer data
Abstract:
This article focuses on the parameter estimation of experimental items/units from Weibull Poisson Model under progressive type-II censoring with binomial removals (PT-II CBRs). The expectation–maximization algorithm has been used for maximum likelihood estimators (MLEs). The MLEs and Bayes estimators have been obtained under symmetric and asymmetric loss functions. Performance of competitive estimators have been studied through their simulated risks. One sample Bayes prediction and expected experiment time have also been studied. Furthermore, through real bladder cancer data set, suitability of considered model and proposed methodology have been illustrated.
Journal: Journal of Applied Statistics
Pages: 926-948
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1845626
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1845626
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:926-948
Template-Type: ReDIF-Article 1.0
Author-Name: Zijian Fang
Author-X-Name-First: Zijian
Author-X-Name-Last: Fang
Author-Name: Peng Zhao
Author-X-Name-First: Peng
Author-X-Name-Last: Zhao
Author-Name: Maochao Xu
Author-X-Name-First: Maochao
Author-X-Name-Last: Xu
Author-Name: Shouhuai Xu
Author-X-Name-First: Shouhuai
Author-X-Name-Last: Xu
Author-Name: Taizhong Hu
Author-X-Name-First: Taizhong
Author-X-Name-Last: Hu
Author-Name: Xing Fang
Author-X-Name-First: Xing
Author-X-Name-Last: Fang
Title: Statistical modeling of computer malware propagation dynamics in cyberspace
Abstract:
Modeling cyber threats, such as the computer malicious software (malware) propagation dynamics in cyberspace, is an important research problem because models can deepen our understanding of dynamical cyber threats. In this paper, we study the statistical modeling of the macro-level evolution of dynamical cyber attacks. Specifically, we propose a Bayesian structural time series approach for modeling the computer malware propagation dynamics in cyberspace. Our model not only possesses the parsimony property (i.e. using few model parameters) but also can provide the predictive distribution of the dynamics by accommodating uncertainty. Our simulation study shows that the proposed model can fit and predict the computer malware propagation dynamics accurately, without requiring to know the information about the underlying attack-defense interaction mechanism and the underlying network topology. We use the model to study the propagation of two particular kinds of computer malware, namely the Conficker and Code Red worms, and show that our model has very satisfactory fitting and prediction accuracies.
Journal: Journal of Applied Statistics
Pages: 858-883
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1845621
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1845621
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:858-883
Template-Type: ReDIF-Article 1.0
Author-Name: Seungchul Baek
Author-X-Name-First: Seungchul
Author-X-Name-Last: Baek
Author-Name: Yewon Kim
Author-X-Name-First: Yewon
Author-X-Name-Last: Kim
Author-Name: Junyong Park
Author-X-Name-First: Junyong
Author-X-Name-Last: Park
Author-Name: Jong Soo Lee
Author-X-Name-First: Jong Soo
Author-X-Name-Last: Lee
Title: Revisit to functional data analysis of sleeping energy expenditure
Abstract:
In this paper, we consider the classification problem of functional data including the sleeping energy expenditure (SEE) data, focusing on functional classification. Many existing classification rules are not effective in distinguishing the two classes of SEE data, because the trajectories of each observation have very different patterns for each class. It is often observed that some aspect of data such as the variability of paths is helpful in classification of functional data. To reflect this issue, we introduce a variable measuring the length of path in functional data and then propose a logistic model with fused lasso that considers the behavior of fluctuation of path as well as local correlations within each path. Our proposed model shows a significant improvement over some models used in the existing literature on the classification accuracy rate of functional data such as SEE data. We carry out simulation studies to show the finite sample performance and the gain that it makes in comparison with fused lasso without considering path length. With two more real datasets studied in some existing literature, we demonstrate that the new model achieves better or similar accuracy rate than the best accuracy rates reported in those studies.
Journal: Journal of Applied Statistics
Pages: 988-1002
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1838457
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1838457
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:988-1002
Template-Type: ReDIF-Article 1.0
Author-Name: Armin Hatefi
Author-X-Name-First: Armin
Author-X-Name-Last: Hatefi
Author-Name: Amirhossein Alvandi
Author-X-Name-First: Amirhossein
Author-X-Name-Last: Alvandi
Title: Efficient estimators with categorical ranked set samples: estimation procedures for osteoporosis
Abstract:
Ranked set sampling (RSS) design as a cost-effective sampling is a powerful tool in situations where measuring the variable of interest is costly and time-consuming; however, ranking information about sampling units can be obtained easily through inexpensive and easy to measure characteristics at little or no cost. In this paper, we study RSS data for analysis of an ordinal population. First, we compare the problem of non-representative extreme samples under RSS and commonly-used simple random sampling. Using RSS data with tie information, we propose non-parametric and maximum likelihood estimators for population parameters. Through extensive numerical studies, we investigate the effect of various factors including ranking ability, tie generating mechanisms, the number of categories and population setting on the performance of the estimators. Finally, we apply the proposed methods to the bone disorder data to estimate the proportions of patients with osteopenia and osteoporosis status.
Journal: Journal of Applied Statistics
Pages: 803-818
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1841742
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1841742
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:803-818
Template-Type: ReDIF-Article 1.0
Author-Name: María José Presno
Author-X-Name-First: María José
Author-X-Name-Last: Presno
Author-Name: Manuel Landajo
Author-X-Name-First: Manuel
Author-X-Name-Last: Landajo
Author-Name: Paula Fernandez-Gonzalez
Author-X-Name-First: Paula
Author-X-Name-Last: Fernandez-Gonzalez
Title: Nonparametric panel stationarity testing with an application to crude oil production
Abstract:
A nonparametric panel stationarity test is proposed which offers the advantage of not requiring prior specification of the trend function for each of the series in the panel. A bootstrap implementation of the test is outlined and its finite sample performance is analyzed via Monte Carlo simulations. An application is also included where the proposed test is used to analyze the stochastic properties of monthly crude oil production for a panel of 20 -both OPEC and non-OPEC- countries from 1973 to 2015. Our analysis detects strong evidence of non-stationarity, both globally and group-wise. Results have implications for the effectiveness of government intervention and stabilization policies.
Journal: Journal of Applied Statistics
Pages: 1033-1048
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1846691
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1846691
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:1033-1048
Template-Type: ReDIF-Article 1.0
Author-Name: Giovanni Boscaino
Author-X-Name-First: Giovanni
Author-X-Name-Last: Boscaino
Author-Name: Gianluca Sottile
Author-X-Name-First: Gianluca
Author-X-Name-Last: Sottile
Author-Name: Giada Adelfio
Author-X-Name-First: Giada
Author-X-Name-Last: Adelfio
Title: Migration and students' performance: detecting geographical differences following a curves clustering approach
Abstract:
Students' migration mobility is the new form of migration: students migrate to improve their skills and become more valued for the job market. The data regard the migration of Italian Bachelors who enrolled at Master Degree level, moving typically from poor to rich areas. This paper investigates the migration and other possible determinants on the Master Degree students' performance. The Clustering of Effects approach for Quantile Regression Coefficients Modelling has been used to cluster the effects of some variables on the students' performance for three Italian macro-areas. Results show evidence of similarity between Southern and Centre students, with respect to the Northern ones.
Journal: Journal of Applied Statistics
Pages: 1018-1032
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1845624
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1845624
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:1018-1032
Template-Type: ReDIF-Article 1.0
Author-Name: Hande Konşuk Ünlü
Author-X-Name-First: Hande Konşuk
Author-X-Name-Last: Ünlü
Author-Name: Derek S. Young
Author-X-Name-First: Derek S.
Author-X-Name-Last: Young
Author-Name: Ayten Yiğiter
Author-X-Name-First: Ayten
Author-X-Name-Last: Yiğiter
Author-Name: L. Hilal Özcebe
Author-X-Name-First: L.
Author-X-Name-Last: Hilal Özcebe
Title: A mixture model with Poisson and zero-truncated Poisson components to analyze road traffic accidents in Turkey
Abstract:
The analysis of traffic accident data is crucial to address numerous concerns, such as understanding contributing factors in an accident's chain-of-events, identifying hotspots, and informing policy decisions about road safety management. The majority of statistical models employed for analyzing traffic accident data are logically count regression models (commonly Poisson regression) since a count – like the number of accidents – is used as the response. However, features of the observed data frequently do not make the Poisson distribution a tenable assumption. For example, observed data rarely demonstrate an equal mean and variance and often times possess excess zeros. Sometimes, data may have heterogeneous structure consisting of a mixture of populations, rather than a single population. In such data analyses, mixtures-of-Poisson-regression models can be used. In this study, the number of injuries resulting from casualties of traffic accidents registered by the General Directorate of Security (Turkey, 2005–2014) are modeled using a novel mixture distribution with two components: a Poisson and zero-truncated-Poisson distribution. Such a model differs from existing mixture models in literature where the components are either all Poisson distributions or all zero-truncated Poisson distributions. The proposed model is compared with the Poisson regression model via simulation and in the analysis of the traffic data.
Journal: Journal of Applied Statistics
Pages: 1003-1017
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1843610
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1843610
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:1003-1017
Template-Type: ReDIF-Article 1.0
Author-Name: David P. M. Scollnik
Author-X-Name-First: David P. M.
Author-X-Name-Last: Scollnik
Title: Bayesian analyses of an exponential-Poisson and related zero augmented type models
Abstract:
We consider several alternatives to the continuous exponential-Poisson distribution in order to accommodate the occurrence of zeros. Three of these are modifications of the exponential-Poisson model. One of these remains a fully continuous model. The other models we consider are all semi-continuous models, each with a discrete point mass at zero and a continuous density on the positive values. All of the models are applied to two environmental data sets concerning precipitation, and their Bayesian analyses using MCMC are discussed. This discussion covers convergence of the MCMC simulations and model selection procedures and considerations.
Journal: Journal of Applied Statistics
Pages: 949-967
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1846692
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1846692
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:949-967
Template-Type: ReDIF-Article 1.0
Author-Name: Alexis Zavez
Author-X-Name-First: Alexis
Author-X-Name-Last: Zavez
Author-Name: Emeir M. McSorley
Author-X-Name-First: Emeir M.
Author-X-Name-Last: McSorley
Author-Name: Alison J. Yeates
Author-X-Name-First: Alison J.
Author-X-Name-Last: Yeates
Author-Name: Sally W. Thurston
Author-X-Name-First: Sally W.
Author-X-Name-Last: Thurston
Title: Modeling the effects of multiple exposures with unknown group memberships: a Bayesian latent variable approach
Abstract:
We propose a Bayesian latent variable model to allow estimation of the covariate-adjusted relationships between an outcome and a small number of latent exposure variables, using data from multiple observed exposures. Each latent variable is assumed to be represented by multiple exposures, where membership of the observed exposures to latent groups is unknown. Our model assumes that one measured exposure variable can be considered as a sentinel marker for each latent variable, while membership of the other measured exposures is estimated using MCMC sampling based on a classical measurement error model framework. We illustrate our model using data on multiple cytokines and birth weight from the Seychelles Child Development Study, and evaluate the performance of our model in a simulation study. Classification of cytokines into Th1 and Th2 cytokine classes in the Seychelles study revealed some differences from standard Th1/Th2 classifications. In simulations, our model correctly classified measured exposures into latent groups, and estimated model parameters with little bias and with coverage that was similar to the oracle model.
Journal: Journal of Applied Statistics
Pages: 831-857
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1843611
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1843611
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:831-857
Template-Type: ReDIF-Article 1.0
Author-Name: George O. Agogo
Author-X-Name-First: George O.
Author-X-Name-Last: Agogo
Author-Name: Alexander K. Muoka
Author-X-Name-First: Alexander K.
Author-X-Name-Last: Muoka
Title: A three-part regression calibration to handle excess zeroes, skewness and heteroscedasticity in adjusting for measurement error in dietary intake data
Abstract:
Exposure measurement error (ME) biases exposure-outcome associations. Calibration dietary intake data used in the regression calibration (RC) response to adjust for ME are usually right-skewed, heteroscedastic and with excess zeroes. We proposed three-part RC models to handle these distributional complexities simultaneously, while correcting for ME in fish intake. We applied data from the National Health and Nutrition Examination Survey (NHANES), where long-term intake was measured with food frequency questionnaire (FFQ) in the main study and short-term intake with 24-hour recall (24HR) in the calibration study. In the three-part RC models, never consumers were modelled using two approaches: a zero distribution (Three-part RC-het-det), and logistic distribution (Three-part RC-het-prob); heteroscedasticity using an exponential distribution and right-skewness using generalized gamma distribution. The proposed models were compared with two-part RC model that ignores never consumers, and with methods that estimate intakes using FFQ and 24HR. The models were evaluated in a simulation study. With NHANES data, mean increase in the mercury level (in
$ \mu \mathrm{g/L} $ μg/L) was 1.20 using FFQ-method, 0.4 using 24HR-method, 1.87 using two-part RC and 2.02 using three-part RC-het-prob method. The three-part RC estimated the association with the least bias in the simulation study.
Journal: Journal of Applied Statistics
Pages: 884-901
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1845622
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1845622
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:884-901
Template-Type: ReDIF-Article 1.0
Author-Name: Peihan Xiong
Author-X-Name-First: Peihan
Author-X-Name-Last: Xiong
Author-Name: Weiwei Zhuang
Author-X-Name-First: Weiwei
Author-X-Name-Last: Zhuang
Author-Name: Guoxin Qiu
Author-X-Name-First: Guoxin
Author-X-Name-Last: Qiu
Title: Testing exponentiality based on the extropy of record values
Abstract:
In this paper, we first present a characterization of exponential distribution based on the extropy of record values and next introduce a goodness-of-fit test for exponentiality. Monte Carlo simulation is used to compute the critical values of our proposed test for different sample sizes and significance levels. To show the advantage of the proposed test, we adopt 58 competitor tests and compute the adjusted power against different alternatives with distinct types of hazard function. The power results show that our proposed test has superior adjusted power if the alternatives have increasing failure rates or bathtub decreasing-increasing failure rates, especially when the sample size is small. Finally, three real examples are used to illustrate the applicability and robustness of our proposed test by monitoring the p-values of the tests.
Journal: Journal of Applied Statistics
Pages: 782-802
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1840535
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1840535
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:782-802
Template-Type: ReDIF-Article 1.0
Author-Name: Yuexia Zhang
Author-X-Name-First: Yuexia
Author-X-Name-Last: Zhang
Author-Name: Guoyou Qin
Author-X-Name-First: Guoyou
Author-X-Name-Last: Qin
Author-Name: Zhongyi Zhu
Author-X-Name-First: Zhongyi
Author-X-Name-Last: Zhu
Author-Name: Bo Fu
Author-X-Name-First: Bo
Author-X-Name-Last: Fu
Title: Robust estimation of models for longitudinal data with dropouts and outliers
Abstract:
Missing data and outliers usually arise in longitudinal studies. Ignoring the effects of missing data and outliers will make the classical generalized estimating equation approach invalid. The longitudinal cohort study of rheumatoid arthritis patients was designed to investigate whether the Health Assessment Questionnaire score was associated with baseline covariates and changed with time. There exist dropouts and outliers in the data. In order to analyze the data, we develop a robust estimating equation approach. To deal with the responses missing at random, we extend a doubly robust method. To achieve robustness against outliers, we utilize an outlier robust method, which corrects the bias induced by outliers through centralizing the covariate matrix in the estimating equation. The doubly robust method for dropouts is easy to combine with the outlier robust method. The proposed method has the property of robustness in the sense that the proposed estimator is not only doubly robust against model misspecification for dropouts when there is no outlier in the data, but also robust against outliers. Consistency and asymptotic normality of the proposed estimator are established under regularity conditions. A comprehensive simulation study and real data analysis demonstrate that the proposed estimator does have the property of robustness.
Journal: Journal of Applied Statistics
Pages: 902-925
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1845623
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1845623
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:902-925
Template-Type: ReDIF-Article 1.0
Author-Name: Usha Govindarajulu
Author-X-Name-First: Usha
Author-X-Name-Last: Govindarajulu
Author-Name: Thaddeus Tarpey
Author-X-Name-First: Thaddeus
Author-X-Name-Last: Tarpey
Title: Optimal partitioning for the proportional hazards model
Abstract:
This paper discusses methods for clustering a continuous covariate in a survival analysis model. The advantages of using a categorical covariate defined from discretizing a continuous covariate (via clustering) is (i) enhanced interpretability of the covariate's impact on survival and (ii) relaxing model assumptions that are usually required for survival models, such as the proportional hazards model. Simulations and an example are provided to illustrate the methods.
Journal: Journal of Applied Statistics
Pages: 968-987
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1846690
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1846690
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:968-987
Template-Type: ReDIF-Article 1.0
Author-Name: Lukáš Malec
Author-X-Name-First: Lukáš
Author-X-Name-Last: Malec
Title: On the rank-deficient canonical correlation technique solved by analytic spectral decomposition
Abstract:
Regularization is a well-known and used statistical approach covering individual points or limit approximations. In this study, the canonical correlation analysis (CCA) process of the paths is discussed with partial least squares (PLS) as the other boundary covering transformation to a symmetric eigenvalue (or singular value) problem dependent on a parameter. Two regularizations of the original criterion in the parameterization domain are compared, i.e. using projection and by identity matrix. We discuss the existence and uniqueness of the analytic path for eigenvalues and corresponding elements of eigenvectors. Specifically, canonical analysis is applied to an ill-conditioned case of singular within-sets input matrices encompassing tourism accommodation data.
Journal: Journal of Applied Statistics
Pages: 819-830
Issue: 4
Volume: 49
Year: 2022
Month: 03
X-DOI: 10.1080/02664763.2020.1843608
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1843608
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:4:p:819-830
Template-Type: ReDIF-Article 1.0
Author-Name: N. Murugeswari
Author-X-Name-First: N.
Author-X-Name-Last: Murugeswari
Author-Name: P. Jeyadurga
Author-X-Name-First: P.
Author-X-Name-Last: Jeyadurga
Author-Name: S. Balamurali
Author-X-Name-First: S.
Author-X-Name-Last: Balamurali
Title: Optimal designing of two-level skip-lot sampling reinspection plan
Abstract:
Skip-lot sampling plan is often applied in industries for reducing the cost and effort of the inspection of the product having excellent quality history. Consequence of skip-lot sampling plans is to reduce the cost of inspection so which are more attractive in economical aspect. In this paper, we develop a sampling plan by incorporating the idea of resampling in two-level skip lot sampling plan and the new plan is designated as SkSP-2L.1-R. This paper presents the Markov chain formulation of the proposed plan along with the derivation of performance measures of the plan. We also provide the designing methodology to determine the optimal parameters of the SkSP-2L.1-R plan so as to minimize the average sample number by using two points on the operating characteristic curve approach. By contemplating various combinations of producer and consumer quality levels along with respective risks, a table is constructed to determine the optimal parameters. An industrial application of the proposed SkSP-2L.1-R plan is discussed. The SkSP-2L.1-R with single sampling plan as a reference plan is compared with the conventional single sampling plan, SkSP-2 plan and SkSP-2-R plan and proved that the proposed SkSP-2L.1-R plan outperforms these plans.
Journal: Journal of Applied Statistics
Pages: 1086-1104
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1849059
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1849059
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1086-1104
Template-Type: ReDIF-Article 1.0
Author-Name: Ufuk Beyaztas
Author-X-Name-First: Ufuk
Author-X-Name-Last: Beyaztas
Author-Name: Han Lin Shang
Author-X-Name-First: Han Lin
Author-X-Name-Last: Shang
Title: Robust bootstrap prediction intervals for univariate and multivariate autoregressive time series models
Abstract:
The bootstrap procedure has emerged as a general framework to construct prediction intervals for future observations in autoregressive time series models. Such models with outlying data points are standard in real data applications, especially in the field of econometrics. These outlying data points tend to produce high forecast errors, which reduce the forecasting performances of the existing bootstrap prediction intervals calculated based on non-robust estimators. In the univariate and multivariate autoregressive time series, we propose a robust bootstrap algorithm for constructing prediction intervals and forecast regions. The proposed procedure is based on the weighted likelihood estimates and weighted residuals. Its finite sample properties are examined via a series of Monte Carlo studies and two empirical data examples.
Journal: Journal of Applied Statistics
Pages: 1179-1202
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1856351
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1856351
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1179-1202
Template-Type: ReDIF-Article 1.0
Author-Name: Kuangnan Fang
Author-X-Name-First: Kuangnan
Author-X-Name-Last: Fang
Author-Name: Peng Wang
Author-X-Name-First: Peng
Author-X-Name-Last: Wang
Author-Name: Xiaochen Zhang
Author-X-Name-First: Xiaochen
Author-X-Name-Last: Zhang
Author-Name: Qingzhao Zhang
Author-X-Name-First: Qingzhao
Author-X-Name-Last: Zhang
Title: Structured sparse support vector machine with ordered features
Abstract:
In the application of high-dimensional data classification, several attempts have been made to achieve variable selection by replacing the
$ \ell _{2} $ ℓ2-penalty with other penalties for the support vector machine (SVM). However, these high-dimensional SVM methods usually do not take into account the special structure among covariates (features). In this article, we consider a classification problem, where the covariates are ordered in some meaningful way, and the number of covariates p can be much larger than the sample size n. We propose a structured sparse SVM to tackle this type of problems, which combines the non-convex penalty and cubic spline estimation procedure (i.e. penalizing second-order derivatives of the coefficients) to the SVM. From a theoretical point of view, the proposed method satisfies the local oracle property. Simulations show that the method works effectively both in feature selection and classification accuracy. A real application is conducted to illustrate the benefits of the method.
Journal: Journal of Applied Statistics
Pages: 1105-1120
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1849053
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1849053
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1105-1120
Template-Type: ReDIF-Article 1.0
Author-Name: Alexey Koloydenko
Author-X-Name-First: Alexey
Author-X-Name-Last: Koloydenko
Author-Name: Kristi Kuljus
Author-X-Name-First: Kristi
Author-X-Name-Last: Kuljus
Author-Name: Jüri Lember
Author-X-Name-First: Jüri
Author-X-Name-Last: Lember
Title: MAP segmentation in Bayesian hidden Markov models: a case study
Abstract:
We consider the problem of estimating the maximum posterior probability (MAP) state sequence for a finite state and finite emission alphabet hidden Markov model (HMM) in the Bayesian setup, where both emission and transition matrices have Dirichlet priors. We study a training set consisting of thousands of protein alignment pairs. The training data is used to set the prior hyperparameters for Bayesian MAP segmentation. Since the Viterbi algorithm is not applicable any more, there is no simple procedure to find the MAP path, and several iterative algorithms are considered and compared. The main goal of the paper is to test the Bayesian setup against the frequentist one, where the parameters of HMM are estimated using the training data.
Journal: Journal of Applied Statistics
Pages: 1203-1234
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1858273
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1858273
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1203-1234
Template-Type: ReDIF-Article 1.0
Author-Name: Yonghui Liu
Author-X-Name-First: Yonghui
Author-X-Name-Last: Liu
Author-Name: Chaoxuan Mao
Author-X-Name-First: Chaoxuan
Author-X-Name-Last: Mao
Author-Name: Víctor Leiva
Author-X-Name-First: Víctor
Author-X-Name-Last: Leiva
Author-Name: Shuangzhe Liu
Author-X-Name-First: Shuangzhe
Author-X-Name-Last: Liu
Author-Name: Waldemiro A. Silva Neto
Author-X-Name-First: Waldemiro A.
Author-X-Name-Last: Silva Neto
Title: Asymmetric autoregressive models: statistical aspects and a financial application under COVID-19 pandemic
Abstract:
In the present study, we provide a motivating example with a financial application under COVID-19 pandemic to investigate autoregressive (AR) modeling and its diagnostics based on asymmetric distributions. The objectives of this work are: (i) to formulate asymmetric AR models and their estimation and diagnostics; (ii) to assess the performance of the parameters estimators and of the local influence technique for these models; and (iii) to provide a tool to show how data following an asymmetric distribution under an AR structure should be analyzed. We take the advantages of the stochastic representation of the skew-normal distribution to estimate the parameters of the corresponding AR model efficiently with the expectation-maximization algorithm. Diagnostic analytics are conducted by using the local influence technique with four perturbation schemes. By employing Monte Carlo simulations, we evaluate the statistical behavior of the corresponding estimators and of the local influence technique. An illustration with financial data updated until 2020, analyzed using the methodology introduced in the present work, is presented as an example of effective applications, from where it is possible to explain atypical cases from the COVID-19 pandemic.
Journal: Journal of Applied Statistics
Pages: 1323-1347
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2021.1913103
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1913103
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1323-1347
Template-Type: ReDIF-Article 1.0
Author-Name: Zahra Barkhordar
Author-X-Name-First: Zahra
Author-X-Name-Last: Barkhordar
Author-Name: Mohsen Maleki
Author-X-Name-First: Mohsen
Author-X-Name-Last: Maleki
Author-Name: Zahra Khodadadi
Author-X-Name-First: Zahra
Author-X-Name-Last: Khodadadi
Author-Name: Darren Wraith
Author-X-Name-First: Darren
Author-X-Name-Last: Wraith
Author-Name: Farajollah Negahdari
Author-X-Name-First: Farajollah
Author-X-Name-Last: Negahdari
Title: A Bayesian approach on the two-piece scale mixtures of normal homoscedastic nonlinear regression models
Abstract:
In this application note paper, we propose and examine the performance of a Bayesian approach for a homoscedastic nonlinear regression (NLR) model assuming errors with two-piece scale mixtures of normal (TP-SMN) distributions. The TP-SMN is a large family of distributions, covering both symmetrical/ asymmetrical distributions as well as light/heavy tailed distributions, and provides an alternative to another well-known family of distributions, called scale mixtures of skew-normal distributions. The proposed family and Bayesian approach provides considerable flexibility and advantages for NLR modelling in different practical settings. We examine the performance of the approach using simulated and real data.
Journal: Journal of Applied Statistics
Pages: 1305-1322
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1854203
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1854203
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1305-1322
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaojuan Hao
Author-X-Name-First: Xiaojuan
Author-X-Name-Last: Hao
Author-Name: Kent M. Eskridge
Author-X-Name-First: Kent M.
Author-X-Name-Last: Eskridge
Author-Name: Dong Wang
Author-X-Name-First: Dong
Author-X-Name-Last: Wang
Title: Variational Bayesian inference for association over phylogenetic trees for microorganisms
Abstract:
With the advance of next generation sequencing technologies, researchers now routinely obtain a collection of microbial sequences with complex phylogenetic relationships. It is often of interest to analyze the association between certain environmental factors and characteristics of the microbial collection. Though methods have been developed to test for association between the microbial composition with environmental factors as well as between coevolving traits, a flexible model that can provide a comprehensive picture of the relationship between microbial community characteristics and environmental variables will be tremendously beneficial. We developed a Bayesian approach for association analysis while incorporating the phylogenetic structure to account for the dependence between observations. To overcome the computational difficulty related to the phylogenetic tree, a variational algorithm was developed to evaluate the posterior distribution. As the posterior distribution can be readily obtained for parameters of interest and any derived variables, the association relationship can be examined comprehensively. With two application examples, we demonstrated that the Bayesian approach can uncover nuanced details of the microbial assemblage with regard to the environmental factor. The proposed Bayesian approach and variational algorithm can be extended for other problems involving dependence over tree-like structures.
Journal: Journal of Applied Statistics
Pages: 1140-1153
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1854200
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1854200
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1140-1153
Template-Type: ReDIF-Article 1.0
Author-Name: T. H. A. Nguyen
Author-X-Name-First: T. H. A.
Author-X-Name-Last: Nguyen
Author-Name: T. Laurent
Author-X-Name-First: T.
Author-X-Name-Last: Laurent
Author-Name: C. Thomas-Agnan
Author-X-Name-First: C.
Author-X-Name-Last: Thomas-Agnan
Author-Name: A. Ruiz-Gazen
Author-X-Name-First: A.
Author-X-Name-Last: Ruiz-Gazen
Title: Analyzing the impacts of socio-economic factors on French departmental elections with CoDa methods
Abstract:
The vote shares by party on a given subdivision of a territory form a vector called composition (mathematically, a vector belonging to a simplex). It is interesting to model these shares and study the impact of the characteristics of the territorial units on the outcome of the elections. In the political economy literature, few regression models are adapted to the case of more than two political parties. In the statistical literature, there are regression models adapted to share vectors including Compositional Data (CoDa) models, but also Dirichlet models, and others. Our goal is to discuss and illustrate the use CoDa regression models for political economy models for more than two parties. The models are fitted on French electoral data of the 2015 departmental elections.
Journal: Journal of Applied Statistics
Pages: 1235-1251
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1858274
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1858274
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1235-1251
Template-Type: ReDIF-Article 1.0
Author-Name: M.B. Seitshiro
Author-X-Name-First: M.B.
Author-X-Name-Last: Seitshiro
Author-Name: H.P. Mashele
Author-X-Name-First: H.P.
Author-X-Name-Last: Mashele
Title: Quantification of model risk that is caused by model misspecification
Abstract:
In this paper, we suggest a technique to quantify model risk, particularly model misspecification for binary response regression problems found in financial risk management, such as in credit risk modelling. We choose the probability of default model as one instance of many other credit risk models that may be misspecified in a financial institution. By way of illustrating the model misspecification for probability of default, we carry out quantification of two specific statistical predictive response techniques, namely the binary logistic regression and complementary log–log. The maximum likelihood estimation technique is employed for parameter estimation. The statistical inference, precisely the goodness of fit and model performance measurements, are assessed. Using the simulation dataset and Taiwan credit card default dataset, our finding reveals that with the same sample size and very small simulation iterations, the two techniques produce similar goodness-of-fit results but completely different performance measures. However, when the iterations increase, the binary logistic regression technique for balanced dataset reveals prominent goodness of fit and performance measures as opposed to the complementary log–log technique for both simulated and real datasets.
Journal: Journal of Applied Statistics
Pages: 1065-1085
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1849055
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1849055
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1065-1085
Template-Type: ReDIF-Article 1.0
Author-Name: Thomas Xavier
Author-X-Name-First: Thomas
Author-X-Name-Last: Xavier
Author-Name: Joby K. Jose
Author-X-Name-First: Joby K.
Author-X-Name-Last: Jose
Title: Stress–strength reliability estimation involving paired observation with ties using bivariate exponentiated half-logistic model
Abstract:
This paper deals with the problem of maximum likelihood and Bayesian estimation of stress–strength reliability involving paired observation with ties using bivariate exponentiated half-logistic distribution. This problem is of importance because in some real applications the strength of the component is highly dependent on the stress experienced by it. A bivariate extension of exponentiated half-logistic is discussed and an expression for the stress–strength reliability is obtained. This model is also useful to analyse data having the unusual feature of having a number of pairs with tied scores, even when the scores are continuous. The maximum likelihood estimate and interval estimate of the stress–strength reliability has been developed. The Bayes estimates of the stress–strength reliability under squared error loss function are obtained using importance sampling technique. Simulation studies are conducted to evaluate the performance of maximum likelihood and Bayes estimates. Two real-life data sets are analysed to numerically illustrate the usefulness of the developed method.
Journal: Journal of Applied Statistics
Pages: 1049-1064
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1849054
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1849054
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1049-1064
Template-Type: ReDIF-Article 1.0
Author-Name: Florian Stark
Author-X-Name-First: Florian
Author-X-Name-Last: Stark
Author-Name: Sven Otto
Author-X-Name-First: Sven
Author-X-Name-Last: Otto
Title: Testing and dating structural changes in copula-based dependence measures
Abstract:
This paper is concerned with testing and dating structural breaks in the dependence structure of multivariate time series. We consider a cumulative sum (CUSUM) type test for constant copula-based dependence measures, such as Spearman's rank correlation and quantile dependencies. The asymptotic null distribution is not known in closed form and critical values are estimated by an i.i.d. bootstrap procedure. We analyze size and power properties in a simulation study under different dependence measure settings, such as skewed and fat-tailed distributions. To date breakpoints and to decide whether two estimated break locations belong to the same break event, we propose a pivot confidence interval procedure. Finally, we apply the test to the historical data of 10 large financial firms during the last financial crisis from 2002 to mid-2013.
Journal: Journal of Applied Statistics
Pages: 1121-1139
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1850655
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1850655
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1121-1139
Template-Type: ReDIF-Article 1.0
Author-Name: Maria Ioneris Oliveira
Author-X-Name-First: Maria
Author-X-Name-Last: Ioneris Oliveira
Author-Name: Michelli Barros
Author-X-Name-First: Michelli
Author-X-Name-Last: Barros
Author-Name: Joelson Campos
Author-X-Name-First: Joelson
Author-X-Name-Last: Campos
Author-Name: Francisco José A. Cysneiros
Author-X-Name-First: Francisco José A.
Author-X-Name-Last: Cysneiros
Title: Bivariate Birnbaum-Saunders accelerated lifetime model: estimation and diagnostic analysis
Abstract:
In this paper, we discuss the bivariate Birnbaum-Saunders accelerated lifetime model, in which we have modeled the dependence structure of bivariate survival data through the use of frailty models. Specifically, we propose the bivariate model Birnbaum-Saunders with the following frailty distributions: gamma, positive stable and logarithmic series. We present a study of inference and diagnostic analysis for the proposed model, more concisely, are proposed a diagnostic analysis based in local influence and residual analysis to assess the fit model, as well as, to detect influential observations. In this regard, we derived the normal curvatures of local influence under different perturbation schemes and we performed some simulation studies for assessing the potential of residuals to detect misspecification in the systematic component, the presence in the stochastic component of the model and to detect outliers. Finally, we apply the methodology studied to real data set from recurrence in times of infections of 38 kidney patients using a portable dialysis machine, we analyzed these data considering independence within the pairs and using the bivariate Birnbaum-Saunders accelerated lifetime model, so that we could make a comparison and verify the importance of modeling dependence within the times of infection associated with the same patient.
Journal: Journal of Applied Statistics
Pages: 1252-1276
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1859466
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1859466
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1252-1276
Template-Type: ReDIF-Article 1.0
Author-Name: Everestus O. Ossai
Author-X-Name-First: Everestus O.
Author-X-Name-Last: Ossai
Author-Name: Mbanefo S. Madukaife
Author-X-Name-First: Mbanefo S.
Author-X-Name-Last: Madukaife
Author-Name: Abimibola V. Oladugba
Author-X-Name-First: Abimibola V.
Author-X-Name-Last: Oladugba
Title: A review of tests for exponentiality with Monte Carlo comparisons
Abstract:
In this paper, 91 different tests for exponentiality are reviewed. Some of the tests are universally consistent while others are against some special classes of life distributions. Power performances of 40 of these different tests for exponentiality of datasets are compared through extensive Monte Carlo simulations. The comparisons are conducted for different sample sizes of 10, 25, 50 and 100 for different groups of distributions according to the shape of their hazard functions at 5 percent level of significance. Also, the techniques are applied to two real-world datasets and a measure of power is employed for the comparison of the tests. The results show that some tests which are very good under one group of alternative distributions are not so under another group. Also, some tests maintained relatively high power over all the groups of alternative distributions studied while some others maintained poor power performances over all the groups of alternative distributions. Again, the result obtained from real-world datasets agree completely with those of the simulation studies.
Journal: Journal of Applied Statistics
Pages: 1277-1304
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1854202
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1854202
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1277-1304
Template-Type: ReDIF-Article 1.0
Author-Name: Li-Pang Chen
Author-X-Name-First: Li-Pang
Author-X-Name-Last: Chen
Title: Ultrahigh-dimensional sufficient dimension reduction for censored data with measurement error in covariates
Abstract:
In this paper, we consider the ultrahigh-dimensional sufficient dimension reduction (SDR) for censored data and measurement error in covariates. We first propose the feature screening procedure based on censored data and the covariates subject to measurement error. With the suitable correction of mismeasurement, the error-contaminated variables detected by the proposed feature screening procedure are the same as the truly important variables. Based on the selected active variables, we develop the SDR method to estimate the central subspace and the structural dimension with both censored data and measurement error incorporated. The theoretical results of the proposed method are established. Simulation studies are reported to assess the performance of the proposed method. The proposed method is implemented to NKI breast cancer data.
Journal: Journal of Applied Statistics
Pages: 1154-1178
Issue: 5
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1856352
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1856352
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:5:p:1154-1178
Template-Type: ReDIF-Article 1.0
Author-Name: Benoît Lalloué
Author-X-Name-First: Benoît
Author-X-Name-Last: Lalloué
Author-Name: Jean-Marie Monnez
Author-X-Name-First: Jean-Marie
Author-X-Name-Last: Monnez
Author-Name: Eliane Albuisson
Author-X-Name-First: Eliane
Author-X-Name-Last: Albuisson
Title: Streaming constrained binary logistic regression with online standardized data
Abstract:
Online learning is a method for analyzing very large datasets (‘big data’) as well as data streams. In this article, we consider the case of constrained binary logistic regression and show the interest of using processes with an online standardization of the data, in particular to avoid numerical explosions or to allow the use of shrinkage methods. We prove the almost sure convergence of such a process and propose using a piecewise constant step-size such that the latter does not decrease too quickly and does not reduce the speed of convergence. We compare twenty-four stochastic approximation processes with raw or online standardized data on five real or simulated data sets. Results show that, unlike processes with raw data, processes with online standardized data can prevent numerical explosions and yield the best results.
Journal: Journal of Applied Statistics
Pages: 1519-1539
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1870672
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1870672
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1519-1539
Template-Type: ReDIF-Article 1.0
Author-Name: Xin Liu
Author-X-Name-First: Xin
Author-X-Name-Last: Liu
Author-Name: Wenqing He
Author-X-Name-First: Wenqing
Author-X-Name-Last: He
Title: Adaptive kernel scaling support vector machine with application to a prostate cancer image study
Abstract:
The support vector machine (SVM) is a popularly used classifier in applications such as pattern recognition, texture mining and image retrieval owing to its flexibility and interpretability. However, its performance deteriorates when the response classes are imbalanced. To enhance the performance of the support vector machine classifier in the imbalanced cases we investigate a new two stage method by adaptively scaling the kernel function. Based on the information obtained from the standard SVM in the first stage, we conformally rescale the kernel function in a data adaptive fashion in the second stage so that the separation between two classes can be effectively enlarged with incorporation of observation imbalance. The proposed method takes into account the location of the support vectors in the feature space, therefore is especially appealing when the response classes are imbalanced. The resulting algorithm can efficiently improve the classification accuracy, which is confirmed by intensive numerical studies as well as a real prostate cancer imaging data application.
Journal: Journal of Applied Statistics
Pages: 1465-1484
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1870669
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1870669
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1465-1484
Template-Type: ReDIF-Article 1.0
Author-Name: B. Curley
Author-X-Name-First: B.
Author-X-Name-Last: Curley
Title: A nonlinear measurement error model and its application to describing the dependency of health outcomes on dietary intake
Abstract:
Many nutritional studies focus on the relationship between individuals' diets and resulting health outcomes. When examining these relationships, researchers are generally interested in individuals' long-term, average intake of nutrients; however, typically only 1–2 days of data are collected. If analyses are performed without accounting for the error in estimating usual intake, estimates will be biased. In this work, we focus on situations where the association between intake and health outcomes is nonlinear. Since we can only obtain noisy measurements of intake, we propose implementing a nonlinear measurement error model which accounts for the nuisance day-to-day variance when estimating long-term average intake. Estimation of the model is performed using maximum likelihood. Properties of the estimators are explored for a model where we assume that the unobservable usual intake is normally distributed. We then propose an extended model where we no longer assume that the distribution for the unobservable predictor is normal, but is instead a finite mixture of discrete distributions. We finish with an application using data from the 2015–2016 National Health and Nutrition Examination Survey (NHANES) where we examine the association between potassium intake and systolic blood pressure.
Journal: Journal of Applied Statistics
Pages: 1485-1518
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1870671
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1870671
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1485-1518
Template-Type: ReDIF-Article 1.0
Author-Name: Jiefei Wang
Author-X-Name-First: Jiefei
Author-X-Name-Last: Wang
Author-Name: Jeffrey C. Miecznikowski
Author-X-Name-First: Jeffrey C.
Author-X-Name-Last: Miecznikowski
Title: High precision implementation of Steck's recursion method for use in goodness-of-fit tests
Abstract:
Classical continuous goodness-of-fit (GOF) testing is employed for examining whether the data come from an assumed parametric model. In many cases, GOF tests assume a uniform null distribution and examine extreme values of the order statistics of the samples. Many of these statistics can be expressed by a function of the order statistics and the p-values amount to a joint probability statement based on the uniform order statistics. In this paper, we utilize Steck's recursion method and propose two high precision computing algorithms to compute the p-values for these GOF statistics. The numerical difficulties in implementing Steck's method are discussed and compared with solutions provided in high precision libraries.
Journal: Journal of Applied Statistics
Pages: 1348-1363
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1861224
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1861224
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1348-1363
Template-Type: ReDIF-Article 1.0
Author-Name: Hongying Jing
Author-X-Name-First: Hongying
Author-X-Name-Last: Jing
Author-Name: Jian Li
Author-X-Name-First: Jian
Author-X-Name-Last: Li
Author-Name: Kaizong Bai
Author-X-Name-First: Kaizong
Author-X-Name-Last: Bai
Title: Directional monitoring and diagnosis for covariance matrices
Abstract:
Statistical surveillance for covariance matrices has attracted increasing attention recently. Many approaches have been developed for monitoring general shifts that are arbitrary deviations, as well as sparse shifts occurring in only a few elements. This paper considers directional shifts that occur in only one independent parameter, which is common if the process is relatively stable. A directional covariance matrix control chart is proposed, which fully exploits directional shift information and borrows the strong power of likelihood ratio test. Therefore, this chart provides a powerful tool for monitoring covariance matrices. In addition, the proposed chart does not require specifying the regularisation parameter, and it enjoys a concise quadratic form, thereby easy to implement. Furthermore, this chart naturally leads to a diagnostic prescription for identifying the shifting element in the covariance matrix. Simulation results have demonstrated the efficiency of the suggested control chart and its accompanying diagnostic scheme.
Journal: Journal of Applied Statistics
Pages: 1449-1464
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1867830
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1867830
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1449-1464
Template-Type: ReDIF-Article 1.0
Author-Name: Marcela de Marillac Carvalho
Author-X-Name-First: Marcela de Marillac
Author-X-Name-Last: Carvalho
Author-Name: Thelma Sáfadi
Author-X-Name-First: Thelma
Author-X-Name-Last: Sáfadi
Title: Risk analysis in the brazilian stock market: copula-APARCH modeling for value-at-risk
Abstract:
Risk management of stock portfolios is a fundamental problem for the financial analysis since it indicates the potential losses of an investment at any given time. The objective of this study is to use bivariate static conditional copulas to quantify the dependence structure and to estimate the risk measure Value-at-Risk (VaR). There were selected stocks that have been performing outstandingly on the Brazilian Stock Exchange to compose pairs trading portfolios (B3, Gerdau, Magazine Luiza, and Petrobras). Due to the flexibility that this methodology offers in the construction of multivariate distributions and risk aggregation in finance, we used the copula-APARCH approach with the Normal, T-student, and Joe-Clayton copula functions. In most scenarios, the results showed a pattern of dependence at the extremes. Moreover, the copula form seems not to be relevant for VaR estimation, since in most portfolios the appropriate copulas lead to significant VaR estimates. It has found that the best models fitted provided conservative risk measures, estimates at 5% and 1%, in a scenario more aggressive.
Journal: Journal of Applied Statistics
Pages: 1598-1610
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1865883
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1865883
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1598-1610
Template-Type: ReDIF-Article 1.0
Author-Name: M. Hemavathi
Author-X-Name-First: M.
Author-X-Name-Last: Hemavathi
Author-Name: Eldho Varghese
Author-X-Name-First: Eldho
Author-X-Name-Last: Varghese
Author-Name: Shashi Shekhar
Author-X-Name-First: Shashi
Author-X-Name-Last: Shekhar
Author-Name: Seema Jaggi
Author-X-Name-First: Seema
Author-X-Name-Last: Jaggi
Title: Sequential asymmetric third order rotatable designs (SATORDs)
Abstract:
Rotatable designs that are available for process/ product optimization trials are mostly symmetric in nature. In many practical situations, response surface designs (RSDs) with mixed factor (unequal) levels are more suitable as these designs explore more regions in the design space but it is hard to get rotatable designs with a given level of asymmetry. When experimenting with unequal factor levels via asymmetric second order rotatable design (ASORDs), the lack of fit of the model may become significant which ultimately leads to the estimation of parameters based on a higher (or third) order model. Experimenting with a new third order rotatable design (TORD) in such a situation would be expensive as the responses observed from the first stage runs would be kept underutilized. In this paper, we propose a method of constructing asymmetric TORD by sequentially augmenting some additional points to the ASORDs without discarding the runs in the first stage. The proposed designs will be more economical to obtain the optimum response as the design in the first stage can be used to fit the second order model and with some additional runs, third order model can be fitted without discarding the initial design.
Journal: Journal of Applied Statistics
Pages: 1364-1381
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1864817
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1864817
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1364-1381
Template-Type: ReDIF-Article 1.0
Author-Name: Milda Maria Burzala
Author-X-Name-First: Milda Maria
Author-X-Name-Last: Burzala
Title: The process of transferring negative impulses in capital markets – a wavelet analysis
Abstract:
The empirical research that is presented herein deals with the process of transferring negative impulses in capital markets during the subprime crisis (contagion, comovements, crisis transmission and shocks). A significant and positive contribution of the research conducted is the demonstration of how the wavelet analysis can be used in examining the various responses of the financial markets. The first stage of the research involved an analysis of the response of seven European markets (CAC40, DAX, FTSE100, IBEX, ATHEX, BUX and WIG20 indexes) to the proceedings in the US market, exemplified by the Dow Jones Industrial Average Index. The second stage involved examining the relationships of strong European markets (CAC40, DAX, FTSE100), and then the impact that the strongest German market DAX had on four other and weaker European markets – two from Western Europe (IBEX, ATHEX) and two from Central-Eastern Europe (BUX and WIG20). This article presents a methodological approach to transfer impulses on capital markets.
Journal: Journal of Applied Statistics
Pages: 1574-1597
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1864811
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1864811
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1574-1597
Template-Type: ReDIF-Article 1.0
Author-Name: Zainab Al-kaabawi
Author-X-Name-First: Zainab
Author-X-Name-Last: Al-kaabawi
Author-Name: Yinghui Wei
Author-X-Name-First: Yinghui
Author-X-Name-Last: Wei
Author-Name: Rana Moyeed
Author-X-Name-First: Rana
Author-X-Name-Last: Moyeed
Title: Bayesian hierarchical models for linear networks
Abstract:
The purpose of this study is to highlight dangerous motorways via estimating the intensity of accidents and study its pattern across the UK motorway network. Two methods have been developed to achieve this aim. First, the motorway-specific intensity is estimated by using a homogeneous Poisson process. The heterogeneity across motorways is incorporated using two-level hierarchical models. The data structure is multilevel since each motorway consists of junctions that are joined by grouped segments. In the second method, the segment-specific intensity is estimated. The homogeneous Poisson process is used to model accident data within grouped segments but heterogeneity across grouped segments is incorporated using three-level hierarchical models. A Bayesian method via Markov Chain Monte Carlo is used to estimate the unknown parameters in the models and the sensitivity to the choice of priors is assessed. The performance of the proposed models is evaluated by a simulation study and an application to traffic accidents in 2016 on the UK motorway network. The deviance information criterion (DIC) and the widely applicable information criterion (WAIC) are employed to choose between models.
Journal: Journal of Applied Statistics
Pages: 1421-1448
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1864814
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1864814
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1421-1448
Template-Type: ReDIF-Article 1.0
Author-Name: Gideon Mensah Engmann
Author-X-Name-First: Gideon Mensah
Author-X-Name-Last: Engmann
Author-Name: Dong Han
Author-X-Name-First: Dong
Author-X-Name-Last: Han
Title: The optimized CUSUM and EWMA multi-charts for jointly detecting a range of mean and variance change
Abstract:
This article considers the problem of jointly monitoring the mean and variance of a process by multi-chart schemes. Multi-chart is a combination of several single charts which detects changes in a process quickly. Asymptotic analyses and simulation studies show that the optimized CUSUM multi-chart has optimal performance than optimized EWMA multi-chart in jointly detecting mean and variance shifts in an
$ i.i.d. $ i.i.d. normal observation. A real example that monitors the changes in IBM's stock returns (mean) and risks (variance) is used to demonstrate the performance of the above two multi-charts. The proposed method has been compared to a benchmark and it performed better.
Journal: Journal of Applied Statistics
Pages: 1540-1558
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1870670
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1870670
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1540-1558
Template-Type: ReDIF-Article 1.0
Author-Name: Qianya Qi
Author-X-Name-First: Qianya
Author-X-Name-Last: Qi
Author-Name: Li Yan
Author-X-Name-First: Li
Author-X-Name-Last: Yan
Author-Name: Lili Tian
Author-X-Name-First: Lili
Author-X-Name-Last: Tian
Title: Analyzing partially paired data: when can the unpaired portion(s) be safely ignored?
Abstract:
Partially paired data, either with incompleteness in one or both arms, are common in practice. For testing equality of means of two arms, practitioners often use only the portion of data with complete pairs and perform paired tests. Although such tests (referred as ‘naive paired tests’) are legitimate, their powers might be low as only partial data are utilized. The recently proposed ‘P-value pooling methods’, based on combining P-values from two tests, use all data, have reasonable type-I error control and good power property. While it is generally believed that ‘P-value pooling methods’ are superior to ‘naive paired tests’ in terms of power as the former use more data than the latter, no detailed power comparison has been done. This paper aims to compare powers of ‘naive paired tests’ and ‘P-value pooling methods’ analytically and our findings are counterintuitive, i.e. the ‘P-value pooling methods’ do not always outperform the naive paired tests in terms of power. Based on these results, we present guidance on how to select the best test for testing equality of means with partially paired data.
Journal: Journal of Applied Statistics
Pages: 1402-1420
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1864813
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1864813
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1402-1420
Template-Type: ReDIF-Article 1.0
Author-Name: Parfait Munezero
Author-X-Name-First: Parfait
Author-X-Name-Last: Munezero
Author-Name: Gebrenegus Ghilagaber
Author-X-Name-First: Gebrenegus
Author-X-Name-Last: Ghilagaber
Title: Dynamic Bayesian adjustment of anticipatory covariates in retrospective data: application to the effect of education on divorce risk
Abstract:
We address a problem in inference from retrospective studies where the value of a variable is measured at the date of the survey but is used as covariate to events that have occurred long before the survey. This causes problem because the value of the current-date (anticipatory) covariate does not follow the temporal order of events. We propose a dynamic Bayesian approach for modelling jointly the anticipatory covariate and the event of interest, and allowing the effects of the anticipatory covariate to vary over time. The issues are illustrated with data on the effects of education attained by the survey-time on divorce risks among Swedish men. The overall results show that failure to adjust for the anticipatory nature of education leads to elevated relative risks of divorce across educational levels. The results are partially in accordance with previous findings based on analyses of the same data set. More importantly, our findings provide new insights in that the bias due to anticipatory covariates varies over marriage duration.
Journal: Journal of Applied Statistics
Pages: 1382-1401
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2020.1864812
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1864812
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1382-1401
Template-Type: ReDIF-Article 1.0
Author-Name: The Editors
Title: Correction Notice
Journal: Journal of Applied Statistics
Pages: 1611-1611
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2022.2049996
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2049996
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1611-1611
Template-Type: ReDIF-Article 1.0
Author-Name: Nipada Papukdee
Author-X-Name-First: Nipada
Author-X-Name-Last: Papukdee
Author-Name: Jeong-Soo Park
Author-X-Name-First: Jeong-Soo
Author-X-Name-Last: Park
Author-Name: Piyapatr Busababodhin
Author-X-Name-First: Piyapatr
Author-X-Name-Last: Busababodhin
Title: Penalized likelihood approach for the four-parameter kappa distribution
Abstract:
The four-parameter kappa distribution (K4D) is a generalized form of some commonly used distributions such as generalized logistic, generalized Pareto, generalized Gumbel, and generalized extreme value (GEV) distributions. Owing to its flexibility, the K4D is widely applied in modeling in several fields such as hydrology and climatic change. For the estimation of the four parameters, the maximum likelihood approach and the method of L-moments are usually employed. The L-moment estimator (LME) method works well for some parameter spaces, with up to a moderate sample size, but it is sometimes not feasible in terms of computing the appropriate estimates. Meanwhile, using the maximum likelihood estimator (MLE) with small sample sizes shows substantially poor performance in terms of a large variance of the estimator. We therefore propose a maximum penalized likelihood estimation (MPLE) of K4D by adjusting the existing penalty functions that restrict the parameter space. Eighteen combinations of penalties for two shape parameters are considered and compared. The MPLE retains modeling flexibility and large sample optimality while also improving on small sample properties. The properties of the proposed estimator are verified through a Monte Carlo simulation, and an application case is demonstrated taking Thailand’s annual maximum temperature data.
Journal: Journal of Applied Statistics
Pages: 1559-1573
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2021.1871592
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1871592
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1559-1573
Template-Type: ReDIF-Article 1.0
Author-Name: The Editors
Title: Correction Notice
Journal: Journal of Applied Statistics
Pages: 1612-1614
Issue: 6
Volume: 49
Year: 2022
Month: 04
X-DOI: 10.1080/02664763.2022.2051984
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2051984
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:6:p:1612-1614
Template-Type: ReDIF-Article 1.0
Author-Name: Seyede Sedighe Azimi
Author-X-Name-First: Seyede Sedighe
Author-X-Name-Last: Azimi
Author-Name: Ehsan Bahrami Samani
Author-X-Name-First: Ehsan
Author-X-Name-Last: Bahrami Samani
Author-Name: Mojtaba Ganjali
Author-X-Name-First: Mojtaba
Author-X-Name-Last: Ganjali
Title: Analysis of mixed correlated overdispersed binomial and ordinal longitudinal responses: LogLindley-Binomial and ordinal random effects model
Abstract:
We propose a new model called LogLindley-Binomial and ordinal joint model with random effects for analyzing mixed overdispersed binomial and ordinal longitudinal responses. A new distribution called the LogLindley-Binomial is presented, which is appropriate for the analysis of overdispersed binomial variables. A full likelihood-based approach is used to obtain maximum likelihood estimates. A comparison between LogLindley-Binomial and Beta-Binomial distributions are given by a simulation study. Also, to illustrate the utility of the proposed model, some simulation studies are conducted. In simulation studies, the performances of the LogLindley-Binomial distribution and the proposed model are well in some situations. Also, the new model's performance for analyzing a real dataset, extracted from the British Household Panel Survey, is studied. The proposed model performs well in comparison with another model for analyzing real data. Finally, the proposed distribution and the new model are found to be applicable for analyzing overdispersed binomial and mixed data.
Journal: Journal of Applied Statistics
Pages: 1742-1768
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1881455
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1881455
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1742-1768
Template-Type: ReDIF-Article 1.0
Author-Name: Xiaolin Chen
Author-X-Name-First: Xiaolin
Author-X-Name-Last: Chen
Author-Name: Chenguang Li
Author-X-Name-First: Chenguang
Author-X-Name-Last: Li
Author-Name: Tao Zhang
Author-X-Name-First: Tao
Author-X-Name-Last: Zhang
Author-Name: Zhenlong Gao
Author-X-Name-First: Zhenlong
Author-X-Name-Last: Gao
Title: On correlation rank screening for ultra-high dimensional competing risks data
Abstract:
In recent years, numerous feature screening schemes have been developed for ultra-high dimensional standard survival data with only one failure event. Nevertheless, existing literature pays little attention to related investigations for competing risks data, in which subjects suffer from multiple mutually exclusive failures. In this article, we develop a new marginal feature screening for ultra-high dimensional time-to-event data to allow for competing risks. The proposed procedure is model-free, and robust against heavy-tailed distributions and potential outliers for time to the type of failure of interest. Apart from this, it is invariant to any monotone transformation of event time of interest. Under rather mild assumptions, it is shown that the newly suggested approach possesses the ranking consistency and sure independence screening properties. Some numerical studies are conducted to evaluate the finite-sample performance of our method and make a comparison with its competitor, while an application to a real data set is provided to serve as an illustration.
Journal: Journal of Applied Statistics
Pages: 1848-1864
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1884209
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1884209
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1848-1864
Template-Type: ReDIF-Article 1.0
Author-Name: Thomas Deregnaucourt
Author-X-Name-First: Thomas
Author-X-Name-Last: Deregnaucourt
Author-Name: Chafik Samir
Author-X-Name-First: Chafik
Author-X-Name-Last: Samir
Author-Name: Sebastian Kurtek
Author-X-Name-First: Sebastian
Author-X-Name-Last: Kurtek
Author-Name: Anne-Francoise Yao
Author-X-Name-First: Anne-Francoise
Author-X-Name-Last: Yao
Title: Shape-constrained Gaussian process regression for surface reconstruction and multimodal, non-rigid image registration
Abstract:
We present a new statistical framework for landmark ?>curve-based image registration and surface reconstruction. The proposed method first elastically aligns geometric features (continuous, parameterized curves) to compute local deformations, and then uses a Gaussian random field model to estimate the full deformation vector field as a spatial stochastic process on the entire surface or image domain. The statistical estimation is performed using two different methods: maximum likelihood and Bayesian inference via Markov Chain Monte Carlo sampling. The resulting deformations accurately match corresponding curve regions while also being sufficiently smooth over the entire domain. We present several qualitative and quantitative evaluations of the proposed method on both synthetic and real data. We apply our approach to two different tasks on real data: (1) multimodal medical image registration, and (2) anatomical and pottery surface reconstruction.
Journal: Journal of Applied Statistics
Pages: 1865-1889
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1897970
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1897970
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1865-1889
Template-Type: ReDIF-Article 1.0
Author-Name: Mohamed Milad
Author-X-Name-First: Mohamed
Author-X-Name-Last: Milad
Author-Name: Gayla R. Olbricht
Author-X-Name-First: Gayla R.
Author-X-Name-Last: Olbricht
Title: Testing differentially methylated regions through functional principal component analysis
Abstract:
DNA methylation is an epigenetic modification that plays an important role in many biological processes and diseases. Several statistical methods have been proposed to test for DNA methylation differences between conditions at individual cytosine sites, followed by a post hoc aggregation procedure to explore regional differences. While there are benefits to analyzing CpGs individually, there are both biological and statistical reasons to test entire genomic regions for differential methylation. Variability in methylation levels measured by Next-Generation Sequencing (NGS) is often observed across CpG sites in a genomic region. Evaluating meaningful changes in regional level methylation profiles between conditions over noisy site-level measurements is often difficult to implement with parametric models. To overcome these limitations, this study develops a nonparametric approach to detect predefined differentially methylated regions (DMR) based on functional principal component analysis (FPCA). The performance of this approach is compared with two alternative methods (GIFT and M3D), using real and simulated data.
Journal: Journal of Applied Statistics
Pages: 1677-1691
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1877636
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1877636
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1677-1691
Template-Type: ReDIF-Article 1.0
Author-Name: Fabrizio Ronchi
Author-X-Name-First: Fabrizio
Author-X-Name-Last: Ronchi
Author-Name: Solomon W. Harrar
Author-X-Name-First: Solomon W.
Author-X-Name-Last: Harrar
Author-Name: Luigi Salmaso
Author-X-Name-First: Luigi
Author-X-Name-Last: Salmaso
Title: Multivariate nonparametric methods in two-way balanced designs: performances and limitations in small samples
Abstract:
Investigations of multivariate population are pretty common in applied researches, and the two-way crossed factorial design is a common design used at the exploratory phase in industrial applications. When assumptions such as multivariate normality and covariance homogeneity are violated, the conventional wisdom is to resort to nonparametric tests for hypotheses testing. In this paper we compare the performances, and in particular the power, of some nonparametric and semi-parametric methods that have been developed in recent years. Specifically, we examined resampling methods and robust versions of classical multivariate analysis of variance (MANOVA) tests. In a simulation study, we generate data sets with different configurations of factor's effect, number of replicates, number of response variables under null hypothesis, and number of response variables under alternative hypothesis. The objective is to elicit practical advice and guides to practitioners regarding the sensitivity of the tests in the various configurations, the tradeoff between power and type I error, the strategic impact of increasing number of response variables, and the favourable performance of one test when the alternative is sparse. A real case study from an industrial engineering experiment in thermoformed packaging production is used to compare and illustrate the application of the various methods.
Journal: Journal of Applied Statistics
Pages: 1714-1741
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1915256
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1915256
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1714-1741
Template-Type: ReDIF-Article 1.0
Author-Name: Tamalika Koley
Author-X-Name-First: Tamalika
Author-X-Name-Last: Koley
Author-Name: Anup Dewanji
Author-X-Name-First: Anup
Author-X-Name-Last: Dewanji
Title: Current status data with two competing risks and missing failure types: a parametric approach
Abstract:
Missing cause of failure is a common problem in competing risks data. Here we consider a general missing pattern in which one observes a set of possible causes containing the true cause. In this work, we focus on the parametric analysis of current status data with two competing risks and the above-mentioned missing pattern. We make some simpler assumptions on the conditional probability of observing a set of possible causes of failure given the true cause and carry out maximum-likelihood estimation of the model parameters. Asymptotic properties of the maximum-likelihood estimators are also discussed. Simulation studies are performed to study the finite sample properties of the estimators and also to investigate the role of the monitoring time distribution. Finally, the method is illustrated through the analysis of a real data set.
Journal: Journal of Applied Statistics
Pages: 1769-1783
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1881453
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1881453
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1769-1783
Template-Type: ReDIF-Article 1.0
Author-Name: M. H. Tahir
Author-X-Name-First: M. H.
Author-X-Name-Last: Tahir
Author-Name: M. Adnan Hussain
Author-X-Name-First: M. Adnan
Author-X-Name-Last: Hussain
Author-Name: Gauss M. Cordeiro
Author-X-Name-First: Gauss M.
Author-X-Name-Last: Cordeiro
Title: A new flexible generalized family for constructing many families of distributions
Abstract:
We propose a new flexible generalized family (NFGF) for constructing many families of distributions. The importance of the NFGF is that any baseline distribution can be chosen and it does not involve any additional parameters. Some useful statistical properties of the NFGF are determined such as a linear representation for the family density, analytical shapes of the density and hazard rate, random variable generation, moments and generating function. Further, the structural properties of a special model named the new flexible Kumaraswamy (NFKw) distribution, are investigated, and the model parameters are estimated by maximum-likelihood method. A simulation study is carried out to assess the performance of the estimates. The usefulness of the NFKw model is proved empirically by means of three real-life data sets. In fact, the two-parameter NFKw model performs better than three-parameter transmuted-Kumaraswamy, three-parameter exponentiated-Kumaraswamy and the well-known two-parameter Kumaraswamy models.
Journal: Journal of Applied Statistics
Pages: 1615-1635
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1874891
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1874891
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1615-1635
Template-Type: ReDIF-Article 1.0
Author-Name: Sajjad Piradl
Author-X-Name-First: Sajjad
Author-X-Name-Last: Piradl
Author-Name: Ali Shadrokh
Author-X-Name-First: Ali
Author-X-Name-Last: Shadrokh
Author-Name: Masoud Yarmohammadi
Author-X-Name-First: Masoud
Author-X-Name-Last: Yarmohammadi
Title: A robust estimation method for the linear regression model parameters with correlated error terms and outliers
Abstract:
Independence of error terms in a linear regression model, often not established. So a linear regression model with correlated error terms appears in many applications. According to the earlier studies, this kind of error terms, basically can affect the robustness of the linear regression model analysis. It is also shown that the robustness of the parameters estimators of a linear regression model can stay using the M-estimator. But considering that, it acquires this feature as the result of establishment of its efficiency. Whereas, it has been shown that the minimum Matusita distance estimators, has both features robustness and efficiency at the same time. On the other hand, because the Cochrane and Orcutt adjusted least squares estimators are not affected by the dependence of the error terms, so they are efficient estimators. Here we are using of a non-parametric kernel density estimation method, to give a new method of obtaining the minimum Matusita distance estimators for the linear regression model with correlated error terms in the presence of outliers. Also, simulation and real data study both are done for the introduced estimation method. In each case, the proposed method represents lower biases and mean squared errors than the other two methods.
Journal: Journal of Applied Statistics
Pages: 1663-1676
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1881454
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1881454
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1663-1676
Template-Type: ReDIF-Article 1.0
Author-Name: H. Rehman
Author-X-Name-First: H.
Author-X-Name-Last: Rehman
Author-Name: N. Chandra
Author-X-Name-First: N.
Author-X-Name-Last: Chandra
Author-Name: Fatemeh Sadat Hosseini-Baharanchi
Author-X-Name-First: Fatemeh Sadat
Author-X-Name-Last: Hosseini-Baharanchi
Author-Name: Ahmad Reza Baghestani
Author-X-Name-First: Ahmad Reza
Author-X-Name-Last: Baghestani
Author-Name: Mohamad Amin Pourhoseingholi
Author-X-Name-First: Mohamad Amin
Author-X-Name-Last: Pourhoseingholi
Title: Cause-specific hazard regression estimation for modified Weibull distribution under a class of non-informative priors
Abstract:
In time to event analysis, the situation of competing risks arises when the individual (or subject) may experience p mutually exclusive causes of death (failure), where cause-specific hazard function is of great importance in this framework. For instance, in malignancy-related death, colorectal cancer is one of the leading causes of the death in the world and death due to other causes considered as competing causes. We include prognostic variables in the model through parametric Cox proportional hazards model. Mostly, in literature exponential, Weibull, etc. distributions have been used for parametric modelling of cause-specific hazard function but they are incapable to accommodate non-monotone failure rate. Therefore, in this article, we consider a modified Weibull distribution which is capable to model survival data with non-monotonic behaviour of hazard rate. For estimating the cumulative cause-specific hazard function, we utilized maximum likelihood and Bayesian methods. A class of non-informative types of prior (uniform, Jeffrey’s and half-t) is introduced for Bayes estimation under squared error (symmetric) as well as LINEX (asymmetric) loss functions. A simulation study is performed for a comprehensive comparison of Bayes and maximum likelihood estimators of cumulative cause-specific hazard function. Real data on colorectal cancer is used to demonstrate the proposed model.
Journal: Journal of Applied Statistics
Pages: 1784-1801
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1882407
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1882407
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1784-1801
Template-Type: ReDIF-Article 1.0
Author-Name: Francisco Corona
Author-X-Name-First: Francisco
Author-X-Name-Last: Corona
Author-Name: Nelson Muriel
Author-X-Name-First: Nelson
Author-X-Name-Last: Muriel
Author-Name: Graciela González-Farías
Author-X-Name-First: Graciela
Author-X-Name-Last: González-Farías
Title: Dynamic factor structure of team performances in Liga MX
Abstract:
Team performance of the Mexican Football League (Liga MX), measured as the percentage of the total points obtained during each short tournament, is analyzed using Dynamic Factor Models (DFMs). The estimation of the common components is carried out with Principal Components and the stochastic nature of the DFM is studied through Panel Analysis of Non-stationarity in Idiosyncratic and Common Components. The results reveal that there are two common factors, one being possibly non-stationary. These factors show an interesting dynamic behavior in the league and allow to split the teams into two groups, namely, top competitors and emerging or relegated teams. Some discussion is given in this direction.
Journal: Journal of Applied Statistics
Pages: 1900-1912
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1881946
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1881946
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1900-1912
Template-Type: ReDIF-Article 1.0
Author-Name: Yujie Zhao
Author-X-Name-First: Yujie
Author-X-Name-Last: Zhao
Author-Name: Hao Yan
Author-X-Name-First: Hao
Author-X-Name-Last: Yan
Author-Name: Sarah Holte
Author-X-Name-First: Sarah
Author-X-Name-Last: Holte
Author-Name: Yajun Mei
Author-X-Name-First: Yajun
Author-X-Name-Last: Mei
Title: Rapid detection of hot-spots via tensor decomposition with applications to crime rate data
Abstract:
In many real-world applications of monitoring multivariate spatio-temporal data that are non-stationary over time, one is often interested in detecting hot-spots with spatial sparsity and temporal consistency, instead of detecting system-wise changes as in traditional statistical process control (SPC) literature. In this paper, we propose an efficient method to detect hot-spots through tensor decomposition, and our method has three steps. First, we fit the observed data into a Smooth Sparse Decomposition Tensor (SSD-Tensor) model that serves as a dimension reduction and de-noising technique: it is an additive model decomposing the original data into: smooth but non-stationary global mean, sparse local anomalies, and random noises. Next, we estimate model parameters by the penalized framework that includes Least Absolute Shrinkage and Selection Operator (LASSO) and fused LASSO penalty. An efficient recursive optimization algorithm is developed based on Fast Iterative Shrinkage Thresholding Algorithm (FISTA). Finally, we apply a Cumulative Sum (CUSUM) Control Chart to monitor model residuals after removing global means, which helps to detect when and where hot-spots occur. To demonstrate the usefulness of our proposed SSD-Tensor method, we compare it with several other methods including scan statistics, LASSO-based, PCA-based, T2-based control chart in extensive numerical simulation studies and a real crime rate dataset.
Journal: Journal of Applied Statistics
Pages: 1636-1662
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1874892
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1874892
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1636-1662
Template-Type: ReDIF-Article 1.0
Author-Name: Sobom M. Somé
Author-X-Name-First: Sobom M.
Author-X-Name-Last: Somé
Author-Name: Célestin C. Kokonendji
Author-X-Name-First: Célestin C.
Author-X-Name-Last: Kokonendji
Title: Bayesian selector of adaptive bandwidth for multivariate gamma kernel estimator on [0,∞ )d
Abstract:
Bayesian bandwidth selections in multivariate associated kernel estimation of probability density functions are known to improve classical methods such as cross-validation techniques in terms of execution time and smoothing quality. The paper focuses on a basic multivariate gamma kernel which is appropriated to estimate densities with support
$ [0,\infty )^d $ [0,∞)d. For this purpose, we consider a Bayesian adaptive estimation of the bandwidths vector under the usual quadratic loss function. The exact expression of the posterior distribution and the vector of bandwidths are obtained. Simulations studies highlight the excellent performance of the proposed approach, comparing to the global cross-validation bandwidth selection, and under integrated squared errors. Two bivariate and trivariate applications to the Old Faithful geyser data and new ones on drinking water pumps in the Sahel, respectively, are made.
Journal: Journal of Applied Statistics
Pages: 1692-1713
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1881456
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1881456
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1692-1713
Template-Type: ReDIF-Article 1.0
Author-Name: Seonghun Cho
Author-X-Name-First: Seonghun
Author-X-Name-Last: Cho
Author-Name: Johan Lim
Author-X-Name-First: Johan
Author-X-Name-Last: Lim
Author-Name: Woncheol Jang
Author-X-Name-First: Woncheol
Author-X-Name-Last: Jang
Title: How many people participated in candlelight protests? Counting the size of a dynamic crowd
Abstract:
The recent controversy about the size of crowds at candlelight protests in Korea raises an interesting question regarding the methods used to estimate crowd size. Protest organizers tend to count all participants in the event from its start to finish, while the police usually report the crowd size at its peak. While several counting methods are available to estimate the size of a crowd at a given time, counting the total number of the participants at a protest is not straightforward. In this paper, we propose a new estimator to count the total number of participants that we call the size of a dynamic crowd. We assume that the arrival and departure times of the crowd are randomly observed and that the number of the attendees in the crowd at a specific time is estimable. We estimate the number of total attendees during the entire gathering based on the capture-recapture model. We also propose a bootstrap procedure to construct a confidence interval for the crowd size. We demonstrate the performance of the proposed method with simulation studies and the data from Korea's March for Science, a global event across the world on Earth Day, April 22, 2017.
Journal: Journal of Applied Statistics
Pages: 1890-1899
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1871591
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1871591
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1890-1899
Template-Type: ReDIF-Article 1.0
Author-Name: Somayeh Momenyan
Author-X-Name-First: Somayeh
Author-X-Name-Last: Momenyan
Author-Name: Farzane Ahmadi
Author-X-Name-First: Farzane
Author-X-Name-Last: Ahmadi
Author-Name: Jalal Poorolajal
Author-X-Name-First: Jalal
Author-X-Name-Last: Poorolajal
Title: Competing risks model for clustered data based on the subdistribution hazards with spatial random effects
Abstract:
In some applications, the clustered survival data are arranged spatially such as clinical centers or geographical regions. Incorporating spatial variation in these data not only can improve the accuracy and efficiency of the parameter estimation, but it also investigates the spatial patterns of survivorship for identifying high-risk areas. Competing risks in survival data concern a situation where there is more than one cause of failure, but only the occurrence of the first one is observable. In this paper, we considered Bayesian subdistribution hazard regression models with spatial random effects for the clustered HIV/AIDS data. An intrinsic conditional autoregressive (ICAR) distribution was employed to model the areal spatial random effects. Comparison among competing models was performed by the deviance information criterion. We illustrated the gains of our model through application to the HIV/AIDS data and the simulation studies.
Journal: Journal of Applied Statistics
Pages: 1802-1820
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1884208
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1884208
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1802-1820
Template-Type: ReDIF-Article 1.0
Author-Name: Cong Li
Author-X-Name-First: Cong
Author-X-Name-Last: Li
Author-Name: Haixiang Zhang
Author-X-Name-First: Haixiang
Author-X-Name-Last: Zhang
Author-Name: Dehui Wang
Author-X-Name-First: Dehui
Author-X-Name-Last: Wang
Title: Modelling and monitoring of INAR(1) process with geometrically inflated Poisson innovations
Abstract:
To analyse count time series data inflated at the r + 1 values
$ \{0,1,\ldots ,r\} $ {0,1,…,r}, we propose a new first-order integer-valued autoregressive process with r-geometrically inflated Poisson innovations. Some statistical properties together with conditional maximum likelihood estimate are provided. For the purpose of statistical monitoring, we focus on the cumulative sum chart, exponentially weighted moving average chart and combined jumps chart towards the proposed process. Numerical simulations indicate that the conditional maximum likelihood estimator is unbiased. Moreover, the cumulative sum chart is the best choice to monitor our model in practice. Some applications about telephone complaints data are provided to illustrate the proposed methods.
Journal: Journal of Applied Statistics
Pages: 1821-1847
Issue: 7
Volume: 49
Year: 2022
Month: 05
X-DOI: 10.1080/02664763.2021.1884206
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1884206
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:7:p:1821-1847
Template-Type: ReDIF-Article 1.0
Author-Name: Christian H. Weiß
Author-X-Name-First: Christian H.
Author-X-Name-Last: Weiß
Author-Name: Annika Homburg
Author-X-Name-First: Annika
Author-X-Name-Last: Homburg
Author-Name: Layth C. Alwan
Author-X-Name-First: Layth C.
Author-X-Name-Last: Alwan
Author-Name: Gabriel Frahm
Author-X-Name-First: Gabriel
Author-X-Name-Last: Frahm
Author-Name: Rainer Göb
Author-X-Name-First: Rainer
Author-X-Name-Last: Göb
Title: Efficient accounting for estimation uncertainty in coherent forecasting of count processes
Abstract:
Coherent forecasting techniques for count processes generate forecasts that consist of count values themselves. In practice, forecasting always relies on a fitted model and so the obtained forecast values are affected by estimation uncertainty. Thus, they may differ from the true forecast values as they would have been obtained from the true data generating process. We propose a computationally efficient resampling scheme that allows to express the uncertainty in common types of coherent forecasts for count processes. The performance of the resampling scheme, which results in ensembles of forecast values, is investigated in a simulation study. A real-data example is used to demonstrate the application of the proposed approach in practice. It is shown that the obtained ensembles of forecast values can be presented in a visual way that allows for an intuitive interpretation.
Journal: Journal of Applied Statistics
Pages: 1957-1978
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1887104
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1887104
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:1957-1978
Template-Type: ReDIF-Article 1.0
Author-Name: Han Yu
Author-X-Name-First: Han
Author-X-Name-Last: Yu
Author-Name: Shanhe Jiang
Author-X-Name-First: Shanhe
Author-X-Name-Last: Jiang
Author-Name: Hong Huang
Author-X-Name-First: Hong
Author-X-Name-Last: Huang
Title: Spatio-temporal parse network-based trajectory modeling on the dynamics of criminal justice system
Abstract:
We extend the existing group-based trajectory modeling by proposing the network-based trajectory modeling based on judicious design and analysis of a spatio-temporal parse network (STPN) as a representation of neighborhood structure that evolves in time. The STPN offers a principled qualitative specification for an explicit paradigm framework to deal with complex real-world problems. The framework is completed by developing a quantitative specification of latent field representation to merge seamlessly on or alongside the established STPN via hierarchical modeling. The models adopt spatial random effects to characterize the heterogeneity and autocorrelation over the locations where nonlinear trajectories were observed. The trajectories are then investigated in the presence of the operational constraints of the dependence structure induced by the spatial and temporal dimensions. With the framework, complex developmental trajectory problems can be discerned, communicated, diagnosed and modeled in a relatively simple way that interpretation is accessible to nontechnical audiences and quickly comprehensible to technically sophisticated audiences. The proposed modeling is applied to address the challenges of the trajectory modeling of nonlinear dynamics arising from a motivating criminal justice empirical process.
Journal: Journal of Applied Statistics
Pages: 1979-2000
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1887101
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1887101
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:1979-2000
Template-Type: ReDIF-Article 1.0
Author-Name: H. E. Correia
Author-X-Name-First: H. E.
Author-X-Name-Last: Correia
Author-Name: A. Abebe
Author-X-Name-First: A.
Author-X-Name-Last: Abebe
Title: Capturing spatiotemporal dynamics of Alaskan groundfish catch using signed-rank estimation for varying coefficient models
Abstract:
Varying coefficient models (VCMs) are commonly used for their high degree of flexibility in modeling complex systems. Many applications in fisheries utilize VCMs to capture spatial variation in populations of marine fishes. All of these applications use the penalized least squares method for estimation. However, this approach is known to be sensitive to non-normal distributions and outliers, a common feature of ecological data. Robust estimation methods are more appropriate for handling noisy and non-normal data. We present the application of a signed-rank-based procedure for obtaining robust estimates in VCMs on a fisheries dataset from the North Pacific Ocean. We demonstrates that the signed-rank-based estimation method provides better fit and improved prediction in comparison to the classical likelihood VCM fits in both simulations and the real data application, particularly when the distributions are non-normal and may be misspecified. Rank-based estimation of VCMs is therefore valuable for modeling ecological data and obtaining useful inferences where non-normality and outliers are common.
Journal: Journal of Applied Statistics
Pages: 2137-2156
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1889996
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1889996
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:2137-2156
Template-Type: ReDIF-Article 1.0
Author-Name: Adewale F. Lukman
Author-X-Name-First: Adewale F.
Author-X-Name-Last: Lukman
Author-Name: Benedicta Aladeitan
Author-X-Name-First: Benedicta
Author-X-Name-Last: Aladeitan
Author-Name: Kayode Ayinde
Author-X-Name-First: Kayode
Author-X-Name-Last: Ayinde
Author-Name: Mohamed R. Abonazel
Author-X-Name-First: Mohamed R.
Author-X-Name-Last: Abonazel
Title: Modified ridge-type for the Poisson regression model: simulation and application
Abstract:
The Poisson regression model (PRM) is employed in modelling the relationship between a count variable (y) and one or more explanatory variables. The parameters of PRM are popularly estimated using the Poisson maximum likelihood estimator (PMLE). There is a tendency that the explanatory variables grow together, which results in the problem of multicollinearity. The variance of the PMLE becomes inflated in the presence of multicollinearity. The Poisson ridge regression (PRRE) and Liu estimator (PLE) have been suggested as an alternative to the PMLE. However, in this study, we propose a new estimator to estimate the regression coefficients for the PRM when multicollinearity is a challenge. We perform a simulation study under different specifications to assess the performance of the new estimator and the existing ones. The performance was evaluated using the scalar mean square error criterion and the mean squared error prediction error. The aircraft damage data was adopted for the application study and the estimators’ performance judged by the SMSE and the mean squared prediction error. The theoretical comparison shows that the proposed estimator outperforms other estimators. This is further supported by the simulation study and the application result.
Journal: Journal of Applied Statistics
Pages: 2124-2136
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1889998
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1889998
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:2124-2136
Template-Type: ReDIF-Article 1.0
Author-Name: Sara Ghanbari
Author-X-Name-First: Sara
Author-X-Name-Last: Ghanbari
Author-Name: Abdolhamid Rezaei Roknabadi
Author-X-Name-First: Abdolhamid
Author-X-Name-Last: Rezaei Roknabadi
Author-Name: Mahdi Salehi
Author-X-Name-First: Mahdi
Author-X-Name-Last: Salehi
Title: Estimation of stress–strength reliability for Marshall–Olkin distributions based on progressively Type-II censored samples
Abstract:
We are mainly interested in estimating the stress–strength parameter, say
$ \mathcal {R} $ R, when the parent distribution follows the well-known Marshall–Olkin model and the accessible data has the form of the progressively Type-II censored samples. In this case, the stress–strength parameter is free of the base distribution employed in the Marshall–Olkin model. Thus, we use the exponential distribution for simplicity. The maximum likelihood methods as well as some Bayesian approaches are used for the estimation purpose. The corresponding estimators of the latter approach are obtained by using Lindley's approximation and Gibbs sampling methods since the Bayesian estimator of
$ \mathcal {R} $ R cannot be obtained as an explicit form. Moreover, some confidence intervals of various types are derived for
$ \mathcal {R} $ R and then compared via a Monte Carlo simulation. Finally, the survival times of head and neck cancer patients are analyzed by two therapies for illustrating purposes.
Journal: Journal of Applied Statistics
Pages: 1913-1934
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1884207
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1884207
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:1913-1934
Template-Type: ReDIF-Article 1.0
Author-Name: Yasaman Jabbari
Author-X-Name-First: Yasaman
Author-X-Name-Last: Jabbari
Author-Name: Robert Cribbie
Author-X-Name-First: Robert
Author-X-Name-Last: Cribbie
Title: Negligible interaction test for continuous predictors
Abstract:
Behavioral science researchers are often interested in whether there is negligible interaction among continuous predictors of an outcome variable. For example, a researcher might be interested in demonstrating that the effect of perfectionism on depression is very consistent across age. In this case, the researcher is interested in assessing whether the interaction between the predictors is too small to be meaningful. Unfortunately, most researchers address the above research question using a traditional association-based null hypothesis test (e.g. regression) where their goal is to fail to reject the null hypothesis of no interaction. Common problems with traditional tests are their sensitivity to sample size and their opposite (and hence inappropriate) hypothesis setup for finding a negligible interaction effect. In this study, we investigated a method for testing for negligible interaction between continuous predictors using unstandardized and standardized regression-based models and equivalence testing. A Monte Carlo study provides evidence for the effectiveness of the equivalence-based test relative to traditional approaches.
Journal: Journal of Applied Statistics
Pages: 2001-2015
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1887102
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1887102
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:2001-2015
Template-Type: ReDIF-Article 1.0
Author-Name: Hyunsung Kim
Author-X-Name-First: Hyunsung
Author-X-Name-Last: Kim
Author-Name: Yaeji Lim
Author-X-Name-First: Yaeji
Author-X-Name-Last: Lim
Title: Bootstrap aggregated classification for sparse functional data
Abstract:
Sparse functional data are commonly observed in real-data analyzes. For such data, we propose a new classification method based on functional principal component analysis (FPCA) and bootstrap aggregating. Bootstrap aggregating is believed to improve the single classifier. In this paper, we apply this belief to an FPCA based classification, and compare the classification performance with that of the single classifiers. The simulation results show that the proposed method performs better than the conventional single classifiers. We then conduct two real-data analyzes.
Journal: Journal of Applied Statistics
Pages: 2052-2063
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1889997
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1889997
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:2052-2063
Template-Type: ReDIF-Article 1.0
Author-Name: Nileshkumar H. Jadhav
Author-X-Name-First: Nileshkumar H.
Author-X-Name-Last: Jadhav
Title: A new linearized ridge Poisson estimator in the presence of multicollinearity
Abstract:
Poisson regression is a very commonly used technique for modeling the count data in applied sciences, in which the model parameters are usually estimated by the maximum likelihood method. However, the presence of multicollinearity inflates the variance of maximum likelihood (ML) estimator and the estimated parameters give unstable results. In this article, a new linearized ridge Poisson estimator is introduced to deal with the problem of multicollinearity. Based on the asymptotic properties of ML estimator, the bias, covariance and mean squared error of the proposed estimator are obtained and the optimal choice of shrinkage parameter is derived. The performance of the existing estimators and proposed estimator is evaluated through Monte Carlo simulations and two real data applications. The results clearly reveal that the proposed estimator outperforms the existing estimators in the mean squared error sense.
Journal: Journal of Applied Statistics
Pages: 2016-2034
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1887103
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1887103
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:2016-2034
Template-Type: ReDIF-Article 1.0
Author-Name: J. C. S. Vasconcelos
Author-X-Name-First: J. C. S.
Author-X-Name-Last: Vasconcelos
Author-Name: E. M. M. Ortega
Author-X-Name-First: E. M. M.
Author-X-Name-Last: Ortega
Author-Name: J. S. Vasconcelos
Author-X-Name-First: J. S.
Author-X-Name-Last: Vasconcelos
Author-Name: G. M. Cordeiro
Author-X-Name-First: G. M.
Author-X-Name-Last: Cordeiro
Author-Name: A. L. Vivan
Author-X-Name-First: A. L.
Author-X-Name-Last: Vivan
Author-Name: M. A. M. Biaggioni
Author-X-Name-First: M. A. M.
Author-X-Name-Last: Biaggioni
Title: A new heteroscedastic regression to analyze mass loss of wood in civil construction in Brazil
Abstract:
A heteroscedastic regression based on the odd log-logistic Marshall–Olkin normal (OLLMON) distribution is defined by extending previous models. Some structural properties of this distribution are presented. The estimation of the parameters is addressed by maximum likelihood. For different parameter settings, sample sizes and some scenarios, various simulations investigate the performance of the heteroscedastic OLLMON regression. We use residual analysis to detect influential observations and to check the model assumptions. The new regression explains the mass loss of different wood species in civil construction in Brazil.
Journal: Journal of Applied Statistics
Pages: 2035-2051
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1890001
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1890001
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:2035-2051
Template-Type: ReDIF-Article 1.0
Author-Name: Amarnath Nandy
Author-X-Name-First: Amarnath
Author-X-Name-Last: Nandy
Author-Name: Ayanendranath Basu
Author-X-Name-First: Ayanendranath
Author-X-Name-Last: Basu
Author-Name: Abhik Ghosh
Author-X-Name-First: Abhik
Author-X-Name-Last: Ghosh
Title: Robust inference for skewed data in health sciences
Abstract:
Health data are often not symmetric to be adequately modeled through the usual normal distributions; most of them exhibit skewed patterns. They can indeed be modeled better through the larger family of skew-normal distributions covering both skewed and symmetric cases. Since outliers are not uncommon in complex real-life experimental datasets, a robust methodology automatically taking care of the noises in the data would be of great practical value to produce stable and more precise research insights leading to better policy formulation. In this paper, we develop a class of robust estimators and testing procedures for the family of skew-normal distributions using the minimum density power divergence approach with application to health data. In particular, a robust procedure for testing of symmetry is discussed in the presence of outliers. Two efficient computational algorithms are discussed. Besides deriving the asymptotic and robustness theory for the proposed methods, their advantages and utilities are illustrated through simulations and a couple of real-life applications for health data of athletes from Australian Institute of Sports and AIDS clinical trial data.
Journal: Journal of Applied Statistics
Pages: 2093-2123
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1891527
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1891527
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:2093-2123
Template-Type: ReDIF-Article 1.0
Author-Name: Emrah Altun
Author-X-Name-First: Emrah
Author-X-Name-Last: Altun
Author-Name: Gauss M. Cordeiro
Author-X-Name-First: Gauss M.
Author-X-Name-Last: Cordeiro
Author-Name: Miroslav M. Ristić
Author-X-Name-First: Miroslav M.
Author-X-Name-Last: Ristić
Title: An one-parameter compounding discrete distribution
Abstract:
In this study, a new one-parameter discrete distribution obtained by compounding the Poisson and xgamma distributions is proposed. Some statistical properties of the new distribution are obtained including moments and probability and moment generating functions. Two methods are used for the estimation of the unknown parameter: the maximum likelihood method and the method of moments. Additionally, the count regression model and integer-valued autoregressive process of the proposed distribution are introduced. Some possible applications of the introduced models are considered and discussed.
Journal: Journal of Applied Statistics
Pages: 1935-1956
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1884846
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1884846
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:1935-1956
Template-Type: ReDIF-Article 1.0
Author-Name: Amulya Kumar Mahto
Author-X-Name-First: Amulya Kumar
Author-X-Name-Last: Mahto
Author-Name: Chandrakant Lodhi
Author-X-Name-First: Chandrakant
Author-X-Name-Last: Lodhi
Author-Name: Yogesh Mani Tripathi
Author-X-Name-First: Yogesh Mani
Author-X-Name-Last: Tripathi
Author-Name: Liang Wang
Author-X-Name-First: Liang
Author-X-Name-Last: Wang
Title: Inference for partially observed competing risks model for Kumaraswamy distribution under generalized progressive hybrid censoring
Abstract:
In this paper, inference for a competing risks model is studied when latent failure times follow Kumaraswamy distribution and causes of failure are partially observed. Under generalized progressive hybrid censoring, existence and uniqueness of maximum likelihood estimators of model parameters are established. The confidence intervals are obtained by using asymptotic distribution theory. We further compute Bayes estimators along with credible intervals. In addition, inference is also discussed when there is order restricted shape parameters. The performance of all estimates is investigated using Monte-Carlo simulations. Finally, analysis of a real data set is presented for illustration purposes.
Journal: Journal of Applied Statistics
Pages: 2064-2092
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1889999
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1889999
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:2064-2092
Template-Type: ReDIF-Article 1.0
Author-Name: Marcus Gerardus L. Nascimento
Author-X-Name-First: Marcus Gerardus
Author-X-Name-Last: L. Nascimento
Author-Name: Ralph S. Silva
Author-X-Name-First: Ralph S.
Author-X-Name-Last: Silva
Author-Name: Mario Jorge Mendonça
Author-X-Name-First: Mario Jorge
Author-X-Name-Last: Mendonça
Author-Name: Amaro Olimpio Pereira Jr.
Author-X-Name-First: Amaro Olimpio
Author-X-Name-Last: Pereira Jr.
Title: Estimating the efficiency of Brazilian electricity distribution utilities
Abstract:
This paper proposes a differing methodology from the Brazilian Electricity Regulatory Agency on the efficiency estimation for the Brazilian electricity distribution sector. Our proposal combines robust state-space models and stochastic frontier analysis to measure the operational cost efficiency in a panel data set from 60 Brazilian electricity distribution utilities. The modeling joins the main literature in energy economics with advanced econometric and statistic techniques in order to estimate the efficiencies. Moreover, the suggested model is able to deal with changes in the inefficiencies across time whilst the Bayesian paradigm – through Markov chain Monte Carlo techniques – facilitates the inference on all unknowns. The method enables a significant degree of flexibility in the resultant efficiencies and a complete photography about the distribution sector.
Journal: Journal of Applied Statistics
Pages: 2157-2166
Issue: 8
Volume: 49
Year: 2022
Month: 06
X-DOI: 10.1080/02664763.2021.1890000
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1890000
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:8:p:2157-2166
Template-Type: ReDIF-Article 1.0
Author-Name: Juan Sosa
Author-X-Name-First: Juan
Author-X-Name-Last: Sosa
Author-Name: Lina Buitrago
Author-X-Name-First: Lina
Author-X-Name-Last: Buitrago
Title: Time-varying coefficient model estimation through radial basis functions
Abstract:
In this paper, we estimate the dynamic parameters of a time-varying coefficient model through radial kernel functions in the context of a longitudinal study. Our proposal is based on a linear combination of weighted kernel functions involving a bandwidth, centered around a given set of time points. In addition, we study different alternatives of estimation and inference including a Frequentist approach using weighted least squares along with bootstrap methods, and a Bayesian approach through both Markov chain Monte Carlo and variational methods. We compare the estimation strategies mention above with each other, and our radial kernel functions proposal with an expansion based on regression spline, by means of an extensive simulation study considering multiples scenarios in terms of sample size, number of repeated measurements, and subject-specific correlation. Our experiments show that the capabilities of our proposal based on radial kernel functions are indeed comparable with or even better than those obtained from regression splines. We illustrate our methodology by analyzing data from two AIDS clinical studies.
Journal: Journal of Applied Statistics
Pages: 2510-2534
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1910938
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1910938
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2510-2534
Template-Type: ReDIF-Article 1.0
Author-Name: Rezoanoor Rahman
Author-X-Name-First: Rezoanoor
Author-X-Name-Last: Rahman
Author-Name: M. Iftakhar Alam
Author-X-Name-First: M.
Author-X-Name-Last: Iftakhar Alam
Title: Stopping for efficacy in single-arm phase II clinical trials
Abstract:
Phase II clinical trials investigate whether a new drug or treatment has sufficient evidence of effectiveness against the disease under study. Two-stage designs are popular for phase II since they can stop in the first stage if the drug is ineffective. Investigators often face difficulties in determining the target response rates, and adaptive designs can help to set the target response rate tested in the second stage based on the number of responses observed in the first stage. Popular adaptive designs consider two alternate response rates, and they generally minimise the expected sample size at the maximum uninterested response rate. Moreover, these designs consider only futility as the reason for early stopping and have high expected sample sizes if the provided drug is effective. Motivated by this problem, we propose an adaptive design that enables us to terminate the single-arm trial at the first stage for efficacy and conclude which alternate response rate to choose. Comparing the proposed design with a popular adaptive design from literature reveals that the expected sample size decreases notably if any of the two target response rates are correct. In contrast, the expected sample size remains almost the same under the null hypothesis.
Journal: Journal of Applied Statistics
Pages: 2447-2466
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1904846
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1904846
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2447-2466
Template-Type: ReDIF-Article 1.0
Author-Name: Yingzi Li
Author-X-Name-First: Yingzi
Author-X-Name-Last: Li
Author-Name: Huinan Liu
Author-X-Name-First: Huinan
Author-X-Name-Last: Liu
Author-Name: Nairanjana Dasgupta
Author-X-Name-First: Nairanjana
Author-X-Name-Last: Dasgupta
Title: Binary Dynamic Logit for Correlated Ordinal: estimation, application and simulation
Abstract:
We evaluate the estimation performance of the Binary Dynamic Logit model for correlated ordinal variables (BDLCO model), and compare it to GEE and Ordinal Logistic Regression performance in terms of bias and Mean Absolute Percentage Error (MAPE) via Monte Carlo simulation. Our results indicate that when the proportional-odds assumption does not hold, the proposed BDLCO method is superior to existing models in estimating correlated ordinal data. Moreover, this method is flexible in terms of modeling dependence and allows unequal slopes for each category, and can be used to estimate an apple bloom data set where the proportional-odds assumption is violated. We also provide a function in R to implement BDLCO.
Journal: Journal of Applied Statistics
Pages: 2657-2673
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1906849
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1906849
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2657-2673
Template-Type: ReDIF-Article 1.0
Author-Name: Diego Salmerón
Author-X-Name-First: Diego
Author-X-Name-Last: Salmerón
Title: Bayesian beta nonlinear models with constrained parameters to describe ruminal degradation kinetics
Abstract:
The models used to describe the kinetics of ruminal degradation are usually nonlinear models where the dependent variable is the proportion of degraded food. The method of least squares is the standard approach used to estimate the unknown parameters but this method can lead to unacceptable predictions. To solve this issue, a beta nonlinear model and the Bayesian perspective is proposed in this article. The application of standard methodologies to obtain prior distributions, such as the Jeffreys prior or the reference priors, involves serious difficulties here because this model is a nonlinear non-normal regression model, and the constrained parameters appear in the log-likelihood function through the Gamma function. This paper proposes an objective method to obtain the prior distribution, which can be applied to other models with similar complexity, can be easily implemented in OpenBUGS, and solves the problem of unacceptable predictions. The model is generalized to a larger class of models. The methodology was applied to real data with three models that were compared using the Deviance Information Criterion and the root mean square prediction error. A simulation study was performed to evaluate the coverage of the credible intervals.
Journal: Journal of Applied Statistics
Pages: 2612-2628
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1913105
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1913105
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2612-2628
Template-Type: ReDIF-Article 1.0
Author-Name: S. Arona Diop
Author-X-Name-First: S. Arona
Author-X-Name-Last: Diop
Author-Name: Thierry Duchesne
Author-X-Name-First: Thierry
Author-X-Name-Last: Duchesne
Author-Name: Steven G. Cumming
Author-X-Name-First: Steven
Author-X-Name-Last: G. Cumming
Author-Name: Awa Diop
Author-X-Name-First: Awa
Author-X-Name-Last: Diop
Author-Name: Denis Talbot
Author-X-Name-First: Denis
Author-X-Name-Last: Talbot
Title: Confounding adjustment methods for multi-level treatment comparisons under lack of positivity and unknown model specification
Abstract:
Imbalances in covariates between treatment groups are frequent in observational studies and can lead to biased comparisons. Various adjustment methods can be employed to correct these biases in the context of multi-level treatments (> 2). Analytical challenges, such as positivity violations and incorrect model specification due to unknown functional relationships between covariates and treatment or outcome, may affect their ability to yield unbiased results. Such challenges were expected in a comparison of fire-suppression interventions for preventing fire growth. We identified the overlap weights, augmented overlap weights, bias-corrected matching and targeted maximum likelihood as methods with the best potential to address those challenges. A simple variance estimator for the overlap weight estimators that can naturally be combined with machine learning is proposed. In a simulation study, we investigated the performance of these methods as well as those of simpler alternatives. Adjustment methods that included an outcome modeling component performed better than those that focused on the treatment mechanism in our simulations. Additionally, machine learning implementation was observed to efficiently compensate for the unknown model specification for the former methods, but not the latter. Based on these results, we compared the effectiveness of fire-suppression interventions using the augmented overlap weight estimator.
Journal: Journal of Applied Statistics
Pages: 2570-2592
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1911966
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1911966
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2570-2592
Template-Type: ReDIF-Article 1.0
Author-Name: Zhongnan Jin
Author-X-Name-First: Zhongnan
Author-X-Name-Last: Jin
Author-Name: Lu Lu
Author-X-Name-First: Lu
Author-X-Name-Last: Lu
Author-Name: Khaled Bedair
Author-X-Name-First: Khaled
Author-X-Name-Last: Bedair
Author-Name: Yili Hong
Author-X-Name-First: Yili
Author-X-Name-Last: Hong
Title: Modeling bivariate geyser eruption system with covariate-adjusted recurrent event process
Abstract:
Geyser eruption is one of the most popular signature attractions at the Yellowstone National Park. The interdependence of geyser eruptions and impacts of covariates are of interest to researchers in geyser studies. In this paper, we propose a parametric covariate-adjusted recurrent event model for estimating the eruption gap time. We describe a general bivariate recurrent event process, where a bivariate lognormal distribution and a Gumbel copula with different marginal distributions are used to model an interdependent dual-type event system. The maximum likelihood approach is used to estimate model parameters. The proposed method is applied to analyzing the Yellowstone geyser eruption data for a bivariate geyser system and offers a deeper understanding of the event occurrence mechanism of individual events as well as the system as a whole. A comprehensive simulation study is conducted to evaluate the performance of the proposed method.
Journal: Journal of Applied Statistics
Pages: 2488-2509
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1910937
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1910937
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2488-2509
Template-Type: ReDIF-Article 1.0
Author-Name: M. S. Eliwa
Author-X-Name-First: M. S.
Author-X-Name-Last: Eliwa
Author-Name: M. El-Morshedy
Author-X-Name-First: M.
Author-X-Name-Last: El-Morshedy
Title: A one-parameter discrete distribution for over-dispersed data: statistical and reliability properties with applications
Abstract:
In the literature of distribution theory, a vast proportion is acquired by discrete distributions and their applications in real-world phenomena. However, in a rapidly changing technological era, the data generated is becoming increasingly complex day by day, making it difficult for us to capture various aspects of this real data through existing discrete models. In view of this, we propose a new flexible discrete distribution with one parameter. Some statistical and reliability are derived. These properties can be expressed as closed-forms. One of the important virtues of this newly evolved model is that it can model not only over-dispersed, positively skewed and leptokurtic data sets, but it can also be utilized for modeling increasing, decreasing and unimodal failure rate. Various estimation approaches are utilized to estimate the model parameter. A simulation study is carried out to examine the performance of the estimators for different sample size. The flexibility of the new model for analyzing different types of data is explained by utilizing four real data sets in different fields. Finally, the proposed model can serve as an alternative model to other distributions in the existing literature for modeling positive real data in several areas.
Journal: Journal of Applied Statistics
Pages: 2467-2487
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1905787
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1905787
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2467-2487
Template-Type: ReDIF-Article 1.0
Author-Name: Guogen Shan
Author-X-Name-First: Guogen
Author-X-Name-Last: Shan
Title: Conservative confidence intervals for the intraclass correlation coefficient for clustered binary data
Abstract:
Asymptotic approaches are traditionally used to calculate confidence intervals for intraclass correlation coefficient in a clustered binary study. When sample size is small to medium, or correlation or response rate is near the boundary, asymptotic intervals often do not have satisfactory performance with regard to coverage. We propose using the importance sampling method to construct the profile confidence limits for the intraclass correlation coefficient. Importance sampling is a simulation based approach to reduce the variance of the estimated parameter. Four existing asymptotic limits are used as statistical quantities for sample space ordering in the importance sampling method. Simulation studies are performed to evaluate the performance of the proposed accurate intervals with regard to coverage and interval width. Simulation results indicate that the accurate intervals based on the asymptotic limits by Fleiss and Cuzick generally have shorter width than others in many cases, while the accurate intervals based on Zou and Donner asymptotic limits outperform others when correlation and response rate are close to their boundaries.
Journal: Journal of Applied Statistics
Pages: 2535-2549
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1910939
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1910939
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2535-2549
Template-Type: ReDIF-Article 1.0
Author-Name: Rosa Arboretti
Author-X-Name-First: Rosa
Author-X-Name-Last: Arboretti
Author-Name: Riccardo Ceccato
Author-X-Name-First: Riccardo
Author-X-Name-Last: Ceccato
Author-Name: Luca Pegoraro
Author-X-Name-First: Luca
Author-X-Name-Last: Pegoraro
Author-Name: Luigi Salmaso
Author-X-Name-First: Luigi
Author-X-Name-Last: Salmaso
Author-Name: Chris Housmekerides
Author-X-Name-First: Chris
Author-X-Name-Last: Housmekerides
Author-Name: Luca Spadoni
Author-X-Name-First: Luca
Author-X-Name-Last: Spadoni
Author-Name: Elisabetta Pierangelo
Author-X-Name-First: Elisabetta
Author-X-Name-Last: Pierangelo
Author-Name: Sara Quaggia
Author-X-Name-First: Sara
Author-X-Name-Last: Quaggia
Author-Name: Catherine Tveit
Author-X-Name-First: Catherine
Author-X-Name-Last: Tveit
Author-Name: Sebastiano Vianello
Author-X-Name-First: Sebastiano
Author-X-Name-Last: Vianello
Title: Machine learning and design of experiments with an application to product innovation in the chemical industry
Abstract:
Industrial statistics plays a major role in the areas of both quality management and innovation. However, existing methodologies must be integrated with the latest tools from the field of Artificial Intelligence. To this end, a background on the joint application of Design of Experiments (DOE) and Machine Learning (ML) methodologies in industrial settings is presented here, along with a case study from the chemical industry. A DOE study is used to collect data, and two ML models are applied to predict responses which performance show an advantage over the traditional modeling approach. Emphasis is placed on causal investigation and quantification of prediction uncertainty, as these are crucial for an assessment of the goodness and robustness of the models developed. Within the scope of the case study, the models learned can be implemented in a semi-automatic system that can assist practitioners who are inexperienced in data analysis in the process of new product development.
Journal: Journal of Applied Statistics
Pages: 2674-2699
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1907840
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1907840
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2674-2699
Template-Type: ReDIF-Article 1.0
Author-Name: Hannes Kazianka
Author-X-Name-First: Hannes
Author-X-Name-Last: Kazianka
Author-Name: Anna Morgenbesser
Author-X-Name-First: Anna
Author-X-Name-Last: Morgenbesser
Author-Name: Thomas Nowak
Author-X-Name-First: Thomas
Author-X-Name-Last: Nowak
Title: Assessing the discriminatory power of loss given default models
Abstract:
For banks using the Advanced Internal Ratings-Based Approach in accordance with Basel III requirements, the amount of required regulatory capital relies on the banks' estimates of the probability of default, the loss given default and the conversion factor for their credit risk portfolio. Therefore, for both model development and validation, assessing the models' predictive and discriminatory abilities is of key importance in order to ensure an adequate quantification of risk. This paper compares different measures of discriminatory power suitable for multi-class target variables such as in loss given default (LGD) models, which are currently used among banks and supervisory authorities. This analysis highlights the disadvantages of using measures that solely rely on pairwise comparisons when applied in a multi-class setting. Thus, for multi-class classification problems, we suggest using a generalisation of the well-known area under the receiver operating characteristic (ROC) curve known as the volume under the ROC surface (VUS). Furthermore, we present the R-package VUROCS, which allows for a time-efficient computation of the VUS as well as associated (co)variance estimates and illustrate its usage based on real-world loss data and validation principles.
Journal: Journal of Applied Statistics
Pages: 2700-2716
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1910936
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1910936
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2700-2716
Template-Type: ReDIF-Article 1.0
Author-Name: Abdullah Mohammed Rashid
Author-X-Name-First: Abdullah
Author-X-Name-Last: Mohammed Rashid
Author-Name: Habshah Midi
Author-X-Name-First: Habshah
Author-X-Name-Last: Midi
Author-Name: Waleed Dhhan
Author-X-Name-First: Waleed
Author-X-Name-Last: Dhhan
Author-Name: Jayanthi Arasan
Author-X-Name-First: Jayanthi
Author-X-Name-Last: Arasan
Title: Detection of outliers in high-dimensional data using nu-support vector regression
Abstract:
Support Vector Regression (SVR) is gaining in popularity in the detection of outliers and classification problems in high-dimensional data (HDD) as this technique does not require the data to be of full rank. In real application, most of the data are of high dimensional. Classification of high-dimensional data is needed in applied sciences, in particular, as it is important to discriminate cancerous cells from non-cancerous cells. It is also imperative that outliers are identified before constructing a model on the relationship between the dependent and independent variables to avoid misleading interpretations about the fitting of a model. The standard SVR and the μ-ε-SVR are able to detect outliers; however, they are computationally expensive. The fixed parameters support vector regression (FP-ε-SVR) was put forward to remedy this issue. However, the FP-ε-SVR using ε-SVR is not very successful in identifying outliers. In this article, we propose an alternative method to detect outliers i.e. by employing nu-SVR. The merit of our proposed method is confirmed by three real examples and the Monte Carlo simulation. The results show that our proposed nu-SVR method is very successful in identifying outliers under a variety of situations, and with less computational running time.
Journal: Journal of Applied Statistics
Pages: 2550-2569
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1911965
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1911965
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2550-2569
Template-Type: ReDIF-Article 1.0
Author-Name: Saisai Ding
Author-X-Name-First: Saisai
Author-X-Name-Last: Ding
Author-Name: Hongyan Fang
Author-X-Name-First: Hongyan
Author-X-Name-Last: Fang
Author-Name: Xiang Dong
Author-X-Name-First: Xiang
Author-X-Name-Last: Dong
Author-Name: Wenzhi Yang
Author-X-Name-First: Wenzhi
Author-X-Name-Last: Yang
Title: The CUSUM statistics of change-point models based on dependent sequences
Abstract:
In this paper, we investigate the mean change-point models based on associated sequences. Under some weak conditions, we obtain a limit distribution of CUSUM statistic which can be used to judge the mean change-mount
$ \delta_n $ δn is satisfied or dissatisfied
$ n^{1/2}\delta_n=o(1) $ n1/2δn=o(1). We also study the consistency of sample covariances and change-point location statistics. Based on Normality and Lognormality data, some simulations such as empirical sizes, empirical powers and convergence are presented to test our results. As an important application, we use CUSUM statistics to do the mean change-point analysis for a financial series.
Journal: Journal of Applied Statistics
Pages: 2593-2611
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1913104
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1913104
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2593-2611
Template-Type: ReDIF-Article 1.0
Author-Name: Chi Zhang
Author-X-Name-First: Chi
Author-X-Name-Last: Zhang
Author-Name: Guo-Liang Tian
Author-X-Name-First: Guo-Liang
Author-X-Name-Last: Tian
Author-Name: Kam Chuen Yuen
Author-X-Name-First: Kam Chuen
Author-X-Name-Last: Yuen
Author-Name: Pengyi Liu
Author-X-Name-First: Pengyi
Author-X-Name-Last: Liu
Author-Name: Man-Lai Tang
Author-X-Name-First: Man-Lai
Author-X-Name-Last: Tang
Title: A new multivariate t distribution with variant tail weights and its application in robust regression analysis
Abstract:
In this paper, we propose a new kind of multivariate t distribution by allowing different degrees of freedom for each univariate component. Compared with the classical multivariate t distribution, it is more flexible in the model specification that can be used to deal with the variant amounts of tail weights on marginals in multivariate data modeling. In particular, it could include components following the multivariate normal distribution, and it contains the product of independent t-distributions as a special case. Subsequently, it is extended to the regression model as the joint distribution of the error terms. Important distributional properties are explored and useful statistical methods are developed. The flexibility of the specified structure in better capturing the characteristic of data is exemplified by both simulation studies and real data analyses.
Journal: Journal of Applied Statistics
Pages: 2629-2656
Issue: 10
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1913106
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1913106
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:10:p:2629-2656
Template-Type: ReDIF-Article 1.0
Author-Name: Jingqin Luo
Author-X-Name-First: Jingqin
Author-X-Name-Last: Luo
Author-Name: Feng Gao
Author-X-Name-First: Feng
Author-X-Name-Last: Gao
Author-Name: Jingxia Liu
Author-X-Name-First: Jingxia
Author-X-Name-Last: Liu
Author-Name: Guoqiao Wang
Author-X-Name-First: Guoqiao
Author-X-Name-Last: Wang
Author-Name: Ling Chen
Author-X-Name-First: Ling
Author-X-Name-Last: Chen
Author-Name: Anne M. Fagan
Author-X-Name-First: Anne M.
Author-X-Name-Last: Fagan
Author-Name: Gregory S. Day
Author-X-Name-First: Gregory S.
Author-X-Name-Last: Day
Author-Name: Jonathan Vöglein
Author-X-Name-First: Jonathan
Author-X-Name-Last: Vöglein
Author-Name: Jasmeer P. Chhatwal
Author-X-Name-First: Jasmeer P.
Author-X-Name-Last: Chhatwal
Author-Name: Chengjie Xiong
Author-X-Name-First: Chengjie
Author-X-Name-Last: Xiong
Author-Name:
Author-X-Name-First:
Author-X-Name-Last:
Title: Statistical estimation and comparison of group-specific bivariate correlation coefficients in family-type clustered studies
Abstract:
Bivariate correlation coefficients (BCCs) are often calculated to gauge the relationship between two variables in medical research. In a family-type clustered design where multiple participants from same units/families are enrolled, BCCs can be defined and estimated at various hierarchical levels (subject level, family level and marginal BCC). Heterogeneity usually exists between subject groups and, as a result, subject level BCCs may differ between subject groups. In the framework of bivariate linear mixed effects modeling, we define and estimate BCCs at various hierarchical levels in a family-type clustered design, accommodating subject group heterogeneity. Simplified and modified asymptotic confidence intervals are constructed to the BCC differences and Wald type tests are conducted. A real-world family-type clustered study of Alzheimer disease (AD) is analyzed to estimate and compare BCCs among well-established AD biomarkers between mutation carriers and non-carriers in autosomal dominant AD asymptomatic individuals. Extensive simulation studies are conducted across a wide range of scenarios to evaluate the performance of the proposed estimators and the type-I error rate and power of the proposed statistical tests.Abbreviations: BCC: bivariate correlation coefficient; BLM: bivariate linear mixed effects model; CI: confidence interval; AD: Alzheimer’s disease; DIAN: The Dominantly Inherited Alzheimer Network; SA: simple asymptotic; MA: modified asymptotic
Journal: Journal of Applied Statistics
Pages: 2246-2270
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1899141
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1899141
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2246-2270
Template-Type: ReDIF-Article 1.0
Author-Name: Srimanti Dutta
Author-X-Name-First: Srimanti
Author-X-Name-Last: Dutta
Author-Name: Geert Molenberghs
Author-X-Name-First: Geert
Author-X-Name-Last: Molenberghs
Author-Name: Arindom Chakraborty
Author-X-Name-First: Arindom
Author-X-Name-Last: Chakraborty
Title: Joint modelling of longitudinal response and time-to-event data using conditional distributions: a Bayesian perspective
Abstract:
Over the last 20 or more years a lot of clinical applications and methodological development in the area of joint models of longitudinal and time-to-event outcomes have come up. In these studies, patients are followed until an event, such as death, occurs. In most of the work, using subject-specific random-effects as frailty, the dependency of these two processes has been established. In this article, we propose a new joint model that consists of a linear mixed-effects model for longitudinal data and an accelerated failure time model for the time-to-event data. These two sub-models are linked via a latent random process. This model will capture the dependency of the time-to-event on the longitudinal measurements more directly. Using standard priors, a Bayesian method has been developed for estimation. All computations are implemented using OpenBUGS. Our proposed method is evaluated by a simulation study, which compares the conditional model with a joint model with local independence by way of calibration. Data on Duchenne muscular dystrophy (DMD) syndrome and a set of data in AIDS patients have been analysed.
Journal: Journal of Applied Statistics
Pages: 2228-2245
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1897971
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1897971
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2228-2245
Template-Type: ReDIF-Article 1.0
Author-Name: Çağlar Sözen
Author-X-Name-First: Çağlar
Author-X-Name-Last: Sözen
Author-Name: Yüksel Öner
Author-X-Name-First: Yüksel
Author-X-Name-Last: Öner
Title: The investigation of temperature data in Turkey’s Black Sea Region using functional data analysis
Abstract:
As the field of study expands, or as the number of observations in a sample increases, data observed at discrete points is generally assumed to be sampled from an underlying real function. As the number of observation points increases, those observations are likely to be sampled from a real-valued function. In this case, the derived data will be examples of a functional structure. We analyzed the daily average temperature data at 65 discrete points in 18 cities in Turkey's Black Sea Region. Fourier basis functions were used as a basis function approach because the temperature data had a periodic structure. The data were then transformed into a continuous function using the basis function and roughness penalty approach. Functional data were obtained using the roughness penalty approach. The generalized cross-validation method was used to determine the smoothing parameter of the variable (temperature variable). Finally, smoothed functional principal components analysis was applied to the functional data. In this way, changes in temperature functions, which seem hard to tackle, were evaluated on the same graph using the mean function generated for the principal component function and using the functions and the mean function obtained using the principal component function.
Journal: Journal of Applied Statistics
Pages: 2403-2415
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1896683
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1896683
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2403-2415
Template-Type: ReDIF-Article 1.0
Author-Name: H. S. Al-Kzzaz
Author-X-Name-First: H. S.
Author-X-Name-Last: Al-Kzzaz
Author-Name: M. M. E. Abd El-Monsef
Author-X-Name-First: M. M. E.
Author-X-Name-Last: Abd El-Monsef
Title: Inverse power Maxwell distribution: statistical properties, estimation and application
Abstract:
In this paper, we introduced a new probability distribution named as inverse power Maxwell distribution. The proposal distribution can be seen as an extension of the Maxwell distribution with more flexibility in modeling upside-down lifetime data. Some statistical properties of this distribution are derived. In estimation viewpoint, five methods are used for estimating the unknown parameters of the distribution and these methods are performed through the simulation study. Finally, two real data sets were analyzed to illustrate the applicability of the proposed distribution, proving that it fits each real data set much better than some other existing distributions.
Journal: Journal of Applied Statistics
Pages: 2287-2306
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1899143
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1899143
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2287-2306
Template-Type: ReDIF-Article 1.0
Author-Name: Ryan Wu
Author-X-Name-First: Ryan
Author-X-Name-Last: Wu
Author-Name: Mihye Ahn
Author-X-Name-First: Mihye
Author-X-Name-Last: Ahn
Author-Name: Hojin Yang
Author-X-Name-First: Hojin
Author-X-Name-Last: Yang
Title: Spike-and-slab type variable selection in the Cox proportional hazards model for high-dimensional features
Abstract:
In this paper, we develop a variable selection framework with the spike-and-slab prior distribution via the hazard function of the Cox model. Specifically, we consider the transformation of the score and information functions for the partial likelihood function evaluated at the given data from the parameter space into the space generated by the logarithm of the hazard ratio. Thereby, we reduce the nonlinear complexity of the estimation equation for the Cox model and allow the utilization of a wider variety of stable variable selection methods. Then, we use a stochastic variable search Gibbs sampling approach via the spike-and-slab prior distribution to obtain the sparsity structure of the covariates associated with the survival outcome. Additionally, we conduct numerical simulations to evaluate the finite-sample performance of our proposed method. Finally, we apply this novel framework on lung adenocarcinoma data to find important genes associated with decreased survival in subjects with the disease.
Journal: Journal of Applied Statistics
Pages: 2189-2207
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1893285
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1893285
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2189-2207
Template-Type: ReDIF-Article 1.0
Author-Name: Şenay Özdemir
Author-X-Name-First: Şenay
Author-X-Name-Last: Özdemir
Author-Name: Yeşim Güney
Author-X-Name-First: Yeşim
Author-X-Name-Last: Güney
Author-Name: Yetkin Tuaç
Author-X-Name-First: Yetkin
Author-X-Name-Last: Tuaç
Author-Name: Olcay Arslan
Author-X-Name-First: Olcay
Author-X-Name-Last: Arslan
Title: Empirical likelihood estimation for linear regression models with AR(p) error terms with numerical examples
Abstract:
Linear regression models are useful statistical tools to analyze data sets in different fields. There are several methods to estimate the parameters of a linear regression model. These methods usually perform under normally distributed and uncorrelated errors. If error terms are correlated the Conditional Maximum Likelihood (CML) estimation method under normality assumption is often used to estimate the parameters of interest. The CML estimation method is required a distributional assumption on error terms. However, in practice, such distributional assumptions on error terms may not be plausible. In this paper, we propose to estimate the parameters of a linear regression model with autoregressive error term using Empirical Likelihood (EL) method, which is a distribution free estimation method. A small simulation study is provided to evaluate the performance of the proposed estimation method over the CML method. The results of the simulation study show that the proposed estimators based on EL method are remarkably better than the estimators obtained from CML method in terms of mean squared errors (MSE) and bias in almost all the simulation configurations. These findings are also confirmed by the results of the numerical and real data examples.
Journal: Journal of Applied Statistics
Pages: 2271-2286
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1899142
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1899142
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2271-2286
Template-Type: ReDIF-Article 1.0
Author-Name: Michael Pokojovy
Author-X-Name-First: Michael
Author-X-Name-Last: Pokojovy
Author-Name: J. Marcus Jobe
Author-X-Name-First: J. Marcus
Author-X-Name-Last: Jobe
Title: Univariate fast initial response statistical process control with taut strings
Abstract:
We present a novel real-time univariate monitoring scheme for detecting a sustained departure of a process mean from some given standard assuming a constant variance. Our proposed stopping rule is based on the total variation of a nonparametric taut string estimator of the process mean and is designed to provide a desired average run length for an in-control situation. Compared to the more prominent CUSUM fast initial response (FIR) methodology and allowing for a restart following a false alarm, the proposed two-sided taut string (TS) scheme produces a significant reduction in average run length for a wide range of changes in the mean that occur at or immediately after process monitoring begins. A decision rule for when to choose our proposed TS chart compared to the CUSUM FIR chart that takes into account both false alarm rate and average run length to detect a shift in the mean is proposed and implemented. Supplementary materials are available online.
Journal: Journal of Applied Statistics
Pages: 2326-2348
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1900798
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1900798
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2326-2348
Template-Type: ReDIF-Article 1.0
Author-Name: Susan Gachau
Author-X-Name-First: Susan
Author-X-Name-Last: Gachau
Author-Name: Edmund Njeru Njagi
Author-X-Name-First: Edmund Njeru
Author-X-Name-Last: Njagi
Author-Name: Nelson Owuor
Author-X-Name-First: Nelson
Author-X-Name-Last: Owuor
Author-Name: Paul Mwaniki
Author-X-Name-First: Paul
Author-X-Name-Last: Mwaniki
Author-Name: Matteo Quartagno
Author-X-Name-First: Matteo
Author-X-Name-Last: Quartagno
Author-Name: Rachel Sarguta
Author-X-Name-First: Rachel
Author-X-Name-Last: Sarguta
Author-Name: Mike English
Author-X-Name-First: Mike
Author-X-Name-Last: English
Author-Name: Philip Ayieko
Author-X-Name-First: Philip
Author-X-Name-Last: Ayieko
Title: Handling missing data in a composite outcome with partially observed components: simulation study based on clustered paediatric routine data
Abstract:
Composite scores are useful in providing insights and trends about complex and multidimensional quality of care processes. However, missing data in subcomponents may hinder the overall reliability of a composite measure. In this study, strategies for handling missing data in Paediatric Admission Quality of Care (PAQC) score, an ordinal composite outcome, were explored through a simulation study. Specifically, the implications of the conventional method employed in addressing missing PAQC score subcomponents, consisting of scoring missing PAQC score components with a zero, and a multiple imputation (MI)-based strategy, were assessed. The latent normal joint modelling MI approach was used for the latter. Across simulation scenarios, MI of missing PAQC score elements at item level produced minimally biased estimates compared to the conventional method. Moreover, regression coefficients were more prone to bias compared to standards errors. Magnitude of bias was dependent on the proportion of missingness and the missing data generating mechanism. Therefore, incomplete composite outcome subcomponents should be handled carefully to alleviate potential for biased estimates and misleading inferences. Further research on other strategies of imputing at the component and composite outcome level and imputing compatibly with the substantive model in this setting, is needed.
Journal: Journal of Applied Statistics
Pages: 2389-2402
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1895087
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1895087
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2389-2402
Template-Type: ReDIF-Article 1.0
Author-Name: Jingyu Liu
Author-X-Name-First: Jingyu
Author-X-Name-Last: Liu
Author-Name: Walter W. Piegorsch
Author-X-Name-First: Walter W.
Author-X-Name-Last: Piegorsch
Author-Name: A. Grant Schissler
Author-X-Name-First: A. Grant
Author-X-Name-Last: Schissler
Author-Name: Rachel R. McCaster
Author-X-Name-First: Rachel R.
Author-X-Name-Last: McCaster
Author-Name: Susan L. Cutter
Author-X-Name-First: Susan L.
Author-X-Name-Last: Cutter
Title: Adjusting statistical benchmark risk analysis to account for non-spatial autocorrelation, with application to natural hazard risk assessment
Abstract:
We develop and study a quantitative, interdisciplinary strategy for conducting statistical risk analyses within the ‘benchmark risk’ paradigm of contemporary risk assessment when potential autocorrelation exists among sample units. We use the methodology to explore information on vulnerability to natural hazards across 3108 counties in the conterminous 48 US states, applying a place-based resilience index to an existing knowledgebase of hazardous incidents and related human casualties. An extension of a centered autologistic regression model is applied to relate local, county-level vulnerability to hazardous outcomes. Adjustments for autocorrelation embedded in the resiliency information are applied via a novel, non-spatial neighborhood structure. Statistical risk-benchmarking techniques are then incorporated into the modeling framework, wherein levels of high and low vulnerability to hazards are identified.
Journal: Journal of Applied Statistics
Pages: 2349-2369
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1904385
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1904385
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2349-2369
Template-Type: ReDIF-Article 1.0
Author-Name: Fiaz Ahmad Bhatti
Author-X-Name-First: Fiaz Ahmad
Author-X-Name-Last: Bhatti
Author-Name: Munir Ahmad
Author-X-Name-First: Munir
Author-X-Name-Last: Ahmad
Title: On the modified Burr IV model for hydrological events: development, properties, characterizations and applications
Abstract:
We introduce a new distribution for modeling extreme events about frequency analysis called modified Burr IV (MBIV) distribution. We derive the MBIV distribution on the basis of the generalized Pearson differential equation. The proposed model turns out to be flexible: its density function can be symmetrical, right-skewed, left-skewed, J and bimodal shaped. Its hazard rate has shapes such as bathtub and modified bathtub, increasing, decreasing, and increasing-decreasing-increasing. To show the importance of the MBIV distribution, we establish various mathematical properties such as random number generator, sub-models, moments related properties, inequality measures, reliability measures, uncertainty measures and characterizations. We utilize the maximum likelihood estimation technique to estimate the model parameters. We assess the behavior of the maximum likelihood estimators (MLEs) of the MBIV parameters via a simulation study. Five data sets related to frequency analysis are considered to elucidate the significance of the MBIV distribution. We show that the MBIV model is the best model to analyze data for hydrological events, motivating its high level of adaptability in the applied setting.
Journal: Journal of Applied Statistics
Pages: 2167-2188
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1893284
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1893284
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2167-2188
Template-Type: ReDIF-Article 1.0
Author-Name: Rana Jreich
Author-X-Name-First: Rana
Author-X-Name-Last: Jreich
Author-Name: Christine Hatte
Author-X-Name-First: Christine
Author-X-Name-Last: Hatte
Author-Name: Eric Parent
Author-X-Name-First: Eric
Author-X-Name-Last: Parent
Title: Review of Bayesian selection methods for categorical predictors using JAGS
Abstract:
The formulation of variable selection has been widely developed in the Bayesian literature by linking a random binary indicator to each variable. This Bayesian inference has the advantage of stochastically exploring the set of possible sub-models, whatever their dimension. Bayesian selection approaches, appropriate for categorical predictors, are generally beyond the scope of the standard Bayesian selection of regressors in the linear model since all levels of a categorical variable should be jointly handled in the selection procedure. For categorical covariates, new strategies have been developed to detect the effect of grouped covariates rather than the single effect of a quantitative regressor. In this paper, we review three Bayesian selection methods for categorical predictors: Bayesian Group Lasso with Spike and Slab priors, Bayesian Sparse Group Selection and Bayesian Effect Fusion using model-based clustering. The motivation behind this paper is to provide detailed information about the implementation of the three Bayesian selection methods mentioned above, appropriate for categorical predictors, using the JAGS software. Selection performance and sensitivity analysis of the hyperparameters tuning for prior specifications are assessed under various simulated scenarios. JAGS helps user implement these three Bayesian selection methods for more complex model structures such as hierarchical ones with latent layers.
Journal: Journal of Applied Statistics
Pages: 2370-2388
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1902955
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1902955
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2370-2388
Template-Type: ReDIF-Article 1.0
Author-Name: Hayala Cristina Cavenague de Souza
Author-X-Name-First: Hayala Cristina
Author-X-Name-Last: Cavenague de Souza
Author-Name: Francisco Louzada
Author-X-Name-First: Francisco
Author-X-Name-Last: Louzada
Author-Name: Mauro Ribeiro de Oliveira
Author-X-Name-First: Mauro Ribeiro
Author-X-Name-Last: de Oliveira
Author-Name: Bukola Fawole
Author-X-Name-First: Bukola
Author-X-Name-Last: Fawole
Author-Name: Adesina Akintan
Author-X-Name-First: Adesina
Author-X-Name-Last: Akintan
Author-Name: Lawal Oyeneyin
Author-X-Name-First: Lawal
Author-X-Name-Last: Oyeneyin
Author-Name: Wilfred Sanni
Author-X-Name-First: Wilfred
Author-X-Name-Last: Sanni
Author-Name: Gleici da Silva Castro Perdoná
Author-X-Name-First: Gleici da
Author-X-Name-Last: Silva Castro Perdoná
Title: The Log-Normal zero-inflated cure regression model for labor time in an African obstetric population
Abstract:
In obstetrics and gynecology, knowledge about how women's features are associated with childbirth is important. This leads to establishing guidelines and can help managers to describe the dynamics of pregnant women's hospital stays. Then, time is a variable of great importance and can be described by survival models. An issue that should be considered in the modeling is the inclusion of women for whom the duration of labor cannot be observed due to fetal death, generating a proportion of times equal to zero. Additionally, another proportion of women's time may be censored due to some intervention. The aim of this paper was to present the Log-Normal zero-inflated cure regression model and to evaluate likelihood-based parameter estimation by a simulation study. In general, the inference procedures showed a better performance for larger samples and low proportions of zero inflation and cure. To exemplify how this model can be an important tool for investigating the course of the childbirth process, we considered the Better Outcomes in Labor Difficulty project dataset and showed that parity and educational level are associated with the main outcomes. We acknowledge the World Health Organization for granting us permission to use the dataset.
Journal: Journal of Applied Statistics
Pages: 2416-2429
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1896684
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1896684
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2416-2429
Template-Type: ReDIF-Article 1.0
Author-Name: Tahir Ekin
Author-X-Name-First: Tahir
Author-X-Name-Last: Ekin
Author-Name: R. M. Musal
Author-X-Name-First: R. M.
Author-X-Name-Last: Musal
Title: Integrated statistical and decision models for multi-stage health care audit sampling
Abstract:
Health care audits are crucial in managing the government insurance programs that are estimated to have losses amounting to billions of dollars every year. Statistical methods such as sampling have long been used to handle their size and complexity. Sampling from health care claims data can benefit from multi-stage approaches, especially when the evaluation of the tradeoffs between precision and cost is important. The use of decision models could facilitate health care auditors and policy makers make the best use of these sampling outputs. This paper proposes an integrated multi-stage sampling and decision-making framework that enables auditors address the tradeoffs between audit costs and expected overpayment recovery. We illustrate the framework and discuss insights utilizing a variety of overpayment scenarios for payment populations including U.S. Medicare Part B claims payment data.
Journal: Journal of Applied Statistics
Pages: 2307-2325
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1900797
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1900797
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2307-2325
Template-Type: ReDIF-Article 1.0
Author-Name: Martina Vittorietti
Author-X-Name-First: Martina
Author-X-Name-Last: Vittorietti
Author-Name: Javier Hidalgo
Author-X-Name-First: Javier
Author-X-Name-Last: Hidalgo
Author-Name: Jilt Sietsma
Author-X-Name-First: Jilt
Author-X-Name-Last: Sietsma
Author-Name: Wei Li
Author-X-Name-First: Wei
Author-X-Name-Last: Li
Author-Name: Geurt Jongbloed
Author-X-Name-First: Geurt
Author-X-Name-Last: Jongbloed
Title: Isotonic regression for metallic microstructure data: estimation and testing under order restrictions
Abstract:
Investigating the main determinants of the mechanical performance of metals is not a simple task. Already known physically inspired qualitative relations between 2D microstructure characteristics and 3D mechanical properties can act as the starting point of the investigation. Isotonic regression allows to take into account ordering relations and leads to more efficient and accurate results when the underlying assumptions actually hold. The main goal in this paper is to test order relations in a model inspired by a materials science application. The statistical estimation procedure is described considering three different scenarios according to the knowledge of the variances: known variance ratio, completely unknown variances, and variances under order restrictions. New likelihood ratio tests are developed in the last two cases. Both parametric and non-parametric bootstrap approaches are developed for finding the distribution of the test statistics under the null hypothesis. Finally an application on the relation between geometrically necessary dislocations and number of observed microstructure precipitations is shown.
Journal: Journal of Applied Statistics
Pages: 2208-2227
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1896685
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1896685
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2208-2227
Template-Type: ReDIF-Article 1.0
Author-Name: Biviana M. Súarez-Sierra
Author-X-Name-First: Biviana M.
Author-X-Name-Last: Súarez-Sierra
Author-Name: Eliane R. Rodrigues
Author-X-Name-First: Eliane R.
Author-X-Name-Last: Rodrigues
Author-Name: Guadalupe Tzintzun
Author-X-Name-First: Guadalupe
Author-X-Name-Last: Tzintzun
Title: An application of a non-homogeneous Poisson model to study PM2.5 exceedances in Mexico City and Bogota
Abstract:
It is very important to study the occurrence of high levels of particulate matter due to the potential harm to people's health and to the environment. In the present work we use a non-homogeneous Poisson model to analyse the rate of exceedances of particulate matter with diameter smaller that 2.5 microns (PM
$ _{2.5} $ 2.5). Models with and without change-points are considered and they are applied to data from Bogota, Colombia, and Mexico City, Mexico. Results show that whereas in Bogota larger particles pose a more serious problem, in Mexico City, even though nowadays levels are more controlled, in the recent past PM
$ _{2.5} $ 2.5 were the ones causing serious problems.
Journal: Journal of Applied Statistics
Pages: 2430-2445
Issue: 9
Volume: 49
Year: 2022
Month: 07
X-DOI: 10.1080/02664763.2021.1897972
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1897972
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:9:p:2430-2445
Template-Type: ReDIF-Article 1.0
# input file: catalog-resolver-2362879763316292882.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220713T202513 git hash: 99d3863004
Author-Name: M. Maswadah
Author-X-Name-First: M.
Author-X-Name-Last: Maswadah
Title: Improved maximum likelihood estimation of the shape-scale family based on the generalized progressive hybrid censoring scheme
Abstract:
In parametric estimates, the maximum likelihood estimation method is the most popular method widely used in the social sciences and psychology, although it is biased in situations where sample sizes are small or the data are heavily censored. Therefore, the main objective of this research is to improve this estimation method using the Runge–Kutta technique. The improved method was applied to derive the estimators of the shape scale family parameters and compare them with Bayesian estimators based on the informative and kernel priors, via Monte Carlo simulation. The simulation results showed that the improved maximum likelihood estimation method is highly efficient and outperforms the Bayesian method for different sample sizes. Finally, from a future perspective, the proposed model could be important for analyzing real data sets including data on COVID-19 deaths in Egypt, for potential comparative studies with other countries.
Journal: Journal of Applied Statistics
Pages: 2825-2844
Issue: 11
Volume: 49
Year: 2022
Month: 08
X-DOI: 10.1080/02664763.2021.1924638
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1924638
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:11:p:2825-2844
Template-Type: ReDIF-Article 1.0
# input file: catalog-resolver-5461835493060998471.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220713T202513 git hash: 99d3863004
Author-Name: Ting Hsiang Lin
Author-X-Name-First: Ting Hsiang
Author-X-Name-Last: Lin
Author-Name: Min-Hsiao Tsai
Author-X-Name-First: Min-Hsiao
Author-X-Name-Last: Tsai
Title: Solving unobserved heterogeneity with latent class inflated Poisson regression model
Abstract:
Inflated data and over-dispersion are two common problems when modeling count data with traditional Poisson regression models. In this study, we propose a latent class inflated Poisson (LCIP) regression model to solve the unobserved heterogeneity that leads to inflations and over-dispersion. The performance of the model estimation is evaluated through simulation studies. We illustrate the usefulness of introducing a latent class variable by analyzing the Behavioral Risk Factor Surveillance System (BRFSS) data, which contain several excessive values and characterized by over-dispersion. As a result, the new model we proposed displays a better fit than the standard Poisson regression and zero-inflated Poisson regression models for the inflated counts.
Journal: Journal of Applied Statistics
Pages: 2953-2963
Issue: 11
Volume: 49
Year: 2022
Month: 08
X-DOI: 10.1080/02664763.2021.1929875
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1929875
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:11:p:2953-2963
Template-Type: ReDIF-Article 1.0
# input file: catalog-resolver233819172648051113.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220713T202513 git hash: 99d3863004
Author-Name: J. H. McVittie
Author-X-Name-First: J. H.
Author-X-Name-Last: McVittie
Author-Name: V. Addona
Author-X-Name-First: V.
Author-X-Name-Last: Addona
Title: A risk set adjustment for proportional hazards modeling of combined cohort data
Abstract:
Sporting careers observed over a preset time interval can be partitioned into two distinct subsamples. These samples consist of individuals whose careers had already commenced at the start of the time interval (prevalent subsample) and individuals whose careers began during the time interval (incident subsample) as well the respective individual-level covariate data such as salary, height, weight, performance statistics, draft position, etc. Under the assumption of a proportional hazards model, we propose a partial likelihood estimator to model the effect of covariates on survival via an adjusted risk set sampling procedure for when the incident cohort data is used in conjunction with the prevalent cohort data. We use simulated failure time data to validate the combined cohort proportional hazards methodology and illustrate our model using an NBA data set for career durations measured between 1990 and 2008.
Journal: Journal of Applied Statistics
Pages: 2913-2927
Issue: 11
Volume: 49
Year: 2022
Month: 08
X-DOI: 10.1080/02664763.2021.1928015
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1928015
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:11:p:2913-2927
Template-Type: ReDIF-Article 1.0
# input file: catalog-resolver7568146652705714345.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220713T202513 git hash: 99d3863004
Author-Name: Sanku Dey
Author-X-Name-First: Sanku
Author-X-Name-Last: Dey
Author-Name: Liang Wang
Author-X-Name-First: Liang
Author-X-Name-Last: Wang
Author-Name: Mazen Nassar
Author-X-Name-First: Mazen
Author-X-Name-Last: Nassar
Title: Inference on Nadarajah–Haghighi distribution with constant stress partially accelerated life tests under progressive type-II censoring
Abstract:
This paper presents methods of estimation of the parameters and acceleration factor for Nadarajah–Haghighi distribution based on constant-stress partially accelerated life tests. Based on progressive Type-II censoring, Maximum likelihood and Bayes estimates of the model parameters and acceleration factor are established, respectively. In addition, approximate confidence interval are constructed via asymptotic variance and covariance matrix, and Bayesian credible intervals are obtained based on importance sampling procedure. For comparison purpose, alternative bootstrap confidence intervals for unknown parameters and acceleration factor are also presented. Finally, extensive simulation studies are conducted for investigating the performance of the our results, and two data sets are analyzed to show the applicabilities of the proposed methods.
Journal: Journal of Applied Statistics
Pages: 2891-2912
Issue: 11
Volume: 49
Year: 2022
Month: 08
X-DOI: 10.1080/02664763.2021.1928014
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1928014
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:11:p:2891-2912
Template-Type: ReDIF-Article 1.0
# input file: catalog-resolver5766990061207766079.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220713T202513 git hash: 99d3863004
Author-Name: Paria Soleimani
Author-X-Name-First: Paria
Author-X-Name-Last: Soleimani
Author-Name: Shervin Asadzadeh
Author-X-Name-First: Shervin
Author-X-Name-Last: Asadzadeh
Title: Effect of non-normality on the monitoring of simple linear profiles in two-stage processes: a remedial measure for gamma-distributed responses
Abstract:
The relationship between the response variable and one or more independent variables refers to the quality characteristic in some statistical quality control applications, which is called profile. Most research dealt with the monitoring of profiles in single-stage processes considering a basic assumption of normality. However, some processes are made up of several sub-processes; thus, the effect of cascade property in multistage processes should be considered. Moreover, sometimes in practice, the assumption of normally distributed data does not hold. This paper first examines the effect of non-normal data to monitor simple linear profiles in two-stage processes in Phase II. We study non-normal distributions such as the skewed gamma distribution and the heavy-tailed symmetric t-distribution to measure the non-normality effect using the average run length criterion. Next, generalized linear models have been used and a monitoring approach based on generalized likelihood ratio (GLR) has been developed for gamma-distributed responses as a remedial measure to reduce the detrimental effects of non-normality. The results of simulation studies reveal that the performance of the GLR procedure is satisfactory for the multistage non-normal linear profiles. Finally, the simulated and real case studies with gamma-distributed data have been provided to show the application of the competing monitoring approaches.
Journal: Journal of Applied Statistics
Pages: 2870-2890
Issue: 11
Volume: 49
Year: 2022
Month: 08
X-DOI: 10.1080/02664763.2021.1928013
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1928013
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:11:p:2870-2890
Template-Type: ReDIF-Article 1.0
# input file: catalog-resolver1429486588680137146.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220713T202513 git hash: 99d3863004
Author-Name: Valdemiro P. Vigas
Author-X-Name-First: Valdemiro P.
Author-X-Name-Last: Vigas
Author-Name: Edwin M. M. Ortega
Author-X-Name-First: Edwin M. M.
Author-X-Name-Last: Ortega
Author-Name: Gauss M. Cordeiro
Author-X-Name-First: Gauss M.
Author-X-Name-Last: Cordeiro
Author-Name: Adriano K. Suzuki
Author-X-Name-First: Adriano K.
Author-X-Name-Last: Suzuki
Author-Name: Giovana O. Silva
Author-X-Name-First: Giovana O.
Author-X-Name-Last: Silva
Title: The new Neyman type A generalized odd log-logistic-G-family with cure fraction
Abstract:
The work proposes a new family of survival models called the Odd log-logistic generalized Neyman type A long-term. We consider different activation schemes in which the number of factors M has the Neyman type A distribution and the time of occurrence of an event follows the odd log-logistic generalized family. The parameters are estimated by the classical and Bayesian methods. We investigate the mean estimates, biases, and root mean square errors in different activation schemes using Monte Carlo simulations. The residual analysis via the frequentist approach is used to verify the model assumptions. We illustrate the applicability of the proposed model for patients with gastric adenocarcinoma. The choice of the adenocarcinoma data is because the disease is responsible for most cases of stomach tumors. The estimated cured proportion of patients under chemoradiotherapy is higher compared to patients undergoing only surgery. The estimated hazard function for the chemoradiotherapy level tends to decrease when the time increases. More information about the data is addressed in the application section.
Journal: Journal of Applied Statistics
Pages: 2805-2824
Issue: 11
Volume: 49
Year: 2022
Month: 08
X-DOI: 10.1080/02664763.2021.1922994
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1922994
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:11:p:2805-2824
Template-Type: ReDIF-Article 1.0
# input file: catalog-resolver2666411365973782405.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220713T202513 git hash: 99d3863004
Author-Name: Dongyu Li
Author-X-Name-First: Dongyu
Author-X-Name-Last: Li
Author-Name: Lei Wang
Author-X-Name-First: Lei
Author-X-Name-Last: Wang
Title: Improved kth power expectile regression with nonignorable dropouts
Abstract:
The kth (
$ 1 n. Our simulation results indicate that the proposed method is efficient and robust against outliers and heavy-tailed distributions. Finally, a real dataset from an air pollution mortality study is used to illustrate the proposed method.
Journal: Journal of Applied Statistics
Pages: 3677-3692
Issue: 14
Volume: 49
Year: 2022
Month: 10
X-DOI: 10.1080/02664763.2021.1962259
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1962259
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:14:p:3677-3692
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1959529_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Timothy Opheim
Author-X-Name-First: Timothy
Author-X-Name-Last: Opheim
Author-Name: Anuradha Roy
Author-X-Name-First: Anuradha
Author-X-Name-Last: Roy
Title: Doubly multivariate linear models with block exchangeable distributed errors and site-dependent covariates
Abstract:
The problem of testing the intercept and slope parameters of doubly multivariate linear models with site-dependent covariates using Rao's score test (RST) is studied. The RST statistic is developed for a block exchangeable covariance structure on the error vector under the assumption of multivariate normality. We compare our developed RST statistic with the likelihood ratio test (LRT) statistic. Monte Carlo simulations indicate that the RST statistic is much more accurate than its counterpart LRT statistic and it takes significantly less computation time than the LRT statistic. The proposed method is illustrated with an example of multiple response variables measured on multiple trees in a single plot in an agricultural study.
Journal: Journal of Applied Statistics
Pages: 3659-3676
Issue: 14
Volume: 49
Year: 2022
Month: 10
X-DOI: 10.1080/02664763.2021.1959529
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1959529
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:14:p:3659-3676
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1962261_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Zhen Sheng
Author-X-Name-First: Zhen
Author-X-Name-Last: Sheng
Author-Name: Yukun Liu
Author-X-Name-First: Yukun
Author-X-Name-Last: Liu
Author-Name: Pengfei Li
Author-X-Name-First: Pengfei
Author-X-Name-Last: Li
Author-Name: Jing Qin
Author-X-Name-First: Jing
Author-X-Name-Last: Qin
Title: Likelihood ratio test for genetic association study with case–control data under Probit model
Abstract:
Probit and Logit models are the most popular for binary disease statusing in genetic association studies. They are equally used and nearly exchangeable in the analysis of prospectively collected data. However, no strong inferences were made based on Probit models for the retrospectively collected case–control data, especially in the presence of random effects. This paper systematically investigates the performance of Probit mixed-effects models for case–control data. We find that the retrospective likelihood has a closed-form, which motivates the development of likelihood ratio tests for genetic association. Specifically, we developed four likelihood ratio tests based on whether the disease prevalence is completely unavailable, partly available, or completely available. We show that their limiting distribution without a genetic effect is an equal mixture of two chi-square distributions with degrees of freedom 1 and 2, respectively. Our simulations indicate that they can have a remarkable power gain against the popular Logit-model-based score tests, and the disease prevalence information can enhance the power of the likelihood ratio tests. After analyzing a Kenya malaria data, we found out that the proposed test produces a significant result on the association of the gene ABO with malaria, whereas the commonly used competitors fail.
Journal: Journal of Applied Statistics
Pages: 3717-3731
Issue: 14
Volume: 49
Year: 2022
Month: 10
X-DOI: 10.1080/02664763.2021.1962261
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1962261
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:14:p:3717-3731
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1951685_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Jeffrey Williams
Author-X-Name-First: Jeffrey
Author-X-Name-Last: Williams
Author-Name: Raymond R. Hill
Author-X-Name-First: Raymond R.
Author-X-Name-Last: Hill
Author-Name: Joseph J. Pignatiello Jr.
Author-X-Name-First: Joseph J.
Author-X-Name-Last: Pignatiello Jr.
Author-Name: Eric Chicken
Author-X-Name-First: Eric
Author-X-Name-Last: Chicken
Title: Wavelet analysis of variance box plot
Abstract:
Functional box plots satisfy two needs; visualization of functional data, and the calculation of important box plot statistics. Data visualization illuminates key characteristics of functional sets missed by statistical tests and summary statistics. The calculation of box plot statistics for functional sets permits a novel comparison more suited to functional data. The functional box plot uses a depth method to visualize and rank smooth functional curves in terms of a mean, box, whiskers, and outliers. The functional box plot improves upon other classic functional data analysis tools such as functional principal components and discriminant analysis for outlier detection. This research adds wavelet analysis as a generating mechanism along with depth for functional box plots to visualize functional data and calculate relevant statistics. The wavelet analysis of variance box plot tool gives competitive error rates in Gaussian test cases with magnitude outliers, and outperforms the functional box plot, for Gaussian test cases with shape outliers. Further, we show wavelet analysis is well suited at approximating irregular and noisy functional data and show the enhanced capability of WANOVA box plots to classify shape outliers which follow a different pattern than other functional data for both simulated and real data instances.
Journal: Journal of Applied Statistics
Pages: 3536-3563
Issue: 14
Volume: 49
Year: 2022
Month: 10
X-DOI: 10.1080/02664763.2021.1951685
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1951685
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:14:p:3536-3563
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1959527_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Guillermo Martínez-Flórez
Author-X-Name-First: Guillermo
Author-X-Name-Last: Martínez-Flórez
Author-Name: Eliseo Martínez
Author-X-Name-First: Eliseo
Author-X-Name-Last: Martínez
Author-Name: Roger Tovar-Falón
Author-X-Name-First: Roger
Author-X-Name-Last: Tovar-Falón
Author-Name: Héctor W. Gómez
Author-X-Name-First: Héctor W.
Author-X-Name-Last: Gómez
Title: A family of bimodal distributions generated by distributions with positive support
Abstract:
Bimodal data sets are very common in different areas of knowledge. The crude birth rates data, fish length data, egg diameter data, the eruption and interruption times of the Old Faithful geyser, are examples of this type of data. In this paper, a new class of symmetric density functions for modeling bimodal data as described above are presented. From density functions with support on
$ [ 0,+\infty ) $ [0,+∞), the symmetry is getting by reflecting the density function in the negative semi-axis with their respective normalization. In this way, if the primitive density function is unimodal, then the resulting density will be bimodal. We introduce asymmetry parameters and study their behavior, in particular the values of their modes and some other statistical values of interest. The cases for densities generated by Gamma, Weibull, Log-normal, and Birnbaum-Saunders densities, among others are studied. Statistical inference is performed from a classical perspective. A small simulation study to evaluate the benefits and limitations of the new proposal. In addition, an application to a data set related to the fetal weight in grams obtained through ultrasound in a sample of 500 units is also presented; the results show the great usefulness of the model in practical situations.
Journal: Journal of Applied Statistics
Pages: 3614-3637
Issue: 14
Volume: 49
Year: 2022
Month: 10
X-DOI: 10.1080/02664763.2021.1959527
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1959527
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:14:p:3614-3637
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1962262_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Yonggang Ji
Author-X-Name-First: Yonggang
Author-X-Name-Last: Ji
Author-Name: Haifang Shi
Author-X-Name-First: Haifang
Author-X-Name-Last: Shi
Title: Shrinkage estimation of fixed and random effects in linear quantile mixed models
Abstract:
This paper presents a Bayesian analysis of linear mixed models for quantile regression using a modified Cholesky decomposition for the covariance matrix of random effects and an asymmetric Laplace distribution for the error distribution. We consider several novel Bayesian shrinkage approaches for both fixed and random effects in a linear mixed quantile model using extended
$ L_1 $ L1 penalties. To improve mixing of the Markov chains, a simple and efficient partially collapsed Gibbs sampling algorithm is developed for posterior inference. We also extend the framework to a Bayesian mixed expectile model and develop a Metropolis–Hastings acceptance–rejection (MHAR) algorithm using proposal densities based on iteratively weighted least squares estimation. The proposed approach is then illustrated via both simulated and real data examples. Results indicate that the proposed approach performs very well in comparison to the other approaches.
Journal: Journal of Applied Statistics
Pages: 3693-3716
Issue: 14
Volume: 49
Year: 2022
Month: 10
X-DOI: 10.1080/02664763.2021.1962262
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1962262
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:14:p:3693-3716
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1962260_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Jiasheng Huang
Author-X-Name-First: Jiasheng
Author-X-Name-Last: Huang
Author-Name: Yehua Li
Author-X-Name-First: Yehua
Author-X-Name-Last: Li
Author-Name: Angelique G. Brellenthin
Author-X-Name-First: Angelique G.
Author-X-Name-Last: Brellenthin
Author-Name: Duck-chul Lee
Author-X-Name-First: Duck-chul
Author-X-Name-Last: Lee
Author-Name: Xuemei Sui
Author-X-Name-First: Xuemei
Author-X-Name-Last: Sui
Author-Name: Steven N. Blair
Author-X-Name-First: Steven N.
Author-X-Name-Last: Blair
Title: Causal mediation analysis between resistance exercise and reduced risk of cardiovascular disease based on the Aerobics Center Longitudinal Study
Abstract:
Health benefits of resistance exercise (RE), particularly in lowering cardiovascular disease (CVD) risks, are less understood in comparison to aerobic exercise (AE). Motivated by big data from the Aerobics Center Longitudinal Study (ACLS), we study the direct and indirect effects of RE on CVD risks. The primary outcome in our study, total CVD events (CVD morbidity and mortality combined), is modeled as a survival outcome. To investigate the pathway from RE to CVD outcome through potential mediators, we first conduct causal mediation analysis based on marginal structural models (MSMs). To fully account the information from repeated measurements of the mediators, we also adopt a joint model of the CVD survival outcome and multiple longitudinal trajectories of the mediators. Results show statistically significant direct effects of RE and AE on lowering the risk of total CVD events under each pathway. The causal effect of RE and AE on CVD risk is also studied across different age and gender groups. Furthermore, we produce a ranking for the relative importance of the potential risk factors for CVD, with total cholesterol ranking the highest.
Journal: Journal of Applied Statistics
Pages: 3750-3767
Issue: 14
Volume: 49
Year: 2022
Month: 10
X-DOI: 10.1080/02664763.2021.1962260
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1962260
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:14:p:3750-3767
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1953448_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Garo Panikian
Author-X-Name-First: Garo
Author-X-Name-Last: Panikian
Author-Name: Gabby Colmenares Reverol
Author-X-Name-First: Gabby
Author-X-Name-Last: Colmenares Reverol
Author-Name: Jayne Rhodes
Author-X-Name-First: Jayne
Author-X-Name-Last: Rhodes
Author-Name: Emma McLarnon
Author-X-Name-First: Emma
Author-X-Name-Last: McLarnon
Author-Name: Sarah Keast
Author-X-Name-First: Sarah
Author-X-Name-Last: Keast
Author-Name: Kokouvi Gamado
Author-X-Name-First: Kokouvi
Author-X-Name-Last: Gamado
Title: Time series modelling methods to forecast the volume of self-assessment tax returns in the UK
Abstract:
Her Majesty's Revenue and Customs (HMRC) has the ambitious target of making tax digital for all its customers and collecting tax in a more efficient, effective and accurate manner for both the government and UK taxpayers. Self-assessment tax returns, the biggest key business event for HMRC, is also one of the most popular digital services with over 90% of the approximately 12 million taxpayers in self assessment filing their return online each year. The majority of returns are filed in January immediately prior to the self-assessment deadline (31st January), putting significant pressure not only on the self-assessment digital service but also on all other HMRC digital services. Hence, understanding and predicting demand for the system is vital to provide a robust and responsive service. We therefore developed mathematical models with Bayesian inference techniques to forecast volumes of Self-assessment (SA) returns submitted online during January, providing accurate hourly predictions of traffic on the digital system in the run up to the SA deadline. Because none of the models being considered is believed to be the true model, we use an ensemble modelling technique that combines forecasts from different models to develop a less risky demand forecast.
Journal: Journal of Applied Statistics
Pages: 3732-3749
Issue: 14
Volume: 49
Year: 2022
Month: 10
X-DOI: 10.1080/02664763.2021.1953448
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1953448
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:14:p:3732-3749
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1957789_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Amadou Barry
Author-X-Name-First: Amadou
Author-X-Name-Last: Barry
Author-Name: Karim Oualkacha
Author-X-Name-First: Karim
Author-X-Name-Last: Oualkacha
Author-Name: Arthur Charpentier
Author-X-Name-First: Arthur
Author-X-Name-Last: Charpentier
Title: A new GEE method to account for heteroscedasticity using asymmetric least-square regressions
Abstract:
Generalized estimating equations
$ ({\rm GEE}) $ (GEE) are widely used to analyze longitudinal data; however, they are not appropriate for heteroscedastic data, because they only estimate regressor effects on the mean response – and therefore do not account for data heterogeneity. Here, we combine the
$ {\rm GEE} $ GEE with the asymmetric least squares (expectile) regression to derive a new class of estimators, which we call generalized expectile estimating equations
$ ({\rm GEEE}) $ (GEEE). The
$ {\rm GEEE} $ GEEE model estimates regressor effects on the expectiles of the response distribution, which provides a detailed view of regressor effects on the entire response distribution. In addition to capturing data heteroscedasticity, the GEEE extends the various working correlation structures to account for within-subject dependence. We derive the asymptotic properties of the
$ {\rm GEEE} $ GEEE estimators and propose a robust estimator of its covariance matrix for inference (see our R package, github.com/AmBarry/expectgee). Our simulations show that the GEEE estimator is non-biased and efficient, and our real data analysis shows it captures heteroscedasticity.
Journal: Journal of Applied Statistics
Pages: 3564-3590
Issue: 14
Volume: 49
Year: 2022
Month: 10
X-DOI: 10.1080/02664763.2021.1957789
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1957789
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:14:p:3564-3590
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1951684_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Seema Zubair
Author-X-Name-First: Seema
Author-X-Name-Last: Zubair
Author-Name: Sanjoy K. Sinha
Author-X-Name-First: Sanjoy K.
Author-X-Name-Last: Sinha
Title: Semiparametric methods for incomplete longitudinal count data with an application to health and retirement study
Abstract:
In this paper, we propose and explore a novel semiparametric approach to analyzing longitudinal count data. We address the issue of missingness in longitudinal data and propose a weighted generalized estimations equations approach to fitting marginal mean response models for count responses with dropouts. Also, we investigate a spline regression approach to approximating the curvilinear relationship between the mean response and covariates. The asymptotic properties of the proposed estimators are studied in some detail. The empirical properties of the estimators are investigated using Monte Carlo simulations. An application is also provided using actual survey data obtained from the Health and Retirement Study (HRS).
Journal: Journal of Applied Statistics
Pages: 3513-3535
Issue: 14
Volume: 49
Year: 2022
Month: 10
X-DOI: 10.1080/02664763.2021.1951684
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1951684
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:14:p:3513-3535
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1959528_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Babak Zafari
Author-X-Name-First: Babak
Author-X-Name-Last: Zafari
Author-Name: Tahir Ekin
Author-X-Name-First: Tahir
Author-X-Name-Last: Ekin
Author-Name: Fabrizio Ruggeri
Author-X-Name-First: Fabrizio
Author-X-Name-Last: Ruggeri
Title: Multicriteria decision frontiers for prescription anomaly detection over time
Abstract:
Health care prescription fraud and abuse result in major financial losses and adverse health effects. The growing budget deficits of health insurance programs and recent opioid drug abuse crisis in the United States have accelerated the use of analytical methods. Unsupervised methods such as clustering and anomaly detection could help the health care auditors to evaluate the billing patterns when embedded into rule-based frameworks. These decision models can aid policymakers in detecting potential suspicious activities. This manuscript proposes an unsupervised temporal learning-based decision frontier model using the real world Medicare Part D prescription data collected over 5 years. First, temporal probabilistic hidden groups of drugs are retrieved using a structural topic model with covariates. Next, we construct combined concentration curves and Gini measures considering the weighted impact of temporal observations for prescription patterns, in addition to the Gini values for the cost. The novel decision frontier utilizes this output and enables health care practitioners to assess the trade-offs among different criteria and to identify audit leads.
Journal: Journal of Applied Statistics
Pages: 3638-3658
Issue: 14
Volume: 49
Year: 2022
Month: 10
X-DOI: 10.1080/02664763.2021.1959528
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1959528
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:14:p:3638-3658
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1968358_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Qingzhao Yu
Author-X-Name-First: Qingzhao
Author-X-Name-Last: Yu
Author-Name: Lu Zhang
Author-X-Name-First: Lu
Author-X-Name-Last: Zhang
Author-Name: Xiaocheng Wu
Author-X-Name-First: Xiaocheng
Author-X-Name-Last: Wu
Author-Name: Bin Li
Author-X-Name-First: Bin
Author-X-Name-Last: Li
Title: Inference on moderation effect with third-variable effect analysis – application to explore the trend of racial disparity in oncotype dx test for breast cancer treatment
Abstract:
Third variable effect refers to the effect from a third variable that explains an observed relationship between an exposure and an outcome. Depending on whether there is causal relationship, typically, a third variable takes the format of a mediator or a confounder. A moderation effect is a special case of the third-variable effect, where the moderator and other variables have an interactive effect on the outcome. In this paper, we extend the R package ‘mma’ for moderation analysis so that third-variable effects can be reported at different levels of the moderator. The proposed moderation analysis use tree-structured models to automatically detect moderation effects and can handle both categorical and numerical moderators. We propose algorithms and graphical methods for making inference on moderation effects and illustrate the method under different scenarios of moderation effects. Finally, we apply the proposed method to explore the trend of racial disparities in the use of Oncotype DX recurrence tests among breast cancer patients. We found that the unexplained racial differences in using the tests have decreased from 2010 to 2015.
Journal: Journal of Applied Statistics
Pages: 3958-3975
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1968358
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1968358
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:3958-3975
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1967890_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Xin Yang
Author-X-Name-First: Xin
Author-X-Name-Last: Yang
Author-Name: Alan D. Hutson
Author-X-Name-First: Alan D.
Author-X-Name-Last: Hutson
Author-Name: Dongliang Wang
Author-X-Name-First: Dongliang
Author-X-Name-Last: Wang
Title: A generalized BLUE approach for combining location and scale information in a meta-analysis
Abstract:
In systematic reviews and meta-analyses, one is interested in combining information from a variety of sources in order to obtain unbiased and efficient pooled estimates of the mean treatment effect compared to a control group along with the corresponding standard errors and confidence intervals, particularly when the source data is unavailable. However, in many studies the mean and standard deviation are not reported in lieu of other descriptive measures such as the median and quartiles. In this note we provide a theoretically optimal best linear unbiased estimator (BLUE) strategy for combining different types of summary information in order to pool results and estimate the overall treatment effect and the corresponding confidence intervals. Our approach is less biased and much more flexible than past attempts at solving this problem and can accommodate combining a variety of summary information across studies. We show that confidence intervals based on our methods have the appropriate coverage probabilities. Our proposed methods are theoretically justified and verified by simulation studies. The BLUE method is illustrated via a real data application.
Journal: Journal of Applied Statistics
Pages: 3846-3867
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1967890
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1967890
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:3846-3867
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1967894_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: FuPeng Xie
Author-X-Name-First: FuPeng
Author-X-Name-Last: Xie
Author-Name: Philippe Castagliola
Author-X-Name-First: Philippe
Author-X-Name-Last: Castagliola
Author-Name: YuLong Qiao
Author-X-Name-First: YuLong
Author-X-Name-Last: Qiao
Author-Name: XueLong Hu
Author-X-Name-First: XueLong
Author-X-Name-Last: Hu
Author-Name: JinSheng Sun
Author-X-Name-First: JinSheng
Author-X-Name-Last: Sun
Title: A one-sided exponentially weighted moving average control chart for time between events
Abstract:
Exponentially weighted moving average (EWMA) control charts for time-between-events (TBE) are commonly suggested to monitor high-quality processes for the early detection of process deteriorations. In this study, an enhanced one-sided EWMA TBE scheme is developed for rapid detection of increases or decreases in the process mean. The use of the truncation method helps to improve the sensitivity of the proposed scheme in the process mean detection. Moreover, by taking the effects of parameter estimation into account, the proposed scheme with estimated parameters is also investigated. Both the average run length (ARL) and standard deviation of run length (SDRL) performances of the proposed scheme with known and estimated parameters are studied using the Markov chain method, respectively. Furthermore, an optimal design procedure is developed for the recommended one-sided EWMA TBE chart based on ARL. Numerical results show that the proposed optimal one-sided EWMA TBE chart is more sensitive than the existing optimal one-sided exponential EWMA chart in monitoring both upward and downward mean shifts. Meanwhile, it also performs better than the existing comparative scheme in resisting the effect of parameter estimation. Finally, two illustrative examples are considered to show the implementation of the proposed scheme for simulated and real datasets.
Journal: Journal of Applied Statistics
Pages: 3928-3957
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1967894
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1967894
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:3928-3957
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1967891_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Piotr Sliwka
Author-X-Name-First: Piotr
Author-X-Name-Last: Sliwka
Title: Markov (Set) chains application to predict mortality rates using extended Milevsky–Promislov generalized mortality models
Abstract:
The mortality rates (
$ \mu _{x,t} $ μx,t) measure the frequency of deaths in a fixed: population and time interval. The ability to model and forecast
$ \mu _{x,t} $ μx,t allows determining, among others, fundamental characteristics of life expectancy tables, e.g. used to determine the amount of premium in life insurance, adequate to the risk of death. The article proposes a new method of modelling and forecasting
$ \mu _{x,t} $ μx,t, using the class of stochastic Milevsky–Promislov switch models with excitations. The excitations are modelled by second, fourth and sixth order polynomials of outputs from the non-Gaussian Linear Scalar Filter (nGLSF) model and taking into account the Markov (Set) chain. The Markov (Set) chain state space is defined based on even orders of the nGLSF polynomial. The model order determines the theoretical values of the death rates. The obtained results usually provide a more precise forecast of the mortality rates than the commonly used Lee–Carter model.
Journal: Journal of Applied Statistics
Pages: 3868-3888
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1967891
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1967891
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:3868-3888
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1965966_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Weiwei Zhuang
Author-X-Name-First: Weiwei
Author-X-Name-Last: Zhuang
Author-Name: Yadong Li
Author-X-Name-First: Yadong
Author-X-Name-Last: Li
Author-Name: Guoxin Qiu
Author-X-Name-First: Guoxin
Author-X-Name-Last: Qiu
Title: Statistical inference for a relaxation index of stochastic dominance under density ratio model
Abstract:
Stochastic dominance is usually used to rank random variables by comparing their distributions, so it is widely applied in economics and finance. In actual applications, complete stochastic dominance is too demanding to meet, so relaxation indexes of stochastic dominance have attracted more attention. The π index, the biggest gap between two distributions, can be a measure of the degree of deviation from complete dominance. The traditional estimation method is to use the empirical distribution functions to estimate it. Considering the populations under comparison are generally of the same nature, we can link the populations through density ratio model under certain condition. Based on this model, we propose a new estimator and establish its statistical inference theory. Simulation results show that the proposed estimator substantially improves estimation efficiency and power of the tests and coverage probabilities satisfactorily match the confidence levels of the tests, which show the superiority of the proposed estimator. Finally we apply our method to a real example of the Chinese household incomes.
Journal: Journal of Applied Statistics
Pages: 3804-3822
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1965966
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1965966
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:3804-3822
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1969342_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Yi-Ching Lee
Author-X-Name-First: Yi-Ching
Author-X-Name-Last: Lee
Author-Name: Junfeng Shang
Author-X-Name-First: Junfeng
Author-X-Name-Last: Shang
Title: Estimation and selection in linear mixed models with missing data under compound symmetric structure
Abstract:
It is quite appealing to extend existing theories in classical linear models to correlated responses where linear mixed-effects models are utilized and the dependency in the data is modeled by random effects. In the mixed modeling framework, missing values occur naturally due to dropouts or non-responses, which is frequently encountered when dealing with real data. Motivated by such problems, we aim to investigate the estimation and model selection performance in linear mixed models when missing data are present. Inspired by the property of the indicator function for missingness and its relation to missing rates, we propose an approach that records missingness in an indicator-based matrix and derive the likelihood-based estimators for all parameters involved in the linear mixed-effects models. Based on the proposed method for estimation, we explore the relationship between estimation and selection behavior over missing rates. Simulations and a real data application are conducted for illustrating the effectiveness of the proposed method in selecting the most appropriate model and in estimating parameters.
Journal: Journal of Applied Statistics
Pages: 4003-4027
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1969342
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1969342
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:4003-4027
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1962264_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Junyi Lu
Author-X-Name-First: Junyi
Author-X-Name-Last: Lu
Author-Name: Sebastian Meyer
Author-X-Name-First: Sebastian
Author-X-Name-Last: Meyer
Title: An endemic–epidemic beta model for time series of infectious disease proportions
Abstract:
Time series of proportions of infected patients or positive specimens are frequently encountered in disease control and prevention. Since proportions are bounded and often asymmetrically distributed, conventional Gaussian time series models only apply to suitably transformed proportions. Here we borrow both from beta regression and from the well-established HHH model for infectious disease counts to propose an endemic–epidemic beta model for proportion time series. It accommodates the asymmetric shape and heteroskedasticity of proportion distributions and is consistent for complementary proportions. Coefficients can be interpreted in terms of odds ratios. A multivariate formulation with spatial power-law weights enables the joint estimation of model parameters from multiple regions. In our application to a flu activity index in the USA, we find that the endemic–epidemic beta model provides a better fit than a seasonal ARIMA model for the logit-transformed proportions. Furthermore, a multivariate approach can improve regional forecasts and reduce model complexity in comparison to univariate beta models stratified by region.
Journal: Journal of Applied Statistics
Pages: 3769-3783
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1962264
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1962264
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:3769-3783
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1966397_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Xiaohong He
Author-X-Name-First: Xiaohong
Author-X-Name-Last: He
Author-Name: Lei Wang
Author-X-Name-First: Lei
Author-X-Name-Last: Wang
Title: Ensemble and calibration multiply robust estimation for quantile treatment effect
Abstract:
Quantile treatment effects can be important causal estimands in the evaluation of biomedical treatments or interventions for health outcomes such as birthweight and medical cost. However, the existing estimators require either a propensity score model or a conditional density vector model is correctly specified, which is difficult to verify in practice. In this paper, we allow multiple models for propensity score and conditional density vector, then construct a class of calibration estimators based on multiple imputation and inverse probability weighting approaches via empirical likelihood. The resulting estimators multiply robust in the sense that they are consistent if any one of these models is correctly specified. Moreover, we propose another class of ensemble estimators to reduce computational burden while ensuring multiple robustness. Simulations are performed to evaluate the finite sample performance of the proposed estimators. Two applications to the birthweight of infants born in the United States and AIDS Clinical Trials Group Protocol 175 data are also presented.
Journal: Journal of Applied Statistics
Pages: 3823-3845
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1966397
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1966397
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:3823-3845
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1962263_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Xun Shen
Author-X-Name-First: Xun
Author-X-Name-Last: Shen
Author-Name: Pongsathorn Raksincharoensak
Author-X-Name-First: Pongsathorn
Author-X-Name-Last: Raksincharoensak
Title: Statistical models of near-accident event and pedestrian behavior at non-signalized intersections
Abstract:
This paper proposes an innovative framework of modeling the statistical properties of the near-accident event and pedestrian behavior at non-signalized intersections based on Poisson process and logistic regression. The first contribution of this study is that the predictive intensity model of the near-accident event is established by regarding the near-accident event as a Poisson process on space of the vehicle velocity, distance to the intersection and lateral distance to the pedestrian at the time when pedestrian appears. Besides, logistic regression is used to build the model which can predict the probability of pedestrian behavior. The two proposed models are validated in a generative simulation. The simulated pedestrian behavior data is generated by the proposed models and compared with the real data. The real data set is from the drive recorder data base of Smart Mobility Research Center (SMRC) at Tokyo University of Agriculture and Technology. Accident and near-accident data has been collected in the city streets with an image-captured drive recorder mounted on a taxi since 2006. The findings in this study are expected to be useful for constructions of traffic simulators or safety control design which considers the pedestrian-vehicle interaction.
Journal: Journal of Applied Statistics
Pages: 4028-4048
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1962263
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1962263
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:4028-4048
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1967893_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Guglielmo D'Amico
Author-X-Name-First: Guglielmo
Author-X-Name-Last: D'Amico
Author-Name: Riccardo De Blasis
Author-X-Name-First: Riccardo
Author-X-Name-Last: De Blasis
Title: Confidence sets for dynamic poverty indexes
Abstract:
In this study, we consider different poverty indexes in a dynamic framework where individuals change their rate of income randomly in time. The primary objective of this paper is to assess the accuracy of the approximation of the indexes that can be obtained by applying the strong law of large numbers to an economic system composed of an infinite number of agents. The main result is a multivariate central limit theorem for dynamic poverty measures, which is obtained applying the theory of U-statistics. We also show how to get the confidence sets for the considered dynamic indexes, which show the appropriateness of the model. An application to the Italian income data from 1998 to 2012 confirms the effectiveness of the considered approach and the possibility to determine the evolution of poverty and inequality in real economies.
Journal: Journal of Applied Statistics
Pages: 3908-3927
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1967893
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1967893
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:3908-3927
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1967892_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: J. C. Poythress
Author-X-Name-First: J. C.
Author-X-Name-Last: Poythress
Author-Name: Cheolwoo Park
Author-X-Name-First: Cheolwoo
Author-X-Name-Last: Park
Author-Name: Jeongyoun Ahn
Author-X-Name-First: Jeongyoun
Author-X-Name-Last: Ahn
Title: Dimension-wise sparse low-rank approximation of a matrix with application to variable selection in high-dimensional integrative analyzes of association
Abstract:
Many research proposals involve collecting multiple sources of information from a set of common samples, with the goal of performing an integrative analysis describing the associations between sources. We propose a method that characterizes the dominant modes of co-variation between the variables in two datasets while simultaneously performing variable selection. Our method relies on a sparse, low rank approximation of a matrix containing pairwise measures of association between the two sets of variables. We show that the proposed method shares a close connection with another group of methods for integrative data analysis – sparse canonical correlation analysis (CCA). Under some assumptions, the proposed method and sparse CCA aim to select the same subsets of variables. We show through simulation that the proposed method can achieve better variable selection accuracies than two state-of-the-art sparse CCA algorithms. Empirically, we demonstrate through the analysis of DNA methylation and gene expression data that the proposed method selects variables that have as high or higher canonical correlation than the variables selected by sparse CCA methods, which is a rather surprising finding given that objective function of the proposed method does not actually maximize the canonical correlation.
Journal: Journal of Applied Statistics
Pages: 3889-3907
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1967892
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1967892
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:3889-3907
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1970120_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Qiang Zhao
Author-X-Name-First: Qiang
Author-X-Name-Last: Zhao
Author-Name: Liang Chen
Author-X-Name-First: Liang
Author-X-Name-Last: Chen
Author-Name: Jingjing Wu
Author-X-Name-First: Jingjing
Author-X-Name-Last: Wu
Title: Robust and efficient estimation of GARCH models based on Hellinger distance
Abstract:
It is well known that financial data frequently contain outlying observations. Almost all methods and techniques used to estimate GARCH models are likelihood-based and thus generally non-robust against outliers. Minimum distance method, as an important tool for statistical inferences and a competitive alternative for achieving robustness, has surprisingly not been well explored for GARCH models. In this paper, we proposed a minimum Hellinger distance estimator (MHDE) and a minimum profile Hellinger distance estimator (MPHDE), depending on whether the innovation distribution is specified or not, for estimating the parameters in GARCH models. The construction and investigation of the two estimators are quite involved due to the non-i.i.d. nature of data. We proved that the MHDE is a consistent estimator and derived its bias in explicit expression. For both of the proposed estimators, we demonstrated their finite-sample performance through simulation studies and compared with the well-established methods including MLE, Gaussian Quasi-MLE, Non-Gaussian Quasi-MLE and Least Absolute Deviation estimator. Our numerical results showed that MHDE and MPHDE have much better performance than MLE-based methods when data are contaminated while simultaneously they are very competitive when data is clean, which testified to the robustness and efficiency of the two proposed MHD-type estimations.
Journal: Journal of Applied Statistics
Pages: 3976-4002
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1970120
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1970120
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:3976-4002
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1963422_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: João Victor B. de Freitas
Author-X-Name-First: João Victor B.
Author-X-Name-Last: de Freitas
Author-Name: Juvêncio S. Nobre
Author-X-Name-First: Juvêncio S.
Author-X-Name-Last: Nobre
Author-Name: Marcelo Bourguignon
Author-X-Name-First: Marcelo
Author-X-Name-Last: Bourguignon
Author-Name: Manoel Santos-Neto
Author-X-Name-First: Manoel
Author-X-Name-Last: Santos-Neto
Title: A new approach to modeling positive random variables with repeated measures
Abstract:
In many situations, it is common to have more than one observation per experimental unit, thus generating the experiments with repeated measures. In the modeling of such experiments, it is necessary to consider and model the intra-unit dependency structure. In the literature, there are several proposals to model positive continuous data with repeated measures. In this paper, we propose one more with the generalization of the beta prime regression model. We consider the possibility of dependence between observations of the same unit. Residuals and diagnostic tools also are discussed. To evaluate the finite-sample performance of the estimators, using different correlation matrices and distributions, we conducted a Monte Carlo simulation study. The methodology proposed is illustrated with an analysis of a real data set. Finally, we create an
$ \texttt {R} $ R package for easy access to publicly available the methodology described in this paper.
Journal: Journal of Applied Statistics
Pages: 3784-3803
Issue: 15
Volume: 49
Year: 2022
Month: 11
X-DOI: 10.1080/02664763.2021.1963422
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1963422
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:15:p:3784-3803
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1973389_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Muhammad Qasim
Author-X-Name-First: Muhammad
Author-X-Name-Last: Qasim
Author-Name: Kristofer Månsson
Author-X-Name-First: Kristofer
Author-X-Name-Last: Månsson
Author-Name: Pär Sjölander
Author-X-Name-First: Pär
Author-X-Name-Last: Sjölander
Author-Name: B. M. Golam Kibria
Author-X-Name-First: B. M. Golam
Author-X-Name-Last: Kibria
Title: A new class of efficient and debiased two-step shrinkage estimators: method and application
Abstract:
This paper introduces a new class of efficient and debiased two-step shrinkage estimators for a linear regression model in the presence of multicollinearity. We derive the proposed estimators’ mean square error and define the necessary and sufficient conditions for superiority over the existing estimators. In addition, we develop an algorithm for selecting the shrinkage parameters for the proposed estimators. The comparison of the new estimators versus the traditional ordinary least squares, ridge regression, Liu, and the two-parameter estimators is done by a matrix mean square error criterion. The Monte Carlo simulation results show the superiority of the proposed estimators under certain conditions. In the presence of high but imperfect multicollinearity, the two-step shrinkage estimators’ performance is relatively better. Finally, two real-world chemical data are analyzed to demonstrate the advantages and the empirical relevance of our newly proposed estimators. It is shown that the standard errors and the estimated mean square error decrease substantially for the proposed estimator. Hence, the precision of the estimated parameters is increased, which of course is one of the main objectives of the practitioners.
Journal: Journal of Applied Statistics
Pages: 4181-4205
Issue: 16
Volume: 49
Year: 2022
Month: 12
X-DOI: 10.1080/02664763.2021.1973389
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1973389
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:16:p:4181-4205
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1976120_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Danúbia R. Cunha
Author-X-Name-First: Danúbia R.
Author-X-Name-Last: Cunha
Author-Name: Jose Angelo Divino
Author-X-Name-First: Jose Angelo
Author-X-Name-Last: Divino
Author-Name: Helton Saulo
Author-X-Name-First: Helton
Author-X-Name-Last: Saulo
Title: On a log-symmetric quantile tobit model applied to female labor supply data
Abstract:
The study of female labor supply has been a topic of relevance in the economic literature. Generally, the data are left-censored and the classic tobit model has been extensively used in the modeling strategy. This model, however, assumes normality for the error distribution and is not recommended for data with positive skewness, heavy-tails and heteroscedasticity, as is the case of female labor supply data. Moreover, it is well-known that the quantile regression approach accounts for the influences of different quantiles in the estimated coefficients. We take all these features into account and propose a parametric quantile tobit regression model based on quantile log-symmetric distributions. The proposed method allows one to model data with positive skewness (which is not suitable for the classic tobit model), to study the influence of the quantiles of interest, and to account for heteroscedasticity. The model parameters are estimated by maximum likelihood and a Monte Carlo experiment is performed to evaluate alternative estimators. The new method is applied to two distinct female labor supply data sets. The results indicate that the log-symmetric quantile tobit model fits better the data than the classic tobit model.
Journal: Journal of Applied Statistics
Pages: 4225-4253
Issue: 16
Volume: 49
Year: 2022
Month: 12
X-DOI: 10.1080/02664763.2021.1976120
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1976120
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:16:p:4225-4253
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1971631_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Derya Ersel
Author-X-Name-First: Derya
Author-X-Name-Last: Ersel
Author-Name: Yasemin Kayhan Atılgan
Author-X-Name-First: Yasemin Kayhan
Author-X-Name-Last: Atılgan
Title: An approach for knowledge acquisition from a survey data by conducting Bayesian network modeling, adopting the robust coplot method
Abstract:
This study proposes a methodological approach for extracting useful knowledge from survey data by performing Bayesian network (BN) modeling and adopting the robust coplot analysis results as prior knowledge about association patterns hidden in the data. By addressing the issue of BN construction when the expert knowledge is limited/not available, this proposed approach facilitates the modeling of large data sets describing numerously observed and latent variables. By answering the question of which node(s)/link(s) should be retained or discarded from a BN, we aim to determine a compact model of variables while considering the desired properties of data. The proposed method steps are explained on real data extracted from Turkey Demographic and Health Survey. First, a BN structure is created, which is based solely on the judgment of the analyst. Then the coplot results are employed to update the BN structure and the model parameters are updated using the updated structure and data. Loss scores of the BNs are used to ensure the success of the updated BN that inherits knowledge from coplot.
Journal: Journal of Applied Statistics
Pages: 4069-4096
Issue: 16
Volume: 49
Year: 2022
Month: 12
X-DOI: 10.1080/02664763.2021.1971631
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1971631
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:16:p:4069-4096
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1973386_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: A. Beretta
Author-X-Name-First: A.
Author-X-Name-Last: Beretta
Author-Name: C. Heuchenne
Author-X-Name-First: C.
Author-X-Name-Last: Heuchenne
Author-Name: M. Restaino
Author-X-Name-First: M.
Author-X-Name-Last: Restaino
Title: Competing risks proportional-hazards cure model and generalized extreme value regression: an application to bank failures and acquisitions in the United States
Abstract:
Several commercial banks in the United States disappeared during the last decades due to failure or acquisition by another entity. From a survival analysis perspective, however, the high censoring rate suggests that some institutions are likely to be immune to failure and/or acquisition. In this study, we use a competing risks proportional-hazards cure model in order to measure the impact of bank-specific and macroeconomic variables on the probabilities of being susceptible to these events (i.e. incidence) and on the survival time of susceptible banks (i.e. latency). Moreover, we propose to model the incidence distribution using Generalized Extreme Value regression and compare the results with the ones obtained by the usual logistic regression model. The proposed methodology is evaluated by means of a simulation study and then applied to a dataset of more than 4000 United States commercial banks spanning the period 1993–2018.
Journal: Journal of Applied Statistics
Pages: 4162-4180
Issue: 16
Volume: 49
Year: 2022
Month: 12
X-DOI: 10.1080/02664763.2021.1973386
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1973386
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:16:p:4162-4180
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1970121_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: David G. Sinclair
Author-X-Name-First: David G.
Author-X-Name-Last: Sinclair
Author-Name: Giles Hooker
Author-X-Name-First: Giles
Author-X-Name-Last: Hooker
Title: An expectation maximization algorithm for high-dimensional model selection for the Ising model with misclassified states*
Abstract:
We propose the misclassified Ising Model: a framework for analyzing dependent binary data where the binary state is susceptible to error. We extend previous theoretical results of a model selection method based on applying the LASSO to logistic regression at each node and show that the method will still correctly identify edges in the underlying graphical model under suitable misclassification settings. With knowledge of the misclassification process, an expectation maximization algorithm is developed that accounts for misclassification during model selection. We illustrate the increase of performance of the proposed expectation maximization algorithm with simulated data, and using data from a functional magnetic resonance imaging analysis.
Journal: Journal of Applied Statistics
Pages: 4049-4068
Issue: 16
Volume: 49
Year: 2022
Month: 12
X-DOI: 10.1080/02664763.2021.1970121
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1970121
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:16:p:4049-4068
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1971632_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Mahendra Saha
Author-X-Name-First: Mahendra
Author-X-Name-Last: Saha
Author-Name: Sanku Dey
Author-X-Name-First: Sanku
Author-X-Name-Last: Dey
Author-Name: Saralees Nadarajah
Author-X-Name-First: Saralees
Author-X-Name-Last: Nadarajah
Title: Parametric inference of the process capability index for exponentiated exponential distribution
Abstract:
Process capability indices (PCIs) are most effective devices/techniques used in industries for determining the quality of products and performance of manufacturing processes. In this article, we consider the PCI Cpc which is based on the proportion of conformance and is applicable to normally as well as non-normally and continuous as well as discrete distributed processes. In order to estimate the PCI Cpc when the process follows exponentiated exponential distribution, we have used five classical methods of estimation. The performances of these classical estimators are compared with respect to their biases and mean squared errors (MSEs) of the index Cpc through simulation study. Also, the confidence intervals for the index Cpc are constructed using five bootstrap confidence interval (BCIs) methods. Monte Carlo simulation study has been carried out to compare the performances of these five BCIs in terms of their average width and coverage probabilities. Besides, net sensitivity (NS) analysis for the given PCI Cpc is considered. We use two data sets related to electronic and food industries and two failure time data sets to illustrate the performance of the proposed methods of estimation and BCIs. Additionally, we have developed PCI Cpc using aforementioned methods for generalized Rayleigh distribution.
Journal: Journal of Applied Statistics
Pages: 4097-4121
Issue: 16
Volume: 49
Year: 2022
Month: 12
X-DOI: 10.1080/02664763.2021.1971632
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1971632
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:16:p:4097-4121
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1975661_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Jimmy Reyes
Author-X-Name-First: Jimmy
Author-X-Name-Last: Reyes
Author-Name: Jaime Arrué
Author-X-Name-First: Jaime
Author-X-Name-Last: Arrué
Author-Name: Osvaldo Venegas
Author-X-Name-First: Osvaldo
Author-X-Name-Last: Venegas
Author-Name: Héctor W. Gómez
Author-X-Name-First: Héctor W.
Author-X-Name-Last: Gómez
Title: The modified slash Lindley–Weibull distribution with applications to nutrition data
Abstract:
This work presents an extension of the slash Lindley–Weibull distribution, of which it can be considered a modification. The new family is obtained by using the quotient of two independent random variables: a two-parameter Lindley–Weibull distribution divided by a power of the exponential distribution with parameter equal to 2. We present the pdf and cdf of the new distribution, analyzing their risk functions. Some statistical properties are studied and the moments and coefficients of asymmetry and kurtosis are shown. The parameter estimation problem is carried out by the maximum likelihood method. The method is assessed by a Monte Carlo simulation study. We use nutrition data, which are characterized by high kurtosis, to illustrate the usefulness of the proposed model.
Journal: Journal of Applied Statistics
Pages: 4206-4224
Issue: 16
Volume: 49
Year: 2022
Month: 12
X-DOI: 10.1080/02664763.2021.1975661
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1975661
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:16:p:4206-4224
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1971633_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Zhiwen Fang
Author-X-Name-First: Zhiwen
Author-X-Name-Last: Fang
Author-Name: Wendong Li
Author-X-Name-First: Wendong
Author-X-Name-Last: Li
Author-Name: Xin Liu
Author-X-Name-First: Xin
Author-X-Name-Last: Liu
Author-Name: Xiaolong Pu
Author-X-Name-First: Xiaolong
Author-X-Name-Last: Pu
Author-Name: Dongdong Xiang
Author-X-Name-First: Dongdong
Author-X-Name-Last: Xiang
Title: Online monitoring of high-dimensional binary data streams with application to extreme weather surveillance
Abstract:
With the rapid development of modern sensor technology, high-dimensional data streams appear frequently nowadays, bringing urgent needs for effective statistical process control (SPC) tools. In such a context, the online monitoring problem of high-dimensional and correlated binary data streams is becoming very important. Conventional SPC methods for monitoring multivariate binary processes may fail when facing high-dimensional applications due to high computational complexity and the lack of efficiency. In this paper, motivated by an application in extreme weather surveillance, we propose a novel pairwise approach that considers the most informative pairwise correlation between any two data streams. The information is then integrated into an exponential weighted moving average (EWMA) charting scheme to monitor abnormal mean changes in high-dimensional binary data streams. Extensive simulation study together with a real-data analysis demonstrates the efficiency and applicability of the proposed control chart.
Journal: Journal of Applied Statistics
Pages: 4122-4136
Issue: 16
Volume: 49
Year: 2022
Month: 12
X-DOI: 10.1080/02664763.2021.1971633
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1971633
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:16:p:4122-4136
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1977260_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Nowrin Nusrat
Author-X-Name-First: Nowrin
Author-X-Name-Last: Nusrat
Author-Name: M. S. Rahman
Author-X-Name-First: M. S.
Author-X-Name-Last: Rahman
Title: Dealing with separation or near-to-separation in the model for multinomial response with application to childhood health seeking behavior data from a complex survey
Abstract:
Separation or monotone-likelihood can be observed in fitting process of a multinomial logistic model using maximum likelihood estimation (MLE) when sample size is small and/or one of the outcome categories is rare and/or there is one or more influential covariates, resulting in infinite or biased estimate of at least one regression coefficient of the model. This study investigated empirically to identify the optimal data condition to define both ‘separation’ and ‘near-to-separation’ (partial separation) and explored their consequences in MLE and provided a solution by applying a penalized likelihood approach, which has been proposed in the literature, by adding a Jeffreys prior-based penalty term to the original likelihood function to remove the first-order bias in the MLEs of the multinomial logit model via equivalent Poisson regression. Furthermore, the penalized estimating equation (PMLE) is extended to a weighted estimating equation allowing for survey-weight for analyzing data from a complex survey. The simulation study suggests that the PMLE outperforms the MLE by providing smaller amount of bias and mean squared of error and better coverage. The methods are applied to analyze data on choice of health facility for treatment of childhood diseases.
Journal: Journal of Applied Statistics
Pages: 4254-4277
Issue: 16
Volume: 49
Year: 2022
Month: 12
X-DOI: 10.1080/02664763.2021.1977260
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1977260
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:16:p:4254-4277
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1977785_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Jinjuan Wang
Author-X-Name-First: Jinjuan
Author-X-Name-Last: Wang
Author-Name: Yunpeng Zhao
Author-X-Name-First: Yunpeng
Author-X-Name-Last: Zhao
Author-Name: Larry L. Tang
Author-X-Name-First: Larry L.
Author-X-Name-Last: Tang
Author-Name: Claudius Mueller
Author-X-Name-First: Claudius
Author-X-Name-Last: Mueller
Author-Name: Qizhai Li
Author-X-Name-First: Qizhai
Author-X-Name-Last: Li
Title: A resample-replace lasso procedure for combining high-dimensional markers with limit of detection
Abstract:
In disease screening, a biomarker combination developed by combining multiple markers tends to have a higher sensitivity than an individual marker. Parametric methods for marker combination rely on the inverse of covariance matrices, which is often a non-trivial problem for high-dimensional data generated by modern high-throughput technologies. Additionally, another common problem in disease diagnosis is the existence of limit of detection (LOD) for an instrument – that is, when a biomarker's value falls below the limit, it cannot be observed and is assigned an NA value. To handle these two challenges in combining high-dimensional biomarkers with the presence of LOD, we propose a resample-replace lasso procedure. We first impute the values below LOD and then use the graphical lasso method to estimate the means and precision matrices for the high-dimensional biomarkers. The simulation results show that our method outperforms alternative methods such as either substitute NA values with LOD values or remove observations that have NA values. A real case analysis on a protein profiling study of glioblastoma patients on their survival status indicates that the biomarker combination obtained through the proposed method is more accurate in distinguishing between two groups.
Journal: Journal of Applied Statistics
Pages: 4278-4293
Issue: 16
Volume: 49
Year: 2022
Month: 12
X-DOI: 10.1080/02664763.2021.1977785
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1977785
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:16:p:4278-4293
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1986685_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: J. M. Thilini Jayasinghe
Author-X-Name-First: J. M. Thilini
Author-X-Name-Last: Jayasinghe
Author-Name: Leif Ellingson
Author-X-Name-First: Leif
Author-X-Name-Last: Ellingson
Author-Name: Chalani Prematilake
Author-X-Name-First: Chalani
Author-X-Name-Last: Prematilake
Title: Regression models using the LINEX loss to predict lower bounds for the number of points for approximating planar contour shapes
Abstract:
Researchers in statistical shape analysis often analyze outlines of objects. Even though these contours are infinite-dimensional in theory, they must be discretized in practice. When discretizing, it is important to reduce the number of sampling points considerably to reduce computational costs, but to not use too few points so as to result in too much approximation error. Unfortunately, determining the minimum number of points needed to achieve sufficiently approximate the contours is computationally expensive. In this paper, we fit regression models to predict these lower bounds using characteristics of the contours that are computationally cheap as predictor variables. However, least squares regression is inadequate for this task because it treats overestimation and underestimation equally, but underestimation of lower bounds is far more serious. Instead, to fit the models, we use the LINEX loss function, which allows us to penalize underestimation at an exponential rate while penalizing overestimation only linearly. We present a novel approach to select the shape parameter of the loss function and tools for analyzing how well the model fits the data. Through validation methods, we show that the LINEX models work well for reducing the underestimation for the lower bounds.
Journal: Journal of Applied Statistics
Pages: 4294-4313
Issue: 16
Volume: 49
Year: 2022
Month: 12
X-DOI: 10.1080/02664763.2021.1986685
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1986685
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:16:p:4294-4313
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1973385_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: F. Prataviera
Author-X-Name-First: F.
Author-X-Name-Last: Prataviera
Author-Name: G. M. Cordeiro
Author-X-Name-First: G. M.
Author-X-Name-Last: Cordeiro
Author-Name: E. M. M. Ortega
Author-X-Name-First: E. M. M.
Author-X-Name-Last: Ortega
Author-Name: E. M. Hashimoto
Author-X-Name-First: E. M.
Author-X-Name-Last: Hashimoto
Author-Name: V. G. Cancho
Author-X-Name-First: V. G.
Author-X-Name-Last: Cancho
Title: A new regression model for rates and proportions data with applications
Abstract:
We propose a new continuous distribution in the interval
$ (0,1) $ (0,1) based on the generalized odd log-logistic-G family, whose density function can be symmetrical, asymmetric, unimodal and bimodal. The new model is implemented using the gamlss packages in R. We propose an extended regression based on this distribution which includes as sub-models some important regressions. We employ a frequentist and Bayesian analysis to estimate the parameters and adopt the non-parametric and parametric bootstrap methods to obtain better efficiency of the estimators. Some simulations are conducted to verify the empirical distribution of the maximum likelihood estimators. We compare the empirical distribution of the quantile residuals with the standard normal distribution. The extended regression can give more realistic fits than other regressions in the analysis of proportional data.
Journal: Journal of Applied Statistics
Pages: 4137-4161
Issue: 16
Volume: 49
Year: 2022
Month: 12
X-DOI: 10.1080/02664763.2021.1973385
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1973385
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:49:y:2022:i:16:p:4137-4161
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2004581_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Ran Bi
Author-X-Name-First: Ran
Author-X-Name-Last: Bi
Author-Name: Peng Liu
Author-X-Name-First: Peng
Author-X-Name-Last: Liu
Title: A semi-parametric Bayesian approach for detection of gene expression heterosis with RNA-seq data
Abstract:
Heterosis refers to the superior performance of a hybrid offspring over its two inbred parents. Although heterosis has been widely observed in agriculture, its molecular mechanism is not well studied. Recent advances in high-throughput genomic technologies such as RNA sequencing (RNA-seq) facilitate the investigation of heterosis at the gene expression level. However, it is challenging to identify genes exhibiting heterosis using RNA-seq data because high-dimension of hypotheses tests are conducted with limited sample size. Furthermore, detecting heterosis genes requires testing composite null hypotheses involving multiple mean expression levels instead of testing simple null hypotheses as in differential expression analysis. In this manuscript, we formulate a statistical model with parameters directly reflecting heterosis status, and develop a powerful test to detect heterosis genes. We employ a Bayesian framework where the RNA-seq count data are modeled through a Poisson-Gamma mixture with Dirichlet processes as priors for the distributions of the parameters of interest, the fold changes between each parent and the hybrid. Markov Chain Monte Carlo sampling with Gibbs algorithm is utilized to provide posterior inference to detect heterosis genes while controlling false discovery rate. Simulation results demonstrate that our proposed method outperformed other methods utilized to detect gene expression heterosis.
Journal: Journal of Applied Statistics
Pages: 214-230
Issue: 1
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.2004581
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2004581
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:1:p:214-230
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2025585_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Caner Taniş
Author-X-Name-First: Caner
Author-X-Name-Last: Taniş
Author-Name: Buğra Saraçoğlu
Author-X-Name-First: Buğra
Author-X-Name-Last: Saraçoğlu
Title: Cubic rank transmuted generalized Gompertz distribution: properties and applications
Abstract:
In this paper, we introduce a new lifetime distribution as an alternative to generalized Gompertz, Gompertz distribution and its modified ones. This new distribution is a special case of the family of distributions introduced by Granzotto et al. [D.C.T. Granzotto, F. Louzada and N. Balakrishnan, Cubic rank transmuted distributions: inferential issues and applications., J. Stat. Comput. Simul. 87 (2017), pp. 2760–2778]. We obtain some characteristic properties of suggested distribution such as hazard function, ordinary moments, coefficient of skewness, coefficient of kurtosis, moment generating function, quantile function and median. We discuss three different methods of estimation to estimate the parameters of proposed distribution. A comprehensive Monte Carlo simulation study is performed in order to compare the performances of estimators according to mean square errors and biases. Finally, three real data applications are performed to illustrate usefulness of suggested distribution.
Journal: Journal of Applied Statistics
Pages: 195-213
Issue: 1
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2022.2025585
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2025585
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:1:p:195-213
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1981834_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Mustafa Ç. Korkmaz
Author-X-Name-First: Mustafa Ç.
Author-X-Name-Last: Korkmaz
Author-Name: Christophe Chesneau
Author-X-Name-First: Christophe
Author-X-Name-Last: Chesneau
Author-Name: Zehra Sedef Korkmaz
Author-X-Name-First: Zehra Sedef
Author-X-Name-Last: Korkmaz
Title: A new alternative quantile regression model for the bounded response with educational measurements applications of OECD countries
Abstract:
This article introduces a new distribution with two tuning parameters specified on the unit interval. It follows from a ‘hyperbolic secant transformation’ of a random variable following the Weibull distribution. The lack of research on the prospect of hyperbolic transformations providing flexible distributions over the unit interval is a motivation for the study. The main distributional structural properties of the new distribution are established. The different estimation methods and two simulation works have been derived for model parameters. Subsequently, we develop a related quantile regression model for further statistical perspectives. We consider two real data applications based on the educational measurements of both OECD and some non-members of OECD countries. Our regression model aims to relate the desire to get top grades on certain young students in the OECD countries with some of their Education and School Life Index such as reading performance, work environment at home, and paid work experience. It is shown that the elaborated quantile regression model has a better fitting power than famous regression models when the unit response variable possesses skewed distribution as well as two independent variables are significant in the statistical sense at any standard significance level for the median response.
Journal: Journal of Applied Statistics
Pages: 131-154
Issue: 1
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1981834
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1981834
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:1:p:131-154
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1981256_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Jianghu (James) Dong
Author-X-Name-First: Jianghu (James)
Author-X-Name-Last: Dong
Author-Name: Haolun Shi
Author-X-Name-First: Haolun
Author-X-Name-Last: Shi
Author-Name: Liangliang Wang
Author-X-Name-First: Liangliang
Author-X-Name-Last: Wang
Author-Name: Ying Zhang
Author-X-Name-First: Ying
Author-X-Name-Last: Zhang
Author-Name: Jiguo Cao
Author-X-Name-First: Jiguo
Author-X-Name-Last: Cao
Title: Jointly modelling multiple transplant outcomes by a competing risk model via functional principal component analysis
Abstract:
In many clinical studies, longitudinal biomarkers are often used to monitor the progression of a disease. For example, in a kidney transplant study, the glomerular filtration rate (GFR) is used as a longitudinal biomarker to monitor the progression of the kidney function and the patient's state of survival that is characterized by multiple time-to-event outcomes, such as kidney transplant failure and death. It is known that the joint modelling of longitudinal and survival data leads to a more accurate and comprehensive estimation of the covariates' effect. While most joint models use the longitudinal outcome as a covariate for predicting survival, very few models consider the further decomposition of the variation within the longitudinal trajectories and its effect on survival. We develop a joint model that uses functional principal component analysis (FPCA) to extract useful features from the longitudinal trajectories and adopt the competing risk model to handle multiple time-to-event outcomes. The longitudinal trajectories and the multiple time-to-event outcomes are linked via the shared functional features. The application of our model on a real kidney transplant data set reveals the significance of these functional features, and a simulation study is carried out to validate the accurateness of the estimation method.
Journal: Journal of Applied Statistics
Pages: 43-59
Issue: 1
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1981256
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1981256
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:1:p:43-59
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1981257_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Honggang Zhang
Author-X-Name-First: Honggang
Author-X-Name-Last: Zhang
Author-Name: Jingyong Su
Author-X-Name-First: Jingyong
Author-X-Name-Last: Su
Author-Name: Linlin Tang
Author-X-Name-First: Linlin
Author-X-Name-Last: Tang
Author-Name: Anuj Srivastava
Author-X-Name-First: Anuj
Author-X-Name-Last: Srivastava
Title: Elastic statistical analysis of interval-valued time series
Abstract:
We investigate the problem of statistical analysis of interval-valued time series data – two nonintersecting real-valued functions, representing lower and upper limits, over a period of time. Specifically, we pay attention to the two concepts of phase (or horizontal) variability and amplitude (or vertical) variability, and propose a phase-amplitude separation method. We view interval-valued time series as elements of a function (Hilbert) space and impose a Riemannian structure on it. We separate phase and amplitude variability in observed interval functions using a metric-based alignment solution. The key idea is to map an interval to a point in
$ {\mathbb {R}}^2 $ R2, view interval-valued time series as parameterized curves in
$ {\mathbb {R}}^2 $ R2, and borrow ideas from elastic shape analysis of planar curves, including PCA, to perform registration, summarization, analysis, and modeling of multiple series. The proposed phase-amplitude separation provides a new way of PCA and modeling for interval-valued time series, and enables shape clustering of interval-valued time series. We apply this framework to three different applications, including finance, meteorology and physiology, proves the effectiveness of proposed methods, and discovers some underlying patterns in the data. Experimental results on simulated data show that our method applies to the point-valued time series.
Journal: Journal of Applied Statistics
Pages: 60-85
Issue: 1
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1981257
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1981257
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:1:p:60-85
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1981832_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Shubham Agnihotri
Author-X-Name-First: Shubham
Author-X-Name-Last: Agnihotri
Author-Name: Sanjay Kumar Singh
Author-X-Name-First: Sanjay
Author-X-Name-Last: Kumar Singh
Author-Name: Umesh Singh
Author-X-Name-First: Umesh
Author-X-Name-Last: Singh
Title: Inferences for multiple interval type-I censoring scheme
Abstract:
In this paper, we have introduced a new type of censoring scheme named the multiple interval type-I censoring scheme. Further, We have assumed that the test units are drawn from the Weibull population. We have also proposed the maximum product of spacing estimators for unknown parameters under the multiple interval type-I censoring scheme and compare them with the existing maximum likelihood estimators. In addition to this, the Bayes estimators for shape and scale parameters are also obtained under the squared error loss function. Their corresponding asymptotic confidence/credible intervals are also discussed. A real data set containing the breakdown time of insulating fluids are used to demonstrate the appropriateness of the proposed methodology.
Journal: Journal of Applied Statistics
Pages: 86-105
Issue: 1
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1981832
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1981832
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:1:p:86-105
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1978955_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Vera Liu
Author-X-Name-First: Vera
Author-X-Name-Last: Liu
Author-Name: Stephen Walker
Author-X-Name-First: Stephen
Author-X-Name-Last: Walker
Title: Testing for genetic mutation of seasonal influenza virus
Abstract:
Influenza virus strains undergo genetic mutations every year and these changes in genetic makeup pose difficulties for effective vaccine selection. To better understand the problem it is important to statistically quantify the amount of genetic change between circulating strains from different years. In this paper, we propose the nonparametric crossmatch test applied to phylogenetic trees to assess the level of discrepancy between circulating flu virus strains between two years; the viruses being represented by a phylogenetic tree. The crossmatch test has advantages compared to parametric tests in that it preserves more information in the data. The outcome of the test would indicate whether the circulating influenza virus has mutated sufficiently in the past year to be considered as a new population of virus, suggesting the need to consider a new vaccine. We validate the test on simulated phylogenetic tree samples with varying branch lengths, as well as with publicly available virus sequence data from the ‘Global Initiative on Sharing All Influenza Data’ (GISAID: https://www.gisaid.org/)
Journal: Journal of Applied Statistics
Pages: 1-18
Issue: 1
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1978955
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1978955
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:1:p:1-18
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1981833_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: J. Bröcker
Author-X-Name-First: J.
Author-X-Name-Last: Bröcker
Title: Testing the reliability of forecasting systems
Abstract:
The problem of statistically evaluating forecasting systems is revisited. The forecaster claims the forecasts to exhibit a certain nominal statistical behaviour; for instance, the forecasts provide the expected value (or certain quantiles) of the verification, conditional on the information available at forecast time. Forecasting systems that indeed exhibit the nominal behaviour are referred to as reliable. Statistical tests for reliability are presented (based on an archive of verification–forecast pairs). As noted previously, devising such tests is encumbered by the fact that the dependence structure of the verification–forecast pairs is not known in general. Ignoring this dependence though might lead to incorrect tests and too-frequent rejection of forecasting systems that are actually reliable. On the other hand, reliability typically implies that the forecast provides information about the dependence structure, and using this in conjunction with judicious choices of the test statistic, rigorous results on the asymptotic distribution of the test statistic are obtained. These results are used to test for reliability under minimal additional assumptions on the statistical properties of the verification–forecast pairs. Applications to environmental forecasts are discussed. A python implementation of the discussed methods is available online.
Journal: Journal of Applied Statistics
Pages: 106-130
Issue: 1
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1981833
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1981833
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:1:p:106-130
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1982879_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Theodoros Perdikis
Author-X-Name-First: Theodoros
Author-X-Name-Last: Perdikis
Author-Name: Stelios Psarakis
Author-X-Name-First: Stelios
Author-X-Name-Last: Psarakis
Author-Name: Philippe Castagliola
Author-X-Name-First: Philippe
Author-X-Name-Last: Castagliola
Author-Name: Athanasios C. Rakitzis
Author-X-Name-First: Athanasios C.
Author-X-Name-Last: Rakitzis
Author-Name: Petros E. Maravelakis
Author-X-Name-First: Petros E.
Author-X-Name-Last: Maravelakis
Title: The EWMA sign chart revisited: performance and alternatives without and with ties
Abstract:
The EWMA Sign control chart is an efficient tool for monitoring shifts in a process regardless the observations' underlying distribution. Recent studies have shown that, for nonparametric control charts, due to the discrete nature of the statistics being used (such as the Sign statistic), it is impossible to accurately compute their Run Length properties using Markov chain or integral equation methods. In this work, a modified nonparametric Phase II EWMA chart based on the Sign statistic is proposed and its exact Run Length properties are discussed. A continuous transformation of the Sign statistic, combined with the classical Markov Chain method, is used for the determination of the chart's in- and out-of-control Run Length properties. Additionally, we show that when ties occur due to measurement rounding-off errors, the EWMA Sign control chart is no longer distribution-free and a Bernoulli trial approach is discussed to handle the occurrence of ties and makes the proposed chart almost distribution-free. Finally, an illustrative example is provided to show the practical implementation of our proposed chart.
Journal: Journal of Applied Statistics
Pages: 170-194
Issue: 1
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1982879
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1982879
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:1:p:170-194
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1982877_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Chun-Chao Wang
Author-X-Name-First: Chun-Chao
Author-X-Name-Last: Wang
Author-Name: Yi-Ting Hwang
Author-X-Name-First: Yi-Ting
Author-X-Name-Last: Hwang
Author-Name: Chung-Chuan Chou
Author-X-Name-First: Chung-Chuan
Author-X-Name-Last: Chou
Author-Name: Hui-Ling Lee
Author-X-Name-First: Hui-Ling
Author-X-Name-Last: Lee
Title: Misspecification of a binary dependent variable in the logistic model controlling for the repeated longitudinal measures
Abstract:
Many medical applications are interested to know the disease status. The disease status can be related to multiple serial measurements. Nevertheless, owing to various reasons, the binary outcome can be measured incorrectly. The estimators derived from the misspecified outcome can be biased. This paper derives the complete data likelihood function to incorporate both the multiple serial measurements and the misspecified outcome. Owing to the latent variables, EM algorithm is used to derive the maximum-likelihood estimators. Monte Carlo simulations are conducted to compare the impact of misspecification on the estimates. A retrospective data for the recurrence of atrial fibrillation is used to illustrate the usage of the proposed model.
Journal: Journal of Applied Statistics
Pages: 155-169
Issue: 1
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1982877
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1982877
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:1:p:155-169
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1980506_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Kashinath Chatterjee
Author-X-Name-First: Kashinath
Author-X-Name-Last: Chatterjee
Author-Name: Christos Koukouvinos
Author-X-Name-First: Christos
Author-X-Name-Last: Koukouvinos
Author-Name: Angeliki Lappa
Author-X-Name-First: Angeliki
Author-X-Name-Last: Lappa
Title: Monitoring process mean and dispersion with one double generally weighted moving average control chart
Abstract:
Control charts are widely known quality tools used to detect and control industrial process deviations in Statistical Process Control. In the current paper, we propose a new single memory-type control chart, called the maximum double generally weighted moving average chart (referred as Max-DGWMA), that simultaneously detects shifts in the process mean and/or process dispersion. The run length performance of the proposed Max-DGWMA chart is compared with that of the Max-EWMA, Max-DEWMA, Max-GWMA and SS-DGWMA charts, using time-varying control limits, through Monte–Carlo simulations. The comparisons reveal that the proposed chart is more efficient than the Max-EWMA, Max-DEWMA and Max-GWMA charts, while it is comparable with the SS-DGWMA chart. An automotive industry application is presented in order to implement the Max-DGWMA chart. The goal is to establish statistical control of the manufacturing process of the automobile engine piston rings. The source of the out-of-control signals is interpreted and the efficiency of the proposed chart in detecting shifts faster is evident.
Journal: Journal of Applied Statistics
Pages: 19-42
Issue: 1
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1980506
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1980506
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:1:p:19-42
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1990225_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Xinyi Wang
Author-X-Name-First: Xinyi
Author-X-Name-Last: Wang
Author-Name: Zhenghui Feng
Author-X-Name-First: Zhenghui
Author-X-Name-Last: Feng
Title: Component selection for exponential power mixture models
Abstract:
Exponential Power (EP) family is a much flexible distribution family including Gaussian family as a sub-family. In this article, we study component selection and estimation for EP mixture models and regressions. The assumption on zero component mean in [X. Cao, Q. Zhao, D. Meng, Y. Chen, and Z. Xu, Robust low-rank matrix factorization under general mixture noise distributions, IEEE. Trans. Image. Process. 25 (2016), pp. 4677–4690.] is relaxed. To select components and estimate parameters simultaneously, we propose a penalized likelihood method, which can shrink mixing proportions to zero to achieve components selection. Modified EM algorithms are proposed, and the consistency of estimated component number is obtained. Simulation studies show the advantages of the proposed methods on accuracies of component number selection, parameter estimation, and density estimation. Analysis of value at risk of SHIBOR and a climate change data are given as illustration.
Journal: Journal of Applied Statistics
Pages: 291-314
Issue: 2
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1990225
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1990225
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:2:p:291-314
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2096875_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Yasin Altinisik
Author-X-Name-First: Yasin
Author-X-Name-Last: Altinisik
Title: Addressing overdispersion and zero-inflation for clustered count data via new multilevel heterogenous hurdle models
Abstract:
Unobserved heterogeneity causing overdispersion and the excessive number of zeros take a prominent place in the methodological development on count modeling. An insight into the mechanisms that induce heterogeneity is required for better understanding of the phenomenon of overdispersion. When the heterogeneity is sourced by the stochastic component of the model, the use of a heterogenous Poisson distribution for this part encounters as an elegant solution. Hierarchical design of the study is also responsible for the heterogeneity as the unobservable effects at various levels also contribute to the overdispersion. Zero-inflation, heterogeneity and multilevel nature in the count data present special challenges in their own respect, however the presence of all in one study adds more challenges to the modeling strategies. This study therefore is designed to merge the attractive features of the separate strand of the solutions in order to face such a comprehensive challenge. This study differs from the previous attempts by the choice of two recently developed heterogeneous distributions, namely Poisson–Lindley (PL) and Poisson–Ailamujia (PA) for the truncated part. Using generalized linear mixed modeling settings, predictive performances of the multilevel PL and PA models and their hurdle counterparts were assessed within a comprehensive simulation study in terms of bias, precision and accuracy measures. Multilevel models were applied to two separate real world examples for the assessment of practical implications of the new models proposed in this study.
Journal: Journal of Applied Statistics
Pages: 408-433
Issue: 2
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2022.2096875
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2096875
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:2:p:408-433
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1985091_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Edward L. Boone
Author-X-Name-First: Edward L.
Author-X-Name-Last: Boone
Author-Name: Abdel-Salam G. Abdel-Salam
Author-X-Name-First: Abdel-Salam G.
Author-X-Name-Last: Abdel-Salam
Author-Name: Indranil Sahoo
Author-X-Name-First: Indranil
Author-X-Name-Last: Sahoo
Author-Name: Ryad Ghanam
Author-X-Name-First: Ryad
Author-X-Name-Last: Ghanam
Author-Name: Xi Chen
Author-X-Name-First: Xi
Author-X-Name-Last: Chen
Author-Name: Aiman Hanif
Author-X-Name-First: Aiman
Author-X-Name-Last: Hanif
Title: Monitoring SEIRD model parameters using MEWMA for the COVID-19 pandemic with application to the state of Qatar
Abstract:
During the current COVID-19 pandemic, decision-makers are tasked with implementing and evaluating strategies for both treatment and disease prevention. In order to make effective decisions, they need to simultaneously monitor various attributes of the pandemic such as transmission rate and infection rate for disease prevention, recovery rate which indicates treatment effectiveness as well as the mortality rate and others. This work presents a technique for monitoring the pandemic by employing an Susceptible, Exposed, Infected, Recovered, Death model regularly estimated by an augmented particle Markov chain Monte Carlo scheme in which the posterior distribution samples are monitored via Multivariate Exponentially Weighted Average process monitoring. This is illustrated on the COVID-19 data for the State of Qatar.
Journal: Journal of Applied Statistics
Pages: 231-246
Issue: 2
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1985091
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1985091
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:2:p:231-246
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1992360_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Joris Mulder
Author-X-Name-First: Joris
Author-X-Name-Last: Mulder
Author-Name: John P. T. M. Gelissen
Author-X-Name-First: John P. T. M.
Author-X-Name-Last: Gelissen
Title: Bayes factor testing of equality and order constraints on measures of association in social research
Abstract:
Measures of association play a central role in the social sciences to quantify the strength of a linear relationship between the variables of interest. In many applications researchers can translate scientific expectations to hypotheses with equality and/or order constraints on these measures of association. In this paper a Bayes factor test is proposed for testing multiple hypotheses with constraints on the measures of association between ordinal and/or continuous variables, possibly after correcting for certain covariates. This test can be used to obtain a direct answer to the research question how much evidence there is in the data for a social science theory relative to competing theories. The stand-alone software package ‘BCT’ allows users to apply the methodology in an easy manner. The methodology will also be available in the R package ‘BFpack’. An empirical application from leisure studies about the associations between life, leisure and relationship satisfaction and an application about the differences about egalitarian justice beliefs across countries are used to illustrate the methodology.
Journal: Journal of Applied Statistics
Pages: 315-351
Issue: 2
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1992360
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1992360
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:2:p:315-351
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1990224_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Takeshi Emura
Author-X-Name-First: Takeshi
Author-X-Name-Last: Emura
Author-Name: Wei-Chern Hsu
Author-X-Name-First: Wei-Chern
Author-X-Name-Last: Hsu
Author-Name: Wen-Chi Chou
Author-X-Name-First: Wen-Chi
Author-X-Name-Last: Chou
Title: A survival tree based on stabilized score tests for high-dimensional covariates
Abstract:
A survival tree can classify subjects into different survival prognostic groups. However, when data contains high-dimensional covariates, the two popular classification trees exhibit fatal drawbacks. The logrank tree is unstable and tends to have false nodes; the conditional inference tree is difficult to interpret the adjusted P-value for high-dimensional tests. Motivated by these problems, we propose a new survival tree based on the stabilized score tests. We propose a novel matrix-based algorithm in order to tests a number of nodes simultaneously via stabilized score tests. We propose a recursive partitioning algorithm to construct a survival tree and develop our original R package uni.survival.tree (https://cran.r-project.org/package=uni.survival.tree) for implementation. Simulations are performed to demonstrate the superiority of the proposed method over the existing methods. The lung cancer data analysis demonstrates the usefulness of the proposed method.
Journal: Journal of Applied Statistics
Pages: 264-290
Issue: 2
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1990224
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1990224
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:2:p:264-290
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1994529_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Zihang Lu
Author-X-Name-First: Zihang
Author-X-Name-Last: Lu
Author-Name: Wendy Lou
Author-X-Name-First: Wendy
Author-X-Name-Last: Lou
Title: Bayesian approaches to variable selection in mixture models with application to disease clustering
Abstract:
In biomedical research, cluster analysis is often performed to identify patient subgroups based on patients' characteristics or traits. In the model-based clustering for identifying patient subgroups, mixture models have played a fundamental role in modeling. While there is an increasing interest in using mixture modeling for identifying patient subgroups, little work has been done in selecting the predictors that are associated with the class assignment. In this study, we develop and compare two approaches to perform variable selection in the context of a mixture model to identify important predictors that are associated with the class assignment. These two approaches are the one-step approach and the stepwise approach. The former refers to an approach in which clustering and variable selection are performed simultaneously in one overall model, whereas the latter refers to an approach in which clustering and variable selection are performed in two sequential steps. We considered both shrinkage prior and spike-and-slab prior to select the importance of variables. Markov chain Monte Carlo algorithms are developed to estimate the posterior distribution of the model parameters. Practical applications and simulation studies are carried out to evaluate the clustering and variable selection performance of the proposed models.
Journal: Journal of Applied Statistics
Pages: 387-407
Issue: 2
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1994529
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1994529
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:2:p:387-407
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1993798_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Zezhun Chen
Author-X-Name-First: Zezhun
Author-X-Name-Last: Chen
Author-Name: Angelos Dassios
Author-X-Name-First: Angelos
Author-X-Name-Last: Dassios
Author-Name: George Tzougas
Author-X-Name-First: George
Author-X-Name-Last: Tzougas
Title: A first-order binomial-mixed Poisson integer-valued autoregressive model with serially dependent innovations
Abstract:
Motivated by the extended Poisson INAR(1), which allows innovations to be serially dependent, we develop a new family of binomial-mixed Poisson INAR(1) (BMP INAR(1)) processes by adding a mixed Poisson component to the innovations of the classical Poisson INAR(1) process. Due to the flexibility of the mixed Poisson component, the model includes a large class of INAR(1) processes with different transition probabilities. Moreover, it can capture some overdispersion features coming from the data while keeping the innovations serially dependent. We discuss its statistical properties, stationarity conditions and transition probabilities for different mixing densities (Exponential, Lindley). Then, we derive the maximum likelihood estimation method and its asymptotic properties for this model. Finally, we demonstrate our approach using a real data example of iceberg count data from a financial system.
Journal: Journal of Applied Statistics
Pages: 352-369
Issue: 2
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1993798
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1993798
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:2:p:352-369
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2008328_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Robert A. Bettinger
Author-X-Name-First: Robert A.
Author-X-Name-Last: Bettinger
Title: Bimodality in reentry latitude predictions for spacecraft in prograde orbits
Abstract:
Probability distribution functions (PDFs) of atmospheric reentry latitude predictions are shown to be bimodal for spacecraft in low-eccentricity, prograde low Earth orbits at altitudes of 300 km and lower. Using two-line element (TLE) data for initial orbit conditions, coupled with coarse estimates for spacecraft aerodynamic characteristics, parametric simulations produce bimodal distributions that suggest a greater likelihood of reentry near the latitudinal maxima of a given spacecraft's ground track. Various computational measures are used to test for and quantify bimodality in the reentry latitude data sets. Also, a method for approximating bandwidth is introduced for the kernel estimation of reentry latitude probability density. Overall, statistical analysis indicates that actual reentry latitudes are generally within 1-σ of observed hemisphere means as demonstrated by six historical reentry cases.
Journal: Journal of Applied Statistics
Pages: 434-450
Issue: 2
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.2008328
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2008328
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:2:p:434-450
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1987400_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Hao Ding
Author-X-Name-First: Hao
Author-X-Name-Last: Ding
Author-Name: Yan Zhang
Author-X-Name-First: Yan
Author-X-Name-Last: Zhang
Author-Name: Yuehua Wu
Author-X-Name-First: Yuehua
Author-X-Name-Last: Wu
Title: A novel group VIF regression for group variable selection with application to multiple change-point detection
Abstract:
In this paper, we propose a novel group variance inflation factor (VIF) regression model for tackling large data sets where data follows a grouped structure. Unlike classical penalized methods, this approach can perform group variable selection in a sparse model, which is quite different from the classical penalized methods. We further adapt the proposed method associated with a two-stage procedure for detecting multiple change-point in linear models. We carry out extensive simulation studies to show that the proposed group variable selection and change-point detection methods are stable and efficient. Finally, we provide two real data examples, including a body fat data set and an air pollution data set, to illustrate the performance of our algorithms in group selection and change-point detection.
Journal: Journal of Applied Statistics
Pages: 247-263
Issue: 2
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1987400
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1987400
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:2:p:247-263
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1993799_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Tao Wang
Author-X-Name-First: Tao
Author-X-Name-Last: Wang
Author-Name: Xiaona Yang
Author-X-Name-First: Xiaona
Author-X-Name-Last: Yang
Author-Name: Yunfei Guo
Author-X-Name-First: Yunfei
Author-X-Name-Last: Guo
Author-Name: Zhonghua Li
Author-X-Name-First: Zhonghua
Author-X-Name-Last: Li
Title: Identification of outlying observations for large-dimensional data
Abstract:
This work proposes a two-stage procedure for identifying outlying observations in a large-dimensional data set. In the first stage, an outlier identification measure is defined by using a max-normal statistic and a clean subset that contains non-outliers is obtained. The identification of outliers can be deemed as a multiple hypothesis testing problem, then, in the second stage, we explore the asymptotic distribution of the proposed measure, and obtain the threshold of the outlying observations. Furthermore, in order to improve the identification power and better control the misjudgment rate, a one-step refined algorithm is proposed. Simulation results and two real data analysis examples show that, compared with other methods, the proposed procedure has great advantages in identifying outliers in various data situations.
Journal: Journal of Applied Statistics
Pages: 370-386
Issue: 2
Volume: 50
Year: 2023
Month: 01
X-DOI: 10.1080/02664763.2021.1993799
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1993799
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:2:p:370-386
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2068514_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Sixia Chen
Author-X-Name-First: Sixia
Author-X-Name-Last: Chen
Author-Name: Chao Xu
Author-X-Name-First: Chao
Author-X-Name-Last: Xu
Title: Handling high-dimensional data with missing values by modern machine learning techniques
Abstract:
High-dimensional data have been regarded as one of the most important types of big data in practice. It happens frequently in practice including genetic study, financial study, and geographical study. Missing data in high dimensional data analysis should be handled properly to reduce nonresponse bias. We discuss some modern machine learning techniques including penalized regression approaches, tree-based approaches, and deep learning (DL) for handling missing data with high dimensionality. Specifically, our proposed methods can be used for estimating general parameters of interest including population means and percentiles with imputation-based estimators, propensity score estimators, and doubly robust estimators. We compare those methods through some limited simulation studies and a real application. Both simulation studies and real application show the benefits of DL and XGboost approaches compared with other methods in terms of balancing bias and variance.
Journal: Journal of Applied Statistics
Pages: 786-804
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2022.2068514
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2068514
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:786-804
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1937581_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Wei Zhang
Author-X-Name-First: Wei
Author-X-Name-Last: Zhang
Author-Name: Colin O. Wu
Author-X-Name-First: Colin O.
Author-X-Name-Last: Wu
Author-Name: Xiaoyang Ma
Author-X-Name-First: Xiaoyang
Author-X-Name-Last: Ma
Author-Name: Xin Tian
Author-X-Name-First: Xin
Author-X-Name-Last: Tian
Author-Name: Qizhai Li
Author-X-Name-First: Qizhai
Author-X-Name-Last: Li
Title: Analysis of multivariate longitudinal data using dynamic lasso-regularized copula models with application to large pediatric cardiovascular studies
Abstract:
The National Heart, Lung and Blood Institute Growth and Health Study (NGHS) is a large longitudinal study of childhood health. A main objective of the study is to estimate the joint distributions of cardiovascular risk outcomes at any two time points conditioning on a large number of covariates. Existing multivariate longitudinal methods are not suitable for outcomes at multiple time points. We present a dynamic copula approach for estimating an outcome's joint distributions at two time points given a large number of time-varying covariates. Our models depend on the outcome's time-varying distributions at one time point, the bivariate copula densities and the functional copula parameters. We develop a three-step procedure for variable selection and estimation, which selects the influential covariates using a machine learning procedure based on spline Lasso-regularized least squares, computes the outcome's single-time distribution using splines, and estimates the functional copula parameter of the dynamic copula models. Pointwise confidence intervals are constructed through the resampling-subject bootstrap. We apply our procedure to the NGHS cardiovascular risk data and illustrate the clinical interpretations of the conditional distributions of a set of risk outcomes. We demonstrate the statistical properties of the dynamic models and estimation procedure through a simulation study.
Journal: Journal of Applied Statistics
Pages: 631-658
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1937581
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1937581
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:631-658
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1834516_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Liang Zhang
Author-X-Name-First: Liang
Author-X-Name-Last: Zhang
Author-Name: Tianming Zhu
Author-X-Name-First: Tianming
Author-X-Name-Last: Zhu
Author-Name: Jin-Ting Zhang
Author-X-Name-First: Jin-Ting
Author-X-Name-Last: Zhang
Title: Two-sample Behrens–Fisher problems for high-dimensional data: a normal reference scale-invariant test
Abstract:
For high-dimensional two-sample Behrens–Fisher problems, several non-scale-invariant and scale-invariant tests have been proposed. Most of them impose strong assumptions on the underlying group covariance matrices so that their test statistics are asymptotically normal. However, in practice, these assumptions may not be satisfied or hardly be checked so that these tests may not be able to maintain the nominal size well in practice. To overcome this difficulty, in this paper, a normal reference scale-invariant test is proposed and studied. It works well by neither imposing strong assumptions on the underlying group covariance matrices nor assuming their equality. It is shown that under some regularity conditions and the null hypothesis, the proposed test and a chi-square-type mixture have the same normal and non-normal limiting distributions. It is then justifiable to approximate the null distribution of the proposed test using that of the chi-square-type mixture. The distribution of the chi-square type mixture can be well approximated by the Welch–Satterthwaite chi-square-approximation with the approximation parameter consistently estimated from the data. The asymptotic power of the proposed test is established. Numerical results demonstrate that the proposed test has much better size control and power than several well-known non-scale-invariant and scale-invariant tests.
Journal: Journal of Applied Statistics
Pages: 456-476
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2020.1834516
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1834516
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:456-476
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1975662_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: X. Zhi
Author-X-Name-First: X.
Author-X-Name-Last: Zhi
Author-Name: J. Liu
Author-X-Name-First: J.
Author-X-Name-Last: Liu
Author-Name: S. Wu
Author-X-Name-First: S.
Author-X-Name-Last: Wu
Author-Name: C. Niu
Author-X-Name-First: C.
Author-X-Name-Last: Niu
Title: A generalized l2,p-norm regression based feature selection algorithm
Abstract:
Feature selection is an important data dimension reduction method, and it has been used widely in applications involving high-dimensional data such as genetic data analysis and image processing. In order to achieve robust feature selection, the latest works apply the
$ l_{2,1} $ l2,1 or
$ l_{2,p} $ l2,p-norm of matrix to the loss function and regularization terms in regression, and have achieved encouraging results. However, these existing works rigidly set the matrix norms used in the loss function and the regularization terms to the same
$ l_{2,1} $ l2,1 or
$ l_{2,p} $ l2,p-norm, which limit their applications. In addition, the algorithms for solutions they present either have high computational complexity and are not suitable for large data sets, or cannot provide satisfying performance due to the approximate calculation. To address these problems, we present a generalized
$ \textit{l}_{2,p} $ l2,p-norm regression based feature selection (
$ l_{2,p} $ l2,p-RFS) method based on a new optimization criterion. The criterion extends the optimization criterion of (
$ l_{2,p} $ l2,p-RFS) when the loss function and the regularization terms in regression use different matrix norms. We cast the new optimization criterion in a regression framework without regularization. In this framework, the new optimization criterion can be solved using an iterative re-weighted least squares (IRLS) procedure in which the least squares problem can be solved efficiently by using the least square QR decomposition (LSQR) algorithm. We have conducted extensive experiments to evaluate the proposed algorithm on various well-known data sets of both gene expression and image data sets, and compare it with other related feature selection methods.
Journal: Journal of Applied Statistics
Pages: 703-723
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1975662
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1975662
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:703-723
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1937583_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Xiaobin Zhi
Author-X-Name-First: Xiaobin
Author-X-Name-Last: Zhi
Author-Name: Tongjun Yu
Author-X-Name-First: Tongjun
Author-X-Name-Last: Yu
Author-Name: Longtao Bi
Author-X-Name-First: Longtao
Author-X-Name-Last: Bi
Author-Name: Yalan Li
Author-X-Name-First: Yalan
Author-X-Name-Last: Li
Title: Noise-insensitive discriminative subspace fuzzy clustering
Abstract:
Discriminative subspace clustering (DSC) can make full use of linear discriminant analysis (LDA) to reduce the dimension of data and achieve effective clustering high-dimension data by clustering low-dimension data in discriminant subspace. However, most existing DSC algorithms do not consider the noise and outliers that may be contained in data sets, and when they are applied to the data sets with noise or outliers, and they often obtain poor performance due to the influence of noise and outliers. In this paper, we address the problem of the sensitivity of DSC to noise and outlier. Replacing the Euclidean distance in the objective function of LDA by an exponential non-Euclidean distance, we first develop a noise-insensitive LDA (NILDA) algorithm. Then, combining the proposed NILDA and a noise-insensitive fuzzy clustering algorithm: AFKM, we propose a noise-insensitive discriminative subspace fuzzy clustering (NIDSFC) algorithm. Experiments on some benchmark data sets show the effectiveness of the proposed NIDSFC algorithm.
Journal: Journal of Applied Statistics
Pages: 659-674
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1937583
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1937583
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:659-674
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1936468_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Mingyue Zhang Wu
Author-X-Name-First: Mingyue
Author-X-Name-Last: Zhang Wu
Author-Name: Jinzhu Luo
Author-X-Name-First: Jinzhu
Author-X-Name-Last: Luo
Author-Name: Xing Fang
Author-X-Name-First: Xing
Author-X-Name-Last: Fang
Author-Name: Maochao Xu
Author-X-Name-First: Maochao
Author-X-Name-Last: Xu
Author-Name: Peng Zhao
Author-X-Name-First: Peng
Author-X-Name-Last: Zhao
Title: Modeling multivariate cyber risks: deep learning dating extreme value theory
Abstract:
Modeling cyber risks has been an important but challenging task in the domain of cyber security, which is mainly caused by the high dimensionality and heavy tails of risk patterns. Those obstacles have hindered the development of statistical modeling of the multivariate cyber risks. In this work, we propose a novel approach for modeling the multivariate cyber risks which relies on the deep learning and extreme value theory. The proposed model not only enjoys the high accurate point predictions via deep learning but also can provide the satisfactory high quantile predictions via extreme value theory. Both the simulation and empirical studies show that the proposed approach can model the multivariate cyber risks very well and provide satisfactory prediction performances.
Journal: Journal of Applied Statistics
Pages: 610-630
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1936468
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1936468
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:610-630
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1999398_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Mi Zhou
Author-X-Name-First: Mi
Author-X-Name-Last: Zhou
Author-Name: Weixin Yao
Author-X-Name-First: Weixin
Author-X-Name-Last: Yao
Title: Sensitivity analysis of unmeasured confounding in causal inference based on exponential tilting and super learner
Abstract:
Causal inference under the potential outcome framework relies on the strongly ignorable treatment assumption. This assumption is usually questionable in observational studies, and the unmeasured confounding is one of the fundamental challenges in causal inference. To this end, we propose a new sensitivity analysis method to evaluate the impact of the unmeasured confounder by leveraging ideas of doubly robust estimators, the exponential tilt method, and the super learner algorithm. Compared to other existing methods of sensitivity analysis that parameterize the unmeasured confounder as a latent variable in the working models, the exponential tilting method does not impose any restrictions on the structure or models of the unmeasured confounders. In addition, in order to reduce the modeling bias of traditional parametric methods, we propose incorporating the super learner machine learning algorithm to perform nonparametric model estimation and the corresponding sensitivity analysis. Furthermore, most existing sensitivity analysis methods require multivariate sensitivity parameters, which make its choice difficult and subjective in practice. In comparison, the new method has a univariate sensitivity parameter with a nice and simple interpretation of log-odds ratios for binary outcomes, which makes its choice and the application of the new sensitivity analysis method very easy for practitioners.
Journal: Journal of Applied Statistics
Pages: 744-760
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1999398
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1999398
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:744-760
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1904847_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Mingtao Zhao
Author-X-Name-First: Mingtao
Author-X-Name-Last: Zhao
Author-Name: Xiaoli Xu
Author-X-Name-First: Xiaoli
Author-X-Name-Last: Xu
Author-Name: Yanling Zhu
Author-X-Name-First: Yanling
Author-X-Name-Last: Zhu
Author-Name: Kongsheng Zhang
Author-X-Name-First: Kongsheng
Author-X-Name-Last: Zhang
Author-Name: Yan Zhou
Author-X-Name-First: Yan
Author-X-Name-Last: Zhou
Title: Model estimation and selection for partial linear varying coefficient EV models with longitudinal data
Abstract:
In this paper, we consider the estimation and model selection for longitudinal partial linear varying coefficient errors-in-variables (EV) models when the covariates are measured with some additive errors. Bias-corrected penalized quadratic inference functions method is proposed based on quadratic inference functions with two penalty function terms. The proposed method can not only handle the measurement errors of covariates and within-subject correlations but also estimate and select significant non-zero parametric and nonparametric components simultaneously. With some regularization conditions, the resulting estimators of parameters are asymptotically normal and the estimators of nonparametric varying coefficient achieves the optimal convergence rate. Furthermore, we present simulation studies and a real example analysis to evaluate the finite sample performance of the proposed method.
Journal: Journal of Applied Statistics
Pages: 512-534
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1904847
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1904847
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:512-534
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1884847_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Mingao Yuan
Author-X-Name-First: Mingao
Author-X-Name-Last: Yuan
Author-Name: Qian Wen
Author-X-Name-First: Qian
Author-X-Name-Last: Wen
Title: A practical two-sample test for weighted random graphs
Abstract:
Network (graph) data analysis is a popular research topic in statistics and machine learning. In application, one is frequently confronted with graph two-sample hypothesis testing where the goal is to test the difference between two graph populations. Several statistical tests have been devised for this purpose in the context of binary graphs. However, many of the practical networks are weighted and existing procedures cannot be directly applied to weighted graphs. In this paper, we study the weighted graph two-sample hypothesis testing problem and propose a practical test statistic. We prove that the proposed test statistic converges in distribution to the standard normal distribution under the null hypothesis and analyze its power theoretically. The simulation study shows that the proposed test has satisfactory performance and it substantially outperforms the existing counterpart in the binary graph case. A real data application is provided to illustrate the method.
Journal: Journal of Applied Statistics
Pages: 495-511
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1884847
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1884847
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:495-511
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2162471_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Yichuan Zhao
Author-X-Name-First: Yichuan
Author-X-Name-Last: Zhao
Author-Name: Chi-Hua Chen
Author-X-Name-First: Chi-Hua
Author-X-Name-Last: Chen
Author-Name: Feng Feng
Author-X-Name-First: Feng
Author-X-Name-Last: Feng
Author-Name: Dragan Pamucar
Author-X-Name-First: Dragan
Author-X-Name-Last: Pamucar
Title: Editorial to the special issue: Statistical Approaches for Big Data and Machine Learning
Journal: Journal of Applied Statistics
Pages: 451-455
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2023.2162471
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2162471
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:451-455
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1947996_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Ilsuk Kang
Author-X-Name-First: Ilsuk
Author-X-Name-Last: Kang
Author-Name: Cheolwoo Park
Author-X-Name-First: Cheolwoo
Author-X-Name-Last: Park
Author-Name: Young Joo Yoon
Author-X-Name-First: Young Joo
Author-X-Name-Last: Yoon
Author-Name: Changyi Park
Author-X-Name-First: Changyi
Author-X-Name-Last: Park
Author-Name: Soon-Sun Kwon
Author-X-Name-First: Soon-Sun
Author-X-Name-Last: Kwon
Author-Name: Hosik Choi
Author-X-Name-First: Hosik
Author-X-Name-Last: Choi
Title: Classification of histogram-valued data with support histogram machines
Abstract:
The current large amounts of data and advanced technologies have produced new types of complex data, such as histogram-valued data. The paper focuses on classification problems when predictors are observed as or aggregated into histograms. Because conventional classification methods take vectors as input, a natural approach converts histograms into vector-valued data using summary values, such as the mean or median. However, this approach forgoes the distributional information available in histograms. To address this issue, we propose a margin-based classifier called support histogram machine (SHM) for histogram-valued data. We adopt the support vector machine framework and the Wasserstein-Kantorovich metric to measure distances between histograms. The proposed optimization problem is solved by a dual approach. We then test the proposed SHM via simulated and real examples and demonstrate its superior performance to summary-value-based methods.
Journal: Journal of Applied Statistics
Pages: 675-690
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1947996
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1947996
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:675-690
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1985090_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Yujia Cheng
Author-X-Name-First: Yujia
Author-X-Name-Last: Cheng
Author-Name: Yang Li
Author-X-Name-First: Yang
Author-X-Name-Last: Li
Author-Name: Matthew Lee Smith
Author-X-Name-First: Matthew
Author-X-Name-Last: Lee Smith
Author-Name: Changwei Li
Author-X-Name-First: Changwei
Author-X-Name-Last: Li
Author-Name: Ye Shen
Author-X-Name-First: Ye
Author-X-Name-Last: Shen
Title: Analyzing evidence-based falls prevention data with significant missing information using variable selection after multiple imputation
Abstract:
Falls are the leading cause of fatal and non-fatal injuries among older adults. Evidence-based fall prevention programs are delivered nationwide, largely supported by funding from the Administration for Community Living (ACL), to mitigate fall-related risk. This study utilizes data from 39 ACL grantees in 22 states from 2014 to 2017. The large amount of missing values for falls efficacy in this national database may lead to potentially biased statistical results and make it challenging to implement reliable variable selection. Multiple imputation is used to deal with missing values. To obtain a consistent result of variable selection in multiply-imputed datasets, multiple imputation-stepwise regression (MI-stepwise) and multiple imputation-least absolute shrinkage and selection operator (MI-LASSO) methods are used. To compare the performances of MI-stepwise and MI-LASSO, simulation studies were conducted. In particular, we extended prior work by considering several circumstances not covered in previous studies, including an extensive investigation of data with different signal-to-noise ratios and various missing data patterns across predictors, as well as a data structure that allowed the missingness mechanism to be missing not at random (MNAR). In addition, we evaluated the performance of MI-LASSO method with varying tuning parameters to address the overselection issue in cross-validation (CV)-based LASSO.
Journal: Journal of Applied Statistics
Pages: 724-743
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1985090
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1985090
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:724-743
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1911967_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Xie Xiaoyue
Author-X-Name-First: Xie
Author-X-Name-Last: Xiaoyue
Author-Name: Jian Shi
Author-X-Name-First: Jian
Author-X-Name-Last: Shi
Author-Name: Kai Song
Author-X-Name-First: Kai
Author-X-Name-Last: Song
Title: A distributed multiple sample testing for massive data
Abstract:
When the data are stored in a distributed manner, direct application of traditional hypothesis testing procedures is often prohibitive due to communication costs and privacy concerns. This paper mainly develops and investigates a distributed two-node Kolmogorov–Smirnov hypothesis testing scheme, implemented by the divide-and-conquer strategy. In addition, this paper also provides a distributed fraud detection and a distribution-based classification for multi-node machines based on the proposed hypothesis testing scheme. The distributed fraud detection is to detect which node stores fraud data in multi-node machines and the distribution-based classification is to determine whether the multi-node distributions differ and classify different distributions. These methods can improve the accuracy of statistical inference in a distributed storage architecture. Furthermore, this paper verifies the feasibility of the proposed methods by simulation and real example studies.
Journal: Journal of Applied Statistics
Pages: 555-573
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1911967
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1911967
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:555-573
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1929089_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Fang Fan
Author-X-Name-First: Fang
Author-X-Name-Last: Fan
Author-Name: Shu-Chuan Chu
Author-X-Name-First: Shu-Chuan
Author-X-Name-Last: Chu
Author-Name: Jeng-Shyang Pan
Author-X-Name-First: Jeng-Shyang
Author-X-Name-Last: Pan
Author-Name: Chuang Lin
Author-X-Name-First: Chuang
Author-X-Name-Last: Lin
Author-Name: Huiqi Zhao
Author-X-Name-First: Huiqi
Author-X-Name-Last: Zhao
Title: An optimized machine learning technology scheme and its application in fault detection in wireless sensor networks
Abstract:
Aiming at the problem of fault detection in data collection in wireless sensor networks, this paper combines evolutionary computing and machine learning to propose a productive technical solution. We choose the classical particle swarm optimization (PSO) and improve it, including the introduction of a biological population model to control the population size, and the addition of a parallel mechanism for further tuning. The proposed RS-PPSO algorithm was successfully used to optimize the initial weights and biases of back propagation neural network (BPNN), shortening the training time and raising the prediction accuracy. Wireless sensor networks (WSN) has become the key supporting platform of Internet of Things (IoT). The correctness of the data collected by the sensor nodes has a great influence on the reliability, real-time performance and energy saving of the entire network. The optimized machine learning technology scheme given in this paper can effectively identify the fault data, so as to ensure the effective operation of WSN.
Journal: Journal of Applied Statistics
Pages: 592-609
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1929089
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1929089
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:592-609
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1849057_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Denis A. Pustokhin
Author-X-Name-First: Denis A.
Author-X-Name-Last: Pustokhin
Author-Name: Irina V. Pustokhina
Author-X-Name-First: Irina V.
Author-X-Name-Last: Pustokhina
Author-Name: Phuoc Nguyen Dinh
Author-X-Name-First: Phuoc Nguyen
Author-X-Name-Last: Dinh
Author-Name: Son Van Phan
Author-X-Name-First: Son Van
Author-X-Name-Last: Phan
Author-Name: Gia Nhu Nguyen
Author-X-Name-First: Gia Nhu
Author-X-Name-Last: Nguyen
Author-Name: Gyanendra Prasad Joshi
Author-X-Name-First: Gyanendra Prasad
Author-X-Name-Last: Joshi
Author-Name: Shankar K.
Author-X-Name-First: Shankar
Author-X-Name-Last: K.
Title: An effective deep residual network based class attention layer with bidirectional LSTM for diagnosis and classification of COVID-19
Abstract:
In recent days, COVID-19 pandemic has affected several people's lives globally and necessitates a massive number of screening tests to detect the existence of the coronavirus. At the same time, the rise of deep learning (DL) concepts helps to effectively develop a COVID-19 diagnosis model to attain maximum detection rate with minimum computation time. This paper presents a new Residual Network (ResNet) based Class Attention Layer with Bidirectional LSTM called RCAL-BiLSTM for COVID-19 Diagnosis. The proposed RCAL-BiLSTM model involves a series of processes namely bilateral filtering (BF) based preprocessing, RCAL-BiLSTM based feature extraction, and softmax (SM) based classification. Once the BF technique produces the preprocessed image, RCAL-BiLSTM based feature extraction process takes place using three modules, namely ResNet based feature extraction, CAL, and Bi-LSTM modules. Finally, the SM layer is applied to categorize the feature vectors into corresponding feature maps. The experimental validation of the presented RCAL-BiLSTM model is tested against Chest-X-Ray dataset and the results are determined under several aspects. The experimental outcome pointed out the superior nature of the RCAL-BiLSTM model by attaining maximum sensitivity of 93.28%, specificity of 94.61%, precision of 94.90%, accuracy of 94.88%, F-score of 93.10% and kappa value of 91.40%.
Journal: Journal of Applied Statistics
Pages: 477-494
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2020.1849057
File-URL: http://hdl.handle.net/10.1080/02664763.2020.1849057
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:477-494
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2017411_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Jin Jin
Author-X-Name-First: Jin
Author-X-Name-Last: Jin
Author-Name: Lin Zhang
Author-X-Name-First: Lin
Author-X-Name-Last: Zhang
Author-Name: Ethan Leng
Author-X-Name-First: Ethan
Author-X-Name-Last: Leng
Author-Name: Gregory J. Metzger
Author-X-Name-First: Gregory J.
Author-X-Name-Last: Metzger
Author-Name: Joseph S. Koopmeiners
Author-X-Name-First: Joseph S.
Author-X-Name-Last: Koopmeiners
Title: Multi-resolution super learner for voxel-wise classification of prostate cancer using multi-parametric MRI
Abstract:
Multi-parametric MRI (mpMRI) is a critical tool in prostate cancer (PCa) diagnosis and management. To further advance the use of mpMRI in patient care, computer aided diagnostic methods are under continuous development for supporting/supplanting standard radiological interpretation. While voxel-wise PCa classification models are the gold standard, few if any approaches have incorporated the inherent structure of the mpMRI data, such as spatial heterogeneity and between-voxel correlation, into PCa classification. We propose a machine learning-based method to fill in this gap. Our method uses an ensemble learning approach to capture regional heterogeneity in the data, where classifiers are developed at multiple resolutions and combined using the super learner algorithm, and further account for between-voxel correlation through a Gaussian kernel smoother. It allows any type of classifier to be the base learner and can be extended to further classify PCa sub-categories. We introduce the algorithms for binary PCa classification, as well as for classifying the ordinal clinical significance of PCa for which a weighted likelihood approach is implemented to improve the detection of less prevalent cancer categories. The proposed method has shown important advantages over conventional modeling and machine learning approaches in simulations and application to our motivating patient data.
Journal: Journal of Applied Statistics
Pages: 805-826
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.2017411
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2017411
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:805-826
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1919063_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Anton Thielmann
Author-X-Name-First: Anton
Author-X-Name-Last: Thielmann
Author-Name: Christoph Weisser
Author-X-Name-First: Christoph
Author-X-Name-Last: Weisser
Author-Name: Astrid Krenz
Author-X-Name-First: Astrid
Author-X-Name-Last: Krenz
Author-Name: Benjamin Säfken
Author-X-Name-First: Benjamin
Author-X-Name-Last: Säfken
Title: Unsupervised document classification integrating web scraping, one-class SVM and LDA topic modelling
Abstract:
Unsupervised document classification for imbalanced data sets poses a major challenge. To obtain accurate classification results, training data sets are often created manually by humans which requires expert knowledge, time and money. Depending on the imbalance of the data set, this approach also either requires human labelling of all of the data or it fails to adequately recognize underrepresented categories. We propose an integration of web scraping, one-class Support Vector Machines (SVM) and Latent Dirichlet Allocation (LDA) topic modelling as a multi-step classification rule that circumvents manual labelling. Unsupervised one-class document classification with the integration of out-of-domain training data is achieved and >80% of the target data is correctly classified. The proposed method thus even outperforms common machine learning classifiers and is validated on multiple data sets.
Journal: Journal of Applied Statistics
Pages: 574-591
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1919063
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1919063
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:574-591
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1982878_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Matthias Weber
Author-X-Name-First: Matthias
Author-X-Name-Last: Weber
Author-Name: Jonas Striaukas
Author-X-Name-First: Jonas
Author-X-Name-Last: Striaukas
Author-Name: Martin Schumacher
Author-X-Name-First: Martin
Author-X-Name-Last: Schumacher
Author-Name: Harald Binder
Author-X-Name-First: Harald
Author-X-Name-Last: Binder
Title: Regularized regression when covariates are linked on a network: the 3CoSE algorithm
Abstract:
Covariates in regressions may be linked to each other on a network. Knowledge of the network structure can be incorporated into regularized regression settings via a network penalty term. However, when it is unknown whether the connection signs in the network are positive (connected covariates reinforce each other) or negative (connected covariates repress each other), the connection signs have to be estimated jointly with the covariate coefficients. This can be done with an algorithm iterating a connection sign estimation step and a covariate coefficient estimation step. We develop such an algorithm, called 3CoSE, and show detailed simulation results and an application forecasting event times. The algorithm performs well in a variety of settings. We also briefly describe the publicly available R-package developed for this purpose.
Journal: Journal of Applied Statistics
Pages: 535-554
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1982878
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1982878
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:535-554
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2047905_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Mehdi Dagdoug
Author-X-Name-First: Mehdi
Author-X-Name-Last: Dagdoug
Author-Name: Camelia Goga
Author-X-Name-First: Camelia
Author-X-Name-Last: Goga
Author-Name: David Haziza
Author-X-Name-First: David
Author-X-Name-Last: Haziza
Title: Model-assisted estimation in high-dimensional settings for survey data
Abstract:
Model-assisted estimators have attracted a lot of attention in the last three decades. These estimators attempt to make an efficient use of auxiliary information available at the estimation stage. A working model linking the survey variable to the auxiliary variables is specified and fitted on the sample data to obtain a set of predictions, which are then incorporated in the estimation procedures. A nice feature of model-assisted procedures is that they maintain important design properties such as consistency and asymptotic unbiasedness irrespective of whether or not the working model is correctly specified. In this article, we examine several model-assisted estimators from a design-based point of view and in a high-dimensional setting, including linear regression and penalized estimators. We conduct an extensive simulation study using data from the Irish Commission for Energy Regulation Smart Metering Project, to assess the performance of several model-assisted estimators in terms of bias and efficiency in this high-dimensional data set.
Journal: Journal of Applied Statistics
Pages: 761-785
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2022.2047905
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2047905
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:761-785
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1973387_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20220907T060133 git hash: 85d61bd949
Author-Name: Xiaotong Liu
Author-X-Name-First: Xiaotong
Author-X-Name-Last: Liu
Author-Name: Guoliang Tian
Author-X-Name-First: Guoliang
Author-X-Name-Last: Tian
Author-Name: Zhenqiu Liu
Author-X-Name-First: Zhenqiu
Author-X-Name-Last: Liu
Title: Identification of novel genes for triple-negative breast cancer with semiparametric gene-based analysis
Abstract:
Triple-negative breast cancer (TNBC) is generally considered an aggressive breast cancer subtype associated with poor prognostic outcomes. Up to now, the molecular and cellular mechanisms underlying TNBC pathology have not been fully understood. In this manuscript, we propose a novel semiparametric model with kernel for gene-based analysis with a breast cancer GWAS data. The software of SPMGBA (semiparametric method for gene-based analysis) in MATLAB is available at GitHub (https://github.com/zliu3/SPMGBA). Genetic signatures associated with breast cancer are discovered. We further validate the prognostic power of the identified genes with a large cohort of expression data from the European Genome-Phenome Archive, and discover that SEL1L is associated with the overall survival of TNBC with the p-value of .0002. We conclude that gene SEL1L is down-regulated in TNBC and the expression of SEL1L is positively associated with patient survival.
Journal: Journal of Applied Statistics
Pages: 691-702
Issue: 3
Volume: 50
Year: 2023
Month: 02
X-DOI: 10.1080/02664763.2021.1973387
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1973387
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:3:p:691-702
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1994530_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Akira Suzuki
Author-X-Name-First: Akira
Author-X-Name-Last: Suzuki
Author-Name: Hidetoshi Murakami
Author-X-Name-First: Hidetoshi
Author-X-Name-Last: Murakami
Author-Name: Amitava Mukherjee
Author-X-Name-First: Amitava
Author-X-Name-Last: Mukherjee
Title: Distribution-free Phase-I scheme for location, scale and skewness shifts with an application in monitoring customers' waiting time
Abstract:
Phase-I analysis of historical data from a statistical process is a strategic problem in Statistical Process Monitoring and control. Before the establishment of process stability, it is challenging to model historical data. Consequently, a distribution-free approach is a natural choice in Phase-I monitoring. Existing distribution-free Phase-I control charts are suitable for detecting instability in location and scale parameters only and are often insensitive in complex processes involving skewness or shape parameters. A new Phase-I control chart is proposed to identify more general shifts, including location, scale and skewness. The proposed Phase-I scheme is efficient in such a situation. The proposed Phase-I scheme uses subsamples, and the plotting statistic is based on the omnibus multi-sample linear rank statistic corresponding to the location, scale and skewness shifts. The new scheme can identify subsamples that are not in control, and it can also indicate one or more process parameters where a deviation has occurred. The encouraging performance of the proposed scheme is established with a large-scale numerical study based on Monte-Carlo in detecting shifts of various nature in a comprehensive class of situations. An illustration based on monitoring the waiting time data from a customer service centre is given. Some concluding remarks and some future research problems are also offered.
Journal: Journal of Applied Statistics
Pages: 827-847
Issue: 4
Volume: 50
Year: 2023
Month: 03
X-DOI: 10.1080/02664763.2021.1994530
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1994530
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:4:p:827-847
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1998391_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Erina Paul
Author-X-Name-First: Erina
Author-X-Name-Last: Paul
Author-Name: Ram C. Tiwari
Author-X-Name-First: Ram C.
Author-X-Name-Last: Tiwari
Author-Name: Shrabanti Chowdhury
Author-X-Name-First: Shrabanti
Author-X-Name-Last: Chowdhury
Author-Name: Samiran Ghosh
Author-X-Name-First: Samiran
Author-X-Name-Last: Ghosh
Title: A more powerful test for three-arm non-inferiority via risk difference: Frequentist and Bayesian approaches
Abstract:
Necessity for finding improved intervention in many legacy therapeutic areas are of high priority. This has the potential to decrease the expense of medical care and poor outcomes for many patients. Typically, clinical efficacy is the primary evaluating criteria to measure any beneficial effect of a treatment. Albeit, there could be situations when several other factors (e.g. side-effects, cost-burden, less debilitating, less intensive, etc.) which can permit some slightly less efficacious treatment options favorable to a subgroup of patients. This often leads to non-inferiority (NI) testing. NI trials may or may not include a placebo arm due to ethical reasons. However, when included, the resulting three-arm trial is more prudent since it requires less stringent assumptions compared to a two-arm placebo-free trial. In this article, we consider both Frequentist and Bayesian procedures for testing NI in the three-arm trial with binary outcomes when the functional of interest is risk difference. An improved Frequentist approach is proposed first, which is then followed by a Bayesian counterpart. Bayesian methods have a natural advantage in many active-control trials, including NI trial, as it can seamlessly integrate substantial prior information. In addition, we discuss sample size calculation and draw an interesting connection between the two paradigms.
Journal: Journal of Applied Statistics
Pages: 848-870
Issue: 4
Volume: 50
Year: 2023
Month: 03
X-DOI: 10.1080/02664763.2021.1998391
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1998391
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:4:p:848-870
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2001442_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Mustafa Ç. Korkmaz
Author-X-Name-First: Mustafa Ç.
Author-X-Name-Last: Korkmaz
Author-Name: Zehra Sedef Korkmaz
Author-X-Name-First: Zehra Sedef
Author-X-Name-Last: Korkmaz
Title: The unit log–log distribution: a new unit distribution with alternative quantile regression modeling and educational measurements applications
Abstract:
In this paper, we propose a new distribution, named unit log–log distribution, defined on the bounded (0,1) interval. Basic distributional properties such as model shapes, stochastic ordering, quantile function, moments, and order statistics of the newly defined unit distribution are studied. The maximum likelihood estimation method has been pointed out to estimate its model parameters. The new quantile regression model based on the proposed distribution is introduced and it has been derived estimations of its model parameters also. The Monte Carlo simulation studies have been given to see the performance of the estimation method based on the new unit distribution and its regression modeling. Applications of the newly defined distribution and its quantile regression model to real data sets show that the proposed models have better modeling abilities than competitive models. The proposed unit quantile regression model has targeted to explain linear relation between educational measurements of both OECD (Organization for Economic Co-operation and Development) countries and some non-members of OECD countries, and their Better Life Index. The existence of the significant covariates has been seen on the real data applications for the unit median response.
Journal: Journal of Applied Statistics
Pages: 889-908
Issue: 4
Volume: 50
Year: 2023
Month: 03
X-DOI: 10.1080/02664763.2021.2001442
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2001442
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:4:p:889-908
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2003760_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Rusul Mohsin Moharib Alsarray
Author-X-Name-First: Rusul Mohsin
Author-X-Name-Last: Moharib Alsarray
Author-Name: Jaber Kazempoor
Author-X-Name-First: Jaber
Author-X-Name-Last: Kazempoor
Author-Name: Adel Ahmadi Nadi
Author-X-Name-First: Adel
Author-X-Name-Last: Ahmadi Nadi
Title: Monitoring the Weibull shape parameter under progressive censoring in presence of independent competing risks
Abstract:
In this paper, monitoring the Weibull shape parameter arising from progressively censored competing risks data is investigated. The competing risks are assumed to be independent and not identically distributed from the Weibull distributions with different shape and scale parameters. Both the shape parameters can be monitored separately by the proposed control charts using censored and predicted observations. We also introduced a control chart for monitoring both shape parameters simultaneously to detect possible shifts in both opposite and the same directions. In addition, the problem of mask data is discussed and an efficient prediction method is proposed. The behavior of the average run length with and without mask data is investigated through extensive simulations. Furthermore, the effects of sample size, number of failures due to each risk, and censoring scheme on the charts' performance are also studied. Finally, an illustrative example is presented to demonstrate the application of the proposed control charts by investigating a real data set of the failure times of two-component ARC-1 VHF communication transmitter receivers of a single commercial airline. Although this data set has been widely investigated in reliability analysis studies, this is the first time it has been analyzed in a statistical process monitoring setting.
Journal: Journal of Applied Statistics
Pages: 945-962
Issue: 4
Volume: 50
Year: 2023
Month: 03
X-DOI: 10.1080/02664763.2021.2003760
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2003760
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:4:p:945-962
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1998392_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Rosineide da Paz
Author-X-Name-First: Rosineide
Author-X-Name-Last: da Paz
Author-Name: Jorge Luis Bazán
Author-X-Name-First: Jorge Luis
Author-X-Name-Last: Bazán
Author-Name: Victor Hugo Lachos
Author-X-Name-First: Victor Hugo
Author-X-Name-Last: Lachos
Author-Name: Dipak Dey
Author-X-Name-First: Dipak
Author-X-Name-Last: Dey
Title: A finite mixture mixed proportion regression model for classification problems in longitudinal voting data
Abstract:
Continuous clustered proportion data often arise in various areas of the social and political sciences where the response variable of interest is a proportion (or percentage). An example is the behavior of the proportion of voters favorable to a political party in municipalities (or cities) of a country over time. This behavior can be different depending on the region of the country, giving rise to groups (or clusters) with similar profiles. For this kind of data, we propose a finite mixture of a random effects regression model based on the L-Logistic distribution. A Markov chain Monte Carlo algorithm is tailored to obtain posterior distributions of the unknown quantities of interest through a Bayesian approach. To illustrate the proposed method, with emphasis on analysis of clusters, we analyze the proportion of votes for a political party in presidential elections in different municipalities observed over time, and then identify groups according to electoral behavior at different levels of favorable votes.
Journal: Journal of Applied Statistics
Pages: 871-888
Issue: 4
Volume: 50
Year: 2023
Month: 03
X-DOI: 10.1080/02664763.2021.1998392
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1998392
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:4:p:871-888
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2004580_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Qianyi Li
Author-X-Name-First: Qianyi
Author-X-Name-Last: Li
Author-Name: Jianbo Li
Author-X-Name-First: Jianbo
Author-X-Name-Last: Li
Author-Name: Yongran Cheng
Author-X-Name-First: Yongran
Author-X-Name-Last: Cheng
Author-Name: Riquan Zhang
Author-X-Name-First: Riquan
Author-X-Name-Last: Zhang
Title: Curve fitting and jump detection on nonparametric regression with missing data
Abstract:
In this paper, by virtual of the inverse probability weighted technique, we considered the jump-preserving estimation on the nonparametric regression models with missing data on response variable. First, we used local piecewise-linear expansion respectively with left and right kernel to approximate the unknown regression function. Second, we obtained the left- and right-limit estimation of regression function at each observed points and then determinated the final estimators by residual sums of squares. Third, we presented the convergence rate of estimators and the residual sums of squares. Finally, we illustrated the performance of our proposed method through some simulation studies and a conjunctivitis example from The Affiliated Hospital of Hangzhou Normal University.
Journal: Journal of Applied Statistics
Pages: 963-983
Issue: 4
Volume: 50
Year: 2023
Month: 03
X-DOI: 10.1080/02664763.2021.2004580
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2004580
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:4:p:963-983
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2001444_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Guanfu Liu
Author-X-Name-First: Guanfu
Author-X-Name-Last: Liu
Author-Name: Zongliang Hu
Author-X-Name-First: Zongliang
Author-X-Name-Last: Hu
Title: Testing quantitative trait locus effects in genetic backcross studies with double recombination occurring
Abstract:
Testing the existence of quantitative trait locus (QTL) effects is an important task in QTL mapping studies. In this paper, we assume the phenotype distributions from a location-scale distribution family, and consider to test the QTL effects in both location and scale in the backcross studies with double recombination occurring. Without equal scale assumption, the log-likelihood function is unbounded, which leads to the traditional likelihood ratio test being invalid. To deal with this problem, we propose a penalized likelihood ratio test (PLRT) for testing the QTL effects. The null limiting distribution of the PLRT is shown to be a supremum of a chi-square process. As a complement, we also investigate the null limiting distribution of the likelihood ratio test for the case with equal scale assumption. The limiting distributions of the two tests under local alternatives are also studied. Simulation studies are performed to evaluate the asymptotic results and a real-data example is given for illustration.
Journal: Journal of Applied Statistics
Pages: 927-944
Issue: 4
Volume: 50
Year: 2023
Month: 03
X-DOI: 10.1080/02664763.2021.2001444
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2001444
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:4:p:927-944
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2006613_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: V. Mohtashami-Borzadaran
Author-X-Name-First: V.
Author-X-Name-Last: Mohtashami-Borzadaran
Author-Name: M. Amini
Author-X-Name-First: M.
Author-X-Name-Last: Amini
Author-Name: J. Ahmadi
Author-X-Name-First: J.
Author-X-Name-Last: Ahmadi
Title: Estimating the parameters of a dependent model and applying it to environmental data set
Abstract:
In this paper, a new dependent model is introduced. The model is motivated using the structure of series-parallel systems consisting of two series-parallel systems with a random number of parallel sub-systems that have fixed components connected in series. The dependence properties of the proposed model are studied. Two estimation methods, namely the moment method, and the maximum likelihood method are applied to estimate the parameters of the distributions of the components based on observing the system's lifetime data. A Monte Carlo simulation study is used to evaluate the performance of the estimators. Two real data sets are used to illustrate the proposed method. The results are useful for researchers and practitioners interested in analyzing bivariate data related to extreme events.
Journal: Journal of Applied Statistics
Pages: 984-1016
Issue: 4
Volume: 50
Year: 2023
Month: 03
X-DOI: 10.1080/02664763.2021.2006613
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2006613
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:4:p:984-1016
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2008882_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Muhammad Atif
Author-X-Name-First: Muhammad
Author-X-Name-Last: Atif
Author-Name: Muhammad Shafiq
Author-X-Name-First: Muhammad
Author-X-Name-Last: Shafiq
Author-Name: Friedrich Leisch
Author-X-Name-First: Friedrich
Author-X-Name-Last: Leisch
Title: Applications of monitoring and tracing the evolution of clustering solutions in dynamic datasets
Abstract:
The clustering approach is widely accepted as the most prominent unsupervised learning problem in data mining techniques. This procedure deals with the identification of notable structures in unlabeled datasets. In modern days clustering of dynamic data, streams play a vital role in policy-making, and researchers are paying particular attention to monitoring the evolution of clustering solutions over time. The data streams evolve continually, and different sources generate data items over time. The clustering solution over this stream is not stationary and changes with the influx of new data items. This paper presents a comprehensive study of algorithms related to tracing the evolution of clusters over time in cumulative datasets. To demonstrate the applications and significance of the tracing cluster evolution, we implement the MONIC algorithm in R-software. This article illustrates how the data segmentation of dynamic streams is done and shows the applications of monitoring changes in clustering solutions with the help of real-life published datasets.
Journal: Journal of Applied Statistics
Pages: 1017-1035
Issue: 4
Volume: 50
Year: 2023
Month: 03
X-DOI: 10.1080/02664763.2021.2008882
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2008882
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:4:p:1017-1035
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2001443_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: A. Venkatasubramaniam
Author-X-Name-First: A.
Author-X-Name-Last: Venkatasubramaniam
Author-Name: L. Evers
Author-X-Name-First: L.
Author-X-Name-Last: Evers
Author-Name: P. Thakuriah
Author-X-Name-First: P.
Author-X-Name-Last: Thakuriah
Author-Name: K. Ampountolas
Author-X-Name-First: K.
Author-X-Name-Last: Ampountolas
Title: Functional distributional clustering using spatio-temporal data
Abstract:
This paper presents a new method called the functional distributional clustering algorithm (FDCA) that seeks to identify spatially contiguous clusters and incorporate changes in temporal patterns across overcrowded networks. This method is motivated by a graph-based network composed of sensors arranged over space where recorded observations for each sensor represent a multi-modal distribution. The proposed method is fully non-parametric and generates clusters within an agglomerative hierarchical clustering approach based on a measure of distance that defines a cumulative distribution function over temporal changes for different locations in space. Traditional hierarchical clustering algorithms that are spatially adapted do not typically accommodate the temporal characteristics of the underlying data. The effectiveness of the FDCA is illustrated using an application to both empirical and simulated data from about 400 sensors in a 2.5 square miles network area in downtown San Francisco, California. The results demonstrate the superior ability of the the FDCA in identifying true clusters compared to functional only and distributional only algorithms and similar performance to a model-based clustering algorithm.
Journal: Journal of Applied Statistics
Pages: 909-926
Issue: 4
Volume: 50
Year: 2023
Month: 03
X-DOI: 10.1080/02664763.2021.2001443
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2001443
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:4:p:909-926
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2017412_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Ke-Yi Mou
Author-X-Name-First: Ke-Yi
Author-X-Name-Last: Mou
Author-Name: Chang-Xing Ma
Author-X-Name-First: Chang-Xing
Author-X-Name-Last: Ma
Author-Name: Zhi-Ming Li
Author-X-Name-First: Zhi-Ming
Author-X-Name-Last: Li
Title: Homogeneity test of relative risk ratios for stratified bilateral data under different algorithms
Abstract:
Medical clinical studies about paired body parts often involve stratified bilateral data. The correlation between responses from paired parts should be taken into account to avoid biased or misleading results. This paper aims to test if the relative risk ratios across strata are equal under the optimal algorithms. Based on different algorithms, we obtain the desired global and constrained maximum likelihood estimations (MLEs). Three asymptotic test statistics (i.e.
$ T_{L} $ TL,
$ T_{SC} $ TSC and
$ T_{W} $ TW) are proposed. Monte Carlo simulations are conducted to evaluate the performance of these algorithms with respect to mean square errors of MLEs and convergence rate. The empirical results show Fisher scoring algorithm is usually better than other methods since it has effective convergence rate for global MLEs, and makes mean-square error lower for constrained MLEs. Three test statistics are compared in terms of type I error rate (TIE) and power. Among these statistics,
$ T_{SC} $ TSC is recommended according to its robust TIEs and satisfactory power.
Journal: Journal of Applied Statistics
Pages: 1060-1077
Issue: 5
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2021.2017412
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2017412
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:5:p:1060-1077
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2023116_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Kenneth Flagg
Author-X-Name-First: Kenneth
Author-X-Name-Last: Flagg
Author-Name: Andrew Hoegh
Author-X-Name-First: Andrew
Author-X-Name-Last: Hoegh
Title: The integrated nested Laplace approximation applied to spatial log-Gaussian Cox process models
Abstract:
Spatial point process models are theoretically useful for mapping discrete events, such as plant or animal presence, across space; however, the computational complexity of fitting these models is often a barrier to their practical use. The log-Gaussian Cox process (LGCP) is a point process driven by a latent Gaussian field, and recent advances have made it possible to fit Bayesian LGCP models using approximate methods that facilitate rapid computation. These advances include the integrated nested Laplace approximation (INLA) with a stochastic partial differential equations (SPDE) approach to sparsely approximate the Gaussian field and an extension using pseudodata with a Poisson response. To help link the theoretical results to statistical practice, we provide an overview of INLA for point process data and then illustrate their implementation using freely available data. The analyzed datasets include both a completely observed spatial field and an incomplete data situation. Our well-commented R code is shared in the online supplement. Our intent is to make these methods accessible to the practitioner of spatial statistics without requiring deep knowledge of point process theory.
Journal: Journal of Applied Statistics
Pages: 1128-1151
Issue: 5
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2021.2023116
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2023116
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:5:p:1128-1151
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2023118_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Lin Tang
Author-X-Name-First: Lin
Author-X-Name-Last: Tang
Author-Name: Pengcheng Zeng
Author-X-Name-First: Pengcheng
Author-X-Name-Last: Zeng
Author-Name: Jian Qing Shi
Author-X-Name-First: Jian
Author-X-Name-Last: Qing Shi
Author-Name: Won-Seok Kim
Author-X-Name-First: Won-Seok
Author-X-Name-Last: Kim
Title: Model-based joint curve registration and classification
Abstract:
In this paper, we consider the problem of classification of misaligned multivariate functional data. We propose to use a model-based approach for the joint registration and classification of such data. The observed functional inputs are modeled as a functional nonlinear mixed effects model containing a nonlinear functional fixed effect constructed upon warping functions to account for curve alignment, and a nonlinear functional random effects component to address the variability among subjects. The warping functions are also modeled to accommodate common effect within groups and the variability between subjects. Then, a functional logistic regression model defined upon the representation of the aligned curves and scalar inputs is used to account for curve classification. EM-based algorithms are developed to perform maximum likelihood inference of the proposed models. The identifiability of the registration model and the asymptotical properties of the proposed method are established. The performance of the proposed procedure is illustrated via simulation studies and an analysis of a hyoid bone movement data application. The statistical developments proposed in this paper were motivated by the hyoid bone movement study, the methodology is designed and presented generality and can be applied to numerous areas of scientific research.
Journal: Journal of Applied Statistics
Pages: 1178-1198
Issue: 5
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2021.2023118
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2023118
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:5:p:1178-1198
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2017414_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Pengcheng Ren
Author-X-Name-First: Pengcheng
Author-X-Name-Last: Ren
Author-Name: Guanfu Liu
Author-X-Name-First: Guanfu
Author-X-Name-Last: Liu
Author-Name: Xiaolong Pu
Author-X-Name-First: Xiaolong
Author-X-Name-Last: Pu
Title: Generalized fiducial methods for testing the homogeneity of a three-sample problem with a mixture structure
Abstract:
Recently, the likelihood ratio (LR) test was proposed to test the homogeneity of a three-sample model with a mixture structure. Because of the presence of the mixture structure, the null limiting distribution of the LR test has a complicated supremum form, which leads to challenges in determining p-values. In addition, the LR test cannot control type-I errors well under small to moderate sample size. In this paper, we propose seven generalized fiducial methods to test the homogeneity of the three-sample model. Via simulation studies, we find that our methods perform significantly better than the LR test method in controlling the type-I errors under small to moderate sample size, while they have comparable powers in most cases. A halibut data example is used to illustrate the proposed methods.
Journal: Journal of Applied Statistics
Pages: 1094-1114
Issue: 5
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2021.2017414
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2017414
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:5:p:1094-1114
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2012563_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Reiko Aoki
Author-X-Name-First: Reiko
Author-X-Name-Last: Aoki
Author-Name: Dorival Leão
Author-X-Name-First: Dorival
Author-X-Name-Last: Leão
Author-Name: Juan P. Mamani Bustamante
Author-X-Name-First: Juan P. Mamani
Author-X-Name-Last: Bustamante
Author-Name: Filidor Vilca
Author-X-Name-First: Filidor
Author-X-Name-Last: Vilca
Title: Ultrastructural calibration model for proficiency testing
Abstract:
Proficiency testing (PT) determines the performance of individual laboratories for specific tests or measurements and it is used to monitor the reliability of laboratories measurements. PT plays a highly valuable role as it provides objective evidence of the competence of the participant laboratories. In this paper, we propose a multivariate calibration model to assess equivalence among laboratories measurements in PT. Our method allows to deal with multivariate data, where the item under test is measured at different levels. Although intuitive, the proposed model is nonergodic, which means that the asymptotic Fisher information matrix is random. As a consequence, a detailed asymptotic analysis was carried out to establish the strategy for comparing the results of the participating laboratories. To illustrate, we apply our method to analyze the data from the Brazilian engine test group, PT program, where the power of an engine was measured by eight laboratories at several levels of rotation.
Journal: Journal of Applied Statistics
Pages: 1037-1059
Issue: 5
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2021.2012563
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2012563
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:5:p:1037-1059
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2024515_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: J. C. S. Vasconcelos
Author-X-Name-First: J. C. S.
Author-X-Name-Last: Vasconcelos
Author-Name: G. M. Cordeiro
Author-X-Name-First: G. M.
Author-X-Name-Last: Cordeiro
Author-Name: E. M. M. Ortega
Author-X-Name-First: E. M. M.
Author-X-Name-Last: Ortega
Author-Name: G. O. Silva
Author-X-Name-First: G. O.
Author-X-Name-Last: Silva
Title: A random effect regression based on the odd log-logistic generalized inverse Gaussian distribution
Abstract:
In recent decades, the use of regression models with random effects has made great progress. Among these models' attractions is the flexibility to analyze correlated data. In various situations, the distribution of the response variable presents asymmetry or bimodality. In these cases, it is possible to use the normal regression with random effect at the intercept. In light of these contexts, i.e. the desire to analyze correlated data in the presence of bimodality or asymmetry, in this paper we propose a regression model with random effect at the intercept based onthe generalized inverse Gaussian distribution model with correlated data. The maximum likelihood is adopted to estimate the parameters and various simulations are performed for correlated data. A type of residuals for the new regression is proposed whose empirical distribution is close to normal. The versatility of the new regression is demonstrated by estimating the average price per hectare of bare land in 10 municipalities in the state of São Paulo (Brazil). In this context, various databases are constantly emerging, requiring flexible modeling. Thus, it is likely to be of interest to data analysts, and can make a good contribution to the statistical literature.
Journal: Journal of Applied Statistics
Pages: 1199-1214
Issue: 5
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2021.2024515
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2024515
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:5:p:1199-1214
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2019689_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: K. B. Kulasekera
Author-X-Name-First: K. B.
Author-X-Name-Last: Kulasekera
Author-Name: Sudaraka Tholkage
Author-X-Name-First: Sudaraka
Author-X-Name-Last: Tholkage
Author-Name: Maiying Kong
Author-X-Name-First: Maiying
Author-X-Name-Last: Kong
Title: Personalized treatment selection using observational data
Abstract:
Estimating the optimal treatment regime based on individual patient characteristics has been a topic of discussion in many forums. Advanced computational power has added momentum to this discussion over the last two decades and practitioners have been advocating the use of new methods in determining the best treatment. Treatments that are geared toward the ‘best’ outcome for a patient based on his/her genetic markers and characteristics are of high importance. In this article, we develop an approach to predict the optimal personalized treatment based on observational data. We have used inverse probability of treatment weighted machine learning methods to obtain score functions to predict the optimal treatment. Extensive simulation studies showed that our proposed method has desirable performance in selecting the optimal treatment. We provided a case study to examine the Statin use on cognitive function to illustrate the use of our proposed method.
Journal: Journal of Applied Statistics
Pages: 1115-1127
Issue: 5
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2021.2019689
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2019689
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:5:p:1115-1127
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2023117_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: O. Kharazmi
Author-X-Name-First: O.
Author-X-Name-Last: Kharazmi
Author-Name: G. G. Hamedani
Author-X-Name-First: G. G.
Author-X-Name-Last: Hamedani
Author-Name: G. M. Cordeiro
Author-X-Name-First: G. M.
Author-X-Name-Last: Cordeiro
Title: Log-mean distribution: applications to medical data, survival regression, Bayesian and non-Bayesian discussion with MCMC algorithm
Abstract:
We introduce a new family via the log mean of an underlying distribution and as baseline the proportional hazards model and derive some important properties. A special model is proposed by taking the Weibull for the baseline. We derive several properties of the sub-model such as moments, order statistics, hazard function, survival regression and certain characterization results. We estimate the parameters using frequentist and Bayesian approaches. Further, Bayes estimators, posterior risks, credible intervals and highest posterior density intervals are obtained under different symmetric and asymmetric loss functions. A Monte Carlo simulation study examines the biases and mean square errors of the maximum likelihood estimators. For the illustrative purposes, we consider heart transplant and bladder cancer data sets and investigate the efficiency of proposed model.
Journal: Journal of Applied Statistics
Pages: 1152-1177
Issue: 5
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2021.2023117
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2023117
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:5:p:1152-1177
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2017413_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Shuhua Chang
Author-X-Name-First: Shuhua
Author-X-Name-Last: Chang
Author-Name: Deli Li
Author-X-Name-First: Deli
Author-X-Name-Last: Li
Author-Name: Yongcheng Qi
Author-X-Name-First: Yongcheng
Author-X-Name-Last: Qi
Title: Pearson's goodness-of-fit tests for sparse distributions
Abstract:
Pearson's chi-squared test is widely used to test the goodness of fit between categorical data and a given discrete distribution function. When the number of sets of the categorical data, say k, is a fixed integer, Pearson's chi-squared test statistic converges in distribution to a chi-squared distribution with k−1 degrees of freedom when the sample size n goes to infinity. In real applications, the number k often changes with n and may be even much larger than n. By using the martingale techniques, we prove that Pearson's chi-squared test statistic converges to the normal under quite general conditions. We also propose a new test statistic which is more powerful than chi-squared test statistic based on our simulation study. A real application to lottery data is provided to illustrate our methodology.
Journal: Journal of Applied Statistics
Pages: 1078-1093
Issue: 5
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2021.2017413
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2017413
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:5:p:1078-1093
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2031128_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: David M. Ruth
Author-X-Name-First: David M.
Author-X-Name-Last: Ruth
Author-Name: Nicholas L. Wood
Author-X-Name-First: Nicholas L.
Author-X-Name-Last: Wood
Author-Name: Douglas N. VanDerwerken
Author-X-Name-First: Douglas N.
Author-X-Name-Last: VanDerwerken
Title: Fully nonparametric survival analysis in the presence of time-dependent covariates and dependent censoring
Abstract:
In the presence of informative right censoring and time-dependent covariates, we estimate the survival function in a fully nonparametric fashion. We introduce a novel method for incorporating multiple observations per subject when estimating the survival function at different covariate values and compare several competing methods via simulation. The proposed method is applied to survival data from people awaiting liver transplant.
Journal: Journal of Applied Statistics
Pages: 1215-1229
Issue: 5
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2022.2031128
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2031128
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:5:p:1215-1229
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2041568_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Richard F. Potthoff
Author-X-Name-First: Richard F.
Author-X-Name-Last: Potthoff
Title: Spot It! and balanced block designs: keys to better debate architecture for a plethora of candidates in presidential primaries?
Abstract:
U.S. presidential primary debates are influential but under-researched. Before 2015, all of these debates, both Democratic and Republican, had 10 candidates or fewer. The first Republican debate in 2015, however, abided 17 candidates. They were split into two segments, with the 10 best-polling candidates in the main (prime-time) segment and the others in an ‘undercard’ session. A comparable pattern applied for the next six Republican debates. Concern arose not only because many candidates were crowded into a session but also because the undercard candidates were seen as receiving inferior exposure. The Democratic presidential primary debates that started four years later encountered similar difficulty. An authorized policy caused their candidates in each of the first two debates to be limited to 20, randomly divided into two groups of 10 appearing on successive nights. For remedy, this paper examines innovative debate plans, for different numbers of candidates, that feature symmetry among all candidates and entail many short segments with relatively few candidates in each. We apply combinatorial designs—balanced incomplete block designs and regular pairwise balanced designs, which are analogous to the games Spot It Jr.! Animals and (full-fledged) Spot It!, respectively.
Journal: Journal of Applied Statistics
Pages: 1435-1454
Issue: 6
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2022.2041568
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2041568
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:6:p:1435-1454
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2028745_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Jie Liu
Author-X-Name-First: Jie
Author-X-Name-Last: Liu
Author-Name: Haojie Chen
Author-X-Name-First: Haojie
Author-X-Name-Last: Chen
Author-Name: Yang Yang
Author-X-Name-First: Yang
Author-X-Name-Last: Yang
Title: Prediction models with graph kernel regularization for network data
Abstract:
Traditional regression methods typically consider only covariate information and assume that the observations are mutually independent samples. However, samples usually come from individuals connected by a network in many modern applications. We present a risk minimization formulation for learning from both covariates and network structure in the context of graph kernel regularization. The formulation involves a loss function with a penalty term. This penalty can be used not only to encourage similarity between linked nodes but also lead to improvement over traditional regression models. Furthermore, the penalty can be used with many loss-based predictive methods, such as linear regression with squared loss and logistic regression with log-likelihood loss. Simulations to evaluate the performance of this model in the cases of low dimensions and high dimensions show that our proposed approach outperforms all other benchmarks. We verify this for uniform graph, nonuniform graph, balanced-sample, and unbalanced-sample datasets. The approach was applied to predicting the response values on a ‘follow’ social network of Tencent Weibo users and on two citation networks (Cora and CiteSeer). Each instance verifies that the proposed method combining covariate information and link structure with the graph kernel regularization can improve predictive performance.
Journal: Journal of Applied Statistics
Pages: 1400-1417
Issue: 6
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2022.2028745
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2028745
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:6:p:1400-1417
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2028130_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: A. D. C. Nascimento
Author-X-Name-First: A. D. C.
Author-X-Name-Last: Nascimento
Author-Name: Leandro C. Rêgo
Author-X-Name-First: Leandro C.
Author-X-Name-Last: Rêgo
Author-Name: Jonas W. A. Silva
Author-X-Name-First: Jonas W. A.
Author-X-Name-Last: Silva
Title: Compound truncated Poisson gamma distribution for understanding multimodal SAR intensities
Abstract:
In recent years, many works have addressed the proposal of new probability models by theoretic and applied reasons. Specifically, mixture models have been indicated to describe phenomena whose resulting data impose high flexibility. One drawback of these tools is the high number of parameters involved, which implies hard inference procedures. To outperform this gap, we propose a new model that is able to describe multimodal behaviors with only three parameters, called compound truncated Poisson gamma (CTrPGa) distribution. Some properties of the CTrPGa law are derived and discussed: hazard, characteristic and cumulative functions and ordinary moments. Beyond, moment estimation, maximum likelihood estimation (via the expectation maximization algorithm) and empirical characteristic function methods for CTrPGa parameters are furnished. The first of them may be reduced to solve one nonlinear equation, which facilitates its use. We perform a simulation analysis to compare the performance of the three estimation methods studied. Moreover, since the gamma distribution and its mixture versions are commonly used to characterize synthetic aperture radar (SAR) intensities, we perform some real experiments with SAR imagery. The results present evidence that our model is a reasonable assumption that can be taken into account in the pre-processing step of such images.
Journal: Journal of Applied Statistics
Pages: 1358-1377
Issue: 6
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2022.2028130
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2028130
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:6:p:1358-1377
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2041565_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: ShengLi Tzeng
Author-X-Name-First: ShengLi
Author-X-Name-Last: Tzeng
Author-Name: Chun-Shu Chen
Author-X-Name-First: Chun-Shu
Author-X-Name-Last: Chen
Author-Name: Yu-Fen Li
Author-X-Name-First: Yu-Fen
Author-X-Name-Last: Li
Author-Name: Jin-Hua Chen
Author-X-Name-First: Jin-Hua
Author-X-Name-Last: Chen
Title: On summary ROC curve for dichotomous diagnostic studies: an application to meta-analysis of COVID-19
Abstract:
In a systematic review of a diagnostic performance, summarizing performance metrics is crucial. There are various summary models in the literature, and hence model selection becomes inevitable. However, most existing large-sample-based model selection approaches may not fit in a meta-analysis of diagnostic studies, typically having a rather small sample size. Researchers need to effectively determine the final model for further inference, which motivates this article to investigate existing methods and to suggest a more robust method for this need. We considered models covering several widely-used methods for bivariate summary of sensitivity and specificity. Simulation studies were conducted based on different number of studies and different population sensitivity and specificity. Then final models were selected using several existing criteria, and we compared the summary receiver operating characteristic (sROC) curves to the theoretical ROC curve given the generating model. Even though parametric likelihood-based criteria are often applied in practice for their asymptotic property, they fail to consistently choose appropriate models under the limited number of studies. When the number of studies is as small as 10 or 5, our suggestion is best in different scenarios. An example for summary ROC curves for chemiluminescence immunoassay (CLIA) used in COVID-19 diagnosis is also illustrated.
Journal: Journal of Applied Statistics
Pages: 1418-1434
Issue: 6
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2022.2041565
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2041565
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:6:p:1418-1434
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2026896_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Panayotis Papoutsis
Author-X-Name-First: Panayotis
Author-X-Name-Last: Papoutsis
Author-Name: Tarn Duong
Author-X-Name-First: Tarn
Author-X-Name-Last: Duong
Author-Name: Bertrand Michel
Author-X-Name-First: Bertrand
Author-X-Name-Last: Michel
Author-Name: Anne Philippe
Author-X-Name-First: Anne
Author-X-Name-Last: Philippe
Title: Bayesian hierarchical models for the prediction of the driver flow and passenger waiting times in a stochastic carpooling service
Abstract:
Carpooling is an integral component in smart carbon-neutral cities, in particular to facilitate home-work commuting. We study an innovative carpooling service which offers stochastic passenger-driver matching. Stochastic matching is when a passenger makes a carpooling request, and then waits for the first driver from a population of drivers who are already en route. Crucially a designated driver is not assigned as in a traditional carpooling service. For this new form of stochastic carpooling, we propose a two-stage Bayesian hierarchical model to predict the driver flow and the passenger waiting times. The first stage focuses on prediction of the aggregated daily driver flows, and the second stage processes these daily driver flow into hourly predictions of the passenger waiting times. We demonstrate, for an operational carpooling service, that the predictions from our Bayesian hierarchical model outperform the predictions from a frequentist model and a Bayesian non-hierarchical model. The inferences from our proposed model provide insights for the service operator in their evidence-based decision making.
Journal: Journal of Applied Statistics
Pages: 1310-1333
Issue: 6
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2022.2026896
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2026896
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:6:p:1310-1333
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2028131_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Jinwen Liang
Author-X-Name-First: Jinwen
Author-X-Name-Last: Liang
Author-Name: Maozai Tian
Author-X-Name-First: Maozai
Author-X-Name-Last: Tian
Title: Sparse regression for low-dimensional time-dynamic varying coefficient models with application to air quality data
Abstract:
Time dynamic varying coefficient models play an important role in applications of biology, medicine, environment, finance, etc. Traditional methods, such as kernel smoothing and spline smoothing, are popular. But explicit expressions are unavailable using these methods, and the convergence rate of coefficient function estimators is slow. To address these problems, we expand the varying component with appropriate basis functions. And then we solve a sparse regression problem via a sequential thresholded least-squares estimator. The “parameterization” leads to explicit expressions and fast computation speed. Convergence of the sequential thresholded least squares algorithm is guaranteed. The asymptotic distribution of the coefficient function estimator is derived under certain assumptions. Our simulation shows the proposed method has higher precision and computing speed. Finally, our proposed method is applied to the study of PM
$ _{2.5} $ 2.5 concentration in Beijing. We analyze the relationship between PM
$ _{2.5} $ 2.5 and other impact factors.
Journal: Journal of Applied Statistics
Pages: 1378-1399
Issue: 6
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2022.2028131
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2028131
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:6:p:1378-1399
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2024154_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: F. Delgado-Vences
Author-X-Name-First: F.
Author-X-Name-Last: Delgado-Vences
Author-Name: F. Baltazar-Larios
Author-X-Name-First: F.
Author-X-Name-Last: Baltazar-Larios
Author-Name: A. Ornelas Vargas
Author-X-Name-First: A. Ornelas
Author-X-Name-Last: Vargas
Author-Name: E. Morales-Bojórquez
Author-X-Name-First: E.
Author-X-Name-Last: Morales-Bojórquez
Author-Name: V. H. Cruz-Escalona
Author-X-Name-First: V. H.
Author-X-Name-Last: Cruz-Escalona
Author-Name: C. Salomón Aguilar
Author-X-Name-First: C.
Author-X-Name-Last: Salomón Aguilar
Title: Inference for a discretized stochastic logistic differential equation and its application to biological growth
Abstract:
In this paper, we present a method to adjust a stochastic logistic differential equation (SLDE) to a set of highly sparse real data. We assume that the SLDE have two unknown parameters to be estimated. We calculate the Maximum Likelihood Estimator (MLE) to estimate the intrinsic growth rate. We prove that the MLE is strongly consistent and asymptotically normal. For estimating the diffusion parameter, the quadratic variation of the data is used. We validate our method with several types of simulated data. For more realistic cases in which we observe discretizations of the solution, we use diffusion bridges and the stochastic expectation-maximization algorithm to estimate the parameters. Furthermore, when we observe only one point for each path for a given number of trajectories we were still able to estimate the parameters of the SLDE. As far as we know, this is the first attempt to fit stochastic differential equations (SDEs) to these types of data. Finally, we apply our method to real data coming from fishery. The proposed adjustment method can be applied to other examples of SDEs and is highly applicable in several areas of science, especially in situations of sparse data.
Journal: Journal of Applied Statistics
Pages: 1231-1254
Issue: 6
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2021.2024154
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2024154
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:6:p:1231-1254
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2026898_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Diana Q. Chen
Author-X-Name-First: Diana Q.
Author-X-Name-Last: Chen
Author-Name: Si-Qi Mao
Author-X-Name-First: Si-Qi
Author-X-Name-Last: Mao
Author-Name: Xu-Feng Niu
Author-X-Name-First: Xu-Feng
Author-X-Name-Last: Niu
Title: Tests and classification methods in adaptive designs with applications
Abstract:
Statistical tests for biomarker identification and classification methods for patient grouping are two important topics in adaptive designs of clinical trials related to genomic studies. In this article, we evaluate four test methods for biomarker identification in the first stage of an adaptive design: a model-based identification method, the popular two-sided t-test, the nonparametric Wilcoxon Rank-Sum test (two-sided), and the Regularized Generalized Linear Models. For patients grouping in the second stage, we examine classification methods such as Random Forest, Elastic-net Regularized Generalized Linear Models, Support Vector Machine (SVM), Gradient Boosting Machine (GBM), and Extreme Gradient Boosting (XGBoost). Simulation studies are carried out to assess the performance of the different methods. The best identification methods are chosen based on the well-known
$ F_1 $ F1 score, while the best classification techniques are selected based on the area under a receiver operating characteristic curve (AUC). The chosen methods are then applied to the Adaptive Signature Design (ASD) with a real data set from breast cancer patients for the purpose of evaluating the performance of ASD in different situations.
Journal: Journal of Applied Statistics
Pages: 1334-1357
Issue: 6
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2022.2026898
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2026898
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:6:p:1334-1357
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2026895_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: O. Eguasa
Author-X-Name-First: O.
Author-X-Name-Last: Eguasa
Author-Name: E Edionwe
Author-X-Name-First: E
Author-X-Name-Last: Edionwe
Author-Name: J. I. Mbegbu
Author-X-Name-First: J. I.
Author-X-Name-Last: Mbegbu
Title: Local Linear Regression and the problem of dimensionality: a remedial strategy via a new locally adaptive bandwidths selector
Abstract:
Local Linear Regression (LLR) is a nonparametric regression model applied in the modeling phase of Response Surface Methodology (RSM). LLR does not make reference to any fixed parametric model. Hence, LLR is flexible and can capture local trends in the data that might be too complicated for the OLS. However, besides the small sample size and sparse data which characterizes RSM, the performance of the LLR model nosedives as the number of explanatory variables considered in the study increases. This phenomenon, popularly referred to as curse of dimensionality, results in the scanty application of LLR in RSM. In this paper, we propose a novel locally adaptive bandwidths selector, unlike the fixed bandwidths and existing locally adaptive bandwidths selectors, takes into account both the number of the explanatory variables in the study and their individual values at each data point. Single and multiple response problems from the literature and simulated data were used to compare the performance of the
$ LL{R_{PAB}} $ LLRPAB with those of the OLS,
$ LL{R_{FB}} $ LLRFB and
$ LL{R_{AB}} $ LLRAB. Neural network activation functions such ReLU, Leaky-ReLU, SELU and SPOCU was considered and give a remarkable improvement on the loss function (Mean Squared Error) over the regression models utilized in the three data.
Journal: Journal of Applied Statistics
Pages: 1283-1309
Issue: 6
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2022.2026895
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2026895
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:6:p:1283-1309
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2024798_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Clécio S. Ferreira
Author-X-Name-First: Clécio S.
Author-X-Name-Last: Ferreira
Author-Name: Camila Borelli Zeller
Author-X-Name-First: Camila
Author-X-Name-Last: Borelli Zeller
Author-Name: Rafael R. de Oliveira Garcia
Author-X-Name-First: Rafael R.
Author-X-Name-Last: de Oliveira Garcia
Title: Heteroscedastic partially linear model under skew-normal distribution with application in ragweed pollen concentration
Abstract:
We introduce a new class of heteroscedastic partially linear model (PLM) with skew-normal distribution. Maximum likelihood estimation of the model parameters by the ECM algorithm (Expectation/Conditional Maximization) as well as influence diagnostics for the new model are investigated. In addition, a Likelihood Ratio test for assessing the homogeneity of the scale parameter is presented. Simulation studies for assessing the performance of the ECM algorithm and the Likelihood Ratio test statistics for homogeneity of variance are developed. Also, a study for misspecification of the structure function is considered. Finally, an application of the new heteroscedastic PLM to a real data set on ragweed pollen concentration is presented to show that it provides a better fit than the classic homocedastic PLM. We hope that the proposed model may attract applications in different areas of knowledge.
Journal: Journal of Applied Statistics
Pages: 1255-1282
Issue: 6
Volume: 50
Year: 2023
Month: 04
X-DOI: 10.1080/02664763.2021.2024798
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2024798
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:6:p:1255-1282
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2034760_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Qihuang Zhang
Author-X-Name-First: Qihuang
Author-X-Name-Last: Zhang
Author-Name: Grace Y. Yi
Author-X-Name-First: Grace Y.
Author-X-Name-Last: Yi
Title: Sensitivity analysis of error-contaminated time series data under autoregressive models with the application of COVID-19 data
Abstract:
Autoregressive (AR) models are useful in time series analysis. Inferences under such models are distorted in the presence of measurement error, a common feature in applications. In this article, we establish analytical results for quantifying the biases of the parameter estimation in AR models if the measurement error effects are neglected. We consider two measurement error models to describe different data contamination scenarios. We propose an estimating equation approach to estimate the AR model parameters with measurement error effects accounted for. We further discuss forecasting using the proposed method. Our work is inspired by COVID-19 data, which are error-contaminated due to multiple reasons including those related to asymptomatic cases and varying incubation periods. We implement the proposed method by conducting sensitivity analyses and forecasting the fatality rate of COVID-19 over time for the four most populated provinces in Canada. The results suggest that incorporating or not incorporating measurement error effects may yield rather different results for parameter estimation and forecasting.
Journal: Journal of Applied Statistics
Pages: 1611-1634
Issue: 7
Volume: 50
Year: 2023
Month: 05
X-DOI: 10.1080/02664763.2022.2034760
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2034760
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:7:p:1611-1634
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2032621_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Shubham Saini
Author-X-Name-First: Shubham
Author-X-Name-Last: Saini
Author-Name: Sachin Tomer
Author-X-Name-First: Sachin
Author-X-Name-Last: Tomer
Author-Name: Renu Garg
Author-X-Name-First: Renu
Author-X-Name-Last: Garg
Title: Inference of multicomponent stress-strength reliability following Topp-Leone distribution using progressively censored data
Abstract:
In this paper, the inference of multicomponent stress-strength reliability has been derived using progressively censored samples from Topp-Leone distribution. Both stress and strength variables are assumed to follow Topp-Leone distributions with different shape parameters. The maximum likelihood estimate along with the asymptotic confidence interval are developed. Boot-p and Boot-t confidence intervals are also constructed. The Bayes estimates under generalized entropy loss function based on gamma priors using Lindley's, Tierney-Kadane's approximation and Markov chain Monte Carlo methods are derived. A simulation study is considered to check the performance of various estimation methods and different censoring schemes. A real data study shows the applicability of the proposed estimation methods.
Journal: Journal of Applied Statistics
Pages: 1538-1567
Issue: 7
Volume: 50
Year: 2023
Month: 05
X-DOI: 10.1080/02664763.2022.2032621
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2032621
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:7:p:1538-1567
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2032619_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: T. S. McElroy
Author-X-Name-First: T. S.
Author-X-Name-Last: McElroy
Author-Name: Thomas Trimbur
Author-X-Name-First: Thomas
Author-X-Name-Last: Trimbur
Title: Variable targeting and reduction in large vector autoregressions with applications to workforce indicators
Abstract:
We develop statistical tools for time series analysis of large multivariate datasets, when a few core series are of principal interest and there are many potential auxiliary predictive variables. The methodology, based on Vector Autoregressions (VAR), handles the case where unrestricted fitting is precluded by a large number of series and a huge parameter space. In particular, we adopt a forecast error criterion and use Granger-causality tests in a sequential manner to build a VAR model that targets the main variables. This approach affects variable reduction (or equivalently, sparsity restrictions) in a computationally fast way that remains feasible for large dimensions. The search for the best model results in a VAR, fitted with a selection of supporting series, that has the best possible forecast performance with respect to the core variables. We apply the statistical methodology to model real Gross Domestic Product and the national Unemployment Rate, two time series widely monitored by economists and policy-makers, based on a large set of Quarterly Workforce Indicators comprising various major sectors of the economy and different measures of labor market conditions.
Journal: Journal of Applied Statistics
Pages: 1515-1537
Issue: 7
Volume: 50
Year: 2023
Month: 05
X-DOI: 10.1080/02664763.2022.2032619
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2032619
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:7:p:1515-1537
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2034759_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Brent Burch
Author-X-Name-First: Brent
Author-X-Name-Last: Burch
Author-Name: Jesse Egbert
Author-X-Name-First: Jesse
Author-X-Name-Last: Egbert
Title: Confidence intervals for ratios of means applied to corpus-based word frequency classes
Abstract:
The words we choose when we communicate with one another convey meaning and information. In written or spoken language, we tend to employ a relatively small number of words repeatedly whereas a large number of words in the lexicon are seldom used. By considering a ratio of means of the most prevalent word in a body of texts (or corpus) compared to that of the word in question, one can quantify the prevalence of the word in question. Furthermore, the concept of word classes or grouping words having similar measures of prevalence enables the investigator to compare the words. Using a sample of texts having varying lengths from a corpus, the sample mean relative frequency of a word and the maximum likelihood estimator using the zero-inflated beta distribution serve as two measures of the prevalence of a word. We construct and then compare asymptotic confidence intervals involving ratios of means for a number of words in the British National Corpus, a 100 million-word collection of written and spoken language of a wide range of British English. We also examine the sample sizes required to meet specific objectives regarding word classes and ratios of means.
Journal: Journal of Applied Statistics
Pages: 1592-1610
Issue: 7
Volume: 50
Year: 2023
Month: 05
X-DOI: 10.1080/02664763.2022.2034759
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2034759
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:7:p:1592-1610
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2031125_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Youssef Anzarmou
Author-X-Name-First: Youssef
Author-X-Name-Last: Anzarmou
Author-Name: Abdallah Mkhadri
Author-X-Name-First: Abdallah
Author-X-Name-Last: Mkhadri
Author-Name: Karim Oualkacha
Author-X-Name-First: Karim
Author-X-Name-Last: Oualkacha
Title: The Kendall interaction filter for variable interaction screening in high dimensional classification problems
Abstract:
Accounting for important interaction effects can improve the prediction of many statistical learning models. Identification of relevant interactions, however, is a challenging issue owing to their ultrahigh-dimensional nature. Interaction screening strategies can alleviate such issues. However, due to heavier tail distribution and complex dependence structure of interaction effects, innovative robust and/or model-free methods for screening interactions are required to better scale analysis of complex and high-throughput data. In this work, we develop a new model-free interaction screening method, termed Kendall Interaction Filter (KIF), for the classification in high-dimensional settings. KIF method suggests a weighted-sum measure, which compares the overall to the within-cluster Kendall's τ of pairs of predictors, to select interactive couples of features. The proposed KIF measure captures relevant interactions for the clusters response-variable, handles continuous, categorical or a mixture of continuous-categorical features, and is invariant under monotonic transformations. The tKIF measure enjoys the sure screening property in the high-dimensional setting under mild conditions, without imposing sub-exponential moment assumptions on the features' distribution. We illustrate the favorable behavior of the proposed methodology compared to the methods in the same category using simulation studies, and we conduct real data analyses to demonstrate its utility.
Journal: Journal of Applied Statistics
Pages: 1496-1514
Issue: 7
Volume: 50
Year: 2023
Month: 05
X-DOI: 10.1080/02664763.2022.2031125
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2031125
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:7:p:1496-1514
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2034141_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: José Clelto Barros Gomes
Author-X-Name-First: José Clelto Barros
Author-X-Name-Last: Gomes
Author-Name: Reiko Aoki
Author-X-Name-First: Reiko
Author-X-Name-Last: Aoki
Author-Name: Victor Hugo Lachos
Author-X-Name-First: Victor Hugo
Author-X-Name-Last: Lachos
Author-Name: Gilberto Alvarenga Paula
Author-X-Name-First: Gilberto Alvarenga
Author-X-Name-Last: Paula
Author-Name: Cibele Maria Russo
Author-X-Name-First: Cibele Maria
Author-X-Name-Last: Russo
Title: Fast inference for robust nonlinear mixed-effects models
Abstract:
The interest for nonlinear mixed-effects models comes from application areas as pharmacokinetics, growth curves and HIV viral dynamics. However, the modeling procedure usually leads to many difficulties, as the inclusion of random effects, the estimation process and the model sensitivity to atypical or nonnormal data. The scale mixture of normal distributions include heavy-tailed models, as the Student-t, slash and contaminated normal distributions, and provide competitive alternatives to the usual models, enabling the obtention of robust estimates against outlying observations. Our proposal is to compare two estimation methods in nonlinear mixed-effects models where the random components follow a multivariate scale mixture of normal distributions. For this purpose, a Monte Carlo expectation-maximization algorithm (MCEM) and an efficient likelihood-based approximate method are developed. Results show that the approximate method is much faster and enables a fairly efficient likelihood maximization, although a slightly larger bias may be produced, especially for the fixed-effects parameters. A discussion on the robustness aspects of the proposed models are also provided. Two real nonlinear applications are discussed and a brief simulation study is presented.
Journal: Journal of Applied Statistics
Pages: 1568-1591
Issue: 7
Volume: 50
Year: 2023
Month: 05
X-DOI: 10.1080/02664763.2022.2034141
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2034141
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:7:p:1568-1591
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2064439_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Masao Ueki
Author-X-Name-First: Masao
Author-X-Name-Last: Ueki
Title: Beta-negative binomial nonlinear spatio-temporal random effects modeling of COVID-19 case counts in Japan
Abstract:
Coronavirus disease 2019 (COVID-19) caused by the SARS-CoV-2 virus has spread seriously throughout the world. Predicting the spread, or the number of cases, in the future can facilitate preparation for, and prevention of, a worst-case scenario. To achieve these purposes, statistical modeling using past data is one feasible approach. This paper describes spatio-temporal modeling of COVID-19 case counts in 47 prefectures of Japan using a nonlinear random effects model, where random effects are introduced to capture the heterogeneity of a number of model parameters associated with the prefectures. The negative binomial distribution is frequently used with the Paul-Held random effects model to account for overdispersion in count data; however, the negative binomial distribution is known to be incapable of accommodating extreme observations such as those found in the COVID-19 case count data. We therefore propose use of the beta-negative binomial distribution with the Paul-Held model. This distribution is a generalization of the negative binomial distribution that has attracted much attention in recent years because it can model extreme observations with analytical tractability. The proposed beta-negative binomial model was applied to multivariate count time series data of COVID-19 cases in the 47 prefectures of Japan. Evaluation by one-step-ahead prediction showed that the proposed model can accommodate extreme observations without sacrificing predictive performance.
Journal: Journal of Applied Statistics
Pages: 1650-1663
Issue: 7
Volume: 50
Year: 2023
Month: 05
X-DOI: 10.1080/02664763.2022.2064439
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2064439
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:7:p:1650-1663
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2026897_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Woojoo Lee
Author-X-Name-First: Woojoo
Author-X-Name-Last: Lee
Author-Name: Jeonghwan Kim
Author-X-Name-First: Jeonghwan
Author-X-Name-Last: Kim
Author-Name: Donghwan Lee
Author-X-Name-First: Donghwan
Author-X-Name-Last: Lee
Title: Revisiting the analysis pipeline for overdispersed Poisson and binomial data
Abstract:
Overdispersion is a common feature in categorical data analysis and several methods have been developed for detecting and handling it in generalized linear models. The first aim of this study is to clarify the relationships among various score statistics for testing overdispersion and to compare their performances. In addition, we investigate a principled way to correct finite sample bias in the score statistic caused by estimating regression parameters with restricted likelihood. The second aim is to reconsider the current practice for handling overdispersed categorical data. Although the conventional models are based on substantially different mechanisms for generating overdispersion, model selection in practice has not been well studied. We perform an intensive numerical study for determining which method is more robust to various overdispersion mechanisms. In addition, we provide some graphical tools for identifying the better model. The last aim is to reconsider the key assumption for deriving the score statistics. We study the meaning of testing overdispersion when this assumption is violated, and we analytically show the conditions for which it is not appropriate to employ the current statistical practices for analyzing overdispersed data.
Journal: Journal of Applied Statistics
Pages: 1455-1476
Issue: 7
Volume: 50
Year: 2023
Month: 05
X-DOI: 10.1080/02664763.2022.2026897
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2026897
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:7:p:1455-1476
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2055749_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Suzanna-Maria Paleologou
Author-X-Name-First: Suzanna-Maria
Author-X-Name-Last: Paleologou
Title: Income and democracy: a bivariate copula approach
Abstract:
We propose a new approach for exploring the relationship between income and democracy by modeling the two most popular discrete democracy indexes, Polity IV and Freedom House, as a joint random variable by means of a copula function. Joint modeling is crucial for eliciting complementarity and/or substitutability amongst these indexes claiming to measure similar things, i.e. a country’s degree of democratization. We find strong evidence supporting both the existence of the relationship and the positive dependence between the two democracy indexes, suggesting that they are complements to each other. Our findings are robust to different samples and model specifications.
Journal: Journal of Applied Statistics
Pages: 1635-1649
Issue: 7
Volume: 50
Year: 2023
Month: 05
X-DOI: 10.1080/02664763.2022.2055749
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2055749
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:7:p:1635-1649
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2031123_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Chunjie Wu
Author-X-Name-First: Chunjie
Author-X-Name-Last: Wu
Author-Name: Zhijun Wang
Author-X-Name-First: Zhijun
Author-X-Name-Last: Wang
Author-Name: Steven MacEachern
Author-X-Name-First: Steven
Author-X-Name-Last: MacEachern
Author-Name: Jingjing Schneider
Author-X-Name-First: Jingjing
Author-X-Name-Last: Schneider
Title: A robust latent CUSUM chart for monitoring customer attrition
Abstract:
In competitive business, such as insurance and telecommunications, customers can easily replace one provider for another, which leads to customer attrition. Keeping customer attrition rate low is crucial for companies, since retaining a customer is more profitable than recruiting a new one. As a main statistical process control (SPC) method, the CUSUM scheme is able to detect small and persistent shifts in customer attrition. However, customer attrition summaries are typically available on an uneven time scale (e.g. 4-week and 5-week ‘business month’), which may not satisfy the assumptions of traditional CUSUM designs. This paper mainly develops a latent CUSUM chart based on an exponential model for monitoring ‘monthly’ customer attrition, under varying time scales. Both maximum likelihood and least squares methods are studied, where the former mostly performs better and the latter is advantageous for quite small shifts. We apply a Markov chain algorithm to obtain the average run length (ARL), make calibrations for different combinations of parameters, and present reference tables of cutoffs. Three more complicated models are considered to test the robustness of deviations from the initial model. Furthermore, a real example of monitoring monthly customer attrition from a Chinese insurance company is used to illustrate the scheme.
Journal: Journal of Applied Statistics
Pages: 1477-1495
Issue: 7
Volume: 50
Year: 2023
Month: 05
X-DOI: 10.1080/02664763.2022.2031123
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2031123
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:7:p:1477-1495
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2036707_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: E. M. Hashimoto
Author-X-Name-First: E. M.
Author-X-Name-Last: Hashimoto
Author-Name: E. M. M. Ortega
Author-X-Name-First: E. M. M.
Author-X-Name-Last: Ortega
Author-Name: G. M. Cordeiro
Author-X-Name-First: G. M.
Author-X-Name-Last: Cordeiro
Author-Name: V. G. Cancho
Author-X-Name-First: V. G.
Author-X-Name-Last: Cancho
Author-Name: I. Silva
Author-X-Name-First: I.
Author-X-Name-Last: Silva
Title: The re-parameterized inverse Gaussian regression to model length of stay of COVID-19 patients in the public health care system of Piracicaba, Brazil
Abstract:
Among the models applied to analyze survival data, a standout is the inverse Gaussian distribution, which belongs to the class of models to analyze positive asymmetric data. However, the variance of this distribution depends on two parameters, which prevents establishing a functional relation with a linear predictor when the assumption of constant variance does not hold. In this context, the aim of this paper is to re-parameterize the inverse Gaussian distribution to enable establishing an association between a linear predictor and the variance. We propose deviance residuals to verify the model assumptions. Some simulations indicate that the distribution of these residuals approaches the standard normal distribution and the mean squared errors of the estimators are small for large samples. Further, we fit the new model to hospitalization times of COVID-19 patients in Piracicaba (Brazil) which indicates that men spend more time hospitalized than women, and this pattern is more pronounced for individuals older than 60 years. The re-parameterized inverse Gaussian model proved to be a good alternative to analyze censored data with non-constant variance.
Journal: Journal of Applied Statistics
Pages: 1665-1685
Issue: 8
Volume: 50
Year: 2023
Month: 06
X-DOI: 10.1080/02664763.2022.2036707
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2036707
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:8:p:1665-1685
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2046713_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Brian Neelon
Author-X-Name-First: Brian
Author-X-Name-Last: Neelon
Author-Name: Chun-Che Wen
Author-X-Name-First: Chun-Che
Author-X-Name-Last: Wen
Author-Name: Sara E. Benjamin-Neelon
Author-X-Name-First: Sara E.
Author-X-Name-Last: Benjamin-Neelon
Title: A multivariate spatiotemporal model for tracking COVID-19 incidence and death rates in socially vulnerable populations
Abstract:
Recent studies have produced inconsistent findings regarding the association between community social vulnerability and COVID-19 incidence and death rates. This inconsistency may be due, in part, to the fact that these studies modeled cases and deaths separately, ignoring their inherent association and thus yielding imprecise estimates. To improve inferences, we develop a Bayesian multivariate negative binomial model for exploring joint spatial and temporal trends in COVID-19 infections and deaths. The model introduces smooth functions that capture long-term temporal trends, while maintaining enough flexibility to detect local outbreaks in areas with vulnerable populations. Using multivariate autoregressive priors, we jointly model COVID-19 cases and deaths over time, taking advantage of convenient conditional representations to improve posterior computation. As such, the proposed model provides a general framework for multivariate spatiotemporal modeling of counts and rates. We adopt a fully Bayesian approach and develop an efficient posterior Markov chain Monte Carlo algorithm that relies on easily sampled Gibbs steps. We use the model to examine incidence and death rates among counties with high and low social vulnerability in the state of Georgia, USA, from 15 March to 15 December 2020.
Journal: Journal of Applied Statistics
Pages: 1812-1835
Issue: 8
Volume: 50
Year: 2023
Month: 06
X-DOI: 10.1080/02664763.2022.2046713
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2046713
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:8:p:1812-1835
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2043254_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Ruofei Du
Author-X-Name-First: Ruofei
Author-X-Name-Last: Du
Author-Name: Li Luo
Author-X-Name-First: Li
Author-X-Name-Last: Luo
Author-Name: Laurie G. Hudson
Author-X-Name-First: Laurie G.
Author-X-Name-Last: Hudson
Author-Name: Sara Nozadi
Author-X-Name-First: Sara
Author-X-Name-Last: Nozadi
Author-Name: Johnnye Lewis
Author-X-Name-First: Johnnye
Author-X-Name-Last: Lewis
Title: An adjusted partial least squares regression framework to utilize additional exposure information in environmental mixture data analysis
Abstract:
In a large-scale environmental health population study that is composed of subprojects, often different fractions of participants out of the total enrolled have measures of specific outcomes. It’s conceptually reasonable to assume the association study would benefit from utilizing additional exposure information from those with a specific outcome not measured. Partial least squares regression is a practical approach to determine the exposure-outcome associations for mixture data. Like a typical regression approach, however, the partial least squares regression requires that each data observation must have both complete covariate and outcome for model fitting. In this paper, we propose novel adjustments to the general partial least squares regression to estimate and examine the association effects of individual environmental exposure to an outcome within a more complete context of the study population’s environmental mixture exposures. The proposed framework takes advantage of the bilinear model structure. It allows information from all participants, with or without the outcome values, to contribute to the model fitting and the assessment of association effects. Using this proposed framework, incorporation of additional information will lead to smaller root mean square errors in the estimation of association effects, and improve the ability to assess the significance of the effects.
Journal: Journal of Applied Statistics
Pages: 1790-1811
Issue: 8
Volume: 50
Year: 2023
Month: 06
X-DOI: 10.1080/02664763.2022.2043254
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2043254
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:8:p:1790-1811
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2043255_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Tsirizani M. Kaombe
Author-X-Name-First: Tsirizani M.
Author-X-Name-Last: Kaombe
Author-Name: Samuel O. M. Manda
Author-X-Name-First: Samuel O. M.
Author-X-Name-Last: Manda
Title: A novel outlier statistic in multivariate survival models and its application to identify unusual under-five mortality sub-districts in Malawi
Abstract:
Although under-five mortality (U5M) rates have declined worldwide, many countries in sub-Saharan Africa still have much higher rates. Detection of subnational areas with unusually higher U5M rates could support targeted high impact child health interventions. We propose a novel group outlier detection statistic for identifying areas with extreme U5M rates under a multivariate survival data model. The performance of the proposed statistic was evaluated through a simulation study. We applied the proposed method to an analysis of child survival data in Malawi to identify sub-districts with unusually higher or lower U5M rates. The simulation study showed that the proposed outlier statistic can detect unusual high or low mortality groups with a high accuracy of at least 90%, for datasets with at least 50 clusters of size 80 or more. In the application, at most 7 U5M outlier sub-districts were identified, based on the best fitting model as measured by the Akaike information criterion (AIC).
Journal: Journal of Applied Statistics
Pages: 1836-1852
Issue: 8
Volume: 50
Year: 2023
Month: 06
X-DOI: 10.1080/02664763.2022.2043255
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2043255
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:8:p:1836-1852
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2036953_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Mingming Liu
Author-X-Name-First: Mingming
Author-X-Name-Last: Liu
Author-Name: Jing Yang
Author-X-Name-First: Jing
Author-X-Name-Last: Yang
Author-Name: Yushi Liu
Author-X-Name-First: Yushi
Author-X-Name-Last: Liu
Author-Name: Bochao Jia
Author-X-Name-First: Bochao
Author-X-Name-Last: Jia
Author-Name: Yun-Fei Chen
Author-X-Name-First: Yun-Fei
Author-X-Name-Last: Chen
Author-Name: Luna Sun
Author-X-Name-First: Luna
Author-X-Name-Last: Sun
Author-Name: Shujie Ma
Author-X-Name-First: Shujie
Author-X-Name-Last: Ma
Title: A fusion learning method to subgroup analysis of Alzheimer's disease
Abstract:
Uncovering the heterogeneity in the disease progression of Alzheimer's is a key factor to disease understanding and treatment development, so that interventions can be tailored to target the subgroups that will benefit most from the treatment, which is an important goal of precision medicine. However, in practice, one top methodological challenge hindering the heterogeneity investigation is that the true subgroup membership of each individual is often unknown. In this article, we aim to identify latent subgroups of individuals who share a common disorder progress over time, to predict latent subgroup memberships, and to estimate and infer the heterogeneous trajectories among the subgroups. To achieve these goals, we apply a concave fusion learning method to conduct subgroup analysis for longitudinal trajectories of the Alzheimer's disease data. The heterogeneous trajectories are represented by subject-specific unknown functions which are approximated by B-splines. The concave fusion method can simultaneously estimate the spline coefficients and merge them together for the subjects belonging to the same subgroup to automatically identify subgroups and recover the heterogeneous trajectories. The resulting estimator of the disease trajectory of each subgroup is supported by an asymptotic distribution. It provides a sound theoretical basis for further conducting statistical inference in subgroup analysis.
Journal: Journal of Applied Statistics
Pages: 1686-1708
Issue: 8
Volume: 50
Year: 2023
Month: 06
X-DOI: 10.1080/02664763.2022.2036953
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2036953
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:8:p:1686-1708
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2037528_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Kristopher Attwood
Author-X-Name-First: Kristopher
Author-X-Name-Last: Attwood
Author-Name: Surui Hou
Author-X-Name-First: Surui
Author-X-Name-Last: Hou
Author-Name: Alan Hutson
Author-X-Name-First: Alan
Author-X-Name-Last: Hutson
Title: Application of the skew exponential power distribution to ROC curves
Abstract:
The bi-Normal ROC model and corresponding metrics are commonly used in medical studies to evaluate the discriminatory ability of a biomarker. However, in practice, many clinical biomarkers tend to have skewed or other non-Normal distributions. And while the bi-Normal ROC model’s AUC tends to be unbiased in this setting, providing a reasonable measure of global performance, the corresponding decision thresholds tend to be biased. To correct this bias, we propose using an ROC model based on the skew exponential power (SEP) distribution, whose additional parameters can accommodate skewed, heavy tailed, or other non-Normal distributions. Additionally, the SEP distribution can be used to evaluate whether the bi-Normal model would be appropriate. The performance of these ROC models and the non-parametric approach are evaluated via a simulation study and applied to a real data set involving infections from Klebsiella pneumoniae. The SEP based ROC-model provides some efficiency gains with respect to estimation of the AUC and provides cut-points with improved classification rates. As such, in the presence non-Normal data, we suggest using the proposed SEP ROC model.
Journal: Journal of Applied Statistics
Pages: 1709-1724
Issue: 8
Volume: 50
Year: 2023
Month: 06
X-DOI: 10.1080/02664763.2022.2037528
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2037528
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:8:p:1709-1724
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2064440_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Wantanee Poonvoralak
Author-X-Name-First: Wantanee
Author-X-Name-Last: Poonvoralak
Title: Bayesian Markov Chain Monte Carlo for reparameterized Stochastic volatility models using Asian FX rates during Covid-19
Abstract:
In this paper, reparameterization and student-t are applied to Stochastic Volatility (SV) model. We aim to reduce the amount of autocorrelation of the SV parameters and to introduce heavy-tailed model via the Bayesian computation of the Markov Chain Monte Carlo (MCMC) samplers. This research paper helps support better MCMC estimation of the SV model for volatile Asian FX series during Covid-19.
Journal: Journal of Applied Statistics
Pages: 1853-1875
Issue: 8
Volume: 50
Year: 2023
Month: 06
X-DOI: 10.1080/02664763.2022.2064440
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2064440
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:8:p:1853-1875
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2041567_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Jing Kersey
Author-X-Name-First: Jing
Author-X-Name-Last: Kersey
Author-Name: Hani Samawi
Author-X-Name-First: Hani
Author-X-Name-Last: Samawi
Author-Name: Jingjing Yin
Author-X-Name-First: Jingjing
Author-X-Name-Last: Yin
Author-Name: Haresh Rochani
Author-X-Name-First: Haresh
Author-X-Name-Last: Rochani
Author-Name: Xinyan Zhang
Author-X-Name-First: Xinyan
Author-X-Name-Last: Zhang
Title: On diagnostic accuracy measure with cut-points criterion for ordinal disease classification based on concordance and discordance
Abstract:
The accuracy of a diagnostic test has always been essential in detecting disease staging. Many diagnostic tests of accuracy measures are used in binary diagnosis tests. Some measures apply to multi-stage diagnosis. Yet, there are limitations to the implementation, and the performance highly depends on the distribution of diagnostic outcomes. Another essential aspect of medical diagnostic testing using biomarkers is to find an optimal cut-point that categorizes a patient as diseased or healthy. This aspect was extended to the diseases with more than two stages. We propose a diagnostic accuracy measure and optimal cut-points selection (CD), using concordance and discordance for k-stages diseases. The CD measure uses the classification agreement and disagreement between tests outcomes and diseases stages. Simulations for power studies suggest that CD can detect the differences between the null and alternative hypotheses that other methods cannot for some scenarios. Simulation results indicate that using CD measures to select optimal cut-points can provide relatively high correct classification rates than the existing measures and more balanced accurate classification rates than the generalized Youden Index (GYI). An illustration is provided using the ANDI data to choose biomarkers for diagnosing Alzheimer's Disease (AD) and select optimal cut-points for the chosen biomarkers.
Journal: Journal of Applied Statistics
Pages: 1772-1789
Issue: 8
Volume: 50
Year: 2023
Month: 06
X-DOI: 10.1080/02664763.2022.2041567
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2041567
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:8:p:1772-1789
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2041566_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Angkana Kokaew
Author-X-Name-First: Angkana
Author-X-Name-Last: Kokaew
Author-Name: Winai Bodhisuwan
Author-X-Name-First: Winai
Author-X-Name-Last: Bodhisuwan
Author-Name: Su-Fen Yang
Author-X-Name-First: Su-Fen
Author-X-Name-Last: Yang
Author-Name: Andrei Volodin
Author-X-Name-First: Andrei
Author-X-Name-Last: Volodin
Title: Logarithmic confidence estimation of a ratio of binomial proportions for dependent populations
Abstract:
This article investigates the logarithmic interval estimation of a ratio of two binomial proportions in dependent samples. Previous studies suggest that the confidence intervals of the difference between two correlated proportions and their ratio typically do not possess closed-form solutions. Moreover, the computation process is complex and often based on a maximum likelihood estimator, which is a biased estimator of the ratio. We look at the data from two dependent samples and explore the general problem of estimating the ratio of two proportions. Each sample is obtained in the framework of direct binomial sampling. Our goal is to demonstrate that the normal approximation for the estimation of the ratio is reliable for the construction of a confidence interval. The main characteristics of confidence estimators will be investigated by a Monte Carlo simulation. We also provide recommendations for applying the asymptotic logarithmic interval. The estimations of the coverage probability, average width, standard deviation of interval width, and index H are presented as the criteria of our judgment. The simulation studies indicate that the proposed interval performs well based on the aforementioned criteria. Finally, the confidence intervals are illustrated with three real data examples.
Journal: Journal of Applied Statistics
Pages: 1750-1771
Issue: 8
Volume: 50
Year: 2023
Month: 06
X-DOI: 10.1080/02664763.2022.2041566
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2041566
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:8:p:1750-1771
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2038546_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Melissa C. Key
Author-X-Name-First: Melissa C.
Author-X-Name-Last: Key
Author-Name: Susanne Ragg
Author-X-Name-First: Susanne
Author-X-Name-Last: Ragg
Author-Name: Benzion Boukai
Author-X-Name-First: Benzion
Author-X-Name-Last: Boukai
Title: A statistical testing procedure for validating class labels
Abstract:
Motivated by an open problem of validating protein identities in label-free shotgun proteomics work-flows, we present a testing procedure to validate class (protein) labels using available measurements across N instances (peptides). More generally, we present a non-parametric solution to the problem of identifying instances that are deemed as outliers relative to the subset of instances assigned to the same class. The primary assumption is that measured distances between instances within the same class are stochastically smaller than measured distances between instances from different classes. We show that the overall type I error probability across all instances within a class can be controlled by some fixed value (say α). We also demonstrate conditions where similar results on type II error probability hold. The theoretical results are supplemented by an extensive numerical study illustrating the applicability and viability of our method. Even with up to 25% of instances initially mislabeled, our testing procedure maintains a high specificity and greatly reduces the proportion of mislabeled instances. The applicability and effectiveness of our testing procedure is further illustrated by a detailed example on a proteomics data set from children with sickle cell disease where five spike-in proteins acted as contrasting controls.
Journal: Journal of Applied Statistics
Pages: 1725-1749
Issue: 8
Volume: 50
Year: 2023
Month: 06
X-DOI: 10.1080/02664763.2022.2038546
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2038546
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:8:p:1725-1749
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2060952_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Jung Wun Lee
Author-X-Name-First: Jung Wun
Author-X-Name-Last: Lee
Author-Name: Ofer Harel
Author-X-Name-First: Ofer
Author-X-Name-Last: Harel
Title: Incomplete clustering analysis via multiple imputation
Abstract:
Clustering analysis is a prevalent statistical method which divides populations into several subgroups of similar units. However, most existing clustering methods require complete data. One general method that addresses incomplete data is multiple imputation (MI) which avoids many limitations found in other single imputation-based methods and complete case analyses. Nevertheless, adopting MI framework to clustering analysis can be challenging since each imputed data might consist of a different number of clusters and there is not a unique parameter for clustering analysis. In response to this problem, we have developed MICA: Multiply Imputed Cluster Analysis. MICA is a framework for clustering incomplete data consisting of two clustering stages. We assess the properties of MICA and its superiority over other existing incomplete clustering strategies based on a simulation study under various data structures. In addition, we demonstrate the usage of MICA by applying it to the Youth Risk Behavior Surveillance System (YRBSS) 2019 data.
Journal: Journal of Applied Statistics
Pages: 1962-1979
Issue: 9
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2060952
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2060952
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:9:p:1962-1979
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2105827_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Shu Wang
Author-X-Name-First: Shu
Author-X-Name-Last: Wang
Author-Name: Elias Sayour
Author-X-Name-First: Elias
Author-X-Name-Last: Sayour
Author-Name: Ji-Hyun Lee
Author-X-Name-First: Ji-Hyun
Author-X-Name-Last: Lee
Title: Evaluation of phase I clinical trial designs for combinational agents along with guidance based on simulation studies
Abstract:
Combinational therapy that combines two or more therapeutic agents is very common in cancer treatment. Currently, many clinical trials aim to assess feasibility, safety and activity of combinational therapeutics to achieve synergistic response. Dose-finding for combinational agents is considerably more complex than single agent, because only partial order of dose combinations' toxicity is known. Prototypical phase I designs may not adequately capture this complexity thus limiting identification of the maximum tolerated dose (MTD) of combinational agents. In response, novel phase I clinical trial designs for combinational agents have been extensively proposed. However, with so many available designs, studies that compare their performances and explore the impact of design parameters, along with providing recommendations are limited. We are evaluating available phase I designs that identify a single MTD for combinational agents using simulation studies under various conditions. We are also exploring the influences of different design parameters and summarizing the risks/benefits of each design to provide general guidance in design selection.
Journal: Journal of Applied Statistics
Pages: 2055-2078
Issue: 9
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2105827
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2105827
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:9:p:2055-2078
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2053949_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Guangbao Guo
Author-X-Name-First: Guangbao
Author-X-Name-Last: Guo
Author-Name: Yue Sun
Author-X-Name-First: Yue
Author-X-Name-Last: Sun
Author-Name: Guoqi Qian
Author-X-Name-First: Guoqi
Author-X-Name-Last: Qian
Author-Name: Qian Wang
Author-X-Name-First: Qian
Author-X-Name-Last: Wang
Title: LIC criterion for optimal subset selection in distributed interval estimation
Abstract:
Distributed interval estimation in linear regression may be computationally infeasible in the presence of big data that are normally stored in different computer servers or in cloud. The existing challenge represents the results from the distributed estimation may still contain redundant information about the population characteristics of the data. To tackle this computing challenge, we develop an optimization procedure to select the best subset from the collection of data subsets, based on which we perform interval estimation in the context of linear regression. The procedure is derived based on minimizing the length of the final interval estimator and maximizing the information remained in the selected data subset, thus is named as the LIC criterion. Theoretical performance of the LIC criterion is studied in this paper together with a simulation study and real data analysis.
Journal: Journal of Applied Statistics
Pages: 1900-1920
Issue: 9
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2053949
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2053949
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:9:p:1900-1920
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2054962_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Kwan-Young Bak
Author-X-Name-First: Kwan-Young
Author-X-Name-Last: Bak
Author-Name: Jae-Kyung Shin
Author-X-Name-First: Jae-Kyung
Author-X-Name-Last: Shin
Author-Name: Ja-Yong Koo
Author-X-Name-First: Ja-Yong
Author-X-Name-Last: Koo
Title: Intrinsic spherical smoothing method based on generalized Bézier curves and sparsity inducing penalization
Abstract:
This study examines an intrinsic penalized smoothing method on the 2-sphere. We propose a method based on the spherical Bézier curves obtained using a generalized de Casteljau algorithm to provide a degree-based regularity constraint to the spherical smoothing problem. A smooth Bézier curve is found by minimizing the least squares criterion under the regularization constraint. The de Casteljau algorithm constructs higher-order Bézier curves in a recursive manner using linear Bézier curves. We introduce a local penalization scheme based on a penalty function that regularizes the velocity differences in consecutive linear Bézier curves. The imposed penalty induces sparsity on the control points so that the proposed method determines the number of control points, or equivalently the order of the Bézier curve, in a data-adaptive way. An efficient Riemannian block coordinate descent algorithm is devised to implement the proposed method. Numerical studies based on real and simulated data are provided to illustrate the performance and properties of the proposed method. The results show that the penalized Bézier curve adapts well to local data trends without compromising overall smoothness.
Journal: Journal of Applied Statistics
Pages: 1942-1961
Issue: 9
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2054962
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2054962
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:9:p:1942-1961
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2063266_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Xun Li
Author-X-Name-First: Xun
Author-X-Name-Last: Li
Author-Name: Joyee Ghosh
Author-X-Name-First: Joyee
Author-X-Name-Last: Ghosh
Author-Name: Gabriele Villarini
Author-X-Name-First: Gabriele
Author-X-Name-Last: Villarini
Title: Bayesian negative binomial regression model with unobserved covariates for predicting the frequency of north atlantic tropical storms
Abstract:
Predicting the annual frequency of tropical storms is of interest because it can provide basic information towards improved preparation against these storms. Sea surface temperatures (SSTs) averaged over the hurricane season can predict annual tropical cyclone activity well. But predictions need to be made before the hurricane season when the predictors are not yet observed. Several climate models issue forecasts of the SSTs, which can be used instead. Such models use the forecasts of SSTs as surrogates for the true SSTs. We develop a Bayesian negative binomial regression model, which makes a distinction between the true SSTs and their forecasts, both of which are included in the model. For prediction, the true SSTs may be regarded as unobserved predictors and sampled from their posterior predictive distribution. We also have a small fraction of missing data for the SST forecasts from the climate models. Thus, we propose a model that can simultaneously handle missing predictors and variable selection uncertainty. If the main goal is prediction, an interesting question is should we include predictors in the model that are missing at the time of prediction? We attempt to answer this question and demonstrate that our model can provide gains in prediction.
Journal: Journal of Applied Statistics
Pages: 2014-2035
Issue: 9
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2063266
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2063266
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:9:p:2014-2035
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2061430_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Xinhua Liu
Author-X-Name-First: Xinhua
Author-X-Name-Last: Liu
Author-Name: Zhezhen Jin
Author-X-Name-First: Zhezhen
Author-X-Name-Last: Jin
Title: On detecting the effect of exposure mixture
Abstract:
To study the effect of exposure mixture on the continuous health outcomes, one can use the linear model with a weighted sum of multiple standardized exposure variables as an index predictor and its coefficient for the overall effect. The unknown weights typically range between zero and one, indicating contributions of individual exposures to the overall effect. Because the weight parameters present only when the parameter for overall effect is non-zero, testing hypotheses on the overall effect can be challenging, especially when the number of exposure variables is above two. This paper presents a working model based approach to estimate the parameter for overall effect and to test specific hypotheses, including two tests for detecting the overall effect and one test for detecting unequal weights when the overall effect is evident. The statistics are computationally easy and one can apply existing statistical software to perform the analysis. A simulation study shows that the proposed estimators for the parameters of interest may have better finite sample performance than some other estimators.
Journal: Journal of Applied Statistics
Pages: 1980-1991
Issue: 9
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2061430
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2061430
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:9:p:1980-1991
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2063265_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Hyune-Ju Kim
Author-X-Name-First: Hyune-Ju
Author-X-Name-Last: Kim
Author-Name: Huann-Sheng Chen
Author-X-Name-First: Huann-Sheng
Author-X-Name-Last: Chen
Author-Name: Douglas Midthune
Author-X-Name-First: Douglas
Author-X-Name-Last: Midthune
Author-Name: Bill Wheeler
Author-X-Name-First: Bill
Author-X-Name-Last: Wheeler
Author-Name: Dennis W. Buckman
Author-X-Name-First: Dennis W.
Author-X-Name-Last: Buckman
Author-Name: Donald Green
Author-X-Name-First: Donald
Author-X-Name-Last: Green
Author-Name: Jeffrey Byrne
Author-X-Name-First: Jeffrey
Author-X-Name-Last: Byrne
Author-Name: Jun Luo
Author-X-Name-First: Jun
Author-X-Name-Last: Luo
Author-Name: Eric J. Feuer
Author-X-Name-First: Eric J.
Author-X-Name-Last: Feuer
Title: Data-driven choice of a model selection method in joinpoint regression
Abstract:
Selecting the number of change points in segmented line regression is an important problem in trend analysis, and there have been various approaches proposed in the literature. We first study the empirical properties of several model selection procedures and propose a new method based on two Schwarz type criteria, a classical Bayes Information Criterion (BIC) and the one with a harsher penalty than BIC (
$ \hbox {BIC}_3 $ BIC3). The proposed rule is designed to use the former when effect sizes are small and the latter when the effect sizes are large and employs the partial
$ R^2 $ R2 to determine the weight between BIC and
$ \hbox {BIC}_3 $ BIC3. The proposed method is computationally much more efficient than the permutation test procedure that has been the default method of Joinpoint software developed for cancer trend analysis, and its satisfactory performance is observed in our simulation study. Simulations indicate that the proposed method performs well in keeping the probability of correct selection at least as large as that of
$ \hbox {BIC}_3 $ BIC3, whose performance is comparable to that of the permutation test procedure, and improves
$ \hbox {BIC}_3 $ BIC3 when it performs worse than
$ \hbox {BIC}. $ BIC. The proposed method is applied to the U.S. prostate cancer incidence and mortality rates.
Journal: Journal of Applied Statistics
Pages: 1992-2013
Issue: 9
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2063265
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2063265
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:9:p:1992-2013
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2064976_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Xin Zhao
Author-X-Name-First: Xin
Author-X-Name-Last: Zhao
Author-Name: Stuart Barber
Author-X-Name-First: Stuart
Author-X-Name-Last: Barber
Author-Name: Charles C Taylor
Author-X-Name-First: Charles C
Author-X-Name-Last: Taylor
Author-Name: Xiaokai Nie
Author-X-Name-First: Xiaokai
Author-X-Name-Last: Nie
Author-Name: Wenqian Shen
Author-X-Name-First: Wenqian
Author-X-Name-Last: Shen
Title: Spatio-temporal forecasting using wavelet transform-based decision trees with application to air quality and covid-19 forecasting
Abstract:
We develop a new method that combines a decision tree with a wavelet transform to forecast time series data with spatial spillover effects. The method can not only improve prediction but also give good interpretability of the time series mechanism. As a feature exploration method, the wavelet transform represents information at different resolution levels, which may improve the performance of decision trees. The method is applied to simulated data, air pollution and COVID time series data sets. In the simulation, Haar, LA8, D4 and D6 wavelets are compared, with the Haar wavelet having the best performance. In the air pollution application, by using wavelet transform-based decision trees, the temporal effect of air quality index including autoregressive and seasonal effects can be described as well as the spatial correlation effect. To describe the spillover spatial effect in contiguous regions, a spatial weight is constructed to improve the modeling performance. The results show that air quality index has autoregressive, seasonal and spatial spillover effects. The wavelet transformed variables have a better forecasting performance and enhanced interpretability than the original variables. For the COVID time series of cumulative cases, spatial weighted variables are not selected which shows the lock-down policies are truly effective.
Journal: Journal of Applied Statistics
Pages: 2036-2054
Issue: 9
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2064976
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2064976
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:9:p:2036-2054
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2052821_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: C. P. Yadav
Author-X-Name-First: C. P.
Author-X-Name-Last: Yadav
Author-Name: Sanjeev K. Tomer
Author-X-Name-First: Sanjeev K.
Author-X-Name-Last: Tomer
Author-Name: M. S. Panwar
Author-X-Name-First: M. S.
Author-X-Name-Last: Panwar
Title: A competing risk study of menarcheal age distribution based on non-recall current status data
Abstract:
In many cross-sectional studies, the chances that an individual will be able to exactly recall the event are very low. The possibility of recalling the exact time as well as the cause of occurrence of an event usually decreases as the gap between event and monitoring time increases. This gives rise to non-recall current status data. In this article, an efficient approach to deal with such non-recall current status data is established in a competing risk set up. In the classical method, a nested Expectation–Maximization technique is worked out for the estimation purpose and the information matrix is evaluated using the missing information principle. In the Bayesian paradigm, point and interval estimates are obtained using the Gibbs sampling algorithm. A recent anthropometric study data containing the menarcheal status of girls and age at menarche is analyzed using the considered methodology.
Journal: Journal of Applied Statistics
Pages: 1877-1899
Issue: 9
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2052821
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2052821
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:9:p:1877-1899
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2053950_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Yanqin Feng
Author-X-Name-First: Yanqin
Author-X-Name-Last: Feng
Author-Name: Jie Wang
Author-X-Name-First: Jie
Author-X-Name-Last: Wang
Author-Name: Yang Li
Author-X-Name-First: Yang
Author-X-Name-Last: Li
Title: Goodness-of-fit inference for the additive hazards regression model with clustered current status data
Abstract:
Clustered current status data are frequently encountered in biomedical research and other areas that require survival analysis. This paper proposes graphical and formal model assessment procedures to evaluate the goodness of fit of the additive hazards model to clustered current status data. The test statistics proposed are based on sums of martingale-based residuals. Relevant asymptotic properties are established, and empirical distributions of the test statistics can be simulated utilizing Gaussian multipliers. Extensive simulation studies confirmed that the proposed test procedures work well for practical scenarios. This proposed method applies when failure times within the same cluster are correlated, and in particular, when cluster sizes can be informative about intra-cluster correlations. The method is applied to analyze clustered current status data from a lung tumorigenicity study.
Journal: Journal of Applied Statistics
Pages: 1921-1941
Issue: 9
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2053950
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2053950
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:9:p:1921-1941
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2064977_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Vasileios Alevizakos
Author-X-Name-First: Vasileios
Author-X-Name-Last: Alevizakos
Author-Name: Kashinath Chatterjee
Author-X-Name-First: Kashinath
Author-X-Name-Last: Chatterjee
Author-Name: Christos Koukouvinos
Author-X-Name-First: Christos
Author-X-Name-Last: Koukouvinos
Author-Name: Angeliki Lappa
Author-X-Name-First: Angeliki
Author-X-Name-Last: Lappa
Title: A double generally weighted moving average control chart for monitoring the process variability
Abstract:
In the present article, a double generally weighted moving average (DGWMA) control chart based on a three-parameter logarithmic transformation is proposed for monitoring the process variability, namely the
$ S^2 $ S2-DGWMA chart. Monte-Carlo simulations are utilized in order to evaluate the run-length performance of the
$ S^2 $ S2-DGWMA chart. In addition, a detailed comparative study is conducted to compare the performance of the
$ S^2 $ S2-DGWMA chart with several well-known memory-type control charts in the literature. The comparisons indicate that the proposed one is more efficient in detecting small shifts, while it is more sensitive in identifying upward shifts in the process variability. A real data example is given to present the implementation of the new
$ S^2 $ S2-DGWMA chart.
Journal: Journal of Applied Statistics
Pages: 2079-2107
Issue: 10
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2064977
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2064977
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:10:p:2079-2107
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2064978_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Fidel Ernesto Castro Morales
Author-X-Name-First: Fidel Ernesto Castro
Author-X-Name-Last: Morales
Author-Name: Daniele Torres Rodrigues
Author-X-Name-First: Daniele Torres
Author-X-Name-Last: Rodrigues
Title: Spatiotemporal nonhomogeneous poisson model with a seasonal component applied to the analysis of extreme rainfall
Abstract:
This paper develops an extension of spatiotemporal models that handle count data using nonhomogeneous Poisson processes. In this new proposal, we incorporate a seasonal cycle component in the definition of the intensity function to control possible effects produced by the occurrence of the event of interest in regular periods. The seasonal cycle can cause problems in estimating the shape parameter of the Weibull and generalized Goel intensity functions. This shape parameter serves to confront the research hypothesis that seeks to identify a trend in the occurrence rate of an event of interest. In the case of the Weibull intensity function, a value significantly equal to one of the shape parameters indicates a constant rate of occurrence, less than one indicates a decreasing rate, and greater than one indicates an increasing rate. In the case of the Goel intensity function, parameter values less than or equal to one indicate a decreasing occurrence rate, and values greater than one indicate the presence of a change point. We also built a spatial model using the Musa-Okumoto intensity function as an alternative to approximate counting processes for which there is a decreasing trend in the occurrence rate of the event of interest. We estimated the parameters of the proposed method from a Bayesian perspective. Finally, we fitted the proposed model and compared it with other approximations to analyze the frequency of extreme rainfall in the northern region of the states of Maranhão and Piauí in northeastern Brazil over ten years. Among the main results, we found that (1) the proposed method has proven superior in terms of fit and prediction performance than the other models, and (2) unlike other approximations, the proposed model does not detect changes in the rate of extreme rainfall occurrences.
Journal: Journal of Applied Statistics
Pages: 2108-2126
Issue: 10
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2064978
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2064978
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:10:p:2108-2126
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2065468_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: N Vipin
Author-X-Name-First: N
Author-X-Name-Last: Vipin
Author-Name: Indranil Ghosh
Author-X-Name-First: Indranil
Author-X-Name-Last: Ghosh
Author-Name: S. M. Sunoj
Author-X-Name-First: S. M.
Author-X-Name-Last: Sunoj
Title: Some properties of stop-loss moments under biased sampling
Abstract:
The stop-loss moments have generally been used as useful summary measures for analyzing the data which exceeds specific threshold levels. In many scientific studies, the investigator cannot record the sampling units with equal probability, and in such a scenario, the selected sample units appear with unequal probability, in other words with different weights, which leads to a biased or weighted sampling. In the present study, we examine the usefulness of stop-loss moments in biased sampling. The application of the weighted stop-loss moments in analyzing biased data has been investigated and compared using different empirical estimators through simulated and real data sets.
Journal: Journal of Applied Statistics
Pages: 2127-2150
Issue: 10
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2065468
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2065468
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:10:p:2127-2150
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2068512_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Pei-Fang Su
Author-X-Name-First: Pei-Fang
Author-X-Name-Last: Su
Author-Name: Junjiang Zhong
Author-X-Name-First: Junjiang
Author-X-Name-Last: Zhong
Author-Name: Yi-Chia Liu
Author-X-Name-First: Yi-Chia
Author-X-Name-Last: Liu
Author-Name: Tzu-Hsuan Lin
Author-X-Name-First: Tzu-Hsuan
Author-X-Name-Last: Lin
Author-Name: Huang-Tz Ou
Author-X-Name-First: Huang-Tz
Author-X-Name-Last: Ou
Title: Efficient estimation of a Cox model when integrating the subgroup incidence rate information
Abstract:
Incidence rates for diseases are widely used in the field of medical research because they lead to clear and simple physical and clinical interpretations. In this study, we propose an efficient estimation method that incorporates auxiliary subgroup information related to the incidence rate into the estimation of the Cox proportional hazard model. The results show that utilizing the incidence rate information improves the efficiency of the estimation of regression parameters based on the double empirical likelihood method compared to that for conventional models that do not incorporation such information. We show that estimators of regression parameters asymptotically follow a multivariate normal distribution with a variance-covariance matrix that can be consistently estimated. Simulation results indicate that the proposed estimators significantly increase efficiency. Finally, an example of the effects of type 2 diabetes on stroke is applied to demonstrate the proposed method.
Journal: Journal of Applied Statistics
Pages: 2151-2170
Issue: 10
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2068512
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2068512
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:10:p:2151-2170
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2068513_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Yulan B. van Oppen
Author-X-Name-First: Yulan B.
Author-X-Name-Last: van Oppen
Author-Name: Gabi Milder-Mulderij
Author-X-Name-First: Gabi
Author-X-Name-Last: Milder-Mulderij
Author-Name: Christophe Brochard
Author-X-Name-First: Christophe
Author-X-Name-Last: Brochard
Author-Name: Rink Wiggers
Author-X-Name-First: Rink
Author-X-Name-Last: Wiggers
Author-Name: Saskia de Vries
Author-X-Name-First: Saskia
Author-X-Name-Last: de Vries
Author-Name: Wim P. Krijnen
Author-X-Name-First: Wim P.
Author-X-Name-Last: Krijnen
Author-Name: Marco A. Grzegorczyk
Author-X-Name-First: Marco A.
Author-X-Name-Last: Grzegorczyk
Title: Modeling dragonfly population data with a Bayesian bivariate geometric mixed-effects model
Abstract:
We develop a generalized linear mixed model (GLMM) for bivariate count responses for statistically analyzing dragonfly population data from the Northern Netherlands. The populations of the threatened dragonfly species Aeshna viridis were counted in the years 2015–2018 at 17 different locations (ponds and ditches). Two different widely applied population size measures were used to quantify the population sizes, namely the number of found exoskeletons (‘exuviae’) and the number of spotted egg-laying females were counted. Since both measures (responses) led to many zero counts but also feature very large counts, our GLMM model builds on a zero-inflated bivariate geometric (ZIBGe) distribution, for which we show that it can be easily parameterized in terms of a correlation parameter and its two marginal medians. We model the medians with linear combinations of fixed (environmental covariates) and random (location-specific intercepts) effects. Modeling the medians yields a decreased sensitivity to overly large counts; in particular, in light of growing marginal zero inflation rates. Because of the relatively small sample size (n = 114) we follow a Bayesian modeling approach and use Metropolis-Hastings Markov Chain Monte Carlo (MCMC) simulations for generating posterior samples.
Journal: Journal of Applied Statistics
Pages: 2171-2193
Issue: 10
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2068513
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2068513
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:10:p:2171-2193
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2070136_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: E. F. Saraiva
Author-X-Name-First: E. F.
Author-X-Name-Last: Saraiva
Author-Name: L. Sauer
Author-X-Name-First: L.
Author-X-Name-Last: Sauer
Author-Name: C. A. B. Pereira
Author-X-Name-First: C. A. B.
Author-X-Name-Last: Pereira
Title: A hierarchical Bayesian approach for modeling the evolution of the 7-day moving average of the number of deaths by COVID-19
Abstract:
In this paper, we propose a hierarchical Bayesian approach for modeling the evolution of the 7-day moving average for the number of deaths due to COVID-19 in a country, state or city. The proposed approach is based on a Gaussian process regression model. The main advantage of this model is that it assumes that a nonlinear function f used for modeling the observed data is an unknown random parameter in opposite to usual approaches that set up f as being a known mathematical function. This assumption allows the development of a Bayesian approach with a Gaussian process prior over f. In order to estimate the parameters of interest, we develop an MCMC algorithm based on the Metropolis-within-Gibbs sampling algorithm. We also present a procedure for making predictions. The proposed method is illustrated in a case study, in which, we model the 7-day moving average for the number of deaths recorded in the state of São Paulo, Brazil. Results obtained show that the proposed method is very effective in modeling and predicting the values of the 7-day moving average.
Journal: Journal of Applied Statistics
Pages: 2194-2208
Issue: 10
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2070136
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2070136
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:10:p:2194-2208
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2070137_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Simon Boge Brant
Author-X-Name-First: Simon Boge
Author-X-Name-Last: Brant
Author-Name: Ingrid Hobæk Haff
Author-X-Name-First: Ingrid
Author-X-Name-Last: Hobæk Haff
Title: The fraud loss for selecting the model complexity in fraud detection
Abstract:
Statistical fraud detection consists in making a system that automatically selects a subset of all cases (insurance claims, financial transactions, etc.) that are the most interesting for further investigation. The reason why such a system is needed is that the total number of cases typically is much higher than one realistically could investigate manually and that fraud tends to be quite rare. Further, the investigator is typically limited to controlling a restricted number k of cases, due to limited resources. The most efficient manner of allocating these resources is then to try selecting the k cases with the highest probability of being fraudulent. The prediction model used for this purpose must normally be regularised to avoid overfitting and consequently bad prediction performance. A loss function, denoted the fraud loss, is proposed for selecting the model complexity via a tuning parameter. A simulation study is performed to find the optimal settings for validation. Further, the performance of the proposed procedure is compared to the most relevant competing procedure, based on the area under the receiver operating characteristic curve (AUC), in a set of simulations, as well as on a credit card default dataset. Choosing the complexity of the model by the fraud loss resulted in either comparable or better results in terms of the fraud loss than choosing it according to the AUC.
Journal: Journal of Applied Statistics
Pages: 2209-2227
Issue: 10
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2070137
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2070137
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:10:p:2209-2227
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2071419_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Ana F. Best
Author-X-Name-First: Ana F.
Author-X-Name-Last: Best
Author-Name: Yaakov Malinovsky
Author-X-Name-First: Yaakov
Author-X-Name-Last: Malinovsky
Author-Name: Paul S. Albert
Author-X-Name-First: Paul S.
Author-X-Name-Last: Albert
Title: The efficient design of Nested Group Testing algorithms for disease identification in clustered data
Abstract:
Group testing study designs have been used since the 1940s to reduce screening costs for uncommon diseases; for rare diseases, all cases are identifiable with substantially fewer tests than the population size. Substantial research has identified efficient designs under this paradigm. However, little work has focused on the important problem of disease screening among clustered data, such as geographic heterogeneity in HIV prevalence. We evaluated designs where we first estimate disease prevalence and then apply efficient group testing algorithms using these estimates. Specifically, we evaluate prevalence using individual testing on a fixed-size subset of each cluster and use these prevalence estimates to choose group sizes that minimize the corresponding estimated average number of tests per subject. We compare designs where we estimate cluster-specific prevalences as well as a common prevalence across clusters, use different group testing algorithms, construct groups from individuals within and in different clusters, and consider misclassification. For diseases with low prevalence, our results suggest that accounting for clustering is unnecessary. However, for diseases with higher prevalence and sizeable between-cluster heterogeneity, accounting for clustering in study design and implementation improves efficiency. We consider the practical aspects of our design recommendations with two examples with strong clustering effects: (1) Identification of HIV carriers in the US population and (2) Laboratory screening of anti-cancer compounds using cell lines.
Journal: Journal of Applied Statistics
Pages: 2228-2245
Issue: 10
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2071419
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2071419
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:10:p:2228-2245
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2073336_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Daniel Ries
Author-X-Name-First: Daniel
Author-X-Name-Last: Ries
Author-Name: Alicia Carriquiry
Author-X-Name-First: Alicia
Author-X-Name-Last: Carriquiry
Title: The relationship between moderate to vigorous physical activity and metabolic syndrome: a Bayesian measurement error approach
Abstract:
Metabolic Syndrome (MetS) is a serious condition that can be an early warning sign of heart disease and Type 2 diabetes. MetS is characterized by having elevated levels of blood pressure, cholesterol, waist circumference, and fasting glucose. There are many articles in the literature exploring the relationship between physical activity and MetS, but most do not consider the measurement error in the physical activity measurements nor the correlations among the MetS risk factors. Furthermore, previous work has generally treated MetS as binary, rather than directly modeling the risk factors on their measured, continuous space. Using data from the National Health and Nutrition Examination Survey (NHANES), we explore the relationship between minutes of moderate to vigorous physical activity (MVPA) and MetS risk factors. We construct a measurement error model for the accelerometry data, and then model its relationship between MetS risk factors with nonlinear seemingly unrelated regressions, incorporating dependence among MetS risk factors. The novel features of this model give the medical research community a new way to understand relationships between MVPA and MetS. The results of this approach present the field with a different modeling perspective than previously taken and suggest future avenues of scientific discovery.
Journal: Journal of Applied Statistics
Pages: 2246-2266
Issue: 10
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2073336
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2073336
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:10:p:2246-2266
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2108773_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: A. Elayouty
Author-X-Name-First: A.
Author-X-Name-Last: Elayouty
Author-Name: H. Abou-Ali
Author-X-Name-First: H.
Author-X-Name-Last: Abou-Ali
Title: Functional data analysis of the relationship between electricity consumption and climate change drivers
Abstract:
Climate change has become increasingly important in recent years. It is the outcome of the burning of fossil fuels that increased the concentration of atmospheric carbon dioxide (CO
$ _2 $ 2), over the last century. Mitigating the impacts of climate change requires a better understanding and assessment of the countries' economic decisions on the amount of CO
$ _2 $ 2 emissions. This paper assesses the variability between the different countries in the trends of CO
$ _2 $ 2 emissions and electricity consumption from 1975 to 2014, while identifying clusters of countries of similar trends over time. The novel methodology applied in this paper enables us to assess long-debated issues in climate literature. The temporal dynamic effects of electricity consumption and economic growth on CO
$ _2 $ 2 emissions across countries are studied using functional data analysis (FDA) methods. The latter have proven to be useful tools for visualising similarities and differences in the non-linear trends of CO
$ _2 $ 2 emissions without forcing linear trends and stationary relationships which can be unrealistic and misleading. The results indicate the possibility of identifying changes in the trends of CO
$ _2 $ 2 emissions and electricity consumption for a wide range of heterogeneous countries over the study period. The findings also reveal that economic growth puts a strain on the environment, where many high-income countries are still away from attaining economic-energy sustainability.
Journal: Journal of Applied Statistics
Pages: 2267-2285
Issue: 10
Volume: 50
Year: 2023
Month: 07
X-DOI: 10.1080/02664763.2022.2108773
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2108773
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:10:p:2267-2285
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1937584_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: S. Bacci
Author-X-Name-First: S.
Author-X-Name-Last: Bacci
Author-Name: R. Fabbricatore
Author-X-Name-First: R.
Author-X-Name-Last: Fabbricatore
Author-Name: Maria Iannario
Author-X-Name-First: Maria
Author-X-Name-Last: Iannario
Title: Latent trait models for perceived risk assessment using a Covid-19 data survey
Abstract:
Aim of the contribution is analyzing potential events that may negatively impact individuals, assets, and/or the environment, and making judgments about the perceived personal and social riskiness of Covid-19 compared to other hazards belonging to health (AIDS, cancer, infarction), environmental (climate change), behavioral (serious car accidents), and technological (nuclear weapons) domains. The comparative risk analysis has been performed on a survey data collected during the first Italian Covid-19 lockdown. An item response theory model for polytomously scored items has been implemented for the analysis of the positioning of Covid-19 with respect to the other hazards in terms of perceived risk. Among the attributes determining the hazard's perceived risk, Covid-19 distinguishes for the knowledge of risks from the hazard, media attention, and fear caused by the hazard in the peers. Besides, through a latent regression analysis, the role of some individual characteristics on the perceived risk for Covid-19 has been examined. Our contribution allows us to disentangle among several aspects of hazards and describe the main factors affecting the perceived risk. It also contributes to determine if existing control measures are perceived as adequate and the interest for new media with related impact on a person's reaction.
Journal: Journal of Applied Statistics
Pages: 2575-2598
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.1937584
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1937584
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2575-2598
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1953449_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Peiyi Zhang
Author-X-Name-First: Peiyi
Author-X-Name-Last: Zhang
Author-Name: Tianning Dong
Author-X-Name-First: Tianning
Author-X-Name-Last: Dong
Author-Name: Ninghui Li
Author-X-Name-First: Ninghui
Author-X-Name-Last: Li
Author-Name: Faming Liang
Author-X-Name-First: Faming
Author-X-Name-Last: Liang
Title: Identification of factors impacting on the transmission and mortality of COVID-19
Abstract:
This paper proposes a dynamic infectious disease model for COVID-19 daily counts data and estimate the model using the Langevinized EnKF algorithm, which is scalable for large-scale spatio-temporal data, converges to the right filtering distribution, and is thus suitable for performing statistical inference and quantifying uncertainty for the underlying dynamic system. Under the framework of the proposed dynamic infectious disease model, we tested the impact of temperature, precipitation, state emergency order and stay home order on the spread of COVID-19 based on the United States county-wise daily counts data. Our numerical results show that warm and humid weather can significantly slow the spread of COVID-19, and the state emergency and stay home orders also help to slow it. This finding provides guidance and support to future policies or acts for mitigating the community transmission and lowering the mortality rate of COVID-19.
Journal: Journal of Applied Statistics
Pages: 2624-2647
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.1953449
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1953449
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2624-2647
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1941806_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Claire Donnat
Author-X-Name-First: Claire
Author-X-Name-Last: Donnat
Author-Name: Susan Holmes
Author-X-Name-First: Susan
Author-X-Name-Last: Holmes
Title: Modeling the heterogeneity in COVID-19's reproductive number and its impact on predictive scenarios
Abstract:
The correct evaluation of the reproductive number R for COVID-19 is central in the quantification of the potential scope of the pandemic and the selection of an appropriate course of action. In most models, R is modeled as a constant - effectively averaging out the inherent variability of the transmission process due to varying individual contact rates, population densities, or temporal factors amongst many. Yet, due to the exponential nature of epidemic growth, the error due to this simplification can be rapidly amplified, and its extent remains unknown. How can this intrinsic variability be percolated into epidemic models, and its impact, better quantified? We study this question here through a Bayesian perspective that captures at scale the heterogeneity of a population and environmental conditions, creating a bridge between the traditional agent-based and compartmental approaches. We use our model to simulate the spread as well as the impact of different social distancing strategies on real COVID-19 data, and highlight the significant impact of the heterogeneity. We emphasize that the contribution of this paper focuses on discussing the importance of the impact of R's heterogeneity on uncertainty quantification from a statistical viewpoint, rather than developing new predictive models.
Journal: Journal of Applied Statistics
Pages: 2518-2546
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.1941806
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1941806
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2518-2546
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1895089_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Ting Tian
Author-X-Name-First: Ting
Author-X-Name-Last: Tian
Author-Name: Jingwen Zhang
Author-X-Name-First: Jingwen
Author-X-Name-Last: Zhang
Author-Name: Shiyun Lin
Author-X-Name-First: Shiyun
Author-X-Name-Last: Lin
Author-Name: Yukang Jiang
Author-X-Name-First: Yukang
Author-X-Name-Last: Jiang
Author-Name: Jianbin Tan
Author-X-Name-First: Jianbin
Author-X-Name-Last: Tan
Author-Name: Zhongfei Li
Author-X-Name-First: Zhongfei
Author-X-Name-Last: Li
Author-Name: Xueqin Wang
Author-X-Name-First: Xueqin
Author-X-Name-Last: Wang
Title: Data-driven analysis of the simulations of the spread of COVID-19 under different interventions of China
Abstract:
Since February 2020, COVID-19 has spread rapidly to more than 200 countries in the world. During the pandemic, local governments in China have implemented different interventions to efficiently control the spread of the epidemic. Characterizing transmission of COVID-19 under some typical interventions is essential to help countries develop appropriate interventions. Based on the pre-symptomatic transmission patterns of COVID-19, we established a novel compartmental model: Susceptible-Infectious-Confirmed-Removed (SICR) model, which allowed the effective reproduction number to change over time, thus the effects of policies could be reasonably estimated. Using the epidemic data of Wuhan, Wenzhou, and Shenzhen, we migrated the corresponding estimated policy modes to South Korea, Italy, and the United States and simulated the potential outcomes for these countries when they adopted similar policy strategies to China. We found that the mild interventions implemented in Shenzhen were effective in controlling the epidemic in the early stage, while more stringent policies which were implemented in Wuhan and Wenzhou were necessary if the epidemic became severe and needed to be controlled in a short time.
Journal: Journal of Applied Statistics
Pages: 2547-2560
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.1895089
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1895089
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2547-2560
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1907839_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Álvaro Gajardo
Author-X-Name-First: Álvaro
Author-X-Name-Last: Gajardo
Author-Name: Hans-Georg Müller
Author-X-Name-First: Hans-Georg
Author-X-Name-Last: Müller
Title: Point process models for COVID-19 cases and deaths
Abstract:
The study of events distributed over time which can be quantified as point processes has attracted much interest over the years due to its wide range of applications. It has recently gained new relevance due to the COVID-19 case and death processes associated with SARS-CoV-2 that characterize the COVID-19 pandemic and are observed across different countries. It is of interest to study the behavior of these point processes and how they may be related to covariates such as mobility restrictions, gross domestic product per capita, and fraction of population of older age. As infections and deaths in a region are intrinsically events that arrive at random times, a point process approach is natural for this setting. We adopt techniques for conditional functional point processes that target point processes as responses with vector covariates as predictors, to study the interaction and optimal transport between case and death processes and doubling times conditional on covariates.
Journal: Journal of Applied Statistics
Pages: 2294-2309
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.1907839
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1907839
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2294-2309
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2064975_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Teng Chen
Author-X-Name-First: Teng
Author-X-Name-Last: Chen
Author-Name: Paweł Polak
Author-X-Name-First: Paweł
Author-X-Name-Last: Polak
Author-Name: Stanislav Uryasev
Author-X-Name-First: Stanislav
Author-X-Name-Last: Uryasev
Title: Classification and severity progression measure of COVID-19 patients using pairs of multi-omic factors
Abstract:
Early detection and effective treatment of severe COVID-19 patients remain two major challenges during the current pandemic. Analysis of molecular changes in blood samples of severe patients is one of the promising approaches to this problem. From thousands of proteomic, metabolomic, lipidomic, and transcriptomic biomarkers selected in other research, we identify several pairs of biomarkers that after additional nonlinear spline transformation are highly effective in classifying and predicting severe COVID-19 cases. The performance of these pairs is evaluated in-sample, in a cross-validation exercise, and in an out-of-sample analysis on two independent datasets. We further improve our classifier by identifying complementary pairs using hierarchical clustering. In a result, we achieve 96–98% AUC on the validation data. Our findings can help medical experts to identify small groups of biomarkers that after nonlinear transformation can be used to construct a cost-effective test for patient screening and prediction of severity progression.
Journal: Journal of Applied Statistics
Pages: 2473-2503
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2022.2064975
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2064975
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2473-2503
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2019687_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Bryan Cai
Author-X-Name-First: Bryan
Author-X-Name-Last: Cai
Author-Name: John P. A. Ioannidis
Author-X-Name-First: John P. A.
Author-X-Name-Last: Ioannidis
Author-Name: Eran Bendavid
Author-X-Name-First: Eran
Author-X-Name-Last: Bendavid
Author-Name: Lu Tian
Author-X-Name-First: Lu
Author-X-Name-Last: Tian
Title: Exact inference for disease prevalence based on a test with unknown specificity and sensitivity
Abstract:
To make informative public policy decisions in battling the ongoing COVID-19 pandemic, it is important to know the disease prevalence in a population. There are two intertwined difficulties in estimating this prevalence based on testing results from a group of subjects. First, the test is prone to measurement error with unknown sensitivity and specificity. Second, the prevalence tends to be low at the initial stage of the pandemic and we may not be able to determine if a positive test result is a false positive due to the imperfect test specificity. The statistical inference based on a large sample approximation or conventional bootstrap may not be valid in such cases. In this paper, we have proposed a set of confidence intervals, whose validity doesn't depend on the sample size in the unweighted setting. For the weighted setting, the proposed inference is equivalent to hybrid bootstrap methods, whose performance is also more robust than those based on asymptotic approximations. The methods are used to reanalyze data from a study investigating the antibody prevalence in Santa Clara County, California in addition to several other seroprevalence studies. Simulation studies have been conducted to examine the finite-sample performance of the proposed method.
Journal: Journal of Applied Statistics
Pages: 2599-2623
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.2019687
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2019687
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2599-2623
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2228597_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Arnold Stromberg
Author-X-Name-First: Arnold
Author-X-Name-Last: Stromberg
Author-Name: Jie Chen
Author-X-Name-First: Jie
Author-X-Name-Last: Chen
Author-Name: Teresa Paula Costa Azinheira Oliveira
Author-X-Name-First: Teresa Paula Costa Azinheira
Author-X-Name-Last: Oliveira
Author-Name: Yichuan Zhao
Author-X-Name-First: Yichuan
Author-X-Name-Last: Zhao
Author-Name: Ramin Moghaddass
Author-X-Name-First: Ramin
Author-X-Name-Last: Moghaddass
Author-Name: Milan Stehlik
Author-X-Name-First: Milan
Author-X-Name-Last: Stehlik
Title: Editorial to the special issue: statistical perspectives on analytics for COVID-19 data
Journal: Journal of Applied Statistics
Pages: 2287-2293
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2023.2228597
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2228597
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2287-2293
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1976119_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Xiaolei Zhang
Author-X-Name-First: Xiaolei
Author-X-Name-Last: Zhang
Author-Name: Renjun Ma
Author-X-Name-First: Renjun
Author-X-Name-Last: Ma
Title: Forecasting waved daily COVID-19 death count series with a novel combination of segmented Poisson model and ARIMA models
Abstract:
Autoregressive Integrated Moving Average (ARIMA) models have been widely used to forecast and model the development of various infectious diseases including COVID-19 outbreaks; however, such use of ARIMA models does not respect the count nature of the pandemic development data. For example, the daily COVID-19 death count series data for Canada and the United States (USA) are generally skewed with lots of low counts. In addition, there are generally waved patterns with turning points influenced by government major interventions against the spread of COVID-19 during different periods and seasons. In this study, we propose a novel combination of the segmented Poisson model and ARIMA models to handle these features and correlation structures in a two-stage process. The first stage of this process is a generalization of trend analysis of time series data. Our approach is illustrated with forecasting and modeling of daily COVID-19 death count series data for Canada and the USA.
Journal: Journal of Applied Statistics
Pages: 2561-2574
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.1976119
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1976119
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2561-2574
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2019688_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Mohsen Maleki
Author-X-Name-First: Mohsen
Author-X-Name-Last: Maleki
Author-Name: Hamid Bidram
Author-X-Name-First: Hamid
Author-X-Name-Last: Bidram
Author-Name: Darren Wraith
Author-X-Name-First: Darren
Author-X-Name-Last: Wraith
Title: Robust clustering of COVID-19 cases across U.S. counties using mixtures of asymmetric time series models with time varying and freely indexed covariates
Abstract:
In this paper, we develop a mixture of autoregressive (MoAR) process model with time varying and freely indexed covariates under the flexible class of two–piece distributions using the scale mixtures of normal (TP-SMN) family. This novel family of time series (TP-SMN-MoAR) models was used to examine flexible and robust clustering of reported cases of Covid-19 across 313 counties in the U.S. The TP-SMN distributions allow for symmetrical/ asymmetrical distributions as well as heavy-tailed distributions providing for flexibility to handle outliers and complex data. Developing a suitable hierarchical representation of the TP-SMN family enabled the construction of a pseudo-likelihood function to derive the maximum pseudo-likelihood estimates via an EM-type algorithm.
Journal: Journal of Applied Statistics
Pages: 2648-2662
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.2019688
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2019688
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2648-2662
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2177625_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: A. Tchorbadjieff
Author-X-Name-First: A.
Author-X-Name-Last: Tchorbadjieff
Author-Name: L. P. Tomov
Author-X-Name-First: L. P.
Author-X-Name-Last: Tomov
Author-Name: V. Velev
Author-X-Name-First: V.
Author-X-Name-Last: Velev
Author-Name: G. Dezhov
Author-X-Name-First: G.
Author-X-Name-Last: Dezhov
Author-Name: V. Manev
Author-X-Name-First: V.
Author-X-Name-Last: Manev
Author-Name: P. Mayster
Author-X-Name-First: P.
Author-X-Name-Last: Mayster
Title: On regime changes of COVID-19 outbreak
Abstract:
The COVID-19 pandemic has had a very serious impact on societies and caused large-scale economic changes and death toll worldwide. The first cases were detected in China, but soon the virus spread quickly worldwide and the intensity of newly reported infections grew high during this initial period almost everywhere. Later, despite all imposed measures, the intensity shifted abruptly multiple times during the two-year period between 2020 and 2022 causing waves of too high infection rates in almost every part of the world. To target this problem, we assume the data heterogeneity as multiple consecutive regime changes. The research study includes the development of a model based on automatic regime change detection and their combination with the linear birth-death process for long-run data fits. The results are empirically verified on data for 38 countries and US states for the period from February 2020 to April 2022. Finally, the initial phase (conditions) properties of infection development are studied.
Journal: Journal of Applied Statistics
Pages: 2343-2359
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2023.2177625
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2177625
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2343-2359
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2006153_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Estate Khmaladze
Author-X-Name-First: Estate
Author-X-Name-Last: Khmaladze
Author-Name: Giorgi Kvizhinadze
Author-X-Name-First: Giorgi
Author-X-Name-Last: Kvizhinadze
Title: On evolution model for SARS-Cov-2-infected population: the case of New Zealand
Abstract:
The work proposes a mathematical model of the process of COVID-19 epidemic as it evolved in New Zealand. The model uses a system of differential equations which emanate from natural assumptions on some probability measure and evolution of this measure on evolving family of simplexes. The authors tried to create the model which, at one hand, is simple and easy to follow. and, at the other hand, reflects the observed epidemic process correctly. The practical aim was to come to justifiable estimations of important parameters like the rate of infection as function of time, thus quantifying effectiveness of the Government measures. Another parameters estimated were the probability distribution of detection times and recovery times.
Journal: Journal of Applied Statistics
Pages: 2435-2449
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.2006153
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2006153
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2435-2449
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2145459_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Fatemeh Elhambakhsh
Author-X-Name-First: Fatemeh
Author-X-Name-Last: Elhambakhsh
Author-Name: Kamyar Sabri-Laghaie
Author-X-Name-First: Kamyar
Author-X-Name-Last: Sabri-Laghaie
Author-Name: Rassoul Noorossana
Author-X-Name-First: Rassoul
Author-X-Name-Last: Noorossana
Title: A latent space model and Hotelling's T2 control chart to monitor the networks of Covid-19 symptoms
Abstract:
In the COVID-19 coronavirus pandemic, potential patients that suffer from different symptoms can be diagnosed with COVID-19. At the early stages of the pandemic, patients were mainly diagnosed with fever and respiratory symptoms. Recently, patients with new symptoms, such as gastrointestinal or loss of senses, are also diagnosed with COVID-19. Monitoring these symptoms can help the healthcare system to be aware of new symptoms that can be related to the COVID-19 coronavirus. This article focuses on monitoring the behavior of COVID-19 symptoms over time. In this regard, a Latent space model (LSM) and a Generalized linear model (GLM) are introduced to model the networks of symptoms. We apply Hotelling's T2 control chart to the estimated parameters of the LSM and GLM, to identify significant changes and detect anomalies in the networks. The performance of the proposed methods is evaluated using simulation and calculating average run length (ARL). Then, dynamic networks are generated from a COVID-19 epidemic survey dataset.
Journal: Journal of Applied Statistics
Pages: 2450-2472
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2022.2145459
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2145459
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2450-2472
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1947995_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Peipei Du
Author-X-Name-First: Peipei
Author-X-Name-Last: Du
Author-Name: Peihua Cao
Author-X-Name-First: Peihua
Author-X-Name-Last: Cao
Author-Name: Xiaodong Yan
Author-X-Name-First: Xiaodong
Author-X-Name-Last: Yan
Author-Name: Daihai He
Author-X-Name-First: Daihai
Author-X-Name-Last: He
Author-Name: Xiaotong Zhang
Author-X-Name-First: Xiaotong
Author-X-Name-Last: Zhang
Author-Name: Weixiang Chen
Author-X-Name-First: Weixiang
Author-X-Name-Last: Chen
Author-Name: Jiawei Luo
Author-X-Name-First: Jiawei
Author-X-Name-Last: Luo
Author-Name: Ziqian Zeng
Author-X-Name-First: Ziqian
Author-X-Name-Last: Zeng
Author-Name: Yaolong Chen
Author-X-Name-First: Yaolong
Author-X-Name-Last: Chen
Author-Name: Lin Yang
Author-X-Name-First: Lin
Author-X-Name-Last: Yang
Author-Name: Shu Yang
Author-X-Name-First: Shu
Author-X-Name-Last: Yang
Author-Name: Xixi Feng
Author-X-Name-First: Xixi
Author-X-Name-Last: Feng
Title: A continuous age-specific standardized mortality ratio for estimating the unascertained rates in the early epidemic of COVID-19 in different regions
Abstract:
The difference in age structure and aging population level was an important factor that caused the difference in COVID-19’s case fatality rate (CFR) in various regions. To eliminate the age effect on estimating the CFR of COVID-19, our study applied nonlinear logistic model and maximum likelihood method to fit the age-fatality curves of COVID-19 in different countries and regions. We further computed the standardized mortality ratio from the age-fatality curves of COVID-19 in the above regions and found that the risk of COVID-19 death in Wuhan was of a moderate level, while the non-Hubei region was even lower, compared with other regions. Regarding the disparity of CFRs among different regions in the country, we believed that there might be an unascertained phenomenon in high-endemic regions. Based on age-fatality rate curves, we estimated unascertained rates in cities with severe epidemics such as Wuhan and New York, and it was found that the total unascertained rates in Wuhan and New York were 81.6% and 81.2%, respectively. Meanwhile, we also found that the unascertained rates varied greatly with age.
Journal: Journal of Applied Statistics
Pages: 2504-2517
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.1947995
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1947995
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2504-2517
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1970122_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Siddharth Rawat
Author-X-Name-First: Siddharth
Author-X-Name-Last: Rawat
Author-Name: Soudeep Deb
Author-X-Name-First: Soudeep
Author-X-Name-Last: Deb
Title: A spatio-temporal statistical model to analyze COVID-19 spread in the USA
Abstract:
Coronavirus pandemic has affected the whole world extensively and it is of immense importance to understand how the disease is spreading. In this work, we provide evidence of spatial dependence in the pandemic data and accordingly develop a new statistical technique that captures the spatio-temporal dependence pattern of the COVID-19 spread appropriately. The proposed model uses a separable Gaussian spatio-temporal process, in conjunction with an additive mean structure and a random error process. The model is implemented through a Bayesian framework, thereby providing a computational advantage over the classical way. We use state-level data from the United States of America in this study. We show that a quadratic trend pattern is most appropriate in this context. Interestingly, the population is found not to affect the numbers significantly, whereas the number of deaths in the previous week positively affects the spread of the disease. Residual diagnostics establish that the model is adequate enough to understand the spatio-temporal dependence pattern in the data. It is also shown to have superior predictive power than other spatial and temporal models. In fact, we show that the proposed approach can predict well for both short term (1 week) and long term (up to three months).
Journal: Journal of Applied Statistics
Pages: 2310-2329
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.1970122
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1970122
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2310-2329
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2022607_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: E. Skamnia
Author-X-Name-First: E.
Author-X-Name-Last: Skamnia
Author-Name: P. Economou
Author-X-Name-First: P.
Author-X-Name-Last: Economou
Author-Name: S. Bersimis
Author-X-Name-First: S.
Author-X-Name-Last: Bersimis
Author-Name: M. Frouda
Author-X-Name-First: M.
Author-X-Name-Last: Frouda
Author-Name: A. Politis
Author-X-Name-First: A.
Author-X-Name-Last: Politis
Author-Name: P. Alexopoulos
Author-X-Name-First: P.
Author-X-Name-Last: Alexopoulos
Title: Hot spot identification method based on Andrews curves: an application on the COVID-19 crisis effects on caregiver distress in neurocognitive disorder
Abstract:
Identifying and locating areas – hot spots – that present high concentration of observations in a high-dimensional data set is crucial in many data processing and analysis methods and techniques, since observations that belong to the same hot spot share information and behave in a similar way. A useful tool towards that aim is the reduction of the data dimensionality and the graphical representation of them. In the present paper, a new method to identify and locate hot spots is proposed, based on the Andrews curves. Simulations results demonstrate the performance of the proposed method, which is also applied to a high-dimensional data set, regarding caregiver distress related to symptoms of people with neurocognitive disorder and to the mental effects of the recent outbreak of the COVID-19 pandemic.
Journal: Journal of Applied Statistics
Pages: 2388-2407
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.2022607
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2022607
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2388-2407
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2069232_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Kevin D. Dayaratna
Author-X-Name-First: Kevin D.
Author-X-Name-Last: Dayaratna
Author-Name: Drew Gonshorowski
Author-X-Name-First: Drew
Author-X-Name-Last: Gonshorowski
Author-Name: Mary Kolesar
Author-X-Name-First: Mary
Author-X-Name-Last: Kolesar
Title: Hierarchical Bayesian spatio-temporal modeling of COVID-19 in the United States
Abstract:
We examine the impact of economic, demographic, and mobility-related factors have had on the transmission of COVID-19 in 2020. While many models in the academic literature employ linear/generalized linear models, few contributions exist that incorporate spatial analysis, which is useful for understanding factors influencing the proliferation of the disease before the introduction of vaccines. We utilize a Poisson generalized linear model coupled with a spatial autoregressive structure to do so. Our analysis yields a number of insights including that, in some areas of the country, the counterintuitive result that staying at home can lead to increased disease proliferation. Additionally, we find some positive effects from increased gathering at grocery stores, negative effects of visiting retail stores and workplaces, and even small effects on visiting parks highlighting the complexities travel and migration have on the transmission of diseases.
Journal: Journal of Applied Statistics
Pages: 2663-2680
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2022.2069232
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2069232
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2663-2680
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2028744_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Yılmaz Akdi
Author-X-Name-First: Yılmaz
Author-X-Name-Last: Akdi
Author-Name: Yunus Emre Karamanoğlu
Author-X-Name-First: Yunus
Author-X-Name-Last: Emre Karamanoğlu
Author-Name: Kamil Demirberk Ünlü
Author-X-Name-First: Kamil Demirberk
Author-X-Name-Last: Ünlü
Author-Name: Cem Baş
Author-X-Name-First: Cem
Author-X-Name-Last: Baş
Title: Identifying the cycles in COVID-19 infection: the case of Turkey
Abstract:
The new coronavirus disease, called COVID-19, has spread extremely quickly to more than 200 countries since its detection in December 2019 in China. COVID-19 marks the return of a very old and familiar enemy. Throughout human history, disasters such as earthquakes, volcanic eruptions and even wars have not caused more human losses than lethal diseases, which are caused by viruses, bacteria and parasites. The first COVID-19 case was detected in Turkey on 12 March 2020 and researchers have since then attempted to examine periodicity in the number of daily new cases. One of the most curious questions in the pandemic process that affects the whole world is whether there will be a second wave. Such questions can be answered by examining any periodicities in the series of daily cases. Periodic series are frequently seen in many disciplines. An important method based on harmonic regression is the focus of the study. The main aim of this study is to identify the hidden periodic structure of the daily infected cases. Infected case of Turkey is analyzed by using periodogram-based methodology. Our results revealed that there are 4, 5 and 62 days cycles in the daily new cases of Turkey.
Journal: Journal of Applied Statistics
Pages: 2360-2372
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2022.2028744
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2028744
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2360-2372
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1936467_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Hou-Cheng Yang
Author-X-Name-First: Hou-Cheng
Author-X-Name-Last: Yang
Author-Name: Yishu Xue
Author-X-Name-First: Yishu
Author-X-Name-Last: Xue
Author-Name: Yuqing Pan
Author-X-Name-First: Yuqing
Author-X-Name-Last: Pan
Author-Name: Qingyang Liu
Author-X-Name-First: Qingyang
Author-X-Name-Last: Liu
Author-Name: Guanyu Hu
Author-X-Name-First: Guanyu
Author-X-Name-Last: Hu
Title: Time fused coefficient SIR model with application to COVID-19 epidemic in the United States
Abstract:
In this paper, we propose a Susceptible–Infected–Removal (SIR) model with time fused coefficients. In particular, our proposed model discovers the underlying time homogeneity pattern for the SIR model's transmission rate and removal rate via Bayesian shrinkage priors. MCMC sampling for the proposed method is facilitated by the nimble package in R. Extensive simulation studies are carried out to examine the empirical performance of the proposed methods. We further apply the proposed methodology to analyze different levels of COVID-19 data in the United States.
Journal: Journal of Applied Statistics
Pages: 2373-2387
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.1936467
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1936467
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2373-2387
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2006154_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: D. Atanasov
Author-X-Name-First: D.
Author-X-Name-Last: Atanasov
Author-Name: Vessela Stoimenova
Author-X-Name-First: Vessela
Author-X-Name-Last: Stoimenova
Author-Name: Nikolay M. Yanev
Author-X-Name-First: Nikolay M.
Author-X-Name-Last: Yanev
Title: Statistical modelling of COVID-19 pandemic development applying branching processes
Abstract:
In this paper, a statistical model for COVID-19 infection dynamics is described, using only the observed daily statistics of infected individuals. For this purpose, two special classes of branching processes without or with an immigration component are considered. These models are intended to estimate the main parameter of the infection and to give a prediction of the mean value of the non-observed population of the infected individuals. This is a serious advantage in comparison with other more complicated models where the officially reported data are not sufficient for estimation of the model parameters. The model is applied for different regions in the world and the corresponding parameters of the infection dynamics are estimated.
Journal: Journal of Applied Statistics
Pages: 2330-2342
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.2006154
File-URL: http://hdl.handle.net/10.1080/02664763.2021.2006154
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2330-2342
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_1928016_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Guannan Wang
Author-X-Name-First: Guannan
Author-X-Name-Last: Wang
Author-Name: Zhiling Gu
Author-X-Name-First: Zhiling
Author-X-Name-Last: Gu
Author-Name: Xinyi Li
Author-X-Name-First: Xinyi
Author-X-Name-Last: Li
Author-Name: Shan Yu
Author-X-Name-First: Shan
Author-X-Name-Last: Yu
Author-Name: Myungjin Kim
Author-X-Name-First: Myungjin
Author-X-Name-Last: Kim
Author-Name: Yueying Wang
Author-X-Name-First: Yueying
Author-X-Name-Last: Wang
Author-Name: Lei Gao
Author-X-Name-First: Lei
Author-X-Name-Last: Gao
Author-Name: Li Wang
Author-X-Name-First: Li
Author-X-Name-Last: Wang
Title: Comparing and integrating US COVID-19 data from multiple sources with anomaly detection and repairing
Abstract:
Over the past few months, the outbreak of Coronavirus disease (COVID-19) has been expanding over the world. A reliable and accurate dataset of the cases is vital for scientists to conduct related research and policy-makers to make better decisions. We collect the United States COVID-19 daily reported data from four open sources: the New York Times, the COVID-19 Data Repository by Johns Hopkins University, the COVID Tracking Project at the Atlantic, and the USAFacts, then compare the similarities and differences among them. To obtain reliable data for further analysis, we first examine the cyclical pattern and the following anomalies, which frequently occur in the reported cases: (1) the order dependencies violation, (2) the point or period anomalies, and (3) the issue of reporting delay. To address these detected issues, we propose the corresponding repairing methods and procedures if corrections are necessary. In addition, we integrate the COVID-19 reported cases with the county-level auxiliary information of the local features from official sources, such as health infrastructure, demographic, socioeconomic, and environmental information, which are also essential for understanding the spread of the virus.
Journal: Journal of Applied Statistics
Pages: 2408-2434
Issue: 11-12
Volume: 50
Year: 2023
Month: 09
X-DOI: 10.1080/02664763.2021.1928016
File-URL: http://hdl.handle.net/10.1080/02664763.2021.1928016
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:11-12:p:2408-2434
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2095362_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Jasper Velthoen
Author-X-Name-First: Jasper
Author-X-Name-Last: Velthoen
Author-Name: Juan-Juan Cai
Author-X-Name-First: Juan-Juan
Author-X-Name-Last: Cai
Author-Name: Geurt Jongbloed
Author-X-Name-First: Geurt
Author-X-Name-Last: Jongbloed
Title: Forward variable selection for random forest models
Abstract:
Random forest is a popular prediction approach for handling high dimensional covariates. However, it often becomes infeasible to interpret the obtained high dimensional and non-parametric model. Aiming for an interpretable predictive model, we develop a forward variable selection method using the continuous ranked probability score (CRPS) as the loss function. eOur stepwise procedure selects at each step a variable that minimizes the CRPS risk and a stopping criterion for selection is designed based on an estimation of the CRPS risk difference of two consecutive steps. We provide mathematical motivation for our method by proving that in a population sense, the method attains the optimal set. In a simulation study, we compare the performance of our method with an existing variable selection method, for different sample sizes and correlation strength of covariates. Our method is observed to have a much lower false positive rate. We also demonstrate an application of our method to statistical post-processing of daily maximum temperature forecasts in the Netherlands. Our method selects about 10% covariates while retaining the same predictive power.
Journal: Journal of Applied Statistics
Pages: 2836-2856
Issue: 13
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2095362
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2095362
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:13:p:2836-2856
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2073585_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Debasis Kundu
Author-X-Name-First: Debasis
Author-X-Name-Last: Kundu
Title: A stationary Weibull process and its applications
Abstract:
In this paper we introduce a discrete-time and continuous state-space Markov stationary process
$ \{X_n; n = 1, 2, \ldots \} $ {Xn;n=1,2,…}, where
$ X_n $ Xn has a two-parameter Weibull distribution,
$ X_n $ Xn's are dependent and there is a positive probability that
$ X_n = X_{n+1} $ Xn=Xn+1. The motivation came from the gold price data where there are several instances for which
$ X_n = X_{n+1} $ Xn=Xn+1. Hence, the existing methods cannot be used to analyze this data. We derive different properties of the proposed Weibull process. It is observed that the joint cumulative distribution function of
$ X_n $ Xn and
$ X_{n+1} $ Xn+1 has a very convenient copula structure. Hence, different dependence properties and dependence measures can be obtained. The maximum likelihood estimators cannot be obtained in explicit forms, we have proposed a simple profile likelihood method to compute these estimators. We have used this model to analyze two synthetic data sets and one gold price data set of the Indian market, and it is observed that the proposed model fits quite well with the data set.
Journal: Journal of Applied Statistics
Pages: 2681-2700
Issue: 13
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2073585
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2073585
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:13:p:2681-2700
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2081965_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Zili Zhang
Author-X-Name-First: Zili
Author-X-Name-Last: Zhang
Author-Name: Christiana Charalambous
Author-X-Name-First: Christiana
Author-X-Name-Last: Charalambous
Author-Name: Peter Foster
Author-X-Name-First: Peter
Author-X-Name-Last: Foster
Title: Joint modelling of longitudinal measurements and survival times via a multivariate copula approach
Abstract:
Joint modelling of longitudinal and time-to-event data is usually described by a joint model which uses shared or correlated latent effects to capture associations between the two processes. Under this framework, the joint distribution of the two processes can be derived straightforwardly by assuming conditional independence given the random effects. Alternative approaches to induce interdependency into sub-models have also been considered in the literature and one such approach is using copulas to introduce non-linear correlation between the marginal distributions of the longitudinal and time-to-event processes. The multivariate Gaussian copula joint model has been proposed in the literature to fit joint data by applying a Monte Carlo expectation-maximisation algorithm. In this paper, we propose an exact likelihood estimation approach to replace the more computationally expensive Monte Carlo expectation-maximisation algorithm and we consider results based on using both the multivariate Gaussian and t copula functions. We also provide a straightforward way to compute dynamic predictions of survival probabilities, showing that our proposed model is comparable in prediction performance to the shared random effects joint model.
Journal: Journal of Applied Statistics
Pages: 2739-2759
Issue: 13
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2081965
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2081965
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:13:p:2739-2759
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2078289_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Kai Qu
Author-X-Name-First: Kai
Author-X-Name-Last: Qu
Author-Name: Jonathan R. Bradley
Author-X-Name-First: Jonathan R.
Author-X-Name-Last: Bradley
Title: Bayesian models for spatial count data with informative finite populations with application to the American community survey
Abstract:
The American Community Survey (ACS) is an ongoing program conducted by the US Census Bureau that publishes estimates of important demographic statistics over pre-specified administrative areas. ACS provides spatially referenced count-valued outcomes that are paired with finite populations. For example, the number of people below the poverty line and the total population for each county are estimated by ACS. One common assumption is that the spatially referenced count-valued outcome given the finite population is binomial distributed. This conditionally specified (CS) model does not define the joint relationship between the count-valued outcome and the finite population. Thus, we consider a joint model for the count-valued outcome and the finite population. When cross-dependence in our joint model can be leveraged to ‘improve spatial prediction’ we say that the finite population is ‘informative.’ We model the count given the finite population as binomial and the finite population as negative binomial and use multivariate logit-beta prior distributions. This leads to closed-form expressions of the full-conditional distributions for an efficient Gibbs sampler. We illustrate our model through simulations and our motivating application of ACS poverty estimates. These empirical analyses show the benefits of using our proposed model over the more traditional CS binomial model.
Journal: Journal of Applied Statistics
Pages: 2701-2716
Issue: 13
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2078289
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2078289
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:13:p:2701-2716
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2084719_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: M. Martel
Author-X-Name-First: M.
Author-X-Name-Last: Martel
Author-Name: M. A. Negrín
Author-X-Name-First: M. A.
Author-X-Name-Last: Negrín
Author-Name: F. J. Vázquez–Polo
Author-X-Name-First: F. J.
Author-X-Name-Last: Vázquez–Polo
Title: Bayesian heterogeneity in a meta–analysis with two studies and binary data
Abstract:
The meta–analysis of two trials is valuable in many practical situations, such as studies of rare and/or orphan diseases focussed on a single intervention. In this context, additional concerns, like small sample size and/or heterogeneity in the results obtained, might make standard frequentist and Bayesian techniques inappropriate. In a meta–analysis, moreover, the presence of between–sample heterogeneity adds model uncertainty, which must be taken into consideration when drawing inferences. We suggest that the most appropriate way to measure this heterogeneity is by clustering the samples and then determining the posterior probability of the cluster models. The meta–inference is obtained as a mixture of all the meta–inferences for the cluster models, where the mixing distribution is the posterior model probability. We present a simple two–component form of Bayesian model averaging that is unaffected by characteristics such as small study size or zero–cell counts, and which is capable of incorporating uncertainties into the estimation process. Illustrative examples are given and analysed, using real sparse binomial data.
Journal: Journal of Applied Statistics
Pages: 2760-2776
Issue: 13
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2084719
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2084719
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:13:p:2760-2776
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2093842_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Siti Zahariah
Author-X-Name-First: Siti
Author-X-Name-Last: Zahariah
Author-Name: Habshah Midi
Author-X-Name-First: Habshah
Author-X-Name-Last: Midi
Title: Minimum regularized covariance determinant and principal component analysis-based method for the identification of high leverage points in high dimensional sparse data
Abstract:
The main aim of this paper is to propose a novel method (RMD-MRCD-PCA) of identification of High Leverage Points (HLPs) in high-dimensional sparse data. It is to address the weakness of the Robust Mahalanobis Distance (RMD) method which is based on the Minimum Regularized Covariance Determinant (RMD-MRCD), which indicates a decrease in its performance as the number of independent variables (p) increases. The RMD-MRCD-PCA is developed by incorporating the Principal Component Analysis (PCA) in the MRCD algorithm whereby this robust approach shrinks the covariance matrix to make it invertible and thus, can be employed to compute the RMD for high dimensional data. A simulation study and two real data sets are used to illustrate the merit of our proposed method compared to the RMD-MRCD and Robust PCA (ROBPCA) methods. Findings show that the performance of the RMD-MRCD is similar to the performance of the RMD-MRCD-PCA for p close to 200. However, its performance tends to decrease when the number of p is more than 200 and worsens at p equals 700 and larger. On the other hand, the ROBPCA is not effective for less than 20% contamination as it suffers from serious swamping problems.
Journal: Journal of Applied Statistics
Pages: 2817-2835
Issue: 13
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2093842
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2093842
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:13:p:2817-2835
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2078798_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: S. M. Patil
Author-X-Name-First: S. M.
Author-X-Name-Last: Patil
Author-Name: H. V. Kulkarni
Author-X-Name-First: H. V.
Author-X-Name-Last: Kulkarni
Title: Analysis of medians under two-way model with and without interaction for Birnbaum–Saunders distributed response
Abstract:
The Birnbaum–Saunders (BS) distribution, well-known as the fatigue-life distribution, has been used in numerous disciplines ranging from engineering to medical sciences. In this article, we develop a test for analysis of medians for BS distributed response to assess the impact of two interacting factors on the median, where no test is presently available. The proposed integrated likelihood ratio test (ILRT) eliminates the nuisance shape parameters by integrating them out. The second-order accurate asymptotic chi-square distribution of ILRT is derived. An in-depth simulation study strongly supports its excellent performance even under small group sizes. Furthermore, ILRT developed under the one-way model is found uniformly superior over its peers, is straightway extendable under general multiway setup, and has potential to be extended to other non-normal response variables. Its genuine need in industry, where non-normal responses are commonly encountered, is highlighted through analysis of three real data sets: ILRT strongly picked out the deposition time as influential factor in epitaxial layer experiment, revealed significant impact of spools on fiber life for the failure times of Kevlar 49 fiber data, and gave more accurate parameter estimates in delivery time data experiment, as assessed by various model adequacy tools, where its competitors failed to deliver desired results.
Journal: Journal of Applied Statistics
Pages: 2717-2738
Issue: 13
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2078798
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2078798
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:13:p:2717-2738
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2088706_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Daniel Ries
Author-X-Name-First: Daniel
Author-X-Name-Last: Ries
Author-Name: Alicia Carriquiry
Author-X-Name-First: Alicia
Author-X-Name-Last: Carriquiry
Title: Assessing adult physical activity and compliance with 2008 CDC guidelines using a Bayesian two-part measurement error model
Abstract:
While there is wide agreement that physical activity is an important component of a healthy lifestyle, it is unclear how many people adhere to public health recommendations on physical activity. The Physical Activity Guidelines (PAG), published by the CDC, provides guidelines to American adults, but it is difficult to assess compliance with these guidelines. The PAG further complicates adherence assessment by recommending activity to occur in at least 10 min bouts. To better understand the measurement capabilities of various instruments to quantify activity, and to propose an approach to evaluate activity relative to the PAG, researchers at Iowa State University administered the Physical Activity Measurement Survey (PAMS) to over 1000 participants in four different Iowa counties. In this paper, we develop a two-part Bayesian measurement error model and apply it to the PAMS data in order to assess compliance with the PAG in the Iowa adult population. The model accurately accounts for the 10 min bout requirement put forth in the PAG. The measurement error model corrects biased estimates and accounts for day-to-day variation in activity. The model is also applied to the nationally representative National Health and Nutrition Examination Survey.
Journal: Journal of Applied Statistics
Pages: 2777-2795
Issue: 13
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2088706
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2088706
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:13:p:2777-2795
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2091526_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Jeyadurga Periyasamypandian
Author-X-Name-First: Jeyadurga
Author-X-Name-Last: Periyasamypandian
Author-Name: Saminathan Balamurali
Author-X-Name-First: Saminathan
Author-X-Name-Last: Balamurali
Title: Determination of new multiple deferred state sampling plan with economic perspective under Weibull distribution
Abstract:
This study focuses on designing a new multiple deferred state sampling plan to ensure products’ mean lifetime that complies with Weibull distribution. The parameters that characterize the proposed plan are determined by considering two specified points on the operating characteristic curve. Practical applications of the proposed plan for assuring mean lifetimes of electrical appliances as well as Lithium-ion batteries are explained by using real-time data and simulated data respectively. Sensitivity analysis on testing time of the life test is done and theoretical average sample number is compared with the same obtained by simulation. By comparing the proposed plan with other existing sampling plans based on discriminating power, the number of units required for lot sentencing, it is observed that the new multiple deferred state sampling plan provides quality assurance for the products with low inspection costs compared to the other existing sampling plans. Besides, this study investigates the economic design of a new multiple deferred state sampling plan and compares the total cost needed in the proposed plan with the same required for some other existing sampling plans.
Journal: Journal of Applied Statistics
Pages: 2796-2816
Issue: 13
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2091526
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2091526
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:13:p:2796-2816
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2112557_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Yujie Zhao
Author-X-Name-First: Yujie
Author-X-Name-Last: Zhao
Author-Name: Xiaoming Huo
Author-X-Name-First: Xiaoming
Author-X-Name-Last: Huo
Author-Name: Yajun Mei
Author-X-Name-First: Yajun
Author-X-Name-Last: Mei
Title: Hot-spots detection in count data by Poisson assisted smooth sparse tensor decomposition
Abstract:
Count data occur widely in many bio-surveillance and healthcare applications, e.g. the numbers of new patients of different types of infectious diseases from different cities/counties/states repeatedly over time, say, daily/weekly/monthly. For this type of count data, one important task is the quick detection and localization of hot-spots in terms of unusual infectious rates so that we can respond appropriately. In this paper, we develop a method called Poisson assisted Smooth Sparse Tensor Decomposition (PoSSTenD), which not only detect when hot-spots occur but also localize where hot-spots occur. The main idea of our proposed PoSSTenD method is articulated as follows. First, we represent the observed count data as a three-dimensional tensor including (1) a spatial dimension for location patterns, e.g. different cities/countries/states; (2) a temporal domain for time patterns, e.g. daily/weekly/monthly; (3) a categorical dimension for different types of data sources, e.g. different types of diseases. Second, we fit this tensor into a Poisson regression model, and then we further decompose the infectious rate into two components: smooth global trend and local hot-spots. Third, we detect when hot-spots occur by building a cumulative sum (CUSUM) control chart and localize where hot-spots occur by their LASSO-type sparse estimation. The usefulness of our proposed methodology is validated through numerical simulation studies and a real-world dataset, which records the annual number of 10 different infectious diseases from 1993 to 2018 for 49 mainland states in the United States.
Journal: Journal of Applied Statistics
Pages: 2999-3029
Issue: 14
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2112557
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2112557
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:14:p:2999-3029
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2150753_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Yinglun Zhan
Author-X-Name-First: Yinglun
Author-X-Name-Last: Zhan
Author-Name: Ruizhi Zhang
Author-X-Name-First: Ruizhi
Author-X-Name-Last: Zhang
Author-Name: Yuzhen Zhou
Author-X-Name-First: Yuzhen
Author-X-Name-Last: Zhou
Author-Name: Vincent Stoerger
Author-X-Name-First: Vincent
Author-X-Name-Last: Stoerger
Author-Name: Jeremy Hiller
Author-X-Name-First: Jeremy
Author-X-Name-Last: Hiller
Author-Name: Tala Awada
Author-X-Name-First: Tala
Author-X-Name-Last: Awada
Author-Name: Yufeng Ge
Author-X-Name-First: Yufeng
Author-X-Name-Last: Ge
Title: Rapid online plant leaf area change detection with high-throughput plant image data
Abstract:
High-throughput plant phenotyping (HTPP) has become an emerging technique to study plant traits due to its fast, labor-saving, accurate and non-destructive nature. It has wide applications in plant breeding and crop management. However, the resulting massive image data has raised a challenge associated with efficient plant traits prediction and anomaly detection. In this paper, we propose a two-step image-based online detection framework for monitoring and quick change detection of the individual plant leaf area via real-time imaging data. Our proposed method is able to achieve a smaller detection delay compared with some baseline methods under some predefined false alarm rate constraint. Moreover, it does not need to store all past image information and can be implemented in real time. The efficiency of the proposed framework is validated by a real data analysis.
Journal: Journal of Applied Statistics
Pages: 2984-2998
Issue: 14
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2150753
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2150753
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:14:p:2984-2998
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2137115_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: David Lun
Author-X-Name-First: David
Author-X-Name-Last: Lun
Author-Name: Svenja Fischer
Author-X-Name-First: Svenja
Author-X-Name-Last: Fischer
Author-Name: Alberto Viglione
Author-X-Name-First: Alberto
Author-X-Name-Last: Viglione
Author-Name: Günter Blöschl
Author-X-Name-First: Günter
Author-X-Name-Last: Blöschl
Title: Significance testing of rank cross-correlations between autocorrelated time series with short-range dependence
Abstract:
Statistical dependency measures such as Kendall’s Tau or Spearman’s Rho are frequently used to analyse the coherence between time series in environmental data analyses. Autocorrelation of the data can, however, result in spurious cross correlations if not accounted for. Here, we present the asymptotic distribution of the estimators of Spearman’s Rho and Kendall’s Tau, which can be used for statistical hypothesis testing of cross-correlations between autocorrelated observations. The results are derived using U-statistics under the assumption of absolutely regular (or β-mixing) processes. These comprise many short-range dependent processes, such as ARMA-, GARCH- and some copula-based models relevant in the environmental sciences. We show that while the assumption of absolute regularity is required, the specific type of model does not have to be specified for the hypothesis test. Simulations show the improved performance of the modified hypothesis test for some common stochastic models and small to moderate sample sizes under autocorrelation. The methodology is applied to observed climatological time series of flood discharges and temperatures in Europe. While the standard test results in spurious correlations between floods and temperatures, this is not the case for the proposed test, which is more consistent with the literature on flood regime changes in Europe.
Journal: Journal of Applied Statistics
Pages: 2934-2950
Issue: 14
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2137115
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2137115
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:14:p:2934-2950
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2147150_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Heng-Hui Lue
Author-X-Name-First: Heng-Hui
Author-X-Name-Last: Lue
Author-Name: ShengLi Tzeng
Author-X-Name-First: ShengLi
Author-X-Name-Last: Tzeng
Title: Interpretable, predictive spatio-temporal models via enhanced pairwise directions estimation
Abstract:
This article concerns predictive modeling for spatio-temporal data as well as model interpretation using data information in space and time. We develop a novel approach based on supervised dimension reduction for such data in order to capture nonlinear mean structures without requiring a prespecified parametric model. In addition to prediction as a common interest, this approach emphasizes the exploration of geometric information from the data. The method of Pairwise Directions Estimation (PDE) is implemented in our approach as a data-driven function searching for spatial patterns and temporal trends. The benefit of using geometric information from the method of PDE is highlighted, which aids effectively in exploring data structures. We further enhance PDE, referring to it as PDE+, by incorporating kriging to estimate the random effects not explained in the mean functions. Our proposal can not only increase prediction accuracy but also improve the interpretation for modeling. Two simulation examples are conducted and comparisons are made with several existing methods. The results demonstrate that the proposed PDE+ method is very useful for exploring and interpreting the patterns and trends for spatio-temporal data. Illustrative applications to two real datasets are also presented.
Journal: Journal of Applied Statistics
Pages: 2914-2933
Issue: 14
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2147150
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2147150
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:14:p:2914-2933
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2174257_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Michael Baron
Author-X-Name-First: Michael
Author-X-Name-Last: Baron
Author-Name: Sergey V. Malov
Author-X-Name-First: Sergey V.
Author-X-Name-Last: Malov
Title: Detection and estimation of multiple transient changes
Abstract:
Change-point detection methods are proposed for the case of temporary failures, or transient changes, when an unexpected disorder is ultimately followed by a re-adjustment and return to the initial state. A base distribution of the ‘in-control’ state changes to an ‘out-of-control’ distribution for unknown periods of time. Likelihood based sequential and retrospective tools are proposed for the detection and estimation of each pair of change-points. The accuracy of the obtained change-point estimates is assessed. Proposed methods offer simultaneous control of the familywise false alarm and false re-adjustment rates at the pre-chosen levels.
Journal: Journal of Applied Statistics
Pages: 2862-2888
Issue: 14
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2023.2174257
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2174257
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:14:p:2862-2888
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2117288_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Jiuyun Hu
Author-X-Name-First: Jiuyun
Author-X-Name-Last: Hu
Author-Name: Yajun Mei
Author-X-Name-First: Yajun
Author-X-Name-Last: Mei
Author-Name: Sarah Holte
Author-X-Name-First: Sarah
Author-X-Name-Last: Holte
Author-Name: Hao Yan
Author-X-Name-First: Hao
Author-X-Name-Last: Yan
Title: Adaptive resources allocation CUSUM for binomial count data monitoring with application to COVID-19 hotspot detection
Abstract:
In this paper, we present an efficient statistical method (denoted as ‘Adaptive Resources Allocation CUSUM’) to robustly and efficiently detect the hotspot with limited sampling resources. Our main idea is to combine the multi-arm bandit (MAB) and change-point detection methods to balance the exploration and exploitation of resource allocation for hotspot detection. Further, a Bayesian weighted update is used to update the posterior distribution of the infection rate. Then, the upper confidence bound (UCB) is used for resource allocation and planning. Finally, CUSUM monitoring statistics to detect the change point as well as the change location. For performance evaluation, we compare the performance of the proposed method with several benchmark methods in the literature and showed the proposed algorithm is able to achieve a lower detection delay and higher detection precision. Finally, this method is applied to hotspot detection in a real case study of county-level daily positive COVID-19 cases in Washington State WA) and demonstrates the effectiveness with very limited distributed samples.
Journal: Journal of Applied Statistics
Pages: 2889-2913
Issue: 14
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2022.2117288
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2117288
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:14:p:2889-2913
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2200496_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Ka Wai Tsang
Author-X-Name-First: Ka
Author-X-Name-Last: Wai Tsang
Author-Name: Fugee Tsung
Author-X-Name-First: Fugee
Author-X-Name-Last: Tsung
Author-Name: Zhihao Xu
Author-X-Name-First: Zhihao
Author-X-Name-Last: Xu
Title: Knockoff procedure for false discovery rate control in high-dimensional data streams
Abstract:
Motivated by applications to root-cause identification of faults in high-dimensional data streams that may have very limited samples after faults are detected, we consider multiple testing in models for multivariate statistical process control (SPC). With quick fault detection, only small portion of data streams being out-of-control (OC) can be assumed. It is a long standing problem to identify those OC data streams while controlling the number of false discoveries. It is challenging due to the limited number of OC samples after the termination of the process when faults are detected. Although several false discovery rate (FDR) controlling methods have been proposed, people may prefer other methods for quick detection. With a recently developed method called Knockoff filtering, we propose a knockoff procedure that can combine with other fault detection methods in the sense that the knockoff procedure does not change the stopping time, but may identify another set of faults to control FDR. A theorem for the FDR control of the proposed procedure is provided. Simulation studies show that the proposed procedure can control FDR while maintaining high power. We also illustrate the performance in an application to semiconductor manufacturing processes that motivated this development.
Journal: Journal of Applied Statistics
Pages: 2970-2983
Issue: 14
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2023.2200496
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2200496
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:14:p:2970-2983
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2164885_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Hongzhen Tian
Author-X-Name-First: Hongzhen
Author-X-Name-Last: Tian
Author-Name: Reuven Zev Cohen
Author-X-Name-First: Reuven Zev
Author-X-Name-Last: Cohen
Author-Name: Chuck Zhang
Author-X-Name-First: Chuck
Author-X-Name-Last: Zhang
Author-Name: Yajun Mei
Author-X-Name-First: Yajun
Author-X-Name-Last: Mei
Title: Active learning-based multistage sequential decision-making model with application on common bile duct stone evaluation
Abstract:
Multistage sequential decision-making occurs in many real-world applications such as healthcare diagnosis and treatment. One concrete example is when the doctors need to decide to collect which kind of information from subjects so as to make the good medical decision cost-effectively. In this paper, an active learning-based method is developed to model the doctors' decision-making process that actively collects necessary information from each subject in a sequential manner. The effectiveness of the proposed model, especially its two-stage version, is validated on both simulation studies and a case study of common bile duct stone evaluation for pediatric patients.
Journal: Journal of Applied Statistics
Pages: 2951-2969
Issue: 14
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2023.2164885
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2164885
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:14:p:2951-2969
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2247646_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Yajun Mei
Author-X-Name-First: Yajun
Author-X-Name-Last: Mei
Author-Name: Jay Bartroff
Author-X-Name-First: Jay
Author-X-Name-Last: Bartroff
Author-Name: Jie Chen
Author-X-Name-First: Jie
Author-X-Name-Last: Chen
Author-Name: Georgios Fellouris
Author-X-Name-First: Georgios
Author-X-Name-Last: Fellouris
Author-Name: Ruizhi Zhang
Author-X-Name-First: Ruizhi
Author-X-Name-Last: Zhang
Title: Editorial to the special issue: modern streaming data analytics
Journal: Journal of Applied Statistics
Pages: 2857-2861
Issue: 14
Volume: 50
Year: 2023
Month: 10
X-DOI: 10.1080/02664763.2023.2247646
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2247646
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:14:p:2857-2861
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2097204_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: R. N. Montgomery
Author-X-Name-First: R. N.
Author-X-Name-Last: Montgomery
Author-Name: L.T. Ptomey
Author-X-Name-First: L.T.
Author-X-Name-Last: Ptomey
Author-Name: J. D. Mahnken
Author-X-Name-First: J. D.
Author-X-Name-Last: Mahnken
Title: A flexible test for early-stage studies with multiple endpoints
Abstract:
This paper builds on the recently proposed prediction test for muliple endpoints. The prediction test combines information across multiple endpoints while accounting for the correlation between them. The test performs well with small samples relative to the number of endpoints of interest and is flexible in the hypotheses across the individual endpoints that can be combined. The prediction test addresses a global hypothesis that is of particular interest in early-stage studies and can be used as justification for continuing on to a larger trial. However, the prediction test has several limitations which we seek to address. First, the prediction test is overly conservative when both the effect sizes across all endpoints and the number of endpoints are small. By using a parametric bootstrap to estimate the null distribution, we show that the test achieves the nominal error rate in this situation and increases the power of the test. Second, we provide a framework to allow for predictions of a difference on one or more endpoints. Finally, we extend the test with a composite null hypothesis that allows for different null hypothesized predictive abilities across the endpoints which can be especially useful if the study contains both familiar and novel endpoints. We use an example from a physical activity trial to illustrate these extensions.
Journal: Journal of Applied Statistics
Pages: 3048-3061
Issue: 15
Volume: 50
Year: 2023
Month: 11
X-DOI: 10.1080/02664763.2022.2097204
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2097204
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:15:p:3048-3061
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2099816_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Termeh Shafie
Author-X-Name-First: Termeh
Author-X-Name-Last: Shafie
Title: Goodness of fit tests for random multigraph models
Abstract:
Goodness of fit tests for two probabilistic multigraph models are presented. The first model is random stub matching given fixed degrees (RSM) so that edge assignments to vertex pair sites are dependent, and the second is independent edge assignments (IEA) according to a common probability distribution. Tests are performed using goodness of fit measures between the edge multiplicity sequence of an observed multigraph, and the expected one according to a simple or composite hypothesis. Test statistics of Pearson type and of likelihood ratio type are used, and the expected values of the Pearson statistic under the different models are derived. Test performances based on simulations indicate that even for small number of edges, the null distributions of both statistics are well approximated by their asymptotic
$ \chi ^{2} $ χ2-distribution. The non-null distributions of the test statistics can be well approximated by proposed adjusted
$ \chi ^{2} $ χ2-distributions used for power approximations. The influence of RSM on both test statistics is substantial for small number of edges and implies a shift of their distributions towards smaller values compared to what holds true for the null distributions under IEA. Two applications on social networks are included to illustrate how the tests can guide in the analysis of social structure.
Journal: Journal of Applied Statistics
Pages: 3062-3087
Issue: 15
Volume: 50
Year: 2023
Month: 11
X-DOI: 10.1080/02664763.2022.2099816
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2099816
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:15:p:3062-3087
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2101045_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: İsmet Birbiçer
Author-X-Name-First: İsmet
Author-X-Name-Last: Birbiçer
Author-Name: Ali İ. Genç
Author-X-Name-First: Ali İ.
Author-X-Name-Last: Genç
Title: On parameter estimation of the standard omega distribution
Abstract:
The standard omega distribution is defined on the unit interval so that it is a probabilistic model for observations in rates and percentages. It is, in fact, the unit form of the exponentiated half logistic distribution. In this work, we first give a detailed shape analysis from which we observe that it is another flexible beta-like distribution. We observe that it can be J-shaped, reverse J-shaped, U-shaped, unimodal and show left and right skewness according to the values of its shape parameters. Contrary to the ordinary beta, it has the advantage of having a clear distribution function. We then discuss the existence and uniqueness of the maximum likelihood estimators and the Bayesian estimate of the parameters. The existence and uniqueness of the maximum likelihood estimators of the parameters will give a great advantage to the possible practitioners of this model since the possibility of finding a spurious solution to the likelihood equations disappears then. The comparison of these estimators with the existing ones for the general omega distribution is made with the help of a simulation study. Two real data fitting demonstrations prove its usefulness among other beta-like distributions such as Kumaraswamy, log-Lindley and Topp–Leone.
Journal: Journal of Applied Statistics
Pages: 3108-3124
Issue: 15
Volume: 50
Year: 2023
Month: 11
X-DOI: 10.1080/02664763.2022.2101045
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2101045
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:15:p:3108-3124
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2096209_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Jody Krahn
Author-X-Name-First: Jody
Author-X-Name-Last: Krahn
Author-Name: Shakhawat Hossain
Author-X-Name-First: Shakhawat
Author-X-Name-Last: Hossain
Author-Name: Shahedul Khan
Author-X-Name-First: Shahedul
Author-X-Name-Last: Khan
Title: An efficient estimation approach to joint modeling of longitudinal and survival data
Abstract:
The joint models for longitudinal and survival data have recently received significant attention in medical and epidemiological studies. Joint models typically combine linear mixed effects models for repeated measurement data and Cox models for survival time. When we are jointly modeling the longitudinal and survival data, variable selection and efficient estimation of parameters are especially important for performing reliable statistical analyzes, both of which are currently lacking in the literature. In this paper we discuss the pretest and shrinkage estimation methods for jointly modeling longitudinal data and survival time data when some of the covariates in both longitudinal and survival components may not be relevant for predicting survival times. In this situation, we fit two models: the full model that contains all the covariates and the subset model that contains a reduced number of covariates. We combine the full model estimators and the estimators that are restricted by a linear hypothesis to define pretest and shrinkage estimators. We provide their numerical mean squared errors (MSE) and relative MSE. We show that if the shrinkage dimension exceeds two, the risk of the shrinkage estimators is strictly less than that of the full model estimators. Our proposed methods are illustrated by extensive simulation studies and a real-data example.
Journal: Journal of Applied Statistics
Pages: 3031-3047
Issue: 15
Volume: 50
Year: 2023
Month: 11
X-DOI: 10.1080/02664763.2022.2096209
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2096209
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:15:p:3031-3047
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2102158_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Jan Kalina
Author-X-Name-First: Jan
Author-X-Name-Last: Kalina
Author-Name: Patrik Janáček
Author-X-Name-First: Patrik
Author-X-Name-Last: Janáček
Title: Testing exchangeability of multivariate distributions
Abstract:
Although there have been a number of available tests of bivariate exchangeability, i.e. bivariate symmetry for bivariate distributions, the literature is void of tests whether a multivariate distribution with more than two dimensions is exchangeable or not. In this paper, multivariate permutation tests of exchangeability of multivariate distributions are proposed, which are based on the non-parametric combination methodology, i.e. on combining non-parametric bivariate exchangeability tests. Numerical experiments on real as well as simulated multivariate data with more than two dimensions are presented here. The multivariate permutation test turns out to be typically more powerful than a bivariate exchangeability test performed only over a single pair of variables, and also more suitable compared to tests exploiting the approaches of Benjamini–Yekutieli or Bonferroni.
Journal: Journal of Applied Statistics
Pages: 3142-3156
Issue: 15
Volume: 50
Year: 2023
Month: 11
X-DOI: 10.1080/02664763.2022.2102158
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2102158
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:15:p:3142-3156
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2103101_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Minh Thu Bui
Author-X-Name-First: Minh Thu
Author-X-Name-Last: Bui
Author-Name: Cornelis J. Potgieter
Author-X-Name-First: Cornelis J.
Author-X-Name-Last: Potgieter
Author-Name: Akihito Kamata
Author-X-Name-First: Akihito
Author-X-Name-Last: Kamata
Title: Penalized likelihood methods for modeling count data
Abstract:
The paper considers parameter estimation in count data models using penalized likelihood methods. The motivating data consists of multiple independent count variables with a moderate sample size per variable. The data were collected during the assessment of oral reading fluency (ORF) in school-aged children. A sample of fourth-grade students were given one of ten available passages to read with these differing in length and difficulty. The observed number of words read incorrectly (WRI) is used to measure ORF. Three models are considered for WRI scores, namely the binomial, the zero-inflated binomial, and the beta-binomial. We aim to efficiently estimate passage difficulty, a quantity expressed as a function of the underlying model parameters. Two types of penalty functions are considered for penalized likelihood with respective goals of shrinking parameter estimates closer to zero or closer to one another. A simulation study evaluates the efficacy of the shrinkage estimates using Mean Square Error (MSE) as metric. Big reductions in MSE relative to unpenalized maximum likelihood are observed. The paper concludes with an analysis of the motivating ORF data.
Journal: Journal of Applied Statistics
Pages: 3157-3176
Issue: 15
Volume: 50
Year: 2023
Month: 11
X-DOI: 10.1080/02664763.2022.2103101
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2103101
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:15:p:3157-3176
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2104228_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Ka Kin Lam
Author-X-Name-First: Ka Kin
Author-X-Name-Last: Lam
Author-Name: Bo Wang
Author-X-Name-First: Bo
Author-X-Name-Last: Wang
Title: Multipopulation mortality modelling and forecasting: the weighted multivariate functional principal component approaches
Abstract:
Human mortality patterns and trajectories in closely related populations are likely linked together and share similarities. It is always desirable to model them simultaneously while taking their heterogeneity into account. This article introduces two new models for joint mortality modelling and forecasting multiple subpopulations using the multivariate functional principal component analysis techniques. The first model extends the independent functional data model to a multipopulation modelling setting. In the second one, we propose a novel multivariate functional principal component method for coherent modelling. Its design primarily fulfils the idea that when several subpopulation groups have similar socio-economic conditions or common biological characteristics such close connections are expected to evolve in a non-diverging fashion. We demonstrate the proposed methods by using sex-specific mortality data. Their forecast performances are further compared with several existing models, including the independent functional data model and the Product-Ratio model, through comparisons with mortality data of ten developed countries. The numerical examples show that the first proposed model maintains a comparable forecast ability with the existing methods. In contrast, the second proposed model outperforms the first model as well as the existing models in terms of forecast accuracy.
Journal: Journal of Applied Statistics
Pages: 3177-3198
Issue: 15
Volume: 50
Year: 2023
Month: 11
X-DOI: 10.1080/02664763.2022.2104228
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2104228
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:15:p:3177-3198
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2101631_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Yamin Sayyari
Author-X-Name-First: Yamin
Author-X-Name-Last: Sayyari
Author-Name: Mohammad Reza Molaei
Author-X-Name-First: Mohammad Reza
Author-X-Name-Last: Molaei
Author-Name: Adel Mehrpooya
Author-X-Name-First: Adel
Author-X-Name-Last: Mehrpooya
Title: Complexities of information sources
Abstract:
Calculating the entropy for complex systems is a significant problem in science and engineering problems. However, this calculation is usually computationally expensive when the entropy is computed directly. This paper introduces three classes of information sources that for all members of each class, the entropy value is the same. These classes are characterized according to special dynamics created by three kinds of self-mappings on Ω, and A, where Ω is a probability space and A is a finite set. An approximation of rank variables of the product of information sources is made, and it is proved that the topological entropy of the product of two information sources is equal to the summation of their topological entropies.
Journal: Journal of Applied Statistics
Pages: 3125-3141
Issue: 15
Volume: 50
Year: 2023
Month: 11
X-DOI: 10.1080/02664763.2022.2101631
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2101631
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:15:p:3125-3141
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2101044_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Mattia Cefis
Author-X-Name-First: Mattia
Author-X-Name-Last: Cefis
Title: Observed heterogeneity in players' football performance analysis using PLS-PM
Abstract:
Nowadays, data science is applied in several areas of daily life. There have been many applications to sports. In this context, the attention will be focused on football (i.e. ‘soccer’ for Americans): the making of strategic choices, whether by the scouting department of the football club, or the technical staff, up to the management, is crucial. It has been measured and monitored football players' performance in the season 2018/2019, for the top five European Leagues, using data provided by Electronic Arts (EA) experts and available on the Kaggle data science platform. For this purpose, with the help of football experts, a third-order partial least-squares path model (PLS-PM) approach was adopted to the sofifa key performance indices in order to compute a composite indicator differentiated by role and compare it with the well-known overall indicator from EA Sports. It has been taken into account players' observed heterogeneity (i.e. roles and leagues), since often experts refer to differences in these features, and so the objective is to verify their importance scientifically. The results are very consistent with this because they underline how some sub-areas of performance have different significance weights depending on the role.
Journal: Journal of Applied Statistics
Pages: 3088-3107
Issue: 15
Volume: 50
Year: 2023
Month: 11
X-DOI: 10.1080/02664763.2022.2101044
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2101044
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:15:p:3088-3107
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2112556_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Nirosha Rathnayake
Author-X-Name-First: Nirosha
Author-X-Name-Last: Rathnayake
Author-Name: Hongying Daisy Dai
Author-X-Name-First: Hongying Daisy
Author-X-Name-Last: Dai
Author-Name: Richard Charnigo
Author-X-Name-First: Richard
Author-X-Name-Last: Charnigo
Author-Name: Kendra Schmid
Author-X-Name-First: Kendra
Author-X-Name-Last: Schmid
Author-Name: Jane Meza
Author-X-Name-First: Jane
Author-X-Name-Last: Meza
Title: A general class of small area estimation using calibrated hierarchical likelihood approach with applications to COVID-19 data
Abstract:
The direct estimation techniques in small area estimation (SAE) models require sufficiently large sample sizes to provide accurate estimates. Hence, indirect model-based methodologies are developed to incorporate auxiliary information. The most commonly used SAE models, including the Fay-Herriot (FH) model and its extended models, are estimated using marginal likelihood estimation and the Bayesian methods, which rely heavily on the computationally intensive integration of likelihood function. In this article, we propose a Calibrated Hierarchical (CH) likelihood approach to obtain SAE through hierarchical estimation of fixed effects and random effects with the regression calibration method for bias correction. The latent random variables at the domain level are treated as ‘parameters’ and estimated jointly with other parameters of interest. Then the dispersion parameters are estimated iteratively based on the Laplace approximation of the profile likelihood. The proposed method avoids the intractable integration to estimate the marginal distribution. Hence, it can be applied to a wide class of distributions, including generalized linear mixed models, survival analysis, and joint modeling with distinct distributions. We demonstrate our method using an area-level analysis of publicly available count data from the novel coronavirus (COVID-19) positive cases.
Journal: Journal of Applied Statistics
Pages: 3384-3404
Issue: 16
Volume: 50
Year: 2023
Month: 12
X-DOI: 10.1080/02664763.2022.2112556
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2112556
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:16:p:3384-3404
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2111678_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Carolina Luque
Author-X-Name-First: Carolina
Author-X-Name-Last: Luque
Author-Name: Juan Sosa
Author-X-Name-First: Juan
Author-X-Name-Last: Sosa
Title: A Bayesian spatial voting model to characterize the legislative behavior of the Colombian Senate 2010–2014
Abstract:
This paper characterizes the legislators voting behavior in the Colombian Senate 2010–2014, by implementing a one-dimensional standard Bayesian ideal point estimator via Markov chain Monte Carlo algorithms. Our main goal is to retrieve the political preferences of legislators from their roll-call voting records, which individualizes the electoral behavior of the legislative chamber. Furthermore, we conclude about the nature of the latent trait underlying the deputies voting decisions and the legislators locations in political space. Finally, we also offer several methodological and theoretical tools to guide the analysis of nominal voting data in the context of unbalanced parliaments (multi-party systems), taking as reference the particular case of the Colombian Senate.
Journal: Journal of Applied Statistics
Pages: 3362-3383
Issue: 16
Volume: 50
Year: 2023
Month: 12
X-DOI: 10.1080/02664763.2022.2111678
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2111678
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:16:p:3362-3383
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2110860_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Onur Camli
Author-X-Name-First: Onur
Author-X-Name-Last: Camli
Author-Name: Zeynep Kalaylioglu
Author-X-Name-First: Zeynep
Author-X-Name-Last: Kalaylioglu
Author-Name: Ashis SenGupta
Author-X-Name-First: Ashis
Author-X-Name-Last: SenGupta
Title: Variable selection in linear-circular regression models
Abstract:
Applications of circular regression models are ubiquitous in many disciplines, particularly in meteorology, biology and geology. In circular regression models, variable selection problem continues to be a remarkable open question. In this paper, we address variable selection in linear-circular regression models where uni-variate linear dependent and a mixed set of circular and linear independent variables constitute the data set. We consider Bayesian lasso which is a popular choice for variable selection in classical linear regression models. We show that Bayesian lasso in linear-circular regression models is not able to produce robust inference as the coefficient estimates are sensitive to the choice of hyper-prior setting for the tuning parameter. To eradicate the problem, we propose a robustified Bayesian lasso that is based on an empirical Bayes (EB) type methodology to construct a hyper-prior for the tuning parameter while using Gibbs Sampling. This hyper-prior construction is computationally more feasible than the hyper-priors that are based on correlation measures. We show in a comprehensive simulation study that Bayesian lasso with EB-GS hyper-prior leads to a more robust inference. Overall, the method offers an efficient Bayesian lasso for variable selection in linear-circular regression while reducing model complexity.
Journal: Journal of Applied Statistics
Pages: 3337-3361
Issue: 16
Volume: 50
Year: 2023
Month: 12
X-DOI: 10.1080/02664763.2022.2110860
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2110860
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:16:p:3337-3361
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2107187_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Zhan Liu
Author-X-Name-First: Zhan
Author-X-Name-Last: Liu
Author-Name: Junbo Zheng
Author-X-Name-First: Junbo
Author-X-Name-Last: Zheng
Author-Name: Chaofeng Tu
Author-X-Name-First: Chaofeng
Author-X-Name-Last: Tu
Author-Name: Yingli Pan
Author-X-Name-First: Yingli
Author-X-Name-Last: Pan
Title: Estimation for volunteer web survey samples using a model-averaging approach
Abstract:
Propensity score approach is a popular technique for estimating the population based on volunteer web survey samples. Various models have been used to estimate propensity scores and produce different population estimates. To obtain more accurate population estimators, we propose a model-averaging estimation approach based on propensity score estimates from a parametric logistic regression model and a nonparametric generalized boosted model. Consistency and asymptotic normality of the proposed estimators are established. A computation algorithm is also developed to implement the proposed method. Simulation studies are conducted to compare the performance of the proposed method with the other methods. A survey data from the Netizen Social Awareness Survey (NSAS) is used to illustrate the proposed methodology.
Journal: Journal of Applied Statistics
Pages: 3251-3271
Issue: 16
Volume: 50
Year: 2023
Month: 12
X-DOI: 10.1080/02664763.2022.2107187
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2107187
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:16:p:3251-3271
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2108386_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Soyun Park
Author-X-Name-First: Soyun
Author-X-Name-Last: Park
Author-Name: Jihnhee Yu
Author-X-Name-First: Jihnhee
Author-X-Name-Last: Yu
Author-Name: Hwa-Hyoung Woo
Author-X-Name-First: Hwa-Hyoung
Author-X-Name-Last: Woo
Author-Name: Chun Gun Park
Author-X-Name-First: Chun Gun
Author-X-Name-Last: Park
Title: A novel network architecture combining central-peripheral deviation with image-based convolutional neural networks for diffusion tensor imaging studies
Abstract:
Brain imaging research is a very challenging topic due to complex structure and lack of explicitly identifiable features in the image. With the advancement of magnetic resonance imaging (MRI) technologies, such as diffusion tensor imaging (DTI), developing classification methods to improve clinical diagnosis is crucial. This paper proposes a classification method for DTI data based on a novel neural network strategy that combines a convolutional neural network (CNN) with a multilayer neural network using central-peripheral deviation (CPD), which reflects diffusion dynamics in the white matter by spatially evaluating the deviation of diffusion coefficients between the inner and outer parts of the brain. In our method, a multilayer perceptron (MLP) using CPD is combined with the final layers for classification after reducing the dimensions of images in the convolutional layers of the neural network architecture. In terms of training loss and the classification error, the proposed classification method improves the existing image classification with CNN. For real data analysis, we demonstrate how to process raw DTI image data sets obtained from a traumatic brain injury study (MagNeTS) and a brain atlas construction study (ICBM), and apply the proposed approach to the data, successfully improving classification performance with two age groups.
Journal: Journal of Applied Statistics
Pages: 3294-3311
Issue: 16
Volume: 50
Year: 2023
Month: 12
X-DOI: 10.1080/02664763.2022.2108386
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2108386
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:16:p:3294-3311
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2109129_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Wei Xiong
Author-X-Name-First: Wei
Author-X-Name-Last: Xiong
Author-Name: Maozai Tian
Author-X-Name-First: Maozai
Author-X-Name-Last: Tian
Author-Name: Manlai Tang
Author-X-Name-First: Manlai
Author-X-Name-Last: Tang
Author-Name: Han Pan
Author-X-Name-First: Han
Author-X-Name-Last: Pan
Title: Robust and sparse learning of varying coefficient models with high-dimensional features
Abstract:
Varying coefficient model (VCM) is extensively used in various scientific fields due to its capability of capturing the changing structure of predictors. Classical mean regression analysis is often complicated in the existence of skewed, heterogeneous and heavy-tailed data. For this purpose, this work employs the idea of model averaging and introduces a novel comprehensive approach by incorporating quantile-adaptive weights across different quantile levels to further improve both least square (LS) and quantile regression (QR) methods. The proposed procedure that adaptively takes advantage of the heterogeneous and sparse nature of input data can gain more efficiency and be well adapted to extreme event case and high-dimensional setting. Motivated by its nice properties, we develop several robust methods to reveal the dynamic close-to-truth structure for VCM and consistently uncover the zero and nonzero patterns in high-dimensional scientific discoveries. We provide a new iterative algorithm that is proven to be asymptotic consistent and can attain the optimal nonparametric convergence rate given regular conditions. These introduced procedures are highlighted with extensive simulation examples and several real data analyses to further show their stronger predictive power compared with LS, composite quantile regression (CQR) and QR methods.
Journal: Journal of Applied Statistics
Pages: 3312-3336
Issue: 16
Volume: 50
Year: 2023
Month: 12
X-DOI: 10.1080/02664763.2022.2109129
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2109129
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:16:p:3312-3336
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2104822_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Somnath Chaudhuri
Author-X-Name-First: Somnath
Author-X-Name-Last: Chaudhuri
Author-Name: Pablo Juan
Author-X-Name-First: Pablo
Author-X-Name-Last: Juan
Author-Name: Jorge Mateu
Author-X-Name-First: Jorge
Author-X-Name-Last: Mateu
Title: Spatio-temporal modeling of traffic accidents incidence on urban road networks based on an explicit network triangulation
Abstract:
Traffic deaths and injuries are one of the major global public health concerns. The present study considers accident records in an urban environment to explore and analyze spatial and temporal in the incidence of road traffic accidents. We propose a spatio-temporal model to provide predictions of the number of traffic collisions on any given road segment, to further generate a risk map of the entire road network. A Bayesian methodology using Integrated nested Laplace approximations with stochastic partial differential equations (SPDE) has been applied in the modeling process. As a novelty, we have introduced SPDE network triangulation to estimate the spatial autocorrelation restricted to the linear network. The resulting risk maps provide information to identify safe routes between source and destination points, and can be useful for accident prevention and multi-disciplinary road safety measures.
Journal: Journal of Applied Statistics
Pages: 3229-3250
Issue: 16
Volume: 50
Year: 2023
Month: 12
X-DOI: 10.1080/02664763.2022.2104822
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2104822
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:16:p:3229-3250
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2108007_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Germà Coenders
Author-X-Name-First: Germà
Author-X-Name-Last: Coenders
Author-Name: Michael Greenacre
Author-X-Name-First: Michael
Author-X-Name-Last: Greenacre
Title: Three approaches to supervised learning for compositional data with pairwise logratios
Abstract:
Logratios between pairs of compositional parts (pairwise logratios) are the easiest to interpret in compositional data analysis, and include the well-known additive logratios as particular cases. When the number of parts is large (sometimes even larger than the number of cases), some form of logratio selection is needed. In this article, we present three alternative stepwise supervised learning methods to select the pairwise logratios that best explain a dependent variable in a generalized linear model, each geared for a specific problem. The first method features unrestricted search, where any pairwise logratio can be selected. This method has a complex interpretation if some pairs of parts in the logratios overlap, but it leads to the most accurate predictions. The second method restricts parts to occur only once, which makes the corresponding logratios intuitively interpretable. The third method uses additive logratios, so that K−1 selected logratios involve a K-part subcomposition. Our approach allows logratios or non-compositional covariates to be forced into the models based on theoretical knowledge, and various stopping criteria are available based on information measures or statistical significance with the Bonferroni correction. We present an application on a dataset from a study predicting Crohn's disease.
Journal: Journal of Applied Statistics
Pages: 3272-3293
Issue: 16
Volume: 50
Year: 2023
Month: 12
X-DOI: 10.1080/02664763.2022.2108007
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2108007
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:16:p:3272-3293
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2104230_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20230119T200553 git hash: 724830af20
Author-Name: Fatemeh Hassantabar Darzi
Author-X-Name-First: Fatemeh Hassantabar
Author-X-Name-Last: Darzi
Author-Name: Samaneh Eftekhari Mahabadi
Author-X-Name-First: Samaneh Eftekhari
Author-X-Name-Last: Mahabadi
Author-Name: Firoozeh Haghighi
Author-X-Name-First: Firoozeh
Author-X-Name-Last: Haghighi
Title: Type-II progressive censoring with GLM-based random removal mechanism dependent on the experimental conditions
Abstract:
This article presents a novel stochastic removal mechanism under Type-II progressive random censoring in which removal probabilities are allowed to be dependent on the lifetime conditions through Generalized Linear Models (GLM). These conditions potentially include failure distances (the time required to observe the next failure) or other covariate information available in the experiment. The proposed GLM-based random removal mechanism includes a set of tuning parameters that are determined by the researcher according to the possible failure distance category. These parameters allow flexible determination of the removal probabilities leading to necessary experimental cost and time reductions. To establish the proposed mechanism, the Proportional Hazard Rate (PHR) family of distributions is considered. Also, the maximum likelihood estimators of parameters and their asymptotic variances are derived for the Weibull distributed lifetime data. A simple simulation algorithm for generating Type-II progressive censoring samples with GLM-based dependent removal probabilities is also presented. The expected experiment time required to complete the life test under this censoring scheme is also investigated using the Monte Carlo integration method. Several simulation studies are conducted to evaluate and compare the performance of the proposed mechanism. A sensitivity analysis is also considered to study the effect of misspecification of removal mechanism coefficients. Finally, two real data sets are analyzed for illustrative purposes.
Journal: Journal of Applied Statistics
Pages: 3199-3228
Issue: 16
Volume: 50
Year: 2023
Month: 12
X-DOI: 10.1080/02664763.2022.2104230
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2104230
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:50:y:2023:i:16:p:3199-3228
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2116409_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: A. Fanjul-Hevia
Author-X-Name-First: A.
Author-X-Name-Last: Fanjul-Hevia
Author-Name: J. C. Pardo-Fernández
Author-X-Name-First: J. C.
Author-X-Name-Last: Pardo-Fernández
Author-Name: I. Van Keilegom
Author-X-Name-First: I.
Author-X-Name-Last: Van Keilegom
Author-Name: W. González-Manteiga
Author-X-Name-First: W.
Author-X-Name-Last: González-Manteiga
Title: A test for comparing conditional ROC curves with multidimensional covariates
Abstract:
The comparison of Receiver Operating Characteristic (ROC) curves is frequently used in the literature to compare the discriminatory capability of different classification procedures based on diagnostic variables. The performance of these variables can be sometimes influenced by the presence of other covariates, and thus they should be taken into account when making the comparison. A new non-parametric test is proposed here for testing the equality of two or more dependent ROC curves conditioned to the value of a multidimensional covariate. Projections are used for transforming the problem into a one-dimensional approach easier to handle. Simulations are carried out to study the practical performance of the new methodology. The procedure is then used to analyse a real data set of patients with Pleural Effusion to compare the diagnostic capability of different markers.
Journal: Journal of Applied Statistics
Pages: 87-113
Issue: 1
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2116409
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2116409
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:1:p:87-113
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2114431_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Wenchao Xu
Author-X-Name-First: Wenchao
Author-X-Name-Last: Xu
Author-Name: Hongmei Lin
Author-X-Name-First: Hongmei
Author-X-Name-Last: Lin
Author-Name: Tiejun Tong
Author-X-Name-First: Tiejun
Author-X-Name-Last: Tong
Author-Name: Riquan Zhang
Author-X-Name-First: Riquan
Author-X-Name-Last: Zhang
Title: A new method for estimating Sharpe ratio function via local maximum likelihood
Abstract:
The Sharpe ratio function is a commonly used risk/return measure in financial econometrics. To estimate this function, most existing methods take a two-step procedure that first estimates the mean and volatility functions separately and then applies the plug-in method. In this paper, we propose a direct method via local maximum likelihood to simultaneously estimate the Sharpe ratio function and the negative log-volatility function as well as their derivatives. We establish the joint limiting distribution of the proposed estimators, and moreover extend the proposed method to estimate the multivariate Sharpe ratio function. We also evaluate the numerical performance of the proposed estimators through simulation studies, and compare them with existing methods. Finally, we apply the proposed method to the three-month US Treasury bill data and that captures a well-known covariate-dependent effect on the Sharpe ratio.
Journal: Journal of Applied Statistics
Pages: 34-52
Issue: 1
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2114431
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2114431
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:1:p:34-52
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2125502_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Jianguo Sun
Author-X-Name-First: Jianguo
Author-X-Name-Last: Sun
Title: New advances in statistics and data science
Journal: Journal of Applied Statistics
Pages: 193-195
Issue: 1
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2125502
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2125502
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:1:p:193-195
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2118245_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Peter Chang
Author-X-Name-First: Peter
Author-X-Name-Last: Chang
Author-Name: Rongzi Liu
Author-X-Name-First: Rongzi
Author-X-Name-Last: Liu
Author-Name: Tingting Hou
Author-X-Name-First: Tingting
Author-X-Name-Last: Hou
Author-Name: Xinyu Yan
Author-X-Name-First: Xinyu
Author-X-Name-Last: Yan
Author-Name: Guogen Shan
Author-X-Name-First: Guogen
Author-X-Name-Last: Shan
Title: Continuity corrected score confidence interval for the difference in proportions in paired data
Abstract:
For paired binary data, the hybrid method and the score method are often recommended for use to calculate the confidence interval for risk difference. These asymptotic intervals do not control the coverage probability. We propose to develop a new score interval with continuity correction to further improve the performance of the existing intervals. The traditional correction value may be too large which leads to a wide interval. For that reason, we propose three different correction values to identify the optimal correction interval with balanced coverage probability and interval width. From simulation studies, we find that a small correction value for the score interval has good performance. In addition, we derive the non-iterative solutions for the developed continuity correction score intervals.
Journal: Journal of Applied Statistics
Pages: 139-152
Issue: 1
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2118245
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2118245
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:1:p:139-152
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2138838_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Cindy Feng
Author-X-Name-First: Cindy
Author-X-Name-Last: Feng
Author-Name: Xi Chen
Author-X-Name-First: Xi
Author-X-Name-Last: Chen
Title: A two-stage latent factor regression method to model the common and unique effects of multiple highly correlated exposure variables
Abstract:
In many epidemiological and environmental health studies, developing an accurate exposure assessment of multiple exposures on a health outcome is often of interest. However, the problem is challenging in the presence of multicollinearity, which can lead to biased estimates of regression coefficients and inflated variance estimators. Selecting one exposure variable as a surrogate of multiple highly correlated exposure variables is often suggested in the literature as a solution to handle the multicollinearity problem. However, this may lead to loss of information, since the exposure variables that are highly correlated tend to have not only common but also additional effects on the outcome variable. In this study, a two-stage latent factor regression method is proposed. The key idea is to regress the dependent variable not only on the common latent factor(s) of the explanatory variables, but also on the residuals terms from the factor analysis as the explanatory variables. The proposed method is compared to the traditional latent factor regression and principal component regression for their performance of handling multicollinearity. Two case studies are presented. Simulation studies are performed to assess their performances in terms of the epidemiological interpretation and stability of parameter estimates.
Journal: Journal of Applied Statistics
Pages: 168-192
Issue: 1
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2138838
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2138838
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:1:p:168-192
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2113865_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Subhankar Dutta
Author-X-Name-First: Subhankar
Author-X-Name-Last: Dutta
Author-Name: Suchandan Kayal
Author-X-Name-First: Suchandan
Author-X-Name-Last: Kayal
Title: Estimation and prediction for Burr type III distribution based on unified progressive hybrid censoring scheme
Abstract:
The present communication develops the tools for estimation and prediction of the Burr-III distribution under unified progressive hybrid censoring scheme. The maximum likelihood estimates of model parameters are obtained. It is shown that the maximum likelihood estimates exist uniquely. Expectation maximization and stochastic expectation maximization methods are employed to compute the point estimates of unknown parameters. Based on the asymptotic distribution of the maximum likelihood estimators, approximate confidence intervals are proposed. In addition, the bootstrap confidence intervals are constructed. Furthermore, the Bayes estimates are derived with respect to squared error and LINEX loss functions. To compute the approximate Bayes estimates, Metropolis–Hastings algorithm is adopted. The highest posterior density credible intervals are obtained. Further, maximum a posteriori estimates of the model parameters are computed. The Bayesian predictive point, as well as interval estimates, are proposed. A Monte Carlo simulation study is employed in order to evaluate the performance of the proposed statistical procedures. Finally, two real data sets are considered and analysed to illustrate the methodologies established in this paper.
Journal: Journal of Applied Statistics
Pages: 1-33
Issue: 1
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2113865
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2113865
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:1:p:1-33
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2112937_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Carlos E. Rodríguez
Author-X-Name-First: Carlos E.
Author-X-Name-Last: Rodríguez
Author-Name: Luis E. Nieto-Barajas
Author-X-Name-First: Luis E.
Author-X-Name-Last: Nieto-Barajas
Author-Name: Carlos S. Pérez-Pérez
Author-X-Name-First: Carlos S.
Author-X-Name-Last: Pérez-Pérez
Title: Dealing with missing data under stratified sampling designs where strata are study domains
Abstract:
A quick count seeks to estimate the voting trends of an election and communicate them to the population on the evening of the same day of the election. In quick counts, the sampling is based on a stratified design of polling stations. Voting information is gathered gradually, often with no guarantee of obtaining the complete sample or even information in all the strata. However, accurate interval estimates with partial information must be obtained. Furthermore, this becomes more challenging if the strata are additionally study domains. To produce partial estimates, two strategies are proposed: (1) a Bayesian model using a dynamic post-stratification strategy and a single imputation process defined after a thorough analysis of historic voting information; additionally, a credibility level correction is included to solve the underestimation of the variance and (2) a frequentist alternative that combines standard multiple imputation ideas with classic sampling techniques to obtain estimates under a missing information framework. Both solutions are illustrated and compared using information from the 2021 quick count. The aim was to estimate the composition of the Chamber of Deputies in Mexico.
Journal: Journal of Applied Statistics
Pages: 153-167
Issue: 1
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2112937
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2112937
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:1:p:153-167
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2116746_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Marco Molinari
Author-X-Name-First: Marco
Author-X-Name-Last: Molinari
Author-Name: Andrea Cremaschi
Author-X-Name-First: Andrea
Author-X-Name-Last: Cremaschi
Author-Name: Maria De Iorio
Author-X-Name-First: Maria
Author-X-Name-Last: De Iorio
Author-Name: Nishi Chaturvedi
Author-X-Name-First: Nishi
Author-X-Name-Last: Chaturvedi
Author-Name: Alun Hughes
Author-X-Name-First: Alun
Author-X-Name-Last: Hughes
Author-Name: Therese Tillin
Author-X-Name-First: Therese
Author-X-Name-Last: Tillin
Title: Bayesian dynamic network modelling: an application to metabolic associations in cardiovascular diseases
Abstract:
We propose a novel approach to the estimation of multiple Graphical Models to analyse temporal patterns of association among a set of metabolites over different groups of patients. Our motivating application is the Southall And Brent REvisited (SABRE) study, a tri-ethnic cohort study conducted in the UK. We are interested in identifying potential ethnic differences in metabolite levels and associations as well as their evolution over time, with the aim of gaining a better understanding of different risk of cardio-metabolic disorders across ethnicities. Within a Bayesian framework, we employ a nodewise regression approach to infer the structure of the graphs, borrowing information across time as well as across ethnicities. The response variables of interest are metabolite levels measured at two time points and for two ethnic groups, Europeans and South-Asians. We use nodewise regression to estimate the high-dimensional precision matrices of the metabolites, imposing sparsity on the regression coefficients through the dynamic horseshoe prior, thus favouring sparser graphs. We provide the code to fit the proposed model using the software Stan, which performs posterior inference using Hamiltonian Monte Carlo sampling, as well as a detailed description of a block Gibbs sampling scheme.
Journal: Journal of Applied Statistics
Pages: 114-138
Issue: 1
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2116746
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2116746
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:1:p:114-138
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2115985_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Ryan H.L. Ip
Author-X-Name-First: Ryan H.L.
Author-X-Name-Last: Ip
Author-Name: K.Y.K. Wu
Author-X-Name-First: K.Y.K.
Author-X-Name-Last: Wu
Title: A Markov random field model with cumulative logistic functions for spatially dependent ordinal data
Abstract:
This paper presents a class of regression models with cumulative logistic functions that are chiefly designed to analyse spatially dependent ordinal data. In contrast to previous works, the proposed model requires neither the sites to be regularly spaced nor the assumption of an underlying continuous variable. It belongs to a more general class of Markov random field models, and can be considered an extension of the ordinal regression model with the proportional odds link function. Our proposed model allows practitioners to interpret the model parameters using odds ratios. Apart from the theoretical developments, this work also highlights the practical aspects of model fitting, including parameterisation, selection of neighbourhood, and calculation of standard errors. Simulation studies with regularly and irregularly spaced sites were conducted. Modelling strategies including pseudo-likelihood methods were found to be useful in both settings. The proposed model and the non-spatial counterpart were applied to the daily air quality index measured in the United Kingdom. The results indicate the presence of spatial effects and the incorporation of spatial effects led to better model performance in terms of various goodness-of-fit measures.
Journal: Journal of Applied Statistics
Pages: 70-86
Issue: 1
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2115985
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2115985
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:1:p:70-86
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2114432_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Hao Hao
Author-X-Name-First: Hao
Author-X-Name-Last: Hao
Author-Name: Bai Huang
Author-X-Name-First: Bai
Author-X-Name-Last: Huang
Author-Name: Tae-hwy Lee
Author-X-Name-First: Tae-hwy
Author-X-Name-Last: Lee
Title: Model averaging estimation of panel data models with many instruments and boosting
Abstract:
Applied researchers often confront two issues when using the fixed effect-two-stage least squares (FE-2SLS) estimator for panel data models. One is that it may lose its consistency due to too many instruments. The other is that the gain of using FE-2SLS may not exceed its loss when the endogeneity is weak. In this paper, an
$ L_{2} $ L2Boosting regularization procedure for panel data models is proposed to tackle the many instruments issue. We then construct a Stein-like model-averaging estimator to take advantage of FE and FE-2SLS-Boosting estimators. Finite sample properties are examined in Monte Carlo and an empirical application is presented.
Journal: Journal of Applied Statistics
Pages: 53-69
Issue: 1
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2114432
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2114432
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:1:p:53-69
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2154329_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Shengxiao Vincent Feng
Author-X-Name-First: Shengxiao Vincent
Author-X-Name-Last: Feng
Author-Name: Willem van den Boom
Author-X-Name-First: Willem
Author-X-Name-Last: van den Boom
Author-Name: Maria De Iorio
Author-X-Name-First: Maria
Author-X-Name-Last: De Iorio
Author-Name: Gladi J. Thng
Author-X-Name-First: Gladi J.
Author-X-Name-Last: Thng
Author-Name: Jerry K. Y. Chan
Author-X-Name-First: Jerry K. Y.
Author-X-Name-Last: Chan
Author-Name: Helen Y. Chen
Author-X-Name-First: Helen Y.
Author-X-Name-Last: Chen
Author-Name: Kok Hian Tan
Author-X-Name-First: Kok Hian
Author-X-Name-Last: Tan
Author-Name: Michelle Z. L. Kee
Author-X-Name-First: Michelle Z. L.
Author-X-Name-Last: Kee
Title: Joint modelling of mental health markers through pregnancy: a Bayesian semi-parametric approach
Abstract:
Maternal depression and anxiety through pregnancy have lasting societal impacts. It is thus crucial to understand the trajectories of its progression from preconception to postnatal period, and the risk factors associated with it. Within the Bayesian framework, we propose to jointly model seven outcomes, of which two are physiological and five non-physiological indicators of maternal depression and anxiety over time. We model the former two by a Gaussian process and the latter by an autoregressive model, while imposing a multidimensional Dirichlet process prior on the subject-specific random effects to account for subject heterogeneity and induce clustering. The model allows for the inclusion of covariates through a regression term. Our findings reveal four distinct clusters of trajectories of the seven health outcomes, characterising women's mental health progression from before to after pregnancy. Importantly, our results caution against the loose use of hair corticosteroids as a biomarker, or even a causal factor, for pregnancy mental health progression. Additionally, the regression analysis reveals a range of preconception determinants and risk factors for depressive and anxiety symptoms during pregnancy.
Journal: Journal of Applied Statistics
Pages: 388-405
Issue: 2
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2154329
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2154329
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:2:p:388-405
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2123460_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Qian Xu
Author-X-Name-First: Qian
Author-X-Name-Last: Xu
Author-Name: Demetra Antimisiaris
Author-X-Name-First: Demetra
Author-X-Name-Last: Antimisiaris
Author-Name: Maiying Kong
Author-X-Name-First: Maiying
Author-X-Name-Last: Kong
Title: Statistical methods for assessing drug interactions using observational data
Abstract:
With advances in medicine, many drugs and treatments become available. On the one hand, polydrug use (i.e. using more than one drug at a time) has been used to treat patients with multiple morbid conditions, and polydrug use may cause severe side effects. On the other hand, combination treatments have been successfully developed to treat severe diseases such as cancer and chronic diseases. Observational data, such as electronic health record data, may provide useful information for assessing drug interactions. In this article, we propose using marginal structural models to assess the average treatment effect and causal interaction of two drugs by controlling confounding variables. The causal effect and the interaction of two drugs are assessed using the weighted likelihood approach, with weights being the inverse probability of the treatment assigned. Simulation studies were conducted to examine the performance of the proposed method, which showed that the proposed method was able to estimate the causal parameters consistently. Case studies were conducted to examine the joint effect of metformin and glyburide use on reducing the hospital readmission for type 2 diabetic patients, and to examine the joint effect of antecedent statins and opioids use on the immune and inflammatory biomarkers for COVID-19 hospitalized patients.
Journal: Journal of Applied Statistics
Pages: 298-323
Issue: 2
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2123460
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2123460
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:2:p:298-323
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2120973_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Muhammad Mohsin
Author-X-Name-First: Muhammad
Author-X-Name-Last: Mohsin
Author-Name: Albrecht Gebhardt
Author-X-Name-First: Albrecht
Author-X-Name-Last: Gebhardt
Title: A stochastic model for NFL games and point spread assessment
Abstract:
Statistical modelling of sports data is indispensable to analyse the sports behaviour and apprehend significant inferences that are helpful to adopt decisive strategies before or during the sports events. This paper introduces a stochastic model as the distribution of difference derived from the Bivariate Affine-Linear Exponential distribution. The distribution of difference is first ever used to model the margin of victory that provides an adequate fitting on the observed data. A simulation study is carried out to observe the stability of the model parameters through their average estimated values, biases, standard errors, root mean square errors and confidence intervals. The performance of the proposed model is examined by applying it on the real data of the National Football League and comparing the results with those of the existing models. Finally, the quantile function of the proposed distribution is used to assess the possible range of point spreads for winning the bet in a particular game.
Journal: Journal of Applied Statistics
Pages: 216-229
Issue: 2
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2120973
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2120973
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:2:p:216-229
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2122947_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Deiby Tineke Salaki
Author-X-Name-First: Deiby Tineke
Author-X-Name-Last: Salaki
Author-Name: Anang Kurnia
Author-X-Name-First: Anang
Author-X-Name-Last: Kurnia
Author-Name: Bagus Sartono
Author-X-Name-First: Bagus
Author-X-Name-Last: Sartono
Author-Name: I Wayan Mangku
Author-X-Name-First: I Wayan
Author-X-Name-Last: Mangku
Author-Name: Arief Gusnanto
Author-X-Name-First: Arief
Author-X-Name-Last: Gusnanto
Title: Model averaging in calibration of near-infrared instruments with correlated high-dimensional data
Abstract:
Model averaging (MA) is a modelling strategy where the uncertainty in the configuration of selected variables is taken into account by weight-combining each estimate of the so-called ‘candidate model’. Some studies have shown that MA enables better prediction, even in high-dimensional cases. However, little is known about the model prediction performance at different types of multicollinearity in high-dimensional data. Motivated by calibration of near-infrared (NIR) instruments,we focus on MA prediction performance in such data. The weighting schemes that we consider are based on the Akaike’s information criterion (AIC), Mallows’ Cp, and cross-validation. For estimating the model parameters, we consider the standard least squares and the ridge regression methods. The results indicate that MA outperforms model selection methods such as LASSO and SCAD in high-correlation data. The use of Mallows’ Cp and cross-validation for the weights tends to yield similar results in all structures of correlation, although the former is generally preferred. We also find that the ridge model averaging outperforms the least-squares model averaging. This research suggests ridge model averaging to build a relatively better prediction of the NIR calibration model.
Journal: Journal of Applied Statistics
Pages: 279-297
Issue: 2
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2122947
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2122947
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:2:p:279-297
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2121384_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Esam Mahdi
Author-X-Name-First: Esam
Author-X-Name-Last: Mahdi
Author-Name: Thomas J. Fisher
Author-X-Name-First: Thomas J.
Author-X-Name-Last: Fisher
Title: Bootstrapping a powerful mixed portmanteau test for time series
Abstract:
A new portmanteau test statistic is proposed for detecting nonlinearity in time series data. The new portmanteau statistic is calculated from the log of the determinant of a matrix comprised of the autocorrelations and cross-correlations of the residuals and squared residuals of a fitted time series. The asymptotic distribution of the proposed test statistic is derived as a linear combination of chi-square distributed random variables and can be approximated by a gamma distribution. A bootstrapping approach is shown to be robust when distributional assumptions are relaxed. The efficacy of the statistic is studied against linear and nonlinear dependency structures of some stationary time series models. It is shown that the new test can provide higher power than other tests in many situations. We demonstrate the advantages of the proposed test by investigating linear and nonlinear effects in an economic series and two environmental time series.
Journal: Journal of Applied Statistics
Pages: 230-255
Issue: 2
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2121384
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2121384
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:2:p:230-255
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2151576_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Semhar B. Ogbagaber
Author-X-Name-First: Semhar B.
Author-X-Name-Last: Ogbagaber
Author-Name: Yifan Cui
Author-X-Name-First: Yifan
Author-X-Name-Last: Cui
Author-Name: Kaigang Li
Author-X-Name-First: Kaigang
Author-X-Name-Last: Li
Author-Name: Ronald J. Iannotti
Author-X-Name-First: Ronald J.
Author-X-Name-Last: Iannotti
Author-Name: Paul S. Albert
Author-X-Name-First: Paul S.
Author-X-Name-Last: Albert
Title: A hidden Markov modeling approach combining objective measure of activity and subjective measure of self-reported sleep to estimate the sleep-wake cycle
Abstract:
Characterizing the sleep-wake cycle in adolescents is an important prerequisite to better understand the association of abnormal sleep patterns with subsequent clinical and behavioral outcomes. The aim of this research was to develop hidden Markov models (HMM) that incorporate both objective (actigraphy) and subjective (sleep log) measures to estimate the sleep-wake cycle using data from the NEXT longitudinal study, a large population-based cohort study. The model was estimated with a negative binomial distribution for the activity counts (1-minute epochs) to account for overdispersion relative to a Poisson process. Furthermore, self-reported measures were dichotomized (for each one-minute interval) and subject to misclassification. We assumed that the unobserved sleep-wake cycle follows a two-state Markov chain with transitional probabilities varying according to a circadian rhythm. Maximum-likelihood estimation using a backward–forward algorithm was applied to fit the longitudinal data on a subject by subject basis. The algorithm was used to reconstruct the sleep-wake cycle from sequences of self-reported sleep and activity data. Furthermore, we conduct simulations to examine the properties of this approach under different observational patterns including both complete and partially observed measurements on each individual.
Journal: Journal of Applied Statistics
Pages: 370-387
Issue: 2
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2151576
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2151576
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:2:p:370-387
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2118678_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Juan Sosa
Author-X-Name-First: Juan
Author-X-Name-Last: Sosa
Author-Name: Abel Rodríguez
Author-X-Name-First: Abel
Author-X-Name-Last: Rodríguez
Title: A Bayesian approach for de-duplication in the presence of relational data
Abstract:
In this paper, we study the impact of combining profile and network data in solving record de-duplication problems. We also assess the influence of a range of prior distributions on the linkage structure, and explore the use of stochastic gradient Hamiltonian Monte Carlo methods as a faster alternative to obtain samples from the posterior distribution for network parameters. Our methodology is evaluated using the RLdata500 data, which is a popular dataset in the record linkage literature.
Journal: Journal of Applied Statistics
Pages: 197-215
Issue: 2
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2118678
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2118678
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:2:p:197-215
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2122027_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Faryal Ibrar
Author-X-Name-First: Faryal
Author-X-Name-Last: Ibrar
Author-Name: Sajid Ali
Author-X-Name-First: Sajid
Author-X-Name-Last: Ali
Author-Name: Ismail Shah
Author-X-Name-First: Ismail
Author-X-Name-Last: Shah
Title: A comparison of single- and double-threshold ROC plots for mixture distributions
Abstract:
The receiver operating characteristics (ROC) analysis is commonly used in clinical settings to check the performance of a single threshold for distinguishing population-wise bimodal-distributed test results. However, for population-wise three-modal distributed test results, a single threshold ROC (stROC) analysis showed poor discriminative performance. The purpose of this study is to use a double-threshold ROC analysis for the three-modal distributed test results to provide better discriminative performance than the stROC analysis. A double-threshold receiver operating characteristic plot (dtROC) is constructed by replacing the single threshold with a double threshold. The sensitivity and specificity coordinates are chosen to maximize sensitivity for a given specificity value. Besides a simulation study assuming a mixture of lognormal, Poisson, and Weibull distributions, a clinical application is examined by a secondary data analysis of palpation test results of the C7 spinous process using the modified thorax–rib static technique. For the assumed mixture models, the discrimination performance of dtROC analysis outperforms the stROC analysis (area under ROC (AUROC) increased from 0.436 to 0.983 for lognormal distributed test results, 0.676 to 0.752 for the Poisson distribution, and 0.674 to 0.804 for Weibull distribution).
Journal: Journal of Applied Statistics
Pages: 256-278
Issue: 2
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2122027
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2122027
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:2:p:256-278
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2125936_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Heba Soltan Mohamed
Author-X-Name-First: Heba Soltan
Author-X-Name-Last: Mohamed
Author-Name: Gauss M. Cordeiro
Author-X-Name-First: Gauss M.
Author-X-Name-Last: Cordeiro
Author-Name: R. Minkah
Author-X-Name-First: R.
Author-X-Name-Last: Minkah
Author-Name: Haitham M. Yousof
Author-X-Name-First: Haitham M.
Author-X-Name-Last: Yousof
Author-Name: Mohamed Ibrahim
Author-X-Name-First: Mohamed
Author-X-Name-Last: Ibrahim
Title: A size-of-loss model for the negatively skewed insurance claims data: applications, risk analysis using different methods and statistical forecasting
Abstract:
The future values of the expected claims are very important for the insurance companies for avoiding the big losses under uncertainty which may be produced from future claims. In this paper, we define a new size-of-loss distribution for the negatively skewed insurance claims data. Four key risk indicators are defined and analyzed under four estimation methods: maximum likelihood, ordinary least squares, weighted least squares, and Anderson Darling. The insurance claims data are modeled using many competitive models and comprehensive comparison is performed under nine statistical tests. The autoregressive model is proposed to analyze the insurance claims data and estimate the future values of the expected claims. The value-at-risk estimation and the peaks-over random threshold mean-of-order-p methodology are considered.
Journal: Journal of Applied Statistics
Pages: 348-369
Issue: 2
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2125936
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2125936
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:2:p:348-369
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2125935_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: A. D. C. Nascimento
Author-X-Name-First: A. D. C.
Author-X-Name-Last: Nascimento
Author-Name: P. M. Almeida-Junior
Author-X-Name-First: P. M.
Author-X-Name-Last: Almeida-Junior
Author-Name: J. M. Vasconcelos
Author-X-Name-First: J. M.
Author-X-Name-Last: Vasconcelos
Author-Name: A. P. M. Borges-Junior
Author-X-Name-First: A. P. M.
Author-X-Name-Last: Borges-Junior
Title: K-Bessel regression model for speckled data
Abstract:
Synthetic aperture radar (SAR) provides an efficient way to monitor the Earth's surface. But the speckle noise that the SAR system generates when acquiring images makes it difficult to understand and interpret SAR intensity features. To automatically analyze SAR images, this paper presents a K-Bessel regression (KBR) model in which a function of the mean intensity response is explained by other features (or covariates) determined in parallel. Some mathematical properties of this regression are derived and discussed in the context of the physical origin of the SAR image. A maximum likelihood estimation procedure is planned and its performance is quantified by Monte Carlo experiments. An application to real data obtained from a polarimetric SAR image of San Francisco Bay is realized. Results show that both the KBR-based processing is more informative than the unconditional approach to describe SAR intensity and that our proposal can outperform the normal and gamma regression models. Finally, it is shown that the KBR model is useful to reproduce the relief signal of one channel from the intensity values of the other.
Journal: Journal of Applied Statistics
Pages: 324-347
Issue: 2
Volume: 51
Year: 2024
Month: 01
X-DOI: 10.1080/02664763.2022.2125935
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2125935
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:2:p:324-347
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2166905_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: A. Skripnikov
Author-X-Name-First: A.
Author-X-Name-Last: Skripnikov
Title: Partially constrained group variable selection to adjust for complementary unit performance in American college football
Abstract:
Given the importance of accurate team rankings in American college football (CFB) – due to heavy title and playoff implications – strides have been made to improve metrics for team performance evaluation, going from basic averages (e.g. points scored per game) to metrics that adjust for a team's strength of schedule, but one aspect that's yet to be accounted for is the ability of team's offense and defense to complement one another, termed ‘complementary football’. American football is unique because the same team's offensive and defensive units typically consist of separate player sets that don't share the field simultaneously, which tempts one to evaluate them independently. Yet, some aspects of your team's defensive (offensive) performance may directly impact the complementary unit, e.g. turnovers forced by your defense could lead to easier scoring chances for your offense. Our main goal is to identify the most consistently influential features of complementary football in a data-driven way, subsequently adjusting each team's offensive (defensive) performance for that of their complementary unit. To achieve that, for the 2009–2019 CFB seasons, we incorporate natural splines with group penalty approaches, conducting partially constrained optimization to guarantee the full adjustment for the strength of schedule and home-field factor.
Journal: Journal of Applied Statistics
Pages: 606-620
Issue: 3
Volume: 51
Year: 2024
Month: 02
X-DOI: 10.1080/02664763.2023.2166905
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2166905
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:3:p:606-620
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2129044_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Pushkal Kumar
Author-X-Name-First: Pushkal
Author-X-Name-Last: Kumar
Author-Name: Manas Ranjan Tripathy
Author-X-Name-First: Manas Ranjan
Author-X-Name-Last: Tripathy
Author-Name: Somesh Kumar
Author-X-Name-First: Somesh
Author-X-Name-Last: Kumar
Title: Alternative classification rules for two inverse gaussian populations with a common mean and order restricted scale-like parameters
Abstract:
The problem of classification into two inverse Gaussian populations with a common mean and ordered scale-like parameters is considered. Surprisingly, the maximum likelihood estimators (MLEs) of the associated model parameters have not been utilized for classification purposes. Note that the MLEs of the model parameters, including the MLE of the common mean, do not have closed-form expressions. In this paper, several classification rules are proposed that use the MLEs and some plug-in type estimators under order restricted scale-like parameters. In the sequel, the risk values of all the proposed estimators are compared numerically, which shows that the proposed plug-in type restricted MLE performs better than others, including the Graybill-Deal type estimator of the common mean. Further, the proposed classification rules are compared in terms of the expected probability of correct classification (EPC) numerically. It is seen that some of our proposed rules have better performance than the existing ones in most of the parameter space. Two real-life examples are considered for application purposes.
Journal: Journal of Applied Statistics
Pages: 407-429
Issue: 3
Volume: 51
Year: 2024
Month: 02
X-DOI: 10.1080/02664763.2022.2129044
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2129044
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:3:p:407-429
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2138837_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Cees Diks
Author-X-Name-First: Cees
Author-X-Name-Last: Diks
Author-Name: Marcin Wolski
Author-X-Name-First: Marcin
Author-X-Name-Last: Wolski
Title: New nonparametric measures for instantaneous and granger-causality tail co-dependence
Abstract:
We propose a new methodology to asses risk spillovers in a time-series framework. Firstly, we introduce an explicit nonparametric measure of cross-sectional conditional tail co-movement, which is intuitively comparable to the Conditional Value-at-Risk (CoVaR). We show that nonlinear CoVaR (NCoVaR) is able to capture even highly nonlinear dependence structures. Secondly, for the purpose of potential contagion analysis, we adapt the measure to be informative about the causality direction between the variables in the Granger causality sense. By showing that the natural estimators of the two metrics are U-statistics, we construct formal nonparametric tests for independence and Granger non-causality. Numerical simulations confirm that in common situations the nonparametric tests have better size and power properties than their parametric counterparts. The methodology is illustrated empirically by assessing risk transmissions between sovereigns and banking sectors in the euro area, which observed highly irregular co-movements between asset prices after the global financial crisis. The new measures seem to be less susceptible to these irregularities than their parametric analogues, providing a clearer overview of the underlying sovereign-bank risk feedback loops.
Journal: Journal of Applied Statistics
Pages: 515-533
Issue: 3
Volume: 51
Year: 2024
Month: 02
X-DOI: 10.1080/02664763.2022.2138837
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2138837
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:3:p:515-533
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2137478_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Hani Samawi
Author-X-Name-First: Hani
Author-X-Name-Last: Samawi
Author-Name: Ding-Geng Chen
Author-X-Name-First: Ding-Geng
Author-X-Name-Last: Chen
Author-Name: Jingjing Yin
Author-X-Name-First: Jingjing
Author-X-Name-Last: Yin
Author-Name: Marwan Alsharman
Author-X-Name-First: Marwan
Author-X-Name-Last: Alsharman
Title: Performance of diagnostic tests based on continuous bivariate markers
Abstract:
In medical diagnostic research, it is customary to collect multiple continuous biomarker measures to improve the accuracy of diagnostic tests. A prevalent practice is to combine the measurements of these biomarkers into one single composite score. However, incorporating those biomarker measurements into a single score depends on the combination of methods and may lose vital information needed to make an effective and accurate decision. Furthermore, a diagnostic cut-off is required for such a combined score, and it is difficult to interpret in actual clinical practice. The paper extends the classical biomarkers’ accuracy and predictive values from univariate to bivariate markers. Also, we will develop a novel pseudo-measures system to maximize the vital information from multiple biomarkers. We specified these pseudo-and-or classifiers for the true positive rate, true negative rate, false-positive rate, and false-negative rate. We used them to redefine classical measures such as the Youden index, diagnostics odds ratio, likelihood ratios, and predictive values. We provide optimal cut-off point selection based on the modified Youden index with numerical illustrations and real data analysis for this paper's newly developed pseudo measures.
Journal: Journal of Applied Statistics
Pages: 497-514
Issue: 3
Volume: 51
Year: 2024
Month: 02
X-DOI: 10.1080/02664763.2022.2137478
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2137478
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:3:p:497-514
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2164562_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Archi Roy
Author-X-Name-First: Archi
Author-X-Name-Last: Roy
Author-Name: Soudeep Deb
Author-X-Name-First: Soudeep
Author-X-Name-Last: Deb
Author-Name: Divya Chakarwarti
Author-X-Name-First: Divya
Author-X-Name-Last: Chakarwarti
Title: Impact of COVID-19 on public social life and mental health: a statistical study of google trends data from the USA
Abstract:
The COVID-19 pandemic has caused a significant disruption in the social lives and mental health of people across the world. This study aims to asses the effect of using internet search volume data. We categorize the widely searched keywords on the internet in several categories, which are relevant in analyzing the public mental health status. Corresponding to each category of keywords, we conduct an appropriate statistical analysis to identify significant changes in the search pattern during the course of the pandemic. Binary segmentation method of changepoint detection, along with the combination of ARMA-GARCH models are utilized in this analysis. It helps us detect how people's behavior changed in phases and whether the severity of the pandemic brought forth those shifts in behaviors. Interestingly, we find that rather than the severity of the outbreak, the long duration of the pandemic has affected the public health status more. The phases, however, align well with the so-called COVID-19 waves and are consistent for different aspects of social and mental health. We further observe that the results are typically similar for different states as well.
Journal: Journal of Applied Statistics
Pages: 581-605
Issue: 3
Volume: 51
Year: 2024
Month: 02
X-DOI: 10.1080/02664763.2022.2164562
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2164562
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:3:p:581-605
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2134316_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Le Chen
Author-X-Name-First: Le
Author-X-Name-Last: Chen
Author-Name: Ruochen Tian
Author-X-Name-First: Ruochen
Author-X-Name-Last: Tian
Author-Name: Guanjie Chen
Author-X-Name-First: Guanjie
Author-X-Name-Last: Chen
Author-Name: Ao Yuan
Author-X-Name-First: Ao
Author-X-Name-Last: Yuan
Author-Name: Chuan-Ming Li
Author-X-Name-First: Chuan-Ming
Author-X-Name-Last: Li
Author-Name: Amy R. Bentley
Author-X-Name-First: Amy R.
Author-X-Name-Last: Bentley
Author-Name: Howard J. Hoffman
Author-X-Name-First: Howard J.
Author-X-Name-Last: Hoffman
Author-Name: Charles Rotimi
Author-X-Name-First: Charles
Author-X-Name-Last: Rotimi
Title: Semiparametric partial linear modeling of risk factors for ear infections: the Early Childhood Longitudinal Study
Abstract:
The Early Childhood Longitudinal Study–Kindergarten Class of 2010–2011 (ECLS-K:2011) ascertained timing of ear infections within age specified intervals and parent's/caregiver's report of medically diagnosed hearing loss. In this nationally representative, school-based sample of children followed from kindergarten entry through fifth grade, academic performance in reading, mathematics, and science was assessed longitudinally. Prior investigations of this ECLS-K:2011 cohort showed that age has a non-linear, monotonically increasing functional relationship with academic performance. Because of this knowledge, a semiparametric partial linear model is proposed, in which the effect of age is modeled by an unknown monotonically increasing function along with other regression parameters. The parameters are estimated by a semiparametric maximum likelihood estimator. A test of a constant effect of age is also proposed. Simulation studies are conducted to evaluate the performance of the proposed method, as compared with the commonly used linear model; the former outperforms the latter based on several criteria. We then analyzed ECLS-K:2011 data to compare results of the partial linear parametric model estimation with that of classical linear regression models.
Journal: Journal of Applied Statistics
Pages: 430-450
Issue: 3
Volume: 51
Year: 2024
Month: 02
X-DOI: 10.1080/02664763.2022.2134316
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2134316
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:3:p:430-450
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2136147_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Qi Lv
Author-X-Name-First: Qi
Author-X-Name-Last: Lv
Author-Name: Yajie Tian
Author-X-Name-First: Yajie
Author-X-Name-Last: Tian
Author-Name: Wenhao Gui
Author-X-Name-First: Wenhao
Author-X-Name-Last: Gui
Title: Statistical inference for Gompertz distribution under adaptive type-II progressive hybrid censoring
Abstract:
Gompertz distribution is a significant and commonly used lifetime distribution, which plays an important role in reliability engineering. In this paper, we study the statistical inference of Gompertz distribution based on adaptive Type-II hybrid progressive censored schemes. From the perspective of frequentist, we derive the point estimations through the method of maximum likelihood estimation (MLE) and the existence of MLE is proved. Besides MLE, we propose the stochastic EM algorithm to reduce complexity and simplify computing. We also apply the method of Bootstraps (Bootstrap-p and Bootstrap-t) to construct confidence intervals. From Bayesian aspect, the Bayes estimates of the unknown parameters are evaluated by applying the MCMC method, the average length and coverage rate of credible intervals are also carried out. The Bayes inference is based on the squared error loss function and LINEX loss function. Furthermore, a numerical simulation is conducted to assess the performance of the proposed methods. Finally, a real-life example is considered to illustrate the application and development of the inference methods. In summary, the Bayesian method seems to perform the best among all approaches, while other approaches also present different advantages.
Journal: Journal of Applied Statistics
Pages: 451-480
Issue: 3
Volume: 51
Year: 2024
Month: 02
X-DOI: 10.1080/02664763.2022.2136147
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2136147
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:3:p:451-480
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2137477_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Han Yu
Author-X-Name-First: Han
Author-X-Name-Last: Yu
Author-Name: Alan D. Hutson
Author-X-Name-First: Alan D.
Author-X-Name-Last: Hutson
Title: Inferential procedures based on the weighted Pearson correlation coefficient test statistic
Abstract:
In this note, we evaluated the type I error control of the commonly used t-test found in most statistical software packages for testing the hypothesis on
$ H_0: \rho = 0 $ H0:ρ=0 vs.
$ H_1: \rho > 0 $ H1:ρ>0 based on the sample weighted Pearson correlation coefficient. We found the type I error rate is severely inflated in general cases, even under bivariate normality. To address this issue, we derived the large sample variance of the weighted Pearson correlation. Based on this result, we proposed an asymptotic test and a set of studentized permutation tests. A comprehensive set of simulation studies with a range of sample sizes and a variety of underlying distributions were conducted. The studentized permutation test based on Fisher's Z statistic was shown to robustly control the type I error even in the small sample and non-normality settings. The method was demonstrated with an example data of country-level preterm birth rates.
Journal: Journal of Applied Statistics
Pages: 481-496
Issue: 3
Volume: 51
Year: 2024
Month: 02
X-DOI: 10.1080/02664763.2022.2137477
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2137477
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:3:p:481-496
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2142537_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Ting Zeng
Author-X-Name-First: Ting
Author-X-Name-Last: Zeng
Author-Name: Solomon W. Harrar
Author-X-Name-First: Solomon W.
Author-X-Name-Last: Harrar
Title: Robust tests for multivariate repeated measures with small samples
Abstract:
Multivariate repeated measures data naturally arise in clinical trials and other fields such as biomedical science, public health, agriculture, social science and so on. For data of this type, the classical approach is to conduct multivariate analysis of variance (MANOVA) based on Wilks' Lambda and other multivariate statistics, which require the assumptions of multivariate normality and homogeneity of within-cell covariance matrices. However, data being analyzed nowadays show marked departure from multivariate normality and homogeneity. This paper proposes a finite-sample test by modifying the sums of squares matrices to make them insensitive to the heterogeneity in MANOVA. The proposed test is invariant to affine transformation and robust against nonnormality. The proposed method can be used in various experimental designs, for example, factorial design and crossover design. Under various simulation settings, the proposed method outperforms the classical Doubly Multivariate Model and Multivariate Mixed Model proposed elsewhere, especially for unbalanced sample sizes with heteroscedasticity. The applications of the proposed method are illustrated with ophthalmology data in factorial and crossover designs. The proposed method successfully identified and validated a significant main effect and demonstrated that univariate analysis could be oversensitive to small but clinically unimportant interactions.
Journal: Journal of Applied Statistics
Pages: 555-580
Issue: 3
Volume: 51
Year: 2024
Month: 02
X-DOI: 10.1080/02664763.2022.2142537
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2142537
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:3:p:555-580
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2140332_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20231214T103247 git hash: d7a2cb0857
Author-Name: Helmi Shat
Author-X-Name-First: Helmi
Author-X-Name-Last: Shat
Title: Optimal design of stress levels in accelerated degradation testing for multivariate linear degradation models
Abstract:
In recent years, more attention has been paid prominently to accelerated degradation testing (ADT) in order to characterize accurate estimation of reliability properties for systems that are designed to work properly for years or even decades. In this paper, we propose optimal experimental designs for repeated measures ADTs with competing failure modes that correspond to multiple response components. The marginal degradation paths are expressed using linear mixed effects models. The optimal design is obtained by minimizing the asymptotic variance of the estimator of some quantile of the failure time distribution at the normal use conditions. Numerical examples are introduced to ensure the robustness of the proposed optimal designs and compare their efficiency with standard experimental designs.
Journal: Journal of Applied Statistics
Pages: 534-554
Issue: 3
Volume: 51
Year: 2024
Month: 02
X-DOI: 10.1080/02664763.2022.2140332
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2140332
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:3:p:534-554
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2153812_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Fernando Ferraz do Nascimento
Author-X-Name-First: Fernando Ferraz do
Author-X-Name-Last: Nascimento
Author-Name: Aline Raquel Assunção Nunes
Author-X-Name-First: Aline Raquel
Author-X-Name-Last: Assunção Nunes
Title: Regression models for the full distribution to exceedance data
Abstract:
The list of occurrences linked to significant climate change has grown in recent decades. These changes can be influenced by a set of covariates, such as temperature, location and period of the year. Analyzing the relation among elements and factors that influence the behavior of such events is extremely important for decision-making in order to minimize damages and losses. Exceedance analysis uses the tail of the distribution based on Extreme Value Theory (EVT). Extensions for these models have been proposed in literature, such as regression models for the tail parameters and a parametric or semi-parametric distribution for the part that comes before the tail (well known as bulk distribution). This work presents a new extension to exceedance model, in which the parameters for the bulk distribution capture the effect of covariates such as location and seasonality. We considered a Bayesian approach in the inference procedure. The estimation was done using MCMC -- Markov Chain Monte Carlo methods. Application results for modeling maximum and minimum temperature data showed an efficient estimation of extreme quantiles and a predictive advantage compared to models previously used in literature.
Journal: Journal of Applied Statistics
Pages: 701-720
Issue: 4
Volume: 51
Year: 2024
Month: 03
X-DOI: 10.1080/02664763.2022.2153812
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2153812
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:4:p:701-720
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2145272_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Zihang Zhong
Author-X-Name-First: Zihang
Author-X-Name-Last: Zhong
Author-Name: Min Yang
Author-X-Name-First: Min
Author-X-Name-Last: Yang
Author-Name: Senmiao Ni
Author-X-Name-First: Senmiao
Author-X-Name-Last: Ni
Author-Name: Lixin Cai
Author-X-Name-First: Lixin
Author-X-Name-Last: Cai
Author-Name: Jingwei Wu
Author-X-Name-First: Jingwei
Author-X-Name-Last: Wu
Author-Name: Jianling Bai
Author-X-Name-First: Jianling
Author-X-Name-Last: Bai
Author-Name: Hao Yu
Author-X-Name-First: Hao
Author-X-Name-Last: Yu
Title: The heterogeneity effect of surveillance intervals on progression free survival
Abstract:
Progression-free survival (PFS) is an increasingly important surrogate endpoint in cancer clinical trials. However, the true time of progression is typically unknown if the evaluation of progression status is only scheduled at given surveillance intervals. In addition, comparison between treatment arms under different surveillance schema is not uncommon. Our aim is to explore whether the heterogeneity of the surveillance intervals may interfere with the validity of the conclusion of efficacy based on PFS, and the extent to which the variation would bias the results. We conduct comprehensive simulation studies to explore the aforementioned goals in a two-arm randomized control trial. We introduce three steps to simulate survival data with predefined surveillance intervals under different censoring rate considerations. We report the estimated hazard ratios and examine false positive rate, power and bias under different surveillance intervals, given different baseline median PFS, hazard ratio and censoring rate settings. Results show that larger heterogeneous lengths of surveillance intervals lead to higher false positive rate and overestimate the power, and the effect of the heterogeneous surveillance intervals may depend upon both the life expectancy of the tumor prognoses and the censoring proportion of the survival data. We also demonstrate such heterogeneity effect of surveillance intervals on PFS in a phase III metastatic colorectal cancer trial. In our opinions, adherence to consistent surveillance intervals should be favored in designing the comparative trials. Otherwise, it needs to be appropriately taken into account when analyzing data.
Journal: Journal of Applied Statistics
Pages: 646-663
Issue: 4
Volume: 51
Year: 2024
Month: 03
X-DOI: 10.1080/02664763.2022.2145272
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2145272
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:4:p:646-663
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2156485_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Emiliano Geneyro
Author-X-Name-First: Emiliano
Author-X-Name-Last: Geneyro
Author-Name: Gabriel Núñez-Antonio
Author-X-Name-First: Gabriel
Author-X-Name-Last: Núñez-Antonio
Title: A Bayesian nonparametric model for bounded directional data on the positive orthant of the unit sphere
Abstract:
Directional data appears in several branches of research. In some cases, those directional variables are only defined in subsets of the K-dimensional unit sphere. For example, in some applications, angles as measured responses are limited on the positive orthant. Analysis on subsets of the K-dimensional unit sphere is challenging and nowadays there are not many proposals that discuss this topic. Thus, from a methodological point of view, it is important to have probability distributions defined on bounded subsets of the K-dimensional unit sphere. Specifically, in this paper, we introduce a nonparametric Bayesian model to describe directional variables restricted to the first orthant. This model is based on a Dirichlet process mixture model with multivariate projected Gamma densities as kernel distributions. We show how to carry out inference for the proposed model based on a slice sampling scheme. The proposed methodology is illustrated using simulated data sets as well as a real data set.
Journal: Journal of Applied Statistics
Pages: 721-739
Issue: 4
Volume: 51
Year: 2024
Month: 03
X-DOI: 10.1080/02664763.2022.2156485
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2156485
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:4:p:721-739
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2151989_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Padma Sharma
Author-X-Name-First: Padma
Author-X-Name-Last: Sharma
Title: Selection of random coefficients in ordered response models: a framework to detect heterogeneity in household surveys
Abstract:
This paper develops a Bayesian method to detect heterogeneity in the relationship between covariates and the outcome in models with ordered responses. To this end, we construct an efficient Markov chain Monte Carlo algorithm for a hierarchical Bayesian model that selects random coefficients in ordered models. This method extends an approach for selecting random coefficients in linear mixed models into the ordered setting by adding two enhancements that are relevant to the latter category of models. First, we construct steps to efficiently estimate cut-points by addressing identification and ordering constraints. Second, we develop a framework to evaluate marginal effects that combine the fixed and random effects of each covariate. The marginal effects additionally allow for model uncertainty by averaging across models visited by the selection algorithm. Simulation studies demonstrate that this method detects random effects when they are present, estimates parameters accurately and efficiently samples from the posterior with low autocorrelations across successive draws. On applying this method on data from the survey of consumer expectations, we find clear support for the presence of household-level heterogeneity in relationships between demographic variables, and current as well as expected financial conditions.
Journal: Journal of Applied Statistics
Pages: 682-700
Issue: 4
Volume: 51
Year: 2024
Month: 03
X-DOI: 10.1080/02664763.2022.2151989
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2151989
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:4:p:682-700
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2178641_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: J. de Uña-Álvarez
Author-X-Name-First: J.
Author-X-Name-Last: de Uña-Álvarez
Author-Name: A. I. Martínez-Senra
Author-X-Name-First: A. I.
Author-X-Name-Last: Martínez-Senra
Author-Name: M. S. Otero-Giráldez
Author-X-Name-First: M. S.
Author-X-Name-Last: Otero-Giráldez
Author-Name: M. A. Quintás
Author-X-Name-First: M. A.
Author-X-Name-Last: Quintás
Title: Cox regression with doubly truncated responses and time-dependent covariates: the impact of innovation on firm survival
Abstract:
The creation of new firms is an important incentive for the economic growth of a country, since it generates employment, it encourages the competition, and promotes innovation. In this work, we investigate the survival of Spanish firms which were created since 2001 and closed down between 2004 and 2012. The information was gathered from Technological Innovation Panel (PITEC), a survey with a focus the technological innovation in Spanish firms. In particular, a Cox regression model with time-dependent covariates was used in order to identify and quantify the determinants of the risk of exit for the firm. The selection bias due to the interval sampling for the firms was corrected by using methods for doubly truncated lifetimes. Interestingly, it is seen how the correction for the selection bias changes both the size and the statistical significance of the effects provided by standard Cox regression.
Journal: Journal of Applied Statistics
Pages: 780-792
Issue: 4
Volume: 51
Year: 2024
Month: 03
X-DOI: 10.1080/02664763.2023.2178641
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2178641
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:4:p:780-792
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2143484_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Wentao Ge
Author-X-Name-First: Wentao
Author-X-Name-Last: Ge
Author-Name: Junfeng Shang
Author-X-Name-First: Junfeng
Author-X-Name-Last: Shang
Title: Bootstrap-adjusted quasi-likelihood information criteria for mixed model selection
Abstract:
We propose two model selection criteria relying on the bootstrap approach, denoted by QAICb1 and QAICb2, in the framework of linear mixed models. Similar to the justification of Akaike Information Criterion (AIC), the proposed QAICb1 and QAICb2 are proved as asymptotically unbiased estimators of the Kullback–Leibler discrepancy between a candidate model and the true model. However, they are defined on the quasi-likelihood function instead of the likelihood and are proven to be asymptotically equivalent. The proposed selection criteria are constructed by the quasi-likelihood of a candidate model and a bias estimation term in which the bootstrap method is adopted to improve the estimation for the bias caused by using the candidate model to estimate the true model. The simulations across a variety of mixed model settings are conducted to demonstrate that the proposed selection criteria outperform some other existing model selection criteria in selecting the true model. Generalized estimating equations (GEE) are utilized to calculate QAICb1 and QAICb2 in the simulations. The effectiveness of the proposed selection criteria is also demonstrated in an application of Parkinson's Progression Markers Initiative (PPMI) data.
Journal: Journal of Applied Statistics
Pages: 621-645
Issue: 4
Volume: 51
Year: 2024
Month: 03
X-DOI: 10.1080/02664763.2022.2143484
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2143484
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:4:p:621-645
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2161488_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Lin Luo
Author-X-Name-First: Lin
Author-X-Name-Last: Luo
Author-Name: Jinzhao Yu
Author-X-Name-First: Jinzhao
Author-X-Name-Last: Yu
Author-Name: Hui Zhao
Author-X-Name-First: Hui
Author-X-Name-Last: Zhao
Title: The sparse estimation of the semiparametric linear transformation model with dependent current status data
Abstract:
In this paper, we study the sparse estimation under the semiparametric linear transformation models for the current status data, also called type I interval-censored data. For the problem, the failure time of interest may be dependent on the censoring time and the association parameter between them is left unspecified. To address this, we employ the copula model to describe the dependence between them and a two-stage estimation procedure to estimate both the association parameter and the regression parameter. In addition, we propose a penalized maximum likelihood estimation procedure based on the broken adaptive ridge regression, and Bernstein polynomials are used to approximate the nonparametric functions involved. The oracle property of the proposed method is established and the numerical studies suggest that the method works well for practical situations. Finally, the method is applied to an Alzheimer's disease study that motivated this investigation.
Journal: Journal of Applied Statistics
Pages: 759-779
Issue: 4
Volume: 51
Year: 2024
Month: 03
X-DOI: 10.1080/02664763.2022.2161488
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2161488
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:4:p:759-779
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2146661_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Roberto Vila
Author-X-Name-First: Roberto
Author-X-Name-Last: Vila
Author-Name: Lucas Alfaia
Author-X-Name-First: Lucas
Author-X-Name-Last: Alfaia
Author-Name: André F.B. Menezes
Author-X-Name-First: André F.B.
Author-X-Name-Last: Menezes
Author-Name: Mehmet N. Çankaya
Author-X-Name-First: Mehmet N.
Author-X-Name-Last: Çankaya
Author-Name: Marcelo Bourguignon
Author-X-Name-First: Marcelo
Author-X-Name-Last: Bourguignon
Title: A model for bimodal rates and proportions
Abstract:
The beta model is the most important distribution for fitting data with the unit interval. However, the beta distribution is not suitable to model bimodal unit interval data. In this paper, we propose a bimodal beta distribution constructed by using an approach based on the alpha-skew-normal model. We discuss several properties of this distribution, such as bimodality, real moments, entropies and identifiability. Furthermore, we propose a new regression model based on the proposed model and discuss residuals. Estimation is performed by maximum likelihood. A Monte Carlo experiment is conducted to evaluate the performances of these estimators in finite samples with a discussion of the results. An application is provided to show the modelling competence of the proposed distribution when the data sets show bimodality.
Journal: Journal of Applied Statistics
Pages: 664-681
Issue: 4
Volume: 51
Year: 2024
Month: 03
X-DOI: 10.1080/02664763.2022.2146661
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2146661
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:4:p:664-681
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2192445_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Katherine Vorpe
Author-X-Name-First: Katherine
Author-X-Name-Last: Vorpe
Author-Name: Sierra Hessinger
Author-X-Name-First: Sierra
Author-X-Name-Last: Hessinger
Author-Name: Rebekah Poth
Author-X-Name-First: Rebekah
Author-X-Name-Last: Poth
Author-Name: Tatjana Miljkovic
Author-X-Name-First: Tatjana
Author-X-Name-Last: Miljkovic
Title: Clustering regions with dynamic time warping to model obesity prevalence disparities in the United States
Abstract:
Current methods for clustering adult obesity prevalence by state focus on creating a single map of obesity prevalence for a given year in the United States. Comparing these maps for different years may limit our understanding of the progression of state and regional obesity prevalence over time for the purpose of developing targeted regional health policies. In this application note, we adopt the non-parametric Dynamic Time Warping method for clustering longitudinal time series of obesity prevalence by state. This method captures the lead and lag relationship between the time series as part of the temporal alignment, allowing us to produce a single map that captures the regional and temporal clusters of obesity prevalence from 1990 to 2019 in the United States. We identify six regions of obesity prevalence in the United States and forecast future estimates of obesity prevalence based on ARIMA models.
Journal: Journal of Applied Statistics
Pages: 793-807
Issue: 4
Volume: 51
Year: 2024
Month: 03
X-DOI: 10.1080/02664763.2023.2192445
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2192445
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:4:p:793-807
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2159339_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: D. Scaldelai
Author-X-Name-First: D.
Author-X-Name-Last: Scaldelai
Author-Name: L. C. Matioli
Author-X-Name-First: L. C.
Author-X-Name-Last: Matioli
Author-Name: S. R. Santos
Author-X-Name-First: S. R.
Author-X-Name-Last: Santos
Title: TreeKDE: clustering multivariate data based on decision tree and using one-dimensional kernel density estimation
Abstract:
In this paper, we present an algorithm for clustering multidimensional data, which we named TreeKDE. It is based on a tree structure decision associated with the optimization of the one-dimensional kernel density estimator function constructed from the orthogonal projections of the data on the coordinate axes. Among the main features of the proposed algorithm, we highlight the automatic determination of the number of clusters and their insertion in a rectangular region. Comparative numerical experiments are presented to illustrate the performance of the proposed algorithm and the results indicate that the TreeKDE is efficient and competitive when compared to other algorithms from the literature. Features such as simplicity and efficiency make the proposed algorithm an attractive and promising research field, which can be used as a basis for its improvement, and also for the development of new clustering algorithms based on the association between decision tree and kernel density estimator.
Journal: Journal of Applied Statistics
Pages: 740-758
Issue: 4
Volume: 51
Year: 2024
Month: 03
X-DOI: 10.1080/02664763.2022.2159339
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2159339
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:4:p:740-758
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2170991_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Yifan Li
Author-X-Name-First: Yifan
Author-X-Name-Last: Li
Author-Name: Chunjie Wu
Author-X-Name-First: Chunjie
Author-X-Name-Last: Wu
Author-Name: Zhijun Wang
Author-X-Name-First: Zhijun
Author-X-Name-Last: Wang
Author-Name: Zhiming Hu
Author-X-Name-First: Zhiming
Author-X-Name-Last: Hu
Title: Aggregated parameter update schemes for monitoring binary profiles
Abstract:
Profile monitoring is one of the most important topics for statistical process control. Traditional self-starting profile monitoring schemes generally use all historical observations to estimate parameters. Because of the rapid increase in the complexity of modern statistical processes, the practitioners often need to deal with massive datasets in process monitoring. However, when observations of each period are of large sample size and the computation is of high complexity, the traditional method is not economical and urgently needs a parameter update strategy. Under the framework of binary profile monitoring, this paper proposes a novel recursive update strategy based on the aggregated estimation equation (AEE) for massive datasets and designs a self-starting control chart accordingly. Numerical simulation verifies that the proposed method performs better in parameter estimation and process monitoring. In addition, we give the asymptotic property of the proposed monitoring statistic and illustrate our method's superiority by a real-data example.
Journal: Journal of Applied Statistics
Pages: 935-957
Issue: 5
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2170991
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2170991
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:5:p:935-957
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2207786_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Achmad Choiruddin
Author-X-Name-First: Achmad
Author-X-Name-Last: Choiruddin
Author-Name: Tabita Yuni Susanto
Author-X-Name-First: Tabita
Author-X-Name-Last: Yuni Susanto
Author-Name: Ahmad Husain
Author-X-Name-First: Ahmad
Author-X-Name-Last: Husain
Author-Name: Yuniar Mega Kartikasari
Author-X-Name-First: Yuniar
Author-X-Name-Last: Mega Kartikasari
Title: kppmenet: combining the kppm and elastic net regularization for inhomogeneous Cox point process with correlated covariates
Abstract:
The
$ \mathtt {kppm} $ kppm is a standard procedure to estimate the parameters of the inhomogeneous Cox point process. However, the procedure cannot handle the problem when the models involve correlated covariates. In this study, we develop the
$ \mathtt {kppmenet} $ kppmenet, the modified version of the
$ \mathtt {kppm} $ kppm, for the inhomogeneous Cox point process involving correlated covariates by considering elastic net regularization. We compare the methodology in a simulation study and apply it to model major-shallow earthquake distribution in Sumatra, Indonesia. We conclude that the
$ \mathtt {kppmenet} $ kppmenet outperforms
$ \mathtt {kppm} $ kppm when correlated covariates are involved.
Journal: Journal of Applied Statistics
Pages: 993-1006
Issue: 5
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2207786
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2207786
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:5:p:993-1006
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2164561_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Osafu Augustine Egbon
Author-X-Name-First: Osafu Augustine
Author-X-Name-Last: Egbon
Author-Name: Ezra Gayawan
Author-X-Name-First: Ezra
Author-X-Name-Last: Gayawan
Title: Modeling the spatial patterns of antenatal care utilization in Nigeria with inference based on Pólya-Gamma mixtures
Abstract:
Despite the vast advantages of making antenatal care visits, the service utilization among pregnant women in Nigeria is suboptimal. A five-year monitoring estimate indicated that about 24% of the women who had live births made no visit. The non-utilization induced excessive zeroes in the outcome of interest. Thus, this study adopted a zero-inflated negative binomial model within a Bayesian framework to identify the spatial pattern and the key factors hindering antenatal care utilization in Nigeria. We overcome the intractability associated with posterior inference by adopting a Pólya-Gamma data-augmentation technique to facilitate inference. The Gibbs sampling algorithm was used to draw samples from the joint posterior distribution. Results revealed that type of place of residence, maternal level of education, access to mass media, household work index, and woman's working status have significant effects on the use of antenatal care services. Findings identified substantial state-level spatial disparity in antenatal care utilization across the country. Cost-effective techniques to achieve an acceptable frequency of utilization include the creation of a community-specific awareness to emphasize the importance and benefits of the appropriate utilization. Special consideration should be given to older pregnant women, women in poor antenatal utilization states, and women residing in poor road network regions.
Journal: Journal of Applied Statistics
Pages: 866-890
Issue: 5
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2022.2164561
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2164561
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:5:p:866-890
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2229973_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Julio C. S. Vasconcelos
Author-X-Name-First: Julio C. S.
Author-X-Name-Last: Vasconcelos
Author-Name: Thiago da Costa Travassos
Author-X-Name-First: Thiago da Costa
Author-X-Name-Last: Travassos
Author-Name: Edwin M. M. Ortega
Author-X-Name-First: Edwin M. M.
Author-X-Name-Last: Ortega
Author-Name: Gauss M. Cordeiro
Author-X-Name-First: Gauss M.
Author-X-Name-Last: Cordeiro
Author-Name: Leonardo Oliveira Reis
Author-X-Name-First: Leonardo
Author-X-Name-Last: Oliveira Reis
Title: Alternative statistical modeling for radical prostatectomy data
Abstract:
Several statistical models have been proposed in recent years, among them is the semiparametric regression. In medicine, there are several situations in which it is impracticable to consider a linear regression for statistical modeling, especially when the data contain explanatory variables that present a nonlinear relationship with the response variable. Another common situation is when the response variable does not have a unimodal shape, and it is not possible to adopt distributions belonging to the symmetric or asymmetric classes. In this context, a semiparametric heteroskedastic regression is proposed based on an extension of the normal distribution. Then, we show the usefulness of this model to analyze the cost of prostate cancer surgery. The predictor variables refer to two groups of patients such that one group receives a multimodal local anesthetic solution (Preemptive Target Anesthetic Solution) and the second group is treated with neuraxial blockade (spinal anesthesia/traditional standard). The other relevant predictor variables are also evaluated, thus allowing for the in-depth interpretation of the predictor variables with a nonlinear effect on the dependent variable cost. The penalized maximum likelihood method is adopted to estimate the model parameters. The new regression is a useful statistical tool for analyzing medical data.
Journal: Journal of Applied Statistics
Pages: 1007-1022
Issue: 5
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2229973
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2229973
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:5:p:1007-1022
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2172143_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Pedro Henrique T. O. Sousa
Author-X-Name-First: Pedro Henrique T. O.
Author-X-Name-Last: Sousa
Author-Name: Camila P. E. de Souza
Author-X-Name-First: Camila P. E.
Author-X-Name-Last: de Souza
Author-Name: Ronaldo Dias
Author-X-Name-First: Ronaldo
Author-X-Name-Last: Dias
Title: Bayesian adaptive selection of basis functions for functional data representation
Abstract:
Considering the context of functional data analysis, we developed and applied a new Bayesian approach via the Gibbs sampler to select basis functions for a finite representation of functional data. The proposed methodology uses Bernoulli latent variables to assign zero to some of the basis function coefficients with a positive probability. This procedure allows for an adaptive basis selection since it can determine the number of bases and which ones should be selected to represent functional data. Moreover, the proposed procedure measures the uncertainty of the selection process and can be applied to multiple curves simultaneously. The methodology developed can deal with observed curves that may differ due to experimental error and random individual differences between subjects, which one can observe in a real dataset application involving daily numbers of COVID-19 cases in Brazil. Simulation studies show the main properties of the proposed method, such as its accuracy in estimating the coefficients and the strength of the procedure to find the true set of basis functions. Despite having been developed in the context of functional data analysis, we also compared the proposed model via simulation with the well-established LASSO and Bayesian LASSO, which are methods developed for non-functional data.
Journal: Journal of Applied Statistics
Pages: 958-992
Issue: 5
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2172143
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2172143
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:5:p:958-992
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2170336_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Muhammad Imran
Author-X-Name-First: Muhammad
Author-X-Name-Last: Imran
Author-Name: Jinsheng Sun
Author-X-Name-First: Jinsheng
Author-X-Name-Last: Sun
Author-Name: Xuelong Hu
Author-X-Name-First: Xuelong
Author-X-Name-Last: Hu
Author-Name: Fatima Sehar Zaidi
Author-X-Name-First: Fatima Sehar
Author-X-Name-Last: Zaidi
Author-Name: Anan Tang
Author-X-Name-First: Anan
Author-X-Name-Last: Tang
Title: Investigating zero-state and steady-state performance of MEWMA-CoDa control chart using variable sampling interval
Abstract:
Traditional process monitoring control charts (CCs) focused on sampling methods using fixed sampling intervals (
$ \mathrm {FSI} $ FSIs). The variable sampling intervals (
$ \mathrm {VSI} $ VSIs) scheme is receiving increasing attention, in which the sampling interval (
$ \mathrm {SI} $ SI) length varies according to the process monitoring statistics. A shorter
$ \mathrm {SI} $ SI is considered when the process quality indicates the possibility of an out-of-control (OOC) situation; otherwise, a longer
$ \mathrm {SI} $ SI is preferred. The
$ \mathrm {VSI} $ VSI multivariate exponentially moving average for compositional data (
$ \mathrm {VSI} $ VSI-
$ \mathrm {MEWMA}\,\mathrm {CoDa} $ MEWMACoDa) CC based on a coordinate representation using isometric log-ratio (
$ \operatorname {ilr} $ ilr) transformation is proposed in this study. A methodology is proposed to obtain the optimal parameters by considering the zero-state (
$ \mathrm {ZS} $ ZS) average time to signal (
$ \mathrm {ZATS} $ ZATS) and the steady-state (SS) average time to signal (
$ \mathrm {SATS} $ SATS). The statistical performance of the proposed CC is evaluated based on a continuous-time Markov chain (
$ \mathrm {CTMC} $ CTMC) method for both cases, the
$ \mathrm {ZS} $ ZS and the SS using a fixed value of in-control (IC)
$ \mathrm {ATS}_0 $ ATS0. Simulation results demonstrate that the
$ \mathrm {VSI} $ VSI-
$ \mathrm {MEWMA}\,\mathrm {CoDa} $ MEWMACoDa CC has significantly decreased the OOC average time to signal (
$ \mathrm {ATS} $ ATS) than the
$ \mathrm {FSI}\,\mathrm {MEWMA}\,\mathrm {CoDa} $ FSIMEWMACoDa CC. Moreover, it is found that the number of variables (d) has a negative impact on the
$ \mathrm {ATS} $ ATS of the
$ \mathrm {VSI} $ VSI-
$ \mathrm {MEWMA}\,\mathrm {CoDa} $ MEWMACoDa CC, and the subgroup size (n) has a mildly positive impact on the
$ \mathrm {ATS} $ ATS of the
$ \mathrm {VSI} $ VSI-
$ \mathrm {MEWMA}\,\mathrm {CoDa} $ MEWMACoDa CC. At the same time, the
$ \mathrm {SATS} $ SATS of the
$ \mathrm {VSI} $ VSI-
$ \mathrm {MEWMA}\,\mathrm {CoDa} $ MEWMACoDa CC is less than the
$ \mathrm {ZATS} $ ZATS of the
$ \mathrm {VSI} $ VSI-
$ \mathrm {MEWMA}\,\mathrm {CoDa} $ MEWMACoDa CC for all the values of n and d. The proposed
$ \mathrm {VSI} $ VSI-
$ \mathrm {MEWMA}\,\mathrm {CoDa} $ MEWMACoDa CC under steady-State performs effectively compared to its competitors, such as the
$ \mathrm {FSI} $ FSI-
$ \mathrm {MEWMA}\,\mathrm {CoDa} $ MEWMACoDa CC, the
$ \mathrm {VSI} $ VSI-
$ T^2\,\mathrm {CoDa} $ T2CoDa CC and the
$ \mathrm {FSI} $ FSI-
$ T^2\,\mathrm {CoDa} $ T2CoDa CC. An example of an industrial problem from a plant in Europe is also given to study the statistical significance of the
$ \mathrm {VSI} $ VSI-
$ \mathrm {MEWMA}\,\mathrm {CoDa} $ MEWMACoDa CC.
Journal: Journal of Applied Statistics
Pages: 913-934
Issue: 5
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2170336
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2170336
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:5:p:913-934
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2163229_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: F. Prataviera
Author-X-Name-First: F.
Author-X-Name-Last: Prataviera
Author-Name: E. M. Hashimoto
Author-X-Name-First: E. M.
Author-X-Name-Last: Hashimoto
Author-Name: E. M. M. Ortega
Author-X-Name-First: E. M. M.
Author-X-Name-Last: Ortega
Author-Name: G. M. Cordeiro
Author-X-Name-First: G. M.
Author-X-Name-Last: Cordeiro
Author-Name: V. G. Cancho
Author-X-Name-First: V. G.
Author-X-Name-Last: Cancho
Author-Name: R. Vila
Author-X-Name-First: R.
Author-X-Name-Last: Vila
Title: A new flexible regression model with application to recovery probability Covid-19 patients
Abstract:
The aim of this study is to propose a generalized odd log-logistic Maxwell mixture model to analyze the effect of gender and age groups on lifetimes and on the recovery probabilities of Chinese individuals with COVID-19. We add new properties of the generalized Maxwell model. The coefficients of the regression and the recovered fraction are estimated by maximum likelihood and Bayesian methods. Further, some simulation studies are done to compare the regressions for different scenarios. Model-checking techniques based on the quantile residuals are addressed. The estimated survival functions for the patients are reported by age range and sex. The simulation study showed that mean squared errors decay toward zero and the average estimates converge to the true parameters when sample size increases. According to the fitted model, there is a significant difference only in the age group on the lifetime of individuals with COVID-19. Women have higher probability of recovering than men and individuals aged
$ \geq $ ≥60 years have lower recovered probabilities than those who aged
$ {<}60 $ <60 years. The findings suggest that the proposed model could be a good alternative to analyze censored lifetime of individuals with COVID-19.
Journal: Journal of Applied Statistics
Pages: 826-844
Issue: 5
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2022.2163229
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2163229
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:5:p:826-844
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2163379_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Abdul Salam
Author-X-Name-First: Abdul
Author-X-Name-Last: Salam
Author-Name: Marco Grzegorczyk
Author-X-Name-First: Marco
Author-X-Name-Last: Grzegorczyk
Title: Learning the structure of the mTOR protein signaling pathway from protein phosphorylation data
Abstract:
Statistical learning of the structures of cellular networks, such as protein signaling pathways, is a topical research field in computational systems biology. To get the most information out of experimental data, it is often required to develop a tailored statistical approach rather than applying one of the off-the-shelf network reconstruction methods. The focus of this paper is on learning the structure of the mTOR protein signaling pathway from immunoblotting protein phosphorylation data. Under two experimental conditions eleven phosphorylation sites of eight key proteins of the mTOR pathway were measured at ten non-equidistant time points. For the statistical analysis we propose a new advanced hierarchically coupled non-homogeneous dynamic Bayesian network (NH-DBN) model, and we consider various data imputation methods for dealing with non-equidistant temporal observations. Because of the absence of a true gold standard network, we propose to use predictive probabilities in combination with a leave-one-out cross validation strategy to objectively cross-compare the accuracies of different NH-DBN models and data imputation methods. Finally, we employ the best combination of model and data imputation method for predicting the structure of the mTOR protein signaling pathway.
Journal: Journal of Applied Statistics
Pages: 845-865
Issue: 5
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2022.2163379
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2163379
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:5:p:845-865
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2162863_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Dong Han
Author-X-Name-First: Dong
Author-X-Name-Last: Han
Author-Name: Fugee Tsung
Author-X-Name-First: Fugee
Author-X-Name-Last: Tsung
Author-Name: Lei Qiao
Author-X-Name-First: Lei
Author-X-Name-Last: Qiao
Title: The optimal CUSUM control chart with a dynamic non-random control limit and a given sampling strategy for small samples sequence
Abstract:
This article proposes a performance measure to evaluate the detection performance of a control chart with a given sampling strategy for finite or small samples sequence and prove that the CUSUM control chart with dynamic non-random control limit and a given sampling strategy can be optimal under the measure. Numerical simulations and real data for an earthquake are provided to illustrate that for different sampling strategies, the CUSUM chart will have different monitoring performance in change-point detection. Among the six sampling strategies that take only a part of samples, the numerical comparing results illustrate that the uniform sampling strategy (uniformly dispersed sampling strategy) has the best monitoring effect.
Journal: Journal of Applied Statistics
Pages: 809-825
Issue: 5
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2022.2162863
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2162863
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:5:p:809-825
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2164759_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Chathura Siriwardhana
Author-X-Name-First: Chathura
Author-X-Name-Last: Siriwardhana
Author-Name: K.B. Kulasekera
Author-X-Name-First: K.B.
Author-X-Name-Last: Kulasekera
Author-Name: Somnath Datta
Author-X-Name-First: Somnath
Author-X-Name-Last: Datta
Title: Selection of the optimal personalized treatment from multiple treatments with right-censored multivariate outcome measures
Abstract:
We propose a novel personalized concept for the optimal treatment selection for a situation where the response is a multivariate vector that could contain right-censored variables such as survival time. The proposed method can be applied with any number of treatments and outcome variables, under a broad set of models. Following a working semiparametric Single Index Model that relates covariates and responses, we first define a patient-specific composite score, constructed from individual covariates. We then estimate conditional means of each response, given the patient score, correspond to each treatment, using a nonparametric smooth estimator. Next, a rank aggregation technique is applied to estimate an ordering of treatments based on ranked lists of treatment performance measures given by conditional means. We handle the right-censored data by incorporating the inverse probability of censoring weighting to the corresponding estimators. An empirical study illustrates the performance of the proposed method in finite sample problems. To show the applicability of the proposed procedure for real data, we also present a data analysis using HIV clinical trial data, that contained a right-censored survival event as one of the endpoints.
Journal: Journal of Applied Statistics
Pages: 891-912
Issue: 5
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2022.2164759
File-URL: http://hdl.handle.net/10.1080/02664763.2022.2164759
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:5:p:891-912
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2176470_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Jinwen Liang
Author-X-Name-First: Jinwen
Author-X-Name-Last: Liang
Author-Name: Maozai Tian
Author-X-Name-First: Maozai
Author-X-Name-Last: Tian
Title: Imputed mean tensor regression for near-sited spatial temporal data
Abstract:
Modern spatial temporal data are often collected from sensor networks. Missing data problems are common to this kind of data. Making accurate imputation is important for many applications. In the unsupervised setting, one technique is to minimize the rank of a tensor or matrix. If we add related covariates, can we get more accurate imputation results? To address this, we transform the original sensor×time measurements to high order tensors by adding additional temporal dimensions and then integrate tensor regression with tensor completion using nuclear norm penalty. One advantage is we can simultaneously estimate parameters and impute missing values due to clear spatial consistency for near-sited spatial-temporal data. The proposed method doesn't assume missing mechanism of the response. Theoretical properties of the proposed estimator are investigated. Simulation studies and real data analysis are conducted to verify the efficiency of the estimation procedure.
Journal: Journal of Applied Statistics
Pages: 1057-1075
Issue: 6
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2176470
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2176470
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:6:p:1057-1075
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2178642_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Bo Peng
Author-X-Name-First: Bo
Author-X-Name-Last: Peng
Author-Name: Kai Yang
Author-X-Name-First: Kai
Author-X-Name-Last: Yang
Author-Name: Xiaogang Dong
Author-X-Name-First: Xiaogang
Author-X-Name-Last: Dong
Title: Variable selection for quantile autoregressive model: Bayesian methods versus classical methods
Abstract:
In this article, we introduce three Bayesian variable selection methods for the quantile autoregressive model with explanatory variables. The Gibbs sampling algorithms are developed for each method by setting different priors. The numerical simulations suggest that the Gibbs sampling algorithms converge fast and Bayesian variable selection methods are reliable. A real example is given to analysis the relationship between the count of total rental bikes and five explanatory variables. Both simulations and data example indicate that the proposed methods are feasible, reliable, and appropriate for analyzing the Bike Sharing data set.
Journal: Journal of Applied Statistics
Pages: 1098-1130
Issue: 6
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2178642
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2178642
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:6:p:1098-1130
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2175799_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Zhou Yu
Author-X-Name-First: Zhou
Author-X-Name-Last: Yu
Author-Name: Jie Yang
Author-X-Name-First: Jie
Author-X-Name-Last: Yang
Author-Name: Hsin-Hsiung Huang
Author-X-Name-First: Hsin-Hsiung
Author-X-Name-Last: Huang
Title: Smoothing regression and impact measures for accidents of traffic flows
Abstract:
Traffic pattern identification and accident evaluation are essential for improving traffic planning, road safety, and traffic management. In this paper, we establish classification and regression models to characterize the relationship between traffic flows and different time points and identify different patterns of traffic flows by a negative binomial model with smoothing splines. It provides mean response curves and Bayesian credible bands for traffic flows, a single index, and the log-likelihood difference, for traffic flow pattern recognition. We further propose an impact measure for evaluating the influence of accidents on traffic flows based on the fitted negative binomial model. The proposed method has been successfully applied to real-world traffic flows, and it can be used for improving traffic management.
Journal: Journal of Applied Statistics
Pages: 1041-1056
Issue: 6
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2175799
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2175799
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:6:p:1041-1056
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2233143_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: David Angeles
Author-X-Name-First: David
Author-X-Name-Last: Angeles
Author-Name: Sebastian Kurtek
Author-X-Name-First: Sebastian
Author-X-Name-Last: Kurtek
Author-Name: Elizabeth Klein
Author-X-Name-First: Elizabeth
Author-X-Name-Last: Klein
Author-Name: Marielle Brinkman
Author-X-Name-First: Marielle
Author-X-Name-Last: Brinkman
Author-Name: Amy Ferketich
Author-X-Name-First: Amy
Author-X-Name-Last: Ferketich
Title: Geometric framework for statistical analysis of eye tracking heat maps, with application to a tobacco waterpipe study
Abstract:
Health warning labels have been found to increase awareness of the harmful effects of tobacco products. An eye tracking study was conducted to determine the optimal placement and type of a health warning label on tobacco waterpipes. Participants viewed images that contained one of (1) four waterpipes, (2) three different types of warning labels, (3) placed in three locations. Typically, statistical analysis of eye tracking data is conducted based on summary statistics such as total dwell time, duration score, and number of visits to an area of interest. However, these summary statistics fail to capture the complete variability in a participant's eye movement. Instead, we propose to estimate heat maps defined on the entire image domain using the raw two-dimensional coordinates of eye movement via kernel density estimation. For statistical analysis of heat maps, we adopt the Fisher–Rao Riemannian geometric framework, which enables computationally efficient comparisons of heat maps, statistical summarization and exploration of variability in a sample of heat maps, and metric-based hierarchical clustering. We apply this framework to eye tracking data from the tobacco waterpipe study and comment on the results in the context of the optimal placement and type of health warning labels on tobacco waterpipes.
Journal: Journal of Applied Statistics
Pages: 1191-1209
Issue: 6
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2233143
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2233143
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:6:p:1191-1209
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2176834_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Serenay Cakar
Author-X-Name-First: Serenay
Author-X-Name-Last: Cakar
Author-Name: Fulya Gokalp Yavuz
Author-X-Name-First: Fulya Gokalp
Author-X-Name-Last: Yavuz
Title: Hybrid statistical and machine learning modeling of cognitive neuroscience data
Abstract:
The nested data structure is prevalent for cognitive measure experiments due to repeatedly taken observations from different brain locations within subjects. The analysis methods used for this data type should consider the dependency structure among the repeated measurements. However, the dependency assumption is mainly ignored in the cognitive neuroscience data analysis literature. We consider both statistical, and machine learning methods extended to repeated data analysis and compare distinct algorithms in terms of their advantage and disadvantages. Unlike basic algorithm comparison studies, this article analyzes novel neuroscience data considering the dependency structure for the first time with several statistical and machine learning methods and their hybrid forms. In addition, the fitting performances of different algorithms are compared using contaminated data sets, and the cross-validation approach. One of our findings suggests that the GLMM tree, including random term indices indicating the location of functional near-infrared spectroscopy optodes nested within experimental units, shows the best predictive performance with the lowest MSE, RMSE, and MAE model performance metrics. However, there is a trade-off between accuracy and speed since this algorithm is required the highest computational time.
Journal: Journal of Applied Statistics
Pages: 1076-1097
Issue: 6
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2176834
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2176834
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:6:p:1076-1097
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2248413_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Laura L. Tupper
Author-X-Name-First: Laura L.
Author-X-Name-Last: Tupper
Author-Name: Charles R. Keese
Author-X-Name-First: Charles R.
Author-X-Name-Last: Keese
Author-Name: David S. Matteson
Author-X-Name-First: David S.
Author-X-Name-Last: Matteson
Title: Classifying contaminated cell cultures using time series features
Abstract:
We examine the use of time series data, derived from Electric Cell-substrate Impedance Sensing (ECIS), to differentiate between standard mammalian cell cultures and those infected with a mycoplasma organism. With the goal of easy visualization and interpretation, we perform low-dimensional feature-based classification, extracting application-relevant features from the ECIS time courses. We can achieve very high classification accuracy using only two features, which depend on the cell line under examination. Initial results also show the existence of experimental variation between plates and suggest types of features that may prove more robust to such variation. Our paper is the first to perform a broad examination of ECIS time course features in the context of detecting contamination; to combine different types of features to achieve classification accuracy while preserving interpretability; and to describe and suggest possibilities for ameliorating plate-to-plate variation.
Journal: Journal of Applied Statistics
Pages: 1210-1226
Issue: 6
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2248413
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2248413
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:6:p:1210-1226
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2189771_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Vasileios Alevizakos
Author-X-Name-First: Vasileios
Author-X-Name-Last: Alevizakos
Author-Name: Kashinath Chatterjee
Author-X-Name-First: Kashinath
Author-X-Name-Last: Chatterjee
Author-Name: Christos Koukouvinos
Author-X-Name-First: Christos
Author-X-Name-Last: Koukouvinos
Title: Distribution-free Phase II triple EWMA control chart for joint monitoring the process location and scale parameters
Abstract:
Distribution-free or nonparametric control charts are used for monitoring the process parameters when there is a lack of knowledge about the underlying distribution. In this paper, we investigate a single distribution-free triple exponentially weighted moving average control chart based on the Lepage statistic (referred as TL chart) for simultaneously monitoring shifts in the unknown location and scale parameters of a univariate continuous distribution. The design and implementation of the proposed chart are discussed using time-varying and steady-state control limits for the zero-state case. The run-length distribution of the TL chart is evaluated by performing Monte Carlo simulations. The performance of the proposed chart is compared to those of the existing EWMA-Lepage (EL) and DEWMA-Lepage (DL) charts. It is observed that the TL chart with a time-varying control limit is superior to its competitors, especially for small to moderate shifts in the process parameters. We also provide a real example from a manufacturing process to illustrate the application of the proposed chart.
Journal: Journal of Applied Statistics
Pages: 1171-1190
Issue: 6
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2189771
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2189771
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:6:p:1171-1190
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2180167_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Canhui Li
Author-X-Name-First: Canhui
Author-X-Name-Last: Li
Author-Name: Weirong Li
Author-X-Name-First: Weirong
Author-X-Name-Last: Li
Author-Name: Wensheng Zhu
Author-X-Name-First: Wensheng
Author-X-Name-Last: Zhu
Title: Penalized robust learning for optimal treatment regimes with heterogeneous individualized treatment effects
Abstract:
The growing popularity of personalized medicine motivates people to explore individualized treatment regimes according to heterogeneous characteristics of the patients. For the large-scale data analysis, however, the data are collected at different times and different locations, i.e. subjects are usually from a heterogeneous population, which causes that the optimal treatment regimes also vary for patients across different subgroups. In this paper, we mainly focus on the estimation of optimal treatment regimes for subjects come from a heterogeneous population with high-dimensional data. We first remove the main effects of the covariates for each subgroup to eliminate non-ignorable residual confounding. Based on the centralized outcome, we propose a penalized robust learning that estimates the coefficient matrix of the interactions between covariates and treatment by penalizing pairwise differences of the coefficients of any two subgroups for the same covariate, which can automatically identify the latent complex structure of the coefficient matrix with heterogeneous and homogeneous columns. At the same time, the penalized robust learning can also select the important variables that truly contribute to the individualized treatment decisions with commonly used sparsity structure penalty. Extensive simulation studies show that our proposed method outperforms current popular methods, and it is further illustrated in the real analysis of the Tamoxifen breast cancer data.
Journal: Journal of Applied Statistics
Pages: 1151-1170
Issue: 6
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2180167
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2180167
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:6:p:1151-1170
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2173156_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Satwik Acharyya
Author-X-Name-First: Satwik
Author-X-Name-Last: Acharyya
Author-Name: Debdeep Pati
Author-X-Name-First: Debdeep
Author-X-Name-Last: Pati
Author-Name: Shumei Sun
Author-X-Name-First: Shumei
Author-X-Name-Last: Sun
Author-Name: Dipankar Bandyopadhyay
Author-X-Name-First: Dipankar
Author-X-Name-Last: Bandyopadhyay
Title: A monotone single index model for missing-at-random longitudinal proportion data
Abstract:
Beta distributions are commonly used to model proportion valued response variables, often encountered in longitudinal studies. In this article, we develop semi-parametric Beta regression models for proportion valued responses, where the aggregate covariate effect is summarized and flexibly modeled, using a interpretable monotone time-varying single index transform of a linear combination of the potential covariates. We utilize the potential of single index models, which are effective dimension reduction tools and accommodate link function misspecification in generalized linear mixed models. Our Bayesian methodology incorporates the missing-at-random feature of the proportion response and utilize Hamiltonian Monte Carlo sampling to conduct inference. We explore finite-sample frequentist properties of our estimates and assess the robustness via detailed simulation studies. Finally, we illustrate our methodology via application to a motivating longitudinal dataset on obesity research recording proportion body fat.
Journal: Journal of Applied Statistics
Pages: 1023-1040
Issue: 6
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2173156
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2173156
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:6:p:1023-1040
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2179567_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Mamadou Lamine Diop
Author-X-Name-First: Mamadou Lamine
Author-X-Name-Last: Diop
Author-Name: William Kengne
Author-X-Name-First: William
Author-X-Name-Last: Kengne
Title: Epidemic change-point detection in general integer-valued time series
Abstract:
In this paper, we consider the structural change in a class of discrete valued time series, where the true conditional distribution of the observations is assumed to be unknown. The conditional mean of the process depends on a parameter
$ \theta ^* $ θ∗ which may change over time. We provide sufficient conditions for the consistency and the asymptotic normality of the Poisson quasi-maximum likelihood estimator (QMLE) of the model. We consider an epidemic change-point detection and propose a test statistic based on the QMLE of the parameter. Under the null hypothesis of a constant parameter (no change), the test statistic converges to a distribution obtained from increments of a Browninan bridge. The test statistic diverges to infinity under the epidemic alternative, which establishes that the proposed procedure is consistent in power. The effectiveness of the proposed procedure is illustrated by simulated and real data examples.
Journal: Journal of Applied Statistics
Pages: 1131-1150
Issue: 6
Volume: 51
Year: 2024
Month: 04
X-DOI: 10.1080/02664763.2023.2179567
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2179567
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:6:p:1131-1150
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2196752_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Rui Ma
Author-X-Name-First: Rui
Author-X-Name-Last: Ma
Author-Name: Shishun Zhao
Author-X-Name-First: Shishun
Author-X-Name-Last: Zhao
Author-Name: Jianguo Sun
Author-X-Name-First: Jianguo
Author-X-Name-Last: Sun
Author-Name: Shuying Wang
Author-X-Name-First: Shuying
Author-X-Name-Last: Wang
Title: Estimation of accelerated hazards models based on case K informatively interval-censored failure time data
Abstract:
The accelerated hazards model is one of the most commonly used models for regression analysis of failure time data and this is especially the case when, for example, the hazard functions may have monotonicity property. Correspondingly a large literature has been established for its estimation or inference when right-censored data are observed. Although several methods have also been developed for its inference based on interval-censored data, they apply only to limited situations or rely on some assumptions such as independent censoring. In this paper, we consider the situation where one observes case K interval-censored data, the type of failure time data that occur most in, for example, medical research such as clinical trials or periodical follow-up studies. For inference, we propose a sieve borrow-strength method and in particular, it allows for informative censoring. The asymptotic properties of the proposed estimators are established. Simulation studies demonstrate that the proposed inference procedure performs well. The method is applied to a set of real data set arising from an AIDS clinical trial.
Journal: Journal of Applied Statistics
Pages: 1251-1270
Issue: 7
Volume: 51
Year: 2024
Month: 05
X-DOI: 10.1080/02664763.2023.2196752
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2196752
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:7:p:1251-1270
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2203882_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Isaac E. Cortés
Author-X-Name-First: Isaac E.
Author-X-Name-Last: Cortés
Author-Name: Mário de Castro
Author-X-Name-First: Mário
Author-X-Name-Last: de Castro
Author-Name: Diego I. Gallardo
Author-X-Name-First: Diego I.
Author-X-Name-Last: Gallardo
Title: A new family of quantile regression models applied to nutritional data
Abstract:
This paper introduces a new family of quantile regression models whose response variable follows a reparameterized Marshall-Olkin distribution indexed by quantile, scale, and asymmetry parameters. The family has arisen by applying the Marshall-Olkin approach to distributions belonging to the location-scale family. Models of higher flexibility and whose structure is similar to generalized linear models were generated by quantile reparameterization. The maximum likelihood (ML) method is presented for the estimation of the model parameters, and simulation studies evaluated the performance of the ML estimators. The advantages of the family are illustrated through an application to a set of nutritional data, whose results indicate it is a good alternative for modeling slightly asymmetric response variables with support on the real line.
Journal: Journal of Applied Statistics
Pages: 1378-1398
Issue: 7
Volume: 51
Year: 2024
Month: 05
X-DOI: 10.1080/02664763.2023.2203882
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2203882
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:7:p:1378-1398
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2194582_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Seyedeh Mahbubeh Hoseini Baladezaei
Author-X-Name-First: Seyedeh Mahbubeh Hoseini
Author-X-Name-Last: Baladezaei
Author-Name: Einolah Deiri
Author-X-Name-First: Einolah
Author-X-Name-Last: Deiri
Author-Name: Ezzatallah Baloui Jamkhaneh
Author-X-Name-First: Ezzatallah Baloui
Author-X-Name-Last: Jamkhaneh
Title: The balanced discrete Burr–Hatke model and mixing INAR(1) process: properties, estimation, forecasting and COVID-19 applications
Abstract:
The main concern of this paper is providing a flexible discrete model that captures every kind of dispersion (equi-, over- and under-dispersion). Based on the balanced discretization method, a new discrete version of Burr–Hatke distribution is introduced with the partial moment-preserving property. Some statistical properties of the new distribution are introduced, and the applicability of proposed model is evaluated by considering counting series. A new integer-valued autoregressive (INAR) process based on the mixing Pegram and binomial thinning operators with discrete Burr–Hatke innovations is introduced, which can model contagious data properly. The different estimation approaches of parameters of the new process are provided and compared through the Monte Carlo simulation scheme. The performance of the proposed process is evaluated by four data sets of the daily death counts of the COVID-19 in Austria, Switzerland, Nigeria and Slovenia in comparison with some competitor INAR(1) models, along with the Pearson residual analysis of the assessing model. The goodness of fit measures affirm the adequacy of the proposed process in modeling all COVID-19 data sets. The fundamental prediction procedures are considered for new process by classic, modified Sieve bootstrap and Bayesian forecasting methods for all COVID-19 data sets, which is concluded that the Bayesian forecasting approach provides more reliable results.
Journal: Journal of Applied Statistics
Pages: 1227-1250
Issue: 7
Volume: 51
Year: 2024
Month: 05
X-DOI: 10.1080/02664763.2023.2194582
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2194582
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:7:p:1227-1250
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2197571_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Sajid Ali
Author-X-Name-First: Sajid
Author-X-Name-Last: Ali
Author-Name: Mariyam Waheed
Author-X-Name-First: Mariyam
Author-X-Name-Last: Waheed
Author-Name: Ismail Shah
Author-X-Name-First: Ismail
Author-X-Name-Last: Shah
Author-Name: Syed Muhammad Muslim Raza
Author-X-Name-First: Syed Muhammad Muslim
Author-X-Name-Last: Raza
Title: Bayesian sample size determination for coefficient of variation of normal distribution
Abstract:
Sample size determination is an active area of research in statistics. Generally, Bayesian methods provide relatively smaller sample sizes than the classical techniques, particularly average length criterion is more conventional and gives relatively small sample sizes under the given constraints. The objective of this study is to utilize major Bayesian sample size determination techniques for the coefficient of variation of normal distribution and assess their performance by comparing the results with the freqentist approach. To this end, we noticed that the average coverage criterion is the one that provides relatively smaller sample sizes than the worst outcome criterion. By comparing with the existing frequentist studies, we show that a smaller sample size is required in Bayesian methods to achieve the same efficiency.
Journal: Journal of Applied Statistics
Pages: 1271-1286
Issue: 7
Volume: 51
Year: 2024
Month: 05
X-DOI: 10.1080/02664763.2023.2197571
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2197571
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:7:p:1271-1286
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2277115_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Sameer Poongadan
Author-X-Name-First: Sameer
Author-X-Name-Last: Poongadan
Author-Name: M. C. Lineesh
Author-X-Name-First: M. C.
Author-X-Name-Last: Lineesh
Title: Forecasting of the true satellite carbon monoxide data with ensemble empirical mode decomposition, singular value decomposition and moving average
Abstract:
The forecasting of carbon monoxide in the atmosphere is essential as it causes the pollution of the atmosphere and hence severe health problems for humans. This study proposes a time-series prognosis EEMD-SVD-MA technique which incorporates Ensemble Empirical Mode Decomposition, Singular Value Decomposition and Moving Average, to predict the prospects of carbon monoxide data taken from the Indian region. The collected data are non-linear. The technique can be applied for non-stationary and non-linear data. In this approach, there are three levels: EEMD level, SVD level and MA level. The first level deploys EEMD to fragment data series into a limited number of Intrinsic Mode Function (IMF) components along with a residue. To denoise each IMF component, SVD is deployed in the second level. In the third level, each denoised IMF component is predicted by MA. The future values of the original data are obtained by adding all the predicted series of the components. In this study, we proposed two variants of the model: EEMD-SVD-MA(3) and EEMD-SVD-MA(4) and compared the results with other forecasting techniques, namely LSTM (Long Short Term Memory network), EMD-LSTM, EMD-MA, EEMD-MA and CEEMDAN-MA. The results show that the proposed EEMD-SVD-MA model is more efficient than other models.
Journal: Journal of Applied Statistics
Pages: 1412-1426
Issue: 7
Volume: 51
Year: 2024
Month: 05
X-DOI: 10.1080/02664763.2023.2277115
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2277115
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:7:p:1412-1426
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2272223_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Nikola Surjanovic
Author-X-Name-First: Nikola
Author-X-Name-Last: Surjanovic
Author-Name: Thomas M. Loughin
Author-X-Name-First: Thomas M.
Author-X-Name-Last: Loughin
Title: Improving the Hosmer-Lemeshow goodness-of-fit test in large models with replicated Bernoulli trials
Abstract:
The Hosmer-Lemeshow (HL) test is a commonly used global goodness-of-fit (GOF) test that assesses the quality of the overall fit of a logistic regression model. In this paper, we give results from simulations showing that the type I error rate (and hence power) of the HL test decreases as model complexity grows, provided that the sample size remains fixed and binary replicates (multiple Bernoulli trials) are present in the data. We demonstrate that a generalized version of the HL test (GHL) presented in previous work can offer some protection against this power loss. These results are also supported by application of both the HL and GHL test to a real-life data set. We conclude with a brief discussion explaining the behavior of the HL test, along with some guidance on how to choose between the two tests. In particular, we suggest the GHL test to be used when there are binary replicates or clusters in the covariate space, provided that the sample size is sufficiently large.
Journal: Journal of Applied Statistics
Pages: 1399-1411
Issue: 7
Volume: 51
Year: 2024
Month: 05
X-DOI: 10.1080/02664763.2023.2272223
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2272223
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:7:p:1399-1411
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2202464_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Ufuk Beyaztas
Author-X-Name-First: Ufuk
Author-X-Name-Last: Beyaztas
Author-Name: Mujgan Tez
Author-X-Name-First: Mujgan
Author-X-Name-Last: Tez
Author-Name: Han Lin Shang
Author-X-Name-First: Han
Author-X-Name-Last: Lin Shang
Title: Robust scalar-on-function partial quantile regression
Abstract:
Compared with the conditional mean regression-based scalar-on-function regression model, the scalar-on-function quantile regression is robust to outliers in the response variable. However, it is susceptible to outliers in the functional predictor (called leverage points). This is because the influence function of the regression quantiles is bounded in the response variable but unbounded in the predictor space. The leverage points may alter the eigenstructure of the predictor matrix, leading to poor estimation and prediction results. This study proposes a robust procedure to estimate the model parameters in the scalar-on-function quantile regression method and produce reliable predictions in the presence of both outliers and leverage points. The proposed method is based on a functional partial quantile regression procedure. We propose a weighted partial quantile covariance to obtain functional partial quantile components of the scalar-on-function quantile regression model. After the decomposition, the model parameters are estimated via a weighted loss function, where the robustness is obtained by iteratively reweighting the partial quantile components. The estimation and prediction performance of the proposed method is evaluated by a series of Monte-Carlo experiments and an empirical data example. The results are compared favorably with several existing methods. The method is implemented in an R package robfpqr.
Journal: Journal of Applied Statistics
Pages: 1359-1377
Issue: 7
Volume: 51
Year: 2024
Month: 05
X-DOI: 10.1080/02664763.2023.2202464
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2202464
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:7:p:1359-1377
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2197587_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Jun Ye
Author-X-Name-First: Jun
Author-X-Name-Last: Ye
Title: Functional principal component models for sparse and irregularly spaced data by Bayesian inference
Abstract:
The area of functional principal component analysis (FPCA) has seen relatively few contributions from the Bayesian inference. A Bayesian method in FPCA is developed under the cases of continuous and binary observations for sparse and irregularly spaced data. In the proposed Markov chain Monte Carlo (MCMC) method, Gibbs sampler approach is adopted to update the different variables based on their conditional posterior distributions. In FPCA, a set of eigenfunctions is suggested under Stiefel manifold, and samples are drawn from a Langevin–Bingham matrix variate distribution. Penalized splines are used to model mean trajectory and eigenfunction trajectories in generalized functional mixed models; and the proposed model is casted into a mixed-effects model framework for Bayesian inference. To determine the number of principal components, reversible jump Markov chain Monte Carlo (RJ-MCMC) algorithm is implemented. Four different simulation settings are conducted to demonstrate competitive performance against non-Bayesian approaches in FPCA. Finally, the proposed method is illustrated to the analysis of body mass index (BMI) data by gender and ethnicity.
Journal: Journal of Applied Statistics
Pages: 1287-1317
Issue: 7
Volume: 51
Year: 2024
Month: 05
X-DOI: 10.1080/02664763.2023.2197587
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2197587
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:7:p:1287-1317
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2198178_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Yonghui Liu
Author-X-Name-First: Yonghui
Author-X-Name-Last: Liu
Author-Name: Jing Wang
Author-X-Name-First: Jing
Author-X-Name-Last: Wang
Author-Name: Víctor Leiva
Author-X-Name-First: Víctor
Author-X-Name-Last: Leiva
Author-Name: Alejandra Tapia
Author-X-Name-First: Alejandra
Author-X-Name-Last: Tapia
Author-Name: Wei Tan
Author-X-Name-First: Wei
Author-X-Name-Last: Tan
Author-Name: Shuangzhe Liu
Author-X-Name-First: Shuangzhe
Author-X-Name-Last: Liu
Title: Robust autoregressive modeling and its diagnostic analytics with a COVID-19 related application
Abstract:
Autoregressive models in time series are useful in various areas. In this article, we propose a skew-t autoregressive model. We estimate its parameters using the expectation-maximization (EM) method and develop the influence methodology based on local perturbations for its validation. We obtain the normal curvatures for four perturbation strategies to identify influential observations, and then to assess their performance through Monte Carlo simulations. An example of financial data analysis is presented to study daily log-returns for Brent crude futures and investigate possible impact by the COVID-19 pandemic.
Journal: Journal of Applied Statistics
Pages: 1318-1343
Issue: 7
Volume: 51
Year: 2024
Month: 05
X-DOI: 10.1080/02664763.2023.2198178
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2198178
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:7:p:1318-1343
Template-Type: ReDIF-Article 1.0
# input file: CJAS_A_2199177_J.xml processed with: repec_from_jats12.xsl darts-xml-transformations-20240209T083504 git hash: db97ba8e3a
Author-Name: Rouba A. Chahine
Author-X-Name-First: Rouba A.
Author-X-Name-Last: Chahine
Author-Name: Inmaculada Aban
Author-X-Name-First: Inmaculada
Author-X-Name-Last: Aban
Title: Analysis of survival outcomes using likelihood ratio test in trials incorporating patient's treatment choice
Abstract:
Methods for designing and analyzing multiple arms survival trials that incorporate patient's treatment choice are needed. In these trials, patients are randomized into two groups, random and choice. Participants in the choice group choose their treatment, which is not a current standard practice in randomized clinical trials. In this paper, we propose a new method based on the likelihood function to design and analyze these trials with time to event outcomes in the presence of non-informative right censoring. We use simulations to evaluate the methods for Weibull outcomes, complete and censored. Finally, we provide an illustration for designing a study in which we discuss some design considerations and demonstrate the methods.
Journal: Journal of Applied Statistics
Pages: 1344-1358
Issue: 7
Volume: 51
Year: 2024
Month: 05
X-DOI: 10.1080/02664763.2023.2199177
File-URL: http://hdl.handle.net/10.1080/02664763.2023.2199177
File-Format: text/html
File-Restriction: Access to full text is restricted to subscribers.
Handle: RePEc:taf:japsta:v:51:y:2024:i:7:p:1344-1358