Ranked set sampling

Ranked Set Sampling

Ranked set sampling is an alternative to simple random sampling that can sometimes offer large improvements in precision. It was originally proposed in connection with estimating herbage yield in a paper by G A McIntyre [Ref 1]. In recent years it has been applied particularly to problems in environmental science. There is now quite a large literature, much of it summarised in the monograph by Chen, Bai & Sinha [Ref 2]. Here is another good introduction [Ref 3].

I first came across this technique many years ago, also in connection with estimating herbage yield [Refs 4,5]. A few years later I became interested in ranked set sampling again, after I was involved in an evaluation of the method for estimating spray deposits on the leaves of fruit trees. This page provides a brief description of this work with links to publications and original data.

Estimating spray deposits

The variable of interest here is the spray deposit per unit area of leaf. This is moderately time consuming to measure precisely, but if leaves are sprayed with a flourescent dye, the resulting deposits can be ranked visually under ultraviolet light. An example image is shown on the right.

Evaluation study

For our evaluation study we used two different sprayer settings, and sampled 125 leaves from each. In the lab, each set of 125 was randomly divided into 25 sets of 5. Four different rankers then ranked both the upper and the lower leaf surfaces in each set. This was repeated with a different randomization, to give a large amount of information about ranking errors.

For this evaluation study, all leaves were measured. To obtain the 'true' measurement for each leaf surface, we used an image analysis system to obtain a total pixel grey-scale value. This was then divided by the leaf area to give a measure of deposit density.

This study was done with Dr Roy Murray of East Malling Research.

Results

In this study, two factors reduce the potential benefits of ranked set sampling in relation to simple random sampling:

* the cost of collecting additional leaves and ranking them;

* errors that arise in the ranking process

When these factors are taken into account, the relative precision of ranked set sampling is estimated to be about 1.5 - in other words about 50% more samples would be needed to achieve the same precision using ordinary random sampling.

There was reasonable consistency between the different rankers (Kendall's coefficient of concordance was about 0.9) and estimates of relative precision were similar for all four rankers. Nor was there any great difference in relative precision between the upper and lower leaf surfaces or between the two different sprayer settings. A more detailed description of this work was published as a conference paper.

You can download the raw data from this study as a text file.

In an unpublished paper I have used a generalisation of Mallows' model to explore the ranking errors made by the four rankers [Ref 7].

Multiple measurements

A specific problem that arises with spray deposits is the need to collect data for both leaf surfaces. Deposits on the two surfaces tend to be weakly negatively correlated - if the leaf happens to be orientated so that one surface receives a lot of spray, the other surface tends to receive a low deposit.

If leaves are selected on the basis of their ranking on just one surface, the resulting sample may be very unbalanced with respect to its ranking on the other surface. This leads to a loss of efficiency. However, a simple hueristic algorithm, which can be implemented without too much additional effort, ensures that the leaves measured are reasonably balanced with respect to their ranks on both surfaces.

I gave a short talk about this work at the very enjoyable, SPRUCE Advanced Workshop on Environmental Sampling and Monitoring held in Estoril, Portugal on 22-24 March 2001. A written version was subsequently published in Environmental & Ecological Statistics [Ref 8].

References

[1] McIntyre, G.A. (1952) A method for unbiased selective sampling using ranked sets. Australian Journal of Agricultural Research, 3, 385-390.

[2] Chen, Z., Bai, Z. & Sinha, B.K. (2004) Ranked Set Sampling: Theory and Applications. Springer Lecture Notes in Statistics, No. 176.

[3] Patil, G.P. (2002) Ranked set sampling. In Encyclopedia of Environmetrics, El-Shaarawi, A.H. & Piergorsch, W.W. (eds.), Volume 3, pp. 1684-1690. Wiley: Chichester. [pdf]

[4] Cobby J.M., Ridout M.S., Bassett P.J. & Large R.V. (1985) An investigation into the use of ranked set sampling on grass and grass-clover swards. Grass and Forage Science, 40, 257-263.

[5] Ridout M.S. & Cobby J.M. (1987) Ranked set sampling with non-random selection of sets and errors in ranking. Applied Statistics, 36, 145-152.

[6] Murray R.A., Ridout M.S. and Cross J.V. (2000) The use of ranked set sampling in spray deposit assessment, Aspect of Applied Biology, 57, 141-146. [pdf]

[7] Ridout M.S. (2001) Modelling ranking errors in ranked set sampling using a generalisation Mallows phi-model. Unpublished manuscript. [pdf]

[8] Ridout M.S. (2003) On ranked set sampling for multiple characteristics. Environmental & Ecological Statistics, 10, 255-262. [pdf]

Martin Ridout

Site menu: