$\MmV$-M'M

mw

f if 0 f

*^'i*

f r '!■

pi

f:,f

>it:iir¥

>* -^

rj

/i*r^.^^^^;^^

' t- ,f

^;,i.'i^iN

:^A!;.i5? |i III «^ f ,^ fr ft

?i r^ f If

;■ W: 4 i«:, fi It i*. ' \t li It ?■

ilvil' ^-'t t t ■■■>■ 4^^*

m$[

';- 0. t*

/■ ^' * ^ il f|, I.

.*\ *

'*.f:^Mi-i ^ r3\f' $

-'

^1 ^

1*1 V

V

■II

!? '' ^ !^ J^

1'^

if

V ?^

' n

$ i

^a

^^f

>,^*^v^-

A

^'

It;

f^^

1 .■*

1 ,

^^ {<. (i ., (i, k ■■ ^^ II d i*

'^ '«-, pi (& (I J5 Ij: :i 1,

MBL

U.S. Department of Commerce

Volume 100 Number 1 January 2002

Fishery Bulletin

U.S. Department of Commerce

Donald L Evans

Secretary

National Oceanic and Atmospheric Administration

Scott B. Gudes

Acting Under Secretary for Oceans and Atmosphere

National Marine Fisheries Service

William T. Hogarth

Acting Assistant Administrator for Fisheries

Scientific Editor

Dr. John V. Merriner

Editorial Assistant

Sarah Shoffler

Center for Coastal Fisheries and Habitat Researcln, 101 Pivers Island Road Beaufort, NC 28516

NOS

^ATES O^

The Fishery Bulletin (ISSN 0090-0656) is published quarterly by the Scientific Publications Office, National Marine Fish- eries Service, NOAA, 7600 Sand Point Way NE, BIN C 15700, Seattle. WA 98 1 15-0070. Periodicals postage is paid at Seattle, WA, and at additional mailing offices. POST- MASTER; Send address changes for sub- scriptions to Fishery Bulletin. Superin- tendent of Documents, Attn.: Chief. Mail List Branch, Mail Stop SSOM, Washing- ton, DC 20402-937.3.

Although the contents of this publica- tion have. not been copyrighted and may be reprinted entirely, reference to source is appreciated.

The Secretary of Commerce has deter- mined that the publication of this peri- odical is necessary according to law for the transaction of public business of this Department. Use of funds for printing of this periodical has been approved by the Director of the Office of Management and Budget.

For sale by the Superintendent of Documents, U.S. Government Printing Office, Washington, DC 20402. Subscrip- tion price per year: $45.00 domestic and $56.25 foreign.. Cost per single issue: $28.00 domestic and $35.00 foreign. See back for order form.

Managing Editor

Sharyn Matriotti

National Marine Fisheries Service Scientific Publications Office 7600 Sand Point Way NE, BIN C15700 Seattle, Washington 98115-0070

Editorial Committee

Dr. Andrew E. Dizon Dr. Harlyn O. Halvorson Dr Ronald W. Hardy Dr Richard D. Methot Dr. Theodore W. Pietsch Dr Joseph E. Powers Dr. Harald Rosenthal Dr. Fredric M. Serchuk

National Marine Fisheries Service University of Massachusetts, Boston University of Idaho, Hagerman National Marine Fisheries Service University of Washington, Seattle National Marine Fisheries Service Universitat Kiel, Germany National Marine Fishenes Service

Fishery Bulletin web site: fishbull.noaa.gov

The Fishery Bulletin carries original research reports and technical notes on investigations in fishery science, engineering, and economics. It began as the Bulletin of the United States Fish Commission in 1881; it became the Bulletin of the Bureau of Fisheries in 1904 and the Fishery Bulletin of the Fish and Wildlife Service in 1941. Separates were issued as documents through volume 46; the last document was No. 1103. Beginning with volume 47 in 1931 and continuing through volume 62 in 1963, each separate appeared as a numbered bulletin. A new system began in 1963 with volume 63 in which papers are bound together in a single issue of the bulletin. Beginning with volume 70, number 1, January 1972, the Fishery Bulletin became a periodical, issued quarterly. In this form, it is available by subscription from the Superintendent of Documents, U.S. Government Printing Office. Washington, DC 20402. It is also available free in limited numbers to libraries, research institutions. State and Federal agencies, and in exchange for other scientific publications.

U.S. Department of Commerce

Seattle, Washington

Volume 100 Number 1 January 2002

Fishery Bulletin

Contents

JAN 3 1 mi

The conclusions and opinions expressed in Fishery Bulletin are solely those of the authors and do not represent the official position of the National Manne Fisher- ies Service (NOAA) or any other agency or institution.

The National Marine Fisheries Service (NMFS) does not approve, recommend, or endorse any proprietary product or pro- prietary matenal mentioned in this puh- lication. No reference shall be made to NMFS. or to this publication furnished by NMFS, in any advertising or sales pro- motion which would indicate or imply that NMFS approves, recommends, or endorses any propnetary product or pro- prietary matenal mentioned herein, or which has as its purpose an intent to cause directly or indirectly the advertised product to be used or purchased because of this NMFS publication-

Articles

1-10 Blick, D. James, and Peter T. Hagen

The use of agreement measures and latent class models to assess the reliability of classifying thermally marked otoliths

11-25 Carmona-Suarez, Carlos A., and Jesus E. Conde

Local distribution and abundance of swimming crabs (Calllnectes spp. and Arenaeus cribrarius) on a tropical arid beach

26-34 Crabtree, Roy E., Peter B. Hood, and Derke Snodgrass

Age, growth, and reproduction of permit (Trachinotus falcatus) in Florida waters

35-41 Denson, Michael R., Wallace E. Jenkins,

Arnold G. Woodward, and Theodore I. J. Smith

Tag-reporting levels for red drum (Saaenops ocellatus) caught by anglers in South Carolina and Georgia estuaries

42-50 Faunce, Craig H., Heather M. Patterson, and

Jerome J. Lorenz

Age, growth, and mortality of the Mayan cichlid (Cichlasoma urophthalmus) from the southeastern Everglades

51 -62 Hastings, Kelly K., and William J. Sydeman

Population status, seasonal vanation in abundance, and long-term population trends of Steller sea lions (Eumetopias jubatus) at the South Farallon Islands, California

63-73 McBride, Richard S., Michael P. Fahay, and

Kenneth W. Able

Larval and settlement periods of the northern searobin (Prionotus carollnus) and the striped searobin (P. evolans)

74-80 Pennington, Michael, Liza-Mare Burmeister, and

Vidar Hjellvik

Assessing the precision of frequency distributions estimated from trawl-survey samples

Fishery Bulletin 100(1)

81-89 Potts, Jennifer C, and Charles S. Manooch III

Estimated ages of red porgy (Pagrus pagrus) from fishery-dependent and fishery-independent data and a comparison of growth parameters

90-105 Romanov, Evgeny V.

Bycatch in the tuna purse-seine fisheries of the western Indian Ocean

106-116 Sainte-Marie, Bernard, and Denis Chabot

Ontogenetic shifts in natural diet during benthic stages of American lobster (Homarus americanus), off the Magdalen Islands

117-127 Zug, George R., George H. Balazs, Jerry A. Wetherall, Denise M. Parker, and Shawn K. K. Murakawa

Age and growth of Hawaiian seaturtles (Chelonia mydas): an analysis based on skeletochronology

Notes

128-133 DiNardo, Gerard T., Edward E. DeMartini, and Wayne R. Haight

Estimates of lobster-handling mortality associated with the Northwestern Hawaiian Islands lobster-trap fishery

134-142 Graves, John E., Brian E. Luckhurst, and Eric D. Prince

An evaluation of pop-up satellite tags for estimating postrelease survival of blue marlin (Makaira nigricans) from a recreational fishery

143-148 Hazin, Fabio H. V., Paulo G. Oliveira, and Matt K. Broadhurst

Reproduction of blacknose shark (Carcharliinus acronotus) in coastal waters off northeastern Brazil

149-152 Porch, Clay E., Charles A. Wilson, and David L. Nieland

A new growth model for red drum (Sciaenops ocellatus) that accommodates seasonal and ontogenic changes in growth rates

153 Subscription form

Abstract-Otolith thermal marking is an I'llii'it'nt method for mass mark- ing hatehcry-rcared salmon and can be used to estimate the proportion of hatchery fish captured in a mixed-stock fishery. Accuracy of the thermal pattern classification depends on the promi- nence of the pattern, the methods used to prepare and view the patterns, and the training and experience of the per- sonnel who determine the presence or absence of a particular pattern. Esti- mating accuracy rates is problematic when no secondary marking is avail- able and no error-free standards exist. Agreement measures, such as kappa I K). provide a relative measure of the reliability of the determinations when independent readings by two readers are available, but the magnitude of k can be influenced by the proportion of marked fish. If a third reader is used or if two or more groups of paired read- ings are examined, latent class models can provide estimates of the error rates of each reader. Applications of K and latent class models are illustrated by a program providing contribution esti- mates of hatchery-reared chum and sockeye salmon in Southeast Alaska.

The use of agreement measures and latent class models to assess the reliability of classifying thermally marked otoliths*

D. James Blick

Peter T. Hagen

Alaska Department of Fish and Game

Division of Commercial Fisheries

10107 Bentwood Place

Juneau, Alaska 99802-5526

E mail address ((or P T Hagen, contact author) peter hagenmifishgame state ak us

Manuscript accepted 16 April 2001. Fish. Bull. 100:1-10(2002).

The ability to induce patterns in salmon otoliths by manipulating water temper- atures has proved to be an efficient means for marking large numbers of salmon (Volket al., 1990). Wlien salmon embryos or alevins are exposed to a rapid drop in temperature, otolith growth is temporarily disrupted, and this results in a discontinuity in the otolith "s microstructure. When viewed under transmitted light microscopy, this discontinuity appears as a dark ring. By controlling the number of tem- perature drops and the timing between drops, a coded pattern of dark rings can be recorded on the otolith and this pattern can be recovered from otoliths of older fish by removing the overlay- ing material and exposing the otolith core. For hatcheries that release a large number of fish, this type of marking method has shown to be particularly cost effective for marking 100% of the releases (Munk et al. 1993).

Several fisheries management pro- grams in Alaska use thermal marking to estimate hatchery contributions to commercial fisheries (Hagen et al., 1995). Typically, several hundred salm- on otoliths are systematically collected during each two- or three-day com- mercial opening during the fishing sea- son. The otoliths and sampling data are shipped to a processing laboratory where a subsample of otoliths (generally 50 to 100) are processed immediately to meet in-season management needs; a portion of the remaining otoliths are processed later to provide an overall es- timate of hatchery contribution to the fisheries.

The process by which a reader de- termines the presence or absence of a thermal mark in an otolith can be char- acterized as one of pattern recognition and image matching. Prior to examin- ing otoliths of unknown origin, the read- ers gain familiarity with the patterns likely to be encountered by carefully examining fry otoliths that were ob- tained after thermal marking but prior to their release into the wild. Because there can be wide variation in the ap- pearance of the thermal marks within a mark group (due in part to differenc- es in developmental stages at marking), a single mark group may be represent- ed by a variety of patterns. As a result, secondary characteristics and measure- ments of the patterns are sometimes necessary to identify an otolith to a mark group. The examination is also used to confirm that all the hatchery fish have been successfully marked.

The process of making a determina- tion on otoliths from returning adult salmon can become problematic be- cause wild salmon may also contain otolith patterns that can mimic the fea- tures imposed through thermal mark- ing. Referred to as "noisy patterns," their presence can increase the rate of false positives. Conversely, if the hatch- ery employs poor temperature control or unintended disruptions occur around the period of marking, it may be diffi- cult to identify the otolith as that of a

* Contribution PP-184 of the Alaska De- partment of Fish and Game, Commercial Fisheries Division, Juneau, Alaska 99802- 5526.

Fishery Bulletin 100(1)

hatchery fish, and this would increase the rate of false negatives. Differences between readers in skill and train- ing level, and how they process otoliths, can add to the un- certainty in estimating the accuracy of the readings and the rates of false positives and negatives.

Otolith marking generally takes place without any sec- ondary marking, such as fin-clipping or coded-wire-tag- ging; therefore the accuracy of a reading cannot directly be determined through conventional methods that make use of a "gold standard" (known origin sample) or other error-free classification methods. To ensure that the in- formation provided to the Alaskan fisheries managers is accurate, each otolith is independently examined by two readers, and a third reading is used to resolve differenc- es between the first two readings. The resolved readings are used to estimate the contribution of hatchery fish, and the presumption of accuracy is based on the premise that, through multiple readings, all marked fish are ei- ther correctly identified or that errors, if present, are in- consequential. Developing the analytical tools to deter- mine the veracity of that assumption is the objective of this investigation, and by establishing such tools, quality control standards for recovering thermal marks can be developed.

In developing the tools to measure the quality of otolith readings, three questions are addressed:

1 How to assess the reliability of otolith readings when no standards are available.

2 How to estimate the proportion of hatchery marks when there is disagreement between two or more readers.

3 How the precision of the estimate of the proportion is influenced by classification error

We discuss two approaches: 1 ) indices of agreement typi- cally used in reliability studies, and 2) latent class models where classification errors are estimated for each reader even though the true error rate is considered unknown. The data requirements and their attendant assumptions are presented for each approach. The methods are illus- trated by examining among-reader comparisons of chum salmon (Oncorhynchus keta) and sockeye (Oncorhynchus nerka) salmon otoliths collected from programs that moni- tor inseason contributions of hatchery fish in several com- mercial fisheries in Southeast Alaska (Hagen et al., 1995). The results are used to provide recommendations for mon- itoring the quality of otolith readings for thermal marking programs.

Table 1

Notation used to show the cross-classification of a sample of fi otoliths by two readers to either hatchery (H) or wild stock (W) assignment. Row and column sums are indicated by the subscript "."

Reader 1

H

Reader

2

"h-

H

W

"hh

"hh

W

"WH

"ww

«w

"•H

"w

/;

2 is infallible (or is considered a "gold standard"), unbiased estimates of the accuracy and error rates of reader 1 and the proportion of hatchery stocks (p) are given by

'^HlH ~ "hh/"h- '^WjH ~ "\VH ^ " H - 1 ■'''hIH

'^wjw ~ "vv\\7" w- '^Hiw ~ "hw I " w = 1~ ■'^w|w P = "h/".

(where, for example, ;r\v|n refers to the probability that reader 1 classifies an otolith as W when its true state is H). These estimates reflect the fact that reader 2 is infallible; the accuracy rates CThih' '^wiw' ^"d the error rates CTwifi- Tc■^^,^^) are conditional on the numbers of hatchery or wild stock otoliths as determined by reader 2.

No standard available

If a standard is not available, an unbiased estimate of p can be obtained if the accuracy rates for reader 1 are known. The estimate is

p* = ("n/«+^wr

I'/f'^HI

H|H

' W|W

1),

where n■^^ is the number of otoliths classified as hatchery otoliths. If the accuracy rates are estimated, thenp* will no longer be unbiased, but will be much less biased than the estimator n■^^ln and will in general have a much smaller mean-squared error (Rogan and Gladen, 1978). For a Bayesian approach to this problem, see Viana et al. ( 1993 ) and Joseph et al. ( 1995).

Methods

Standard available

A sample of /i otoliths, which are examined by two readers, can be cross-classified as hatchery (H) or wild stock (W) as in Table 1. Suppose we wish to estimate the accuracy rate (probability of making a correct classification) or con- versely, the error rate ( probability of making a wrong clas- sification). If we know nothing about reader 1, but reader

Agreement measures When accuracy rates are unavail- able, statistics that measure "agreement" between readers are often calculated (e.g. Fleiss, 1981). One such index is simply the proportion of observed agreement (P„), defined as

:(/(.

)ln.

Another index, called kappa (k), corrects P„ for the degree of agi'eement that is expected by chance alone. It is defined as

Blick and Hagen Use of agreement measures and latent class models to assess the reliability of classifying ttiermally marked otoliths 3

K = iP„-P,.)/(l-P^.),

where P,, = expected agreement = ('!h"h + "w"w'^"^- ^^^ divisor, 1 - P., constrains k to be less than or equal to one, and if all agreement is due to chance {P^=PJ, then k: equals zero. Note that with k; independence between readers is assumed in order to calculate expected agreement.

An example of how agreement indices can be used to monitor readings is shown in Figure 1, which displays k and its standard error for 2874 chum otoliths readings di- vided into 27 groups based on different reader pairs and capture locations. Included are P,'s for four of the groups. The results indicate that v levels were similar between the different groups, suggesting overall consistency in read- ings, although some of the groups had lower values, which in practice would invite further investigation.

The Pg's in Figure 1 have a different rank order than the ic values. This apparent discrepancy highlights a potential problem in interpretation when using agreement indices to draw conclusions. To help illustrate this point, consider the following examples (Table 2). Table 2A is generated as the expected counts, given ;rj,|j^ = 0.9 and %|w = 1-0 for both readers, and p = 0.1. In this case, P, = 0.98 and k: = 0.89. On the other hand. Table 2B is generated under the same assumptions except that rt^n = 0.5. In this case P„ drops only slightly to 0.95, whereas v drops to 0.47. Be- cause the hatchery stock is rare, the inability of the read- ers to detect the mark is not well reflected by P„ whereas k reflects it better by correcting for the high level of chance agreement.

Now let K,

HIH

0.9 and /Twiw = 0.9 for both readers, and

0.64. On

P= 0.5 (Table 2C). In this case, P, = 0.82 and k the other hand. Table 2D is generated under the same as- sumptions except that P= 0.05. In this case, P, remains unchanged at 0.82, but \' drops to 0.25.

In none of the above examples is the index "wrong." Rather, as is the case with most indices, interpretation is affected by the values of the underlying parameters. In the latter example (Table 2, C-D), even though P, is the same for C and D, the scale it is being compared with has changed, thus changing the value of k. This increases the difficulty of comparing k across populations with differ- ent underlying proportions. Note also that Table 2D could have been derived from %|h = 0.5 and ttwiw = 0.944 for both readers, andp = 0.19. Thus, without additional infor- mation, it is impossible to draw reliable conclusions about reader accuracies or the proportion of hatchery marks.

Although agreement measures can be ambiguously in- terpreted, in practice they can still sei've a useful moni- toring role during routine comparisons when the circum- stances of the readings are fairly well characterized. The interpretive difficulties with indices such k and P, become apparent when trying to translate agi"eement measures into statements about the accuracy of different readers and about the influence of reading error on the contribu- tion estimates.

Latent class models An alternative approach is to try to estimate tTj^, j^ and tt^viw f""" each reader, along with p. Although at first thought this may seem impossible, it can

1 0

ra 0.6

04

02

T -^ 8si 920 tl

J_

J_

10 20

Group number

30

Figure 1

The values of k{±1 SE) from 27 gi'oups of paired read- ings of chum salmon otoliths (total=2874). The groups are based on pairs of different readers examining oto- liths collected at different times and locations. The pro- portion of agreement (P,) is shown next to group 4, 7, 9, and 12 for comparison with the value of k.

be shown that either by setting a few constraints or by col- lecting additional information, estimation is indeed pos- sible. This problem falls into the category of latent class modeling (e.g. Everitt, 1984; Bartholomew, 1987; McCutch- eon, 1987; Clogg, 1995). Latent class models (LCMs) belong to a family of latent variable models that hypothesize the existence of unobservable "latent" variables, about which information can be obtained only though measurements on observable "manifest" variables. LCMs specifically restrict the latent and manifest variables to be categorical. In the present situation, the latent variable is the true class (H or W) to which the otolith belongs, whereas the mani- fest variables are the readers' classifications. Such models have been used for assessing reliability of diagnostic tests in the medical field over the last 20 years (see Walter and Irwig, 1988; Formann, 1996, for reviews).

Returning to the problem with two readers, neither of which is a standard, there are five essential parameters to estimate: s-i)|H,^H|H'^w|w.'fw|'w ' andp, with only 3 df (four pieces of data, /i^H' "hw "wH' "ww- minus one because the sample size, n, is fixed). Thus, the model is overparameter- ized, and either constraints on the parameters or more da- ta are needed. Possible constraints include 1) considering that two of the parameters are known (e.g. /r^vjw = Tw|w = ^• i.e. both readers always call a wild stock correctly, there are no "false positives"), or 2) considering that two sets of parameters are equal (e.g. t1|'|'h , 7r|f|H , ;r\v|'\v ='fwi'w' i-^- the accuracy rates are the same for both readers).

Although there may be times when such constraints are realistic, in general they will not be; therefore more infor-

Fishery Bulletin 100(1)

mation will be necessary. One way to generate more in- formation is to have a third independent reader (Walter, 1984). With three readers, there are seven essential pa- rameters; 7i'ii;^"-'''\r^'^\i,''-"" and p. There is also 2^ - 1 = 7 df, so that all the parameters are estimable. Estimation is most commonly done by the method of maximum likeli- hood.

If readings are assumed to be independent among read- ers and among otoliths, the likelihood function is

i = H,\V ) = H,\V*:H,VV

This likelihood function must be maximized numerically and methods for this computation will be discussed later If more than three readers are used, there are extra de- gi-ees of freedom that can be used to assess goodness-of-fit.

For example, with four readers there will be nine param- eters with 15 df leaving 6 df for goodness-of-fit. Pearson chi- square or likelihood ratio G'-^ tests would both be applicable. Another way to generate additional information was proposed by Hui and Walter ( 1980). Suppose there are two or more strata with different hatchery proportions in each strata. For example, catch could be stratified temporally or spatially. If it is assumed that ;r||||| and /Ty^iw remain constant over strata, then a solution for just two readers may be obtained. For example, if there are two readers and two strata, then there are six parameters; 'rH|H"'>'i'w|w' > Pj, and p.,, with 2(2'^ - 1) = 6 df Increasing the number of strata increases the degrees of freedom; e.g. three strata for two readers gives 3(2^ - 1) = 9 df for 7 parameters. The likelihood function for two readers and S strata is

fin ni^^'^^iH'^^^+'i-^.

(1) 12) 1" 'Iw'TjIWl

g=l (=H,W_/ = H.W

Table 2

Examples from cross-classification data generated as expected counts from a sample of 1000 otoliths based on different accuracy rates for identifying hatchery fish < tt,., | ,^ I and wild fish (/Twiw' under different mark proportions tp). The examples used illustrate differences between obsei"ved agreement IP, i and chance-corrected agi-eement U') under different underlying conditions.

A

H

Reader 2

90

^Hl

H ~

0.9

P„

= 0.98

'

H

W

Reader 1

81

9

W

9

901

910

%

\V ~

1.0

K-

0.89

Total

90

910

1000

/' =

0.1

B

H

Reader 2

50

^11

n =

0.5

P,

= 0.95

H

W

Reader 1

2.5

25

W

25

925

950

%

w =

1.0

K-

= 0.47

Total

50

950

1000

P =

0.1

C

Reader 1

H

Reader 2

500

'^ll|H =

0.9

P„

= 0.82

H

W

410

90

W

90

410

500

%■

w -

0.9

V

= 0.64

Total

500

500

1000

P =

0.5

D

Reader 1

H

Reader 2

140

'fH

H =

0.9

P,

= 0.82

H

W

50

90

W

90

770

860

Tu

|\V =

-0.9

K

= 0.25

Total

140

860

1000

P =

0.05

A third way to supply additional information is to take a Bayesian approach (see "Discussion" section). By speci- fying prior distributions of the model parameters, unique estimates can be obtained (Joseph et al., 1995).

A critical assumption in the above models is that read- ings are independent. Specifically, the reading of each oto- lith by a given reader is independent of any other reading by the same reader, and each reading by various readers on a given otolith is independent given the true state of the otolith. In principle, the latter assumption may be dif- ficult to meet especially if all readers examine the same otolith. The fact that the otolith is not prepared indepen- dently by each reader could induce a dependence among the readers. Also, variability in the readability of the mark due to the marking process can induce a dependence. Such dependence can bias the estimators of n and p (Vacek, 1985). Note that this latter assumption of independence is also required for v.

One remedy for the problem of dependence due to prepa- ration is to require independent preparations. This however, requires additional otoliths and with only two otoliths per fi.sh, this would limit the number of readers to two. But in practice, this may not be a large concern. Typically, the second reader has the option to provide additional process- ing effort to the first otolith or, if needed, to process the second otolith. In almost all cases additional preparation is not done and readers feel they are able to extract suf- ficient information about the presence or absence of a mark from each other's preparations. In addition, reader accura- cy rates obtained by LCM do not appear to vary systemati- cally with the reading order, which also suggests that prep- aration-induced dependency is not a significant factor

Dependency associated with variability in the appear- ance of the mark may be harder to address. A general so- lution is to model the dependence with additional param- eters (e.g. Vacek, 1985; Qu et al., 1996; Yang and Becker, 1997; Qu and Hagdu; 1998; Albert et al., 2001). Modeling dependence requires either more readers or more strata. These modeling approaches are complicated and are cur- rently evolving (see Albert et al., 2001). Alternatively, ad-

Blick and Hagen: Use of agreement measures and latent class models to assess the reliability of classifying tfiermally marked otolitfis

ditional latent classes may be added (Christenson ct al., 1992; Forniann, 1994), e.g. a third class of otoliths from ambiguous sources.

In the previous discussion concerning three or more readers, we implied that readers were different individu- als. This need not be so; what is required are three or more independent readings. If it were possible for the same in- dividual to read the same otolith more than once, indepen- dently, then the number of different readers could be re- duced. If independence could not be met, the dependence could be modeled, as discussed above.

Another critical assumption, but one that should be met most of the time, is that the individual accuracy rates are known to be either greater than or less than the error rates (e.g. %|h > ^wm ^^'^ %-|W ^ %|W' which im- plies that ^Tj^iH and JT^,-^ are either greater than or less than 0.5) because of an inherent symmetry in the problem that results in the same likelihood function being gener- ated when the error rates are switched with the accuracy rates.

Computation Formulas for estimating \'and its standard error are straightforward (Fleiss, 1981). Estimates can also be obtained from several software packages including PROC FREQ in SAS (SAS Institute, 1989).

Maximizing either of the likelihood functions for the LCMs requires a numerical procedure. The most straight- forward is to use an optimization routine such as "Solver" in Excel (Microsoft Corporation, 1993) or "nlminb" in S- PLUS (Statistical Sciences, 1995). Alternatively, the EM algorithm (Dempster et al., 1977; Dawid and Skene, 1979; McLachlan and Krishnan, 1997) can be easily used. The simplicity of the EM algorithm follows from the recogni- tion that the LCM is an example of a finite mixture prob- lem, specifically, in this case, a mixture of multivariate Bernoulli distributions with mixing parameter p (Everitt, 1984). Use of the EM algorithm for such mixture prob- lems in fisheries is well documented, e.g. for stock compo- sition estimates (Millar, 1987; Pella et al., 1996) and for age-length keys (Kimura and Chikuni, 1987). A more ef- ficient alternative to the EM algorithm is to use iteratively reweighted least squares (Agresti, 1990). This method is relatively easy to implement in software such as PROC NLIN in SAS (SAS Institute, 1989). Perhaps the most di- rect and efficient way would be to use LCM software. We are not aware of any routines for LCMs in any major statistical package at present, but several independent LCM packages exist (for a review, see Clogg, 1995; and for an Internet listing see http://oui-world.compuserve.com/ homepages/jsuebersax/index.htm).

As with many maximum likelihood problems, where nu- merical methods must be used, complications can arise. Constraints may at times be needed to ensure that pa- rameter estimates fall in acceptable intervals (e.g. [0,1] for p and [0.5,1] for the ;r's). Also the likelihood function may have local maxima, which means that several runs with varying starting values may be necessary to identify the global maximum. Finally, estimates of standard er- rors may entail additional computing. PROC NLIN in SAS provides asymptotic (i.e. large-sample) standard errors.

Jackknife and bootstrap estimates are relatively easy to program, the jackknife being much less computationally intensive.

Finally, the Bayesian programs discussed in Joseph et al. (1995) can be found at http://www.epi.mcgill.ca/Josepli/ software. html.

Examples

The first example analyzes the results of three readers examining 570 chum otoliths. The samples were taken from a common location, and the readers were familiar with the patterns. Each reading was made without knowl- edge of prior readings. The data, along with pairwise k estimates and the LCM parameter estimates (using PROC NLIN in SAS; see appendix for code) are presented in Table 3.

These results indicate that the third reader is signifi- cantly (a=0.05) less able to correctly identify a hatchery mark when it is present and that there are no significant differences among readers in their ability to detect a wild mark when it is present. These conclusions are readily ap- parent from the table of results, and although the pairwise K"'s are consistent with these results, they are more dif- ficult to interpret. With the variance due to sampling es- timated to be (0.7379X1 - 0.7379)/(570 - 1) = 0.0003399, misclassification error contributes only 0.36% to the total variance.

The second example consists of two readers with four spatial strata. Samples were obtained from sockeye salm- on caught in four neighboring Alaskan gillnet fisheries in central Southeast Alaska. The data and the LCM esti- mates are shown in Table 4. These estimates indicate that the readers are not statistically different in their ability to detect hatchery marks, whereas the second reader is bet- ter able to distinguish wild marks. With eight parameters and 12 df there are 4 df available for a goodness-of-fit test. Pearson's chi-square yields 4.83, which with 4 df, has a p-value of 0.306, thus indicating an acceptable model fit. Misclassification error contributes from about 8% to 14% to the total variance in the estimates of the proportion of hatchery stock.

Design considerations

Design of an otolith reading program is complicated by misclassification error. An important consideration is the precision of the estimates, in particular the precision of the estimate ofp. Table 5 shows the asymptotic standard error of p for various combinations ofp, /r^iH' ^^^ ^wiv! f'"' '-^e three-reader model with unknown accuracies, and the one-, two-, and three-reader models with accuracies assumed known. Although this table is derived for a sample of 1000 otoliths, the ratio of any two standard errors within the table would be the same for any sample size (assuming the sample size is large enough to approximate the asymptotic conditions). It is evident that misclassification inflates the standard error over the usual binomial case (right-most column). The table also makes clear the increase in the uncertainty of estimating p when the accuracies also have

Fishery Bulletin 100(1)

Table 3

Cross-classification data and results for 570 chum

otoliths examined bv three

readers showing the parameter estimates and stan-

dard errors from the latent class model, followed by a comparison of the differences

among

reader pan's by

jsing

kapp

3 and the

latent class model (LCM) accuracy rates. The data

show that the high agreement among read

ers as to hatcher

V and

wild (

lassifica-

tion (e.g. HHH=

406 and WWW=

= 135) is reflected in the overall high accuracy

rates estimated from the LCM

However the model

also shows that reader 3 has a significantly lower

accuracy rate in detecting hatchery

marks (;rij5'|H=0.969) than the other readers.

Reading

Count

LCM Parameter

Estimate

SE

HHH

406

'Thih

0.998

0.002

HHW

13

'f'&IH

0.998

0.002

HWH

1

'^'^IH

0.969

0.008

WHH

1

f'^'jW

0.958

0.017

HWW

6

t'w/|w

0.986

0.010

WHW

2

*rl3t

0.957

0.017

WWH

6

P

0.738

0.018

WWW

135

Reader pairs

K

SE

Difference in tTj^ih

SE

Difference in ^-^v

SE

land 2

0.954

0.014

0.000

0.004

-0.028

0.020

lands

0.882

0.022

0.029

0.009

0.000

0.024

2 and 3

0.901

0.021

0.029

0.009

0.028

0.020

Table 4

Cross-classification

data for 2340 sockeye otoliths

e.xamined bv two

readers

and stratified by four fishing districts

showing the

estimates of the latent class parameters and their

standard errors. Between-

reader comparison is based

on whether the difference

in accuracy estimates are

significantly different th

an zero. The result

s indicate that the

readers were not statistical!

V different in

detecting hatchery

marks

' "^H 1 H ' ^"^'- were

statistically different in detecting

wild marks

(;rw|w'LCM =

latent class

model.

Fishing districts

108-30

108-50

106-41

106-30

HH

152

127

85

20

HW

11

9

21

5

WH

2

6

5

1

WW

271

382

832

411

n

436

524

943

437

LCM parameter

Estimate

SE

Reader difference

SE

'^hih'"

rr <2> "HjH

0.980 0.964

0.013 0.021

0.017

0.025

IT 11'

"w 1 W

TT 12'

''W|W

0.984 0.997

0.005 0.003

-0.013

0.006

P108-30

0.366

0.024

Pi 08-50

0.257

0.020

P1O6--H

0.096

0.010

P1O6-3O

0.047

0.011

to be estimated in the three-reader case. For example, if = 0.8 for all three readers, one would have to

'^HlH

''wlw

have almost twice (0.035/.019=1.84) the sample size to esti- mate ap of about 0.5. Once accuracy estimates for the read-

ers are obtained, dropping one or even two readers may be appropriate, although the assumption must be made that the accuracy rates will be constant for the remainder of the program. Maintaining two readers will allow for that

Blick and Hagen: Use of agreement measures and latent class models to assess the reliability of classifying thermally marked otoliths

Table 5

AsyniptolK'

Uand

ard errors

Ibr the cs

timalcd ])!

opor'tion of

marked fish

p, for various combinat

ons

of accuracy rates in identify- |

iiifj; halclu'rv

fish.

;r|,|„,and

wild fisli,

%|W'"«1

mark proportion p, for a sample of 1000 otoliths

Val

ues are

reported foi

the cases

u liero accur

icy r

ites, K. are

the same

and assumed known

or one, two.

or three readers

and for the

case w

lere ;r's are

estimated

lor three readers. Table illustrates how misclassification will increase standard errors in

the estimate of hatchery proportion.

'''n 1 11

0.8

0.9

1.0

'^W 1 w

0.8

0.9

1.0

0.8

0.9

1.0

0.8

0.9

1.0

,'i readers

P

0.1

0.032

0.016

0.011

0.023

0.013

0.010

0.018

0.011

0.009

1 rfs estimated)

0.3

0.034

0.021

0.017

0.024

0.017

0.015

0.020

0.015

0.014

0.5

0.035

0.023

0.019

0.023

0.018

0.016

0.019

0.016

0.016

0.7

0.034

0.024

0.020

0.021

0.017

0.015

0.017

0.015

0.014

0.9

0.032

0.023

0.018

0.016

0.013

0.011

0.011

0.010

0.009

3 readers

0.1

0.013

0.011

0.010

0.011

0.010

0.009

0.010

0.010

0.009