$\MmV$M'M
mw
f if 0 f
*^'i*
f r '!■
pi
f:,f
>it:iir¥
>* ^
rj
/i*r^.^^^^;^^
' t ,f t»
^;,i.'i^iN
:^A!;.i5? i III «^ f ,^ fr ft
?i r^ f If
;■ W: 4 ■ i«:, fi It i*. ' \t li It • ?■
ilvil' ^'t t t ■■■>■ 4^^*
m$[
'; 0. t*
/■ ^' * ^ il f, I.
.*\ *
'*.f:^Mii ^ r3\f' $
' 
^1 ^ 
1*1 V 

V 
■II 
!? '' ^ !^ J^ 

1'^ 
if 

V ?^ 
' n 
$ i 

^a 
^^f 
>,^*^v^ 
A 

^' 
It; 
f^^ 

1 .■* 
1 , 
^^ {<. (i ., (i, k ■■ ^^ ■ II d i*
'^ '«, pi (& (I J5 Ij: :i 1,
MBL
U.S. Department of Commerce
Volume 100 Number 1 January 2002
Fishery Bulletin
U.S. Department of Commerce
Donald L Evans
Secretary
National Oceanic and Atmospheric Administration
Scott B. Gudes
Acting Under Secretary for Oceans and Atmosphere
National Marine Fisheries Service
William T. Hogarth
Acting Assistant Administrator for Fisheries
Scientific Editor
Dr. John V. Merriner
Editorial Assistant
Sarah Shoffler
Center for Coastal Fisheries and Habitat Researcln, 101 Pivers Island Road Beaufort, NC 28516
NOS
^ATES O^
The Fishery Bulletin (ISSN 00900656) is published quarterly by the Scientific Publications Office, National Marine Fish eries Service, NOAA, 7600 Sand Point Way NE, BIN C 15700, Seattle. WA 98 1 150070. Periodicals postage is paid at Seattle, WA, and at additional mailing offices. POST MASTER; Send address changes for sub scriptions to Fishery Bulletin. Superin tendent of Documents, Attn.: Chief. Mail List Branch, Mail Stop SSOM, Washing ton, DC 20402937.3.
Although the contents of this publica tion have. not been copyrighted and may be reprinted entirely, reference to source is appreciated.
The Secretary of Commerce has deter mined that the publication of this peri odical is necessary according to law for the transaction of public business of this Department. Use of funds for printing of this periodical has been approved by the Director of the Office of Management and Budget.
For sale by the Superintendent of Documents, U.S. Government Printing Office, Washington, DC 20402. Subscrip tion price per year: $45.00 domestic and $56.25 foreign.. Cost per single issue: $28.00 domestic and $35.00 foreign. See back for order form.
Managing Editor
Sharyn Matriotti
National Marine Fisheries Service Scientific Publications Office 7600 Sand Point Way NE, BIN C15700 Seattle, Washington 981150070
Editorial Committee
Dr. Andrew E. Dizon Dr. Harlyn O. Halvorson Dr Ronald W. Hardy Dr Richard D. Methot Dr. Theodore W. Pietsch Dr Joseph E. Powers Dr. Harald Rosenthal Dr. Fredric M. Serchuk
National Marine Fisheries Service University of Massachusetts, Boston University of Idaho, Hagerman National Marine Fisheries Service University of Washington, Seattle National Marine Fisheries Service Universitat Kiel, Germany National Marine Fishenes Service
Fishery Bulletin web site: fishbull.noaa.gov
The Fishery Bulletin carries original research reports and technical notes on investigations in fishery science, engineering, and economics. It began as the Bulletin of the United States Fish Commission in 1881; it became the Bulletin of the Bureau of Fisheries in 1904 and the Fishery Bulletin of the Fish and Wildlife Service in 1941. Separates were issued as documents through volume 46; the last document was No. 1103. Beginning with volume 47 in 1931 and continuing through volume 62 in 1963, each separate appeared as a numbered bulletin. A new system began in 1963 with volume 63 in which papers are bound together in a single issue of the bulletin. Beginning with volume 70, number 1, January 1972, the Fishery Bulletin became a periodical, issued quarterly. In this form, it is available by subscription from the Superintendent of Documents, U.S. Government Printing Office. Washington, DC 20402. It is also available free in limited numbers to libraries, research institutions. State and Federal agencies, and in exchange for other scientific publications.
U.S. Department of Commerce
Seattle, Washington
Volume 100 Number 1 January 2002
Fishery Bulletin
Contents
JAN 3 1 mi
The conclusions and opinions expressed in Fishery Bulletin are solely those of the authors and do not represent the official position of the National Manne Fisher ies Service (NOAA) or any other agency or institution.
The National Marine Fisheries Service (NMFS) does not approve, recommend, or endorse any proprietary product or pro prietary matenal mentioned in this puh lication. No reference shall be made to NMFS. or to this publication furnished by NMFS, in any advertising or sales pro motion which would indicate or imply that NMFS approves, recommends, or endorses any propnetary product or pro prietary matenal mentioned herein, or which has as its purpose an intent to cause directly or indirectly the advertised product to be used or purchased because of this NMFS publication
Articles
110 Blick, D. James, and Peter T. Hagen
The use of agreement measures and latent class models to assess the reliability of classifying thermally marked otoliths
1125 CarmonaSuarez, Carlos A., and Jesus E. Conde
Local distribution and abundance of swimming crabs (Calllnectes spp. and Arenaeus cribrarius) on a tropical arid beach
2634 Crabtree, Roy E., Peter B. Hood, and Derke Snodgrass
Age, growth, and reproduction of permit (Trachinotus falcatus) in Florida waters
3541 Denson, Michael R., Wallace E. Jenkins,
Arnold G. Woodward, and Theodore I. J. Smith
Tagreporting levels for red drum (Saaenops ocellatus) caught by anglers in South Carolina and Georgia estuaries
4250 Faunce, Craig H., Heather M. Patterson, and
Jerome J. Lorenz
Age, growth, and mortality of the Mayan cichlid (Cichlasoma urophthalmus) from the southeastern Everglades
51 62 Hastings, Kelly K., and William J. Sydeman
Population status, seasonal vanation in abundance, and longterm population trends of Steller sea lions (Eumetopias jubatus) at the South Farallon Islands, California
6373 McBride, Richard S., Michael P. Fahay, and
Kenneth W. Able
Larval and settlement periods of the northern searobin (Prionotus carollnus) and the striped searobin (P. evolans)
7480 Pennington, Michael, LizaMare Burmeister, and
Vidar Hjellvik
Assessing the precision of frequency distributions estimated from trawlsurvey samples
Fishery Bulletin 100(1)
8189 Potts, Jennifer C, and Charles S. Manooch III
Estimated ages of red porgy (Pagrus pagrus) from fisherydependent and fisheryindependent data and a comparison of growth parameters
90105 Romanov, Evgeny V.
Bycatch in the tuna purseseine fisheries of the western Indian Ocean
106116 SainteMarie, Bernard, and Denis Chabot
Ontogenetic shifts in natural diet during benthic stages of American lobster (Homarus americanus), off the Magdalen Islands
117127 Zug, George R., George H. Balazs, Jerry A. Wetherall, Denise M. Parker, and Shawn K. K. Murakawa
Age and growth of Hawaiian seaturtles (Chelonia mydas): an analysis based on skeletochronology
Notes
128133 DiNardo, Gerard T., Edward E. DeMartini, and Wayne R. Haight
Estimates of lobsterhandling mortality associated with the Northwestern Hawaiian Islands lobstertrap fishery
134142 Graves, John E., Brian E. Luckhurst, and Eric D. Prince
An evaluation of popup satellite tags for estimating postrelease survival of blue marlin (Makaira nigricans) from a recreational fishery
143148 Hazin, Fabio H. V., Paulo G. Oliveira, and Matt K. Broadhurst
Reproduction of blacknose shark (Carcharliinus acronotus) in coastal waters off northeastern Brazil
149152 Porch, Clay E., Charles A. Wilson, and David L. Nieland
A new growth model for red drum (Sciaenops ocellatus) that accommodates seasonal and ontogenic changes in growth rates
153 Subscription form
AbstractOtolith thermal marking is an I'llii'it'nt method for mass mark ing hatehcryrcared salmon and can be used to estimate the proportion of hatchery fish captured in a mixedstock fishery. Accuracy of the thermal pattern classification depends on the promi nence of the pattern, the methods used to prepare and view the patterns, and the training and experience of the per sonnel who determine the presence or absence of a particular pattern. Esti mating accuracy rates is problematic when no secondary marking is avail able and no errorfree standards exist. Agreement measures, such as kappa I K). provide a relative measure of the reliability of the determinations when independent readings by two readers are available, but the magnitude of k can be influenced by the proportion of marked fish. If a third reader is used or if two or more groups of paired read ings are examined, latent class models can provide estimates of the error rates of each reader. Applications of K and latent class models are illustrated by a program providing contribution esti mates of hatcheryreared chum and sockeye salmon in Southeast Alaska.
The use of agreement measures and latent class models to assess the reliability of classifying thermally marked otoliths*
D. James Blick
Peter T. Hagen
Alaska Department of Fish and Game
Division of Commercial Fisheries
10107 Bentwood Place
Juneau, Alaska 998025526
E mail address ((or P T Hagen, contact author) peter hagenmifishgame state ak us
Manuscript accepted 16 April 2001. Fish. Bull. 100:110(2002).
The ability to induce patterns in salmon otoliths by manipulating water temper atures has proved to be an efficient means for marking large numbers of salmon (Volket al., 1990). Wlien salmon embryos or alevins are exposed to a rapid drop in temperature, otolith growth is temporarily disrupted, and this results in a discontinuity in the otolith "s microstructure. When viewed under transmitted light microscopy, this discontinuity appears as a dark ring. By controlling the number of tem perature drops and the timing between drops, a coded pattern of dark rings can be recorded on the otolith and this pattern can be recovered from otoliths of older fish by removing the overlay ing material and exposing the otolith core. For hatcheries that release a large number of fish, this type of marking method has shown to be particularly cost effective for marking 100% of the releases (Munk et al. 1993).
Several fisheries management pro grams in Alaska use thermal marking to estimate hatchery contributions to commercial fisheries (Hagen et al., 1995). Typically, several hundred salm on otoliths are systematically collected during each two or threeday com mercial opening during the fishing sea son. The otoliths and sampling data are shipped to a processing laboratory where a subsample of otoliths (generally 50 to 100) are processed immediately to meet inseason management needs; a portion of the remaining otoliths are processed later to provide an overall es timate of hatchery contribution to the fisheries.
The process by which a reader de termines the presence or absence of a thermal mark in an otolith can be char acterized as one of pattern recognition and image matching. Prior to examin ing otoliths of unknown origin, the read ers gain familiarity with the patterns likely to be encountered by carefully examining fry otoliths that were ob tained after thermal marking but prior to their release into the wild. Because there can be wide variation in the ap pearance of the thermal marks within a mark group (due in part to differenc es in developmental stages at marking), a single mark group may be represent ed by a variety of patterns. As a result, secondary characteristics and measure ments of the patterns are sometimes necessary to identify an otolith to a mark group. The examination is also used to confirm that all the hatchery fish have been successfully marked.
The process of making a determina tion on otoliths from returning adult salmon can become problematic be cause wild salmon may also contain otolith patterns that can mimic the fea tures imposed through thermal mark ing. Referred to as "noisy patterns," their presence can increase the rate of false positives. Conversely, if the hatch ery employs poor temperature control or unintended disruptions occur around the period of marking, it may be diffi cult to identify the otolith as that of a
* Contribution PP184 of the Alaska De partment of Fish and Game, Commercial Fisheries Division, Juneau, Alaska 99802 5526.
Fishery Bulletin 100(1)
hatchery fish, and this would increase the rate of false negatives. Differences between readers in skill and train ing level, and how they process otoliths, can add to the un certainty in estimating the accuracy of the readings and the rates of false positives and negatives.
Otolith marking generally takes place without any sec ondary marking, such as finclipping or codedwiretag ging; therefore the accuracy of a reading cannot directly be determined through conventional methods that make use of a "gold standard" (known origin sample) or other errorfree classification methods. To ensure that the in formation provided to the Alaskan fisheries managers is accurate, each otolith is independently examined by two readers, and a third reading is used to resolve differenc es between the first two readings. The resolved readings are used to estimate the contribution of hatchery fish, and the presumption of accuracy is based on the premise that, through multiple readings, all marked fish are ei ther correctly identified or that errors, if present, are in consequential. Developing the analytical tools to deter mine the veracity of that assumption is the objective of this investigation, and by establishing such tools, quality control standards for recovering thermal marks can be developed.
In developing the tools to measure the quality of otolith readings, three questions are addressed:
1 How to assess the reliability of otolith readings when no standards are available.
2 How to estimate the proportion of hatchery marks when there is disagreement between two or more readers.
3 How the precision of the estimate of the proportion is influenced by classification error
We discuss two approaches: 1 ) indices of agreement typi cally used in reliability studies, and 2) latent class models where classification errors are estimated for each reader even though the true error rate is considered unknown. The data requirements and their attendant assumptions are presented for each approach. The methods are illus trated by examining amongreader comparisons of chum salmon (Oncorhynchus keta) and sockeye (Oncorhynchus nerka) salmon otoliths collected from programs that moni tor inseason contributions of hatchery fish in several com mercial fisheries in Southeast Alaska (Hagen et al., 1995). The results are used to provide recommendations for mon itoring the quality of otolith readings for thermal marking programs.
Table 1
Notation used to show the crossclassification of a sample of fi otoliths by two readers to either hatchery (H) or wild stock (W) assignment. Row and column sums are indicated by the subscript "."
Reader 1 
H 
Reader 
2 
"h 
H 
W 

"hh 
"hh 

W 
"WH 
"ww 
«w 

"•H 
"w 
/; 
2 is infallible (or is considered a "gold standard"), unbiased estimates of the accuracy and error rates of reader 1 and the proportion of hatchery stocks (p) are given by
'^HlH ~ "hh/"h '^WjH ~ "\VH ^ " H  1 ■'''hIH
'^wjw ~ "vv\\7" w '^Hiw ~ "hw I " w = 1~ ■'^ww P = "h/".
(where, for example, ;r\vn refers to the probability that reader 1 classifies an otolith as W when its true state is H). These estimates reflect the fact that reader 2 is infallible; the accuracy rates CThih' '^wiw' ^"d the error rates CTwifi Tc■^^,^^) are conditional on the numbers of hatchery or wild stock otoliths as determined by reader 2.
No standard available
If a standard is not available, an unbiased estimate of p can be obtained if the accuracy rates for reader 1 are known. The estimate is
p* = ("n/«+^wr
I'/f'^HI
HH
' WW
1),
where n■^^ is the number of otoliths classified as hatchery otoliths. If the accuracy rates are estimated, thenp* will no longer be unbiased, but will be much less biased than the estimator n■^^ln and will in general have a much smaller meansquared error (Rogan and Gladen, 1978). For a Bayesian approach to this problem, see Viana et al. ( 1993 ) and Joseph et al. ( 1995).
Methods
Standard available
A sample of /i otoliths, which are examined by two readers, can be crossclassified as hatchery (H) or wild stock (W) as in Table 1. Suppose we wish to estimate the accuracy rate (probability of making a correct classification) or con versely, the error rate ( probability of making a wrong clas sification). If we know nothing about reader 1, but reader
Agreement measures When accuracy rates are unavail able, statistics that measure "agreement" between readers are often calculated (e.g. Fleiss, 1981). One such index is simply the proportion of observed agreement (P„), defined as
:(/(.
■ )ln.
Another index, called kappa (k), corrects P„ for the degree of agi'eement that is expected by chance alone. It is defined as
Blick and Hagen Use of agreement measures and latent class models to assess the reliability of classifying ttiermally marked otoliths 3
K = iP„P,.)/(lP^.),
where P,, = expected agreement = ('!h"h + "w"w'^"^ ^^^ divisor, 1  P., constrains k to be less than or equal to one, and if all agreement is due to chance {P^=PJ, then k: equals zero. Note that with k; independence between readers is assumed in order to calculate expected agreement.
An example of how agreement indices can be used to monitor readings is shown in Figure 1, which displays k and its standard error for 2874 chum otoliths readings di vided into 27 groups based on different reader pairs and capture locations. Included are P,'s for four of the groups. The results indicate that v levels were similar between the different groups, suggesting overall consistency in read ings, although some of the groups had lower values, which in practice would invite further investigation.
The Pg's in Figure 1 have a different rank order than the ic values. This apparent discrepancy highlights a potential problem in interpretation when using agreement indices to draw conclusions. To help illustrate this point, consider the following examples (Table 2). Table 2A is generated as the expected counts, given ;rj,j^ = 0.9 and %w = 10 for both readers, and p = 0.1. In this case, P, = 0.98 and k: = 0.89. On the other hand. Table 2B is generated under the same assumptions except that rt^n = 0.5. In this case P„ drops only slightly to 0.95, whereas v drops to 0.47. Be cause the hatchery stock is rare, the inability of the read ers to detect the mark is not well reflected by P„ whereas k reflects it better by correcting for the high level of chance agreement.
Now let K,
HIH
0.9 and /Twiw = 0.9 for both readers, and
0.64. On
P= 0.5 (Table 2C). In this case, P, = 0.82 and k the other hand. Table 2D is generated under the same as sumptions except that P= 0.05. In this case, P, remains unchanged at 0.82, but \' drops to 0.25.
In none of the above examples is the index "wrong." Rather, as is the case with most indices, interpretation is affected by the values of the underlying parameters. In the latter example (Table 2, CD), even though P, is the same for C and D, the scale it is being compared with has changed, thus changing the value of k. This increases the difficulty of comparing k across populations with differ ent underlying proportions. Note also that Table 2D could have been derived from %h = 0.5 and ttwiw = 0.944 for both readers, andp = 0.19. Thus, without additional infor mation, it is impossible to draw reliable conclusions about reader accuracies or the proportion of hatchery marks.
Although agreement measures can be ambiguously in terpreted, in practice they can still sei've a useful moni toring role during routine comparisons when the circum stances of the readings are fairly well characterized. The interpretive difficulties with indices such k and P, become apparent when trying to translate agi"eement measures into statements about the accuracy of different readers and about the influence of reading error on the contribu tion estimates.
Latent class models An alternative approach is to try to estimate tTj^, j^ and tt^viw f""" each reader, along with p. Although at first thought this may seem impossible, it can
1 0
ra 0.6
04
02
T ^ 8si 920 tl
J_
J_
10 20
Group number
30
Figure 1
The values of k{±1 SE) from 27 gi'oups of paired read ings of chum salmon otoliths (total=2874). The groups are based on pairs of different readers examining oto liths collected at different times and locations. The pro portion of agreement (P,) is shown next to group 4, 7, 9, and 12 for comparison with the value of k.
be shown that either by setting a few constraints or by col lecting additional information, estimation is indeed pos sible. This problem falls into the category of latent class modeling (e.g. Everitt, 1984; Bartholomew, 1987; McCutch eon, 1987; Clogg, 1995). Latent class models (LCMs) belong to a family of latent variable models that hypothesize the existence of unobservable "latent" variables, about which information can be obtained only though measurements on observable "manifest" variables. LCMs specifically restrict the latent and manifest variables to be categorical. In the present situation, the latent variable is the true class (H or W) to which the otolith belongs, whereas the mani fest variables are the readers' classifications. Such models have been used for assessing reliability of diagnostic tests in the medical field over the last 20 years (see Walter and Irwig, 1988; Formann, 1996, for reviews).
Returning to the problem with two readers, neither of which is a standard, there are five essential parameters to estimate: si)H,^HH'^ww.'fw'w ' andp, with only 3 df (four pieces of data, /i^H' "hw "wH' "ww minus one because the sample size, n, is fixed). Thus, the model is overparameter ized, and either constraints on the parameters or more da ta are needed. Possible constraints include 1) considering that two of the parameters are known (e.g. /r^vjw = Tww = ^• i.e. both readers always call a wild stock correctly, there are no "false positives"), or 2) considering that two sets of parameters are equal (e.g. t1''h , 7rfH , ;r\v'\v ='fwi'w' i^ the accuracy rates are the same for both readers).
Although there may be times when such constraints are realistic, in general they will not be; therefore more infor
Fishery Bulletin 100(1)
mation will be necessary. One way to generate more in formation is to have a third independent reader (Walter, 1984). With three readers, there are seven essential pa rameters; 7i'ii;^"'''\r^'^\i,''"" and p. There is also 2^  1 = 7 df, so that all the parameters are estimable. Estimation is most commonly done by the method of maximum likeli hood.
If readings are assumed to be independent among read ers and among otoliths, the likelihood function is
i = H,\V ) = H,\V*:H,VV
This likelihood function must be maximized numerically and methods for this computation will be discussed later If more than three readers are used, there are extra de giees of freedom that can be used to assess goodnessoffit.
For example, with four readers there will be nine param eters with 15 df leaving 6 df for goodnessoffit. Pearson chi square or likelihood ratio G'^ tests would both be applicable. Another way to generate additional information was proposed by Hui and Walter ( 1980). Suppose there are two or more strata with different hatchery proportions in each strata. For example, catch could be stratified temporally or spatially. If it is assumed that ;r and /Ty^iw remain constant over strata, then a solution for just two readers may be obtained. For example, if there are two readers and two strata, then there are six parameters; 'rHH"'>'i'ww' > Pj, and p.,, with 2(2'^  1) = 6 df Increasing the number of strata increases the degrees of freedom; e.g. three strata for two readers gives 3(2^  1) = 9 df for 7 parameters. The likelihood function for two readers and S strata is
fin ni^^'^^iH'^^^+'i^.
(1) 12) 1" 'Iw'TjIWl
g=l (=H,W_/ = H.W
Table 2
Examples from crossclassification data generated as expected counts from a sample of 1000 otoliths based on different accuracy rates for identifying hatchery fish < tt,.,  ,^ I and wild fish (/Twiw' under different mark proportions tp). The examples used illustrate differences between obsei"ved agreement IP, i and chancecorrected agieement U') under different underlying conditions.
A 
H 
Reader 2 
90 
^Hl 
H ~ 
0.9 
P„ 
= 0.98 

' 
H 
W 

Reader 1 
81 
9 

W 
9 
901 
910 
% 
\V ~ 
1.0 
K 
0.89 

Total 
90 
910 
1000 
/' = 
0.1 

B 
H 
Reader 2 
50 
^11 
n = 
0.5 
P, 
= 0.95 

H 
W 

Reader 1 
2.5 
25 

W 
25 
925 
950 
% 
w = 
1.0 
K 
= 0.47 

Total 
50 
950 
1000 
P = 
0.1 

C Reader 1 
H 
Reader 2 
500 
'^llH = 
0.9 
P„ 
= 0.82 

H 
W 

410 
90 

W 
90 
410 
500 
%■ 
w  
0.9 
V 
= 0.64 

Total 
500 
500 
1000 
P = 
0.5 

D Reader 1 
H 
Reader 2 
140 
'fH 
H = 
0.9 
P, 
= 0.82 

H 
W 

50 
90 

W 
90 
770 
860 
Tu 
\V = 
0.9 
K 
= 0.25 

Total 
140 
860 
1000 
P = 
0.05 
A third way to supply additional information is to take a Bayesian approach (see "Discussion" section). By speci fying prior distributions of the model parameters, unique estimates can be obtained (Joseph et al., 1995).
A critical assumption in the above models is that read ings are independent. Specifically, the reading of each oto lith by a given reader is independent of any other reading by the same reader, and each reading by various readers on a given otolith is independent given the true state of the otolith. In principle, the latter assumption may be dif ficult to meet especially if all readers examine the same otolith. The fact that the otolith is not prepared indepen dently by each reader could induce a dependence among the readers. Also, variability in the readability of the mark due to the marking process can induce a dependence. Such dependence can bias the estimators of n and p (Vacek, 1985). Note that this latter assumption of independence is also required for v.
One remedy for the problem of dependence due to prepa ration is to require independent preparations. This however, requires additional otoliths and with only two otoliths per fi.sh, this would limit the number of readers to two. But in practice, this may not be a large concern. Typically, the second reader has the option to provide additional process ing effort to the first otolith or, if needed, to process the second otolith. In almost all cases additional preparation is not done and readers feel they are able to extract suf ficient information about the presence or absence of a mark from each other's preparations. In addition, reader accura cy rates obtained by LCM do not appear to vary systemati cally with the reading order, which also suggests that prep arationinduced dependency is not a significant factor
Dependency associated with variability in the appear ance of the mark may be harder to address. A general so lution is to model the dependence with additional param eters (e.g. Vacek, 1985; Qu et al., 1996; Yang and Becker, 1997; Qu and Hagdu; 1998; Albert et al., 2001). Modeling dependence requires either more readers or more strata. These modeling approaches are complicated and are cur rently evolving (see Albert et al., 2001). Alternatively, ad
Blick and Hagen: Use of agreement measures and latent class models to assess the reliability of classifying tfiermally marked otolitfis
ditional latent classes may be added (Christenson ct al., 1992; Forniann, 1994), e.g. a third class of otoliths from ambiguous sources.
In the previous discussion concerning three or more readers, we implied that readers were different individu als. This need not be so; what is required are three or more independent readings. If it were possible for the same in dividual to read the same otolith more than once, indepen dently, then the number of different readers could be re duced. If independence could not be met, the dependence could be modeled, as discussed above.
Another critical assumption, but one that should be met most of the time, is that the individual accuracy rates are known to be either greater than or less than the error rates (e.g. %h > ^wm ^^'^ %W ^ %W' which im plies that ^Tj^iH and JT^,^ are either greater than or less than 0.5) because of an inherent symmetry in the problem that results in the same likelihood function being gener ated when the error rates are switched with the accuracy rates.
Computation Formulas for estimating \'and its standard error are straightforward (Fleiss, 1981). Estimates can also be obtained from several software packages including PROC FREQ in SAS (SAS Institute, 1989).
Maximizing either of the likelihood functions for the LCMs requires a numerical procedure. The most straight forward is to use an optimization routine such as "Solver" in Excel (Microsoft Corporation, 1993) or "nlminb" in S PLUS (Statistical Sciences, 1995). Alternatively, the EM algorithm (Dempster et al., 1977; Dawid and Skene, 1979; McLachlan and Krishnan, 1997) can be easily used. The simplicity of the EM algorithm follows from the recogni tion that the LCM is an example of a finite mixture prob lem, specifically, in this case, a mixture of multivariate Bernoulli distributions with mixing parameter p (Everitt, 1984). Use of the EM algorithm for such mixture prob lems in fisheries is well documented, e.g. for stock compo sition estimates (Millar, 1987; Pella et al., 1996) and for agelength keys (Kimura and Chikuni, 1987). A more ef ficient alternative to the EM algorithm is to use iteratively reweighted least squares (Agresti, 1990). This method is relatively easy to implement in software such as PROC NLIN in SAS (SAS Institute, 1989). Perhaps the most di rect and efficient way would be to use LCM software. We are not aware of any routines for LCMs in any major statistical package at present, but several independent LCM packages exist (for a review, see Clogg, 1995; and for an Internet listing see http://ouiworld.compuserve.com/ homepages/jsuebersax/index.htm).
As with many maximum likelihood problems, where nu merical methods must be used, complications can arise. Constraints may at times be needed to ensure that pa rameter estimates fall in acceptable intervals (e.g. [0,1] for p and [0.5,1] for the ;r's). Also the likelihood function may have local maxima, which means that several runs with varying starting values may be necessary to identify the global maximum. Finally, estimates of standard er rors may entail additional computing. PROC NLIN in SAS provides asymptotic (i.e. largesample) standard errors.
Jackknife and bootstrap estimates are relatively easy to program, the jackknife being much less computationally intensive.
Finally, the Bayesian programs discussed in Joseph et al. (1995) can be found at http://www.epi.mcgill.ca/Josepli/ software. html.
Examples
The first example analyzes the results of three readers examining 570 chum otoliths. The samples were taken from a common location, and the readers were familiar with the patterns. Each reading was made without knowl edge of prior readings. The data, along with pairwise k estimates and the LCM parameter estimates (using PROC NLIN in SAS; see appendix for code) are presented in Table 3.
These results indicate that the third reader is signifi cantly (a=0.05) less able to correctly identify a hatchery mark when it is present and that there are no significant differences among readers in their ability to detect a wild mark when it is present. These conclusions are readily ap parent from the table of results, and although the pairwise K"'s are consistent with these results, they are more dif ficult to interpret. With the variance due to sampling es timated to be (0.7379X1  0.7379)/(570  1) = 0.0003399, misclassification error contributes only 0.36% to the total variance.
The second example consists of two readers with four spatial strata. Samples were obtained from sockeye salm on caught in four neighboring Alaskan gillnet fisheries in central Southeast Alaska. The data and the LCM esti mates are shown in Table 4. These estimates indicate that the readers are not statistically different in their ability to detect hatchery marks, whereas the second reader is bet ter able to distinguish wild marks. With eight parameters and 12 df there are 4 df available for a goodnessoffit test. Pearson's chisquare yields 4.83, which with 4 df, has a pvalue of 0.306, thus indicating an acceptable model fit. Misclassification error contributes from about 8% to 14% to the total variance in the estimates of the proportion of hatchery stock.
Design considerations
Design of an otolith reading program is complicated by misclassification error. An important consideration is the precision of the estimates, in particular the precision of the estimate ofp. Table 5 shows the asymptotic standard error of p for various combinations ofp, /r^iH' ^^^ ^wiv! f'"' '^e threereader model with unknown accuracies, and the one, two, and threereader models with accuracies assumed known. Although this table is derived for a sample of 1000 otoliths, the ratio of any two standard errors within the table would be the same for any sample size (assuming the sample size is large enough to approximate the asymptotic conditions). It is evident that misclassification inflates the standard error over the usual binomial case (rightmost column). The table also makes clear the increase in the uncertainty of estimating p when the accuracies also have
Fishery Bulletin 100(1)
Table 3 

Crossclassification data and results for 570 chum 
otoliths examined bv three 
readers showing the parameter estimates and stan 

dard errors from the latent class model, followed by a comparison of the differences 
among 
reader pan's by 
jsing 
kapp 
3 and the 

latent class model (LCM) accuracy rates. The data 
show that the high agreement among read 
ers as to hatcher 
V and 
wild ( 
lassifica 

tion (e.g. HHH= 
406 and WWW= 
= 135) is reflected in the overall high accuracy 
rates estimated from the LCM 
However the model 

also shows that reader 3 has a significantly lower 
accuracy rate in detecting hatchery 
marks (;rij5'H=0.969) than the other readers. 

Reading 
Count 
LCM Parameter 
Estimate 
SE 

HHH 
406 
'Thih 
0.998 
0.002 

HHW 
13 
'f'&IH 
0.998 
0.002 

HWH 
1 
'^'^IH 
0.969 
0.008 

WHH 
1 
f'^'jW 
0.958 
0.017 

HWW 
6 
t'w/w 
0.986 
0.010 

WHW 
2 
*rl3t 
0.957 
0.017 

WWH 
6 
P 
0.738 
0.018 

WWW 
135 

Reader pairs 
K 
SE 
Difference in tTj^ih 
SE 
Difference in ^^v 
SE 

land 2 
0.954 
0.014 
0.000 
0.004 
0.028 
0.020 

lands 
0.882 
0.022 
0.029 
0.009 
0.000 
0.024 

2 and 3 
0.901 
0.021 
0.029 
0.009 
0.028 
0.020 
Table 4 

Crossclassification 
data for 2340 sockeye otoliths 
e.xamined bv two 
readers 
and stratified by four fishing districts 
showing the 

estimates of the latent class parameters and their 
standard errors. Between 
reader comparison is based 
on whether the difference 

in accuracy estimates are 
significantly different th 
an zero. The result 
s indicate that the 
readers were not statistical! 
V different in 

detecting hatchery 
marks 
' "^H 1 H ' ^"^' were 
statistically different in detecting 
wild marks 
(;rww'LCM = 
latent class 
model. 

Fishing districts 

10830 
10850 
10641 
10630 

HH 
152 
127 
85 
20 

HW 
11 
9 
21 
5 

WH 
2 
6 
5 
1 

WW 
271 
382 
832 
411 

n 
436 
524 
943 
437 

LCM parameter 
Estimate 
SE 
Reader difference 
SE 

'^hih'" rr <2> "HjH 
0.980 0.964 
0.013 0.021 
0.017 
0.025 

IT 11' "w 1 W TT 12' ''WW 
0.984 0.997 
0.005 0.003 
0.013 
0.006 

P10830 
0.366 
0.024 

Pi 0850 
0.257 
0.020 

P1O6H 
0.096 
0.010 

P1O63O 
0.047 
0.011 
to be estimated in the threereader case. For example, if = 0.8 for all three readers, one would have to
'^HlH
''wlw
have almost twice (0.035/.019=1.84) the sample size to esti mate ap of about 0.5. Once accuracy estimates for the read
ers are obtained, dropping one or even two readers may be appropriate, although the assumption must be made that the accuracy rates will be constant for the remainder of the program. Maintaining two readers will allow for that
Blick and Hagen: Use of agreement measures and latent class models to assess the reliability of classifying thermally marked otoliths
Table 5 

AsyniptolK' 
Uand 
ard errors 
Ibr the cs 
timalcd ])! 
opor'tion of 
marked fish 
p, for various combinat 
ons 
of accuracy rates in identify  

iiifj; halclu'rv 
fish. 
;r,„,and 
wild fisli, 
%W'"«1 
mark proportion p, for a sample of 1000 otoliths 
Val 
ues are 
reported foi 
the cases 

u liero accur 
icy r 
ites, K. are 
the same 
and assumed known 
or one, two. 
or three readers 
and for the 
case w 
lere ;r's are 
estimated 

lor three readers. Table illustrates how misclassification will increase standard errors in 
the estimate of hatchery proportion. 

'''n 1 11 
0.8 
0.9 
1.0 

'^W 1 w 
0.8 
0.9 
1.0 
0.8 
0.9 
1.0 
0.8 
0.9 
1.0 

,'i readers 
P 0.1 
0.032 
0.016 
0.011 
0.023 
0.013 
0.010 
0.018 
0.011 
0.009 

1 rfs estimated) 
0.3 
0.034 
0.021 
0.017 
0.024 
0.017 
0.015 
0.020 
0.015 
0.014 

0.5 
0.035 
0.023 
0.019 
0.023 
0.018 
0.016 
0.019 
0.016 
0.016 

0.7 
0.034 
0.024 
0.020 
0.021 
0.017 
0.015 
0.017 
0.015 
0.014 

0.9 
0.032 
0.023 
0.018 
0.016 
0.013 
0.011 
0.011 
0.010 
0.009 

3 readers 
0.1 
0.013 
0.011 
0.010 
0.011 
0.010 
0.009 
0.010 
0.010 
0.009 

