Developing and validating a nestling photographic aging guide for cavity-nesting birds: an example with the European Bee-eater (Merops apiaster)

Accurate estimation of nestlings’ age is essential in avian demography studies as well as in population ecology and conservation. For example, it can be useful for synchronizing nest visits with events of particular interest, such as the age at which young can be safely ringed, or in choosing the best period to attain the most accurate calculation of laying or hatching dates. We constructed a photographic guide for aging European Bee-eaters (Merops apiaster) nestlings to 3-day age classes and evaluated the aging method by performing a validation exercise with several observers with no previous experience in aging bee-eater nestlings. The aging guide for bee-eater nestlings allowed estimating age to within 3 days with an average accuracy of 0.85. We found the optimal period for aging nestlings was between days 13‒18 (with accuracy between 0.94 and 0.99), during which the status of feather development was more easily distinguishable from the preceding and subsequent age classes. During the first 3 days after hatching, nestlings could also be aged with high accuracy (0.93). The small size of the nestling in relation to the eggs and the nestling’s inability to raise its head during these first days allowed for good discrimination from the subsequent age class. Between days 25 and 28, nestlings were correctly aged in only half of assignments (0.55 sensitivity) and nestlings belonging to class 7 (days 7‒9) were the least correctly identified (0.38 sensitivity). However, by visiting the nests at 12 days intervals it is possible to achieve the highest accuracy in age estimation with the smallest disturbance and logistic investment. This study highlighted how indirect methods and a simple protocol can be established and employed to quickly estimate nestling age in cases where handling nestlings is challenging or impossible, while minimizing disturbance in and around the nest.


Background
Assessing bird productivity and inter-annual variation in hatching and laying dates allows linking potential phenological changes to demographic rates (e.g. Fletcher et al. 2013;Cruz-Mcdonnell and Wolf 2016;Tomotani et al. 2018). Accurate estimation of nestling age is therefore essential in avian demography studies and also in population ecology and conservation (e.g. Eeva and Lehikoinen 1996;Saunders and Ingram 1998;Marchesi et al. 2002). Establishing nestling age can also be used to synchronize subsequent nest visits with the age at which

Open Access
Avian Research *Correspondence: joana.santoscosta@ua.pt 1 Department of Biology & CESAM, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal Full list of author information is available at the end of the article young can be safely ringed (Fyfe and Oldendorff 1976;Tomotani et al. 2018) or in choosing the best period for attaining the most accurate back-calculation of laying or hatching dates (Marchesi et al. 2002). However, determining nestling age often requires handling the birds, and in some species nests are difficult to access, due to their location in cavities (e.g. bee-eaters, Meropidae ;Fry 1984) or high in trees or cliffs (e.g. raptors; Moritsch 1983). In addition, frequent nest visits may affect nesting success by drawing the attention of potential predators to the nest or by changing parental behaviour. Ultimately, it can also lead to nest abandonment (Gotmark 1992). Reducing the frequency and length of nest visits (including nestling handling) is therefore very desirable and alternative indirect methods for estimating nestling age can be advantageous.
Nestlings are commonly aged using growth curves of morphological traits constructed from individuals of known age (e.g. Green and Tyler 2005;Pande et al. 2011;Saunders et al. 2015). But, estimating chick age using directly measured biometrics alone may be inaccurate during specific periods of development (Rodway 1997;Brown et al. 2011). An alternative, or complement, is the use of guides based on photographs of nestlings with known age, together with a description of their qualitative changes in appearance, throughout the growth period (e.g. opening of the eyes, stages of feather development). This method may be used with higher success than biometric aging (Brown et al. 2013) while minimizing nest disturbance and avoiding bird handling (Moritsch 1985;Boal 1994;Saunders et al. 2015).
Here we present a photographic guide to determine nestling age in European Bee-eaters (Merops apiaster), based on visible traits that allow aging without excavation and extraction of nestlings from their burrows. The European Bee-eater (hereafter bee-eater) is an Afro-Palearctic migrant that breeds colonially, digging its nest in sloping hillsides or flat ground. Nests chambers are usually difficult to access, with burrows either straight or curving to one side, and extending for 0.7-2 m (Fry 1984). Females lay 4-10 eggs in 1-2-day intervals and incubation lasts around 20 days, beginning after the first egg is laid. This results in hatching asynchrony with hatching of all nestlings taking 2-9 days. Fledging of the young occurs after 30 days (Lessells and Avery 1989). The oldest nestling(s) is usually larger and tends to monopolize access to food by positioning itself in front of its siblings, inside the nest chamber (Lessells and Avery 1989). For this reason, the oldest nestling(s) is also more developed than laterhatched siblings, an order that is maintained during growth (Lessells and Avery 1989).
In this study, we (1) provide a photographic guide for aging bee-eater nestlings into 3-day age classes that can be used in the field by any observer when hatching date is unknown, and (2) evaluate this aging method by performing a validation exercise on several observers with no previous experience in aging bee-eater nestlings. Finally, (3) we propose a nest visitation scheme that allows nestling age to be determined to within 3 days with high accuracy, while minimizing the number of nest visits.

Recoding nestling development
Fieldwork was conducted at two breeding colonies in Portugal (38.1° N,7.0° W;38.6° N,8.9° W), between May and July of 2016 and 2017. We visited each nest every 3 days in order to minimize the intensity of the monitoring and avoid potential detrimental effects on the growth and survival of fledglings (Gotmark 1992). We inspected bee-eater nests after clutch initiation with an adapted "burrowscope" consisting of a webcam (Microsoft Lifecam HD-3000) attached to a LED light for illumination and connected to a laptop with a 2 m USB cable for image recording. In seven nests hatching date was possible to assign to within a 3 day period. These nests were subsequently monitored at 3-day intervals until all the nestlings had fledged. During each nest visit, we recorded several photographic images of nestlings to record all noticeable aspects of their development. Since bee-eaters hatch asynchronously, we consistently targeted the oldest nestling(s), recorded and aged during the first visit following hatching, as a reference for the quantification of development, and therefore considered that nestling(s) for the production of the photographic images on each subsequent visit. During each visit, all chicks were checked to ensure that the oldest nestling(s) was recorded and its development was therefore monitored. For six nests in which two nestlings hatched during the first 3 days, we monitored the development of those nestlings at each visit. Those two nestlings always presented the most advanced stage of development amongst their respective brood, which was similar between them in every visit (JSC pers. obs.).

Photographic aging guide
We grouped the recorded nestling images into ten age classes at 3-day intervals (1-3, 4-6, 7-9, 10-12, 13-15, 16-18, 19-21, 22-24, 25-27 and 28-30 days; age classes henceforward indicated by the first number of the age interval), where the first interval (i.e. 1-3) corresponds to the period when the first nestling(s) of each nest hatched. We selected only good quality images from the representative nestling(s) for constructing the photographic aging guide (hereafter aging guide). We used images from several nests and from the same nestlings at distinct development stages to illustrate each age class. For each age class, we described the most prominent characteristics and how they have changed throughout development. We selected the most identifiable characteristics based on detailed descriptions of the oldest nestling(s) and grouped them into four main features: head, plumage, relative size and behaviour. Specifically, we noted changes in the head: eye opening, bill size and colour; plumage: feather colour and stages of development (e.g. when pins emerge, unsheathing of pins); size in relation to eggs; and behaviour (e.g. being able to raise head).

Testing the nestling aging guide
In order to check the usefulness and effectiveness of the aging guide, we conducted a test with 6 observers with no prior experience in aging bee-eater nestlings but with differing levels of experience in handling other bird species. The test consisted of two stages: (1) learning-the test structure was explained to all observers (see below), each having one copy of the photographic aging guide (Additional file 1: Table S1). In order to evaluate how an observer would perform with little prior experience, each observer had 2 min to read and learn how to interpret the guide before the start of the test. The observers could consult the guide during the test (i.e. for assigning nestlings to a specific age class).
(2) test structurea selection of 30 unique images from the first hatched nestling(s) (three images from each age class and different from the ones included in the guide) were randomly split into three sets of 10 images, each set composing a trial. Each image from the set was displayed during 40 s to all observers simultaneously, using a projector in a common room. Based on previous field experience we considered that 40 s would allow identifying the oldest nestling and attaining images (JSC per. obs.), therefore mimicking field conditions while minimizing disturbance. Each observer thus had 40 s to view the image and assign it to an age class, after which period the following image on the set was presented until all 10 images from the set had been shown and the trial ended. The three trials were run in succession and were intended to capture potential experience acquired by the observers during the test itself. Observers were not allowed to make any comments during the test and there were no intervals between trials.

Statistical analysis
We used generalized linear mixed models (GLMMs) to explore differences in the proportion of correct assignments between classes and trials, using package lme4 (Bates et al. 2015) with binomial error structure and logit link function. The response variable, age class estimated by a given observer during the test, was coded as 1, when the image displayed was assigned to the correct class, or as 0, when the imaged displayed was incorrectly assigned to a different class. We considered age class and trial as fixed factors and observer as a random factor. We constructed full, reduced (including only one of the fixed factors) and null models (including only the random factor) that were ranked according to AICc. The model with the lowest AICc value was considered to have the best fit to our data. Models that differed by less than 2 AICc points from the best one were considered to provide similar support to the data (Burnham and Anderson 2002). For the top-ranked models, we performed pairwise comparisons between levels of each fixed factor using package emmeans (Lenth 2019). P-value was adjusted to multiple comparisons using Tuckey method.
In order to assess the predictive ability of our aging guide and evaluate performance within each nestling age class, we first constructed a cross-tabulation (confusion matrix) of actual and observer-assigned age classes. For each age class we constructed a 2 × 2 table of assignments (see example for class 7 provided in Additional file 1: Table S2) and each assignment made by a given observer was categorized as: true positive (TP) when the focal class (age class for which the confusion matrix is being constructed) was correctly assigned (e.g. when the displayed image showed age class 7 and the observer classified it as age class 7); true negative (TN) when a different class (i.e. not the focal) being displayed was correctly assigned; false positive (FP) when a different class being displayed was assigned as the focal class; and false negative (FN) when the focal class was being displayed but a different class was assigned. Additionally, we used confusion matrices for each age class to generate five performance metrics: Accuracy-total proportion of correct assignments, TP + TN/TP + TN + FP + FN; Sensitivity-proportion of the images showing the focal class that were correctly assigned, TP/TP + FN; Precisionproportion of images assigned as the focal class that were in fact showing the focal class, TP/TP + FP; False positive rate-proportion of images showing different classes that were assigned as the focal class, FP/FP + TN; False negative rate-proportion of images showing the focal class that were assigned to a different class, FN/TP + FN. We calculated the percentage of nestling age assignments that were correct (i.e. sensitivity) and under-or over-estimated by one or two age classes (no incorrect under-or over-estimation was recorded beyond two classes away from the focal class, Additional file 1: Table S3). Additionally, we developed a visitation protocol to maximize the probability of correctly estimating age of nestlings with minimum disturbance. To do so, we selected accuracy as a performance metric once it takes into account both types of error (False positives and False negatives).
All the analyses were performed in R 3.4.3 (R Core Team 2017).

Results
In total, 180 age estimations of bee-eater nestlings of 10 age classes were made by six participants, during the three trials of the test, resulting in 30 answers per participant, and 18 per age class. The model containing age class as the fixed factor was the most parsimonious model and ranked as the top model (Additional file 1: Table S4). The proportion of correct assignments was significantly different between several classes (Tables 1 and 2). Specifically, class 13 received more correct answers than class 7, and class 16 presented more correct answers than classes 7, 25 and 28 (Table 2, Additional file 1: Figure   S1). The absence of variance reported for the random factor suggests low variation in the proportion of correct assignments between observers (Table 1, Additional file 1: Figure S1). The difference between AICc of model 1 (including class) and model 2 (including class and trial) was less than two units (AICc = 208.2, ∆AICc = 1.7, Additional file 1: Table S4). However, we found no differences in the proportion of correct assignments between trials (Additional file 1: Table S5; Additional file 1: Figure  S1).
Observers using the aging guide (Additional file 1: Table S1) classified nestling age with a mean accuracy of 0.85, and seven out of ten age classes were correctly identified with an accuracy above 0.80 (Table 3). Most nestlings in the sample were miss-estimated by only

Table 1 Summary table of GLMM for the top-ranking model (Class + (1|observer) testing the differences in the proportion of correct assignments between classes
The model was run with a binomial error structure and logit link function. N = 18 age estimations per age class. Estimates, standard errors (SE), and 95% confidence intervals (95% lower and upper CI) are presented. Positive and negative estimates indicate a higher or lower proportion of correct assignments of a given class compared to age class 1 (intercept)  one class and never by more than two classes (Table 4, Additional file 1: Table S3). The most frequently correct age classes were 1, 13 and 16 (≥ 0.88 sensitivity, Table 3), with seven classes having sensitivity of at least 0.70. Conversely, classes 25 and 28 were only correctly identified in about half of assignments (0.55 sensitivity), and class 7 was the least correctly identified (0.38 sensitivity; Table 3). Additionally, class 4 was correctly assigned on 77% of the events and it had both the lowest precision value (0.48) and the highest false positive rate (0.09; Table 3). This was reflected in a considerable proportion of nestlings from classes 7 and 10 being incorrectly assigned to class 4 (0.62 and 0.28 false negative rate, respectively; Tables 3 and 4). Classes 22, 25 and 28 were often incorrectly assigned (0.28-0.45 false negative rate and < 0.68 precision; Tables 3 and 4). But while class 22 was exclusively underestimated, assignment of class 25 was biased in both directions, whilst class 28 could only be underestimated (Table 4). By adopting a protocol with 12 day visit intervals, nestling age can be determined to within 3 days with an accuracy of 0.85 to 0.99, with only two visits to the nest ( Table 5). The first visit should be made during the first 12 days since hatching, to allow for a second visit before fledging in order to confirm or adjust age with high level of accuracy (> 0.85). For example, if the first visit to a nest takes place during days 1-3, the age of nestling(s) will likely be correctly assigned to class 1 with an accuracy of 0.93. By visiting that same nest 12 days later, the age of nestlings can be confirmed with 0.94 accuracy, as the age class of nestling(s) during that period will be 13. Conversely, if during the first visit to the same nest it is assigned to class 4 or 7, it is possible do adjust age classification with 0.94 accuracy in a visit 12 days later, as nestling(s) will be 13-15 days old. If nests are only visited after day 12, accuracy will be at least 0.94 until day 18 and at least 0.87 until day 21, although in these cases a second visit at the suggested 12 day interval would not improve accuracy (Tables 3 and 5). However, by determining hatching via observation of provisioning (i.e. adults entering the nest cavity carrying food items), a visit to the colony at 12 days intervals ensures that nests can be visited within the 12 first days since hatching.

Discussion
Aging guides of nestlings based on photographs have been widely developed and used for several species (e.g. Boal 1994;Fernaz et al. 2012;Amiot et al. 2014), but an assessment of age estimation accuracy has seldom been applied (but see Brown et al. 2013;Wails et al. 2014;Wilkins and Brown 2015;Brown and Alianell 2017). Here, we show that high accurate levels of age estimation can be achieved (0.85-0.99 accuracy), with only two nest visits during the entire nestling development period.
The aging guide for bee-eater nestlings allows estimating age to within 3 days with an average accuracy of 0.85. While some age classes can be estimated with an accuracy above 0.90 (classes 1, 13 and 16), others have lower accuracy (classes 7, 25 and 28, range: 0.68-0.75). This is probably due to the very distinctive characteristics of nestling in specific classes (e.g. small size of nestlings in class 1; starting of emergence or unsheathing of pins in classes 13 and 16) and the less obvious in others, as the degree of change varies during development stages.
Overall, nestling age could not be estimated with the same accuracy throughout the growth period. It may thus be advantageous to visit nests for age estimation   (7) 55.66 (10) --in periods that have the highest accuracy (classes 1, 13 and 16). In bee-eaters, we found the optimal period for aging nestlings to be between days 13-18 (with accuracy between 0.94 and 0.99, Table 3). During this period, the status of feather development is more easily distinguishable from the preceding and subsequent age classes, once there is an evident growth of pins and unsheathing of body feathers as feather colours become gradually more visible. During the first 3 days after hatching (class 1), nestlings can also be aged with high accuracy (0.93, Table 3), which is similarly to several passerine species (Brown and Alianell 2017). The small size of the nestling in relation to the eggs and the nestling inability to raise the head during these first days allow clear discrimination from the subsequent classes. Aging of nestlings between 7-9 days old was most challenging and these were frequently misclassified (always as underestimation of class 4, Table 4). This is likely due to slow growth, and thus the lack of evident size differences between these age classes, which are only distinguishable by the appearance of a light grey coloration of the flight feathers tracts (indicating the emergence of the pins) on nestling of class 7. Additionally, the oldest bee-eater nestlings (classes 25 and 28) were frequently misclassified, similarly to what was reported in Common Terns (Sterna hirundo, Wails et al. 2014), Eastern Bluebirds (Sialia sialis, Wilkins and Brown 2015), House Wrens (Troglodytes aedon, Brown et al. 2013) and Carnaby's cockatoos (Calyptorhynchus latirostris, Saunders et al. 2015). In bee-eaters, the underestimation of these classes likely occurred due to the difficulty in observing the featherless patches in the ventral and anal regions, and the unsheathing of rectrices, which are characteristic of age class 25. These skin patches are only visible when nestlings are optimally positioned towards the camera. Given their age and relatively high mobility, this is more easily achieved in the field rather than in the still images displayed during the validation exercise.
The observers' ability to correctly assign age during the final stages of development might also have been influenced by variable growth rates between nestlings from different nests which were of the same age class. Differences in development rates between broods are more apparent at older ages due to several factors. Food provisioning to growing chicks, mediated by presence of helpers (Fry 1984), and suitable weather conditions for flying insects (Arbeiter et al. 2016) are known to influence growth rate and survival of bee-eater nestlings. However, it is unlikely that weather conditions limited food availability during our study, as mean maximum temperatures were above 29 °C and total precipitation below 5 mm, throughout the nestling provisioning period (June/July, Arbeiter et al. 2016; IPMA weather reports 2019). Number of nestlings per nest, paired with sibling competition, also creates additional variation in individual nestling development (Lessells and Avery 1989). However, it was not possible to account for the number of nestlings in each brood and nestlings from larger broods may develop slower than nestlings from smaller broods (Nilsson and Gårdmark 2001). Although between-brood variation of development rates might not be an issue for direct observations in the field, intra-brood variation can be relevant, as younger nestlings may develop at a slower rate than first hatched nestlings (Bryant and Tatner 1990). It is therefore recommended that the larger nestling(s) (i.e. first hatched) is targeted on each visit, in order to

Table 5 Key to attain highest accuracy in age assignment of bee-eater nestlings on a 12-day interval visit schedule
Correction on the second visit following potential misclassifications during the first visit, in accordance with most likely miss-assigned classes from 7 -minimize potential differences of individual growth rates between siblings. Although we did not find an increase in observer experience during our test trials, the percentage of correct estimations slightly increased from trial one to trial three (Additional file 1: Figure S1). This suggests that training of observers can further increase age assignment accuracy, as indicated in other studies (Weinberg and Roth 1994;Brown et al. 2011;Wails et al. 2014). Longer training of the observers beyond a two minute period may further improve aging accuracy and is recommended for field studies.
It should also be noted that this test did not entirely replicate field conditions, as recorded images were displayed in a projector rather than being visualized in a laptop by the nest. In any case, besides outdoors conditions which will likely differ (e.g. temperature, light reflection) displaying projected images vs observing those in a laptop or other mobile electronic display is unlikely to increase error rate. Furthermore, when in the field, there can be ample opportunity to clarify any less obvious nestling characteristics in real time. During the experimental setup, observers were tested under stringent and fastpaced conditions, as observation time was limited to 40 s per photograph. It is likely that the method we propose may allow higher accuracy levels on age estimation, if observers are given enough time for detailed observation and evaluation, particularly for those age classes where lower accuracy was recorded. The images selected in the guide show (whenever possible) an example of the most and least developed phenotype within the class age days range, in order to further aid observers. In addition, the images are accompanied by a description of identifiable changes in the main developmental characteristics. The most reliable characters for age estimations are those that change at a faster rate. In the case of bee-eater nestlings this is feather development, similar to that reported in Eastern Kingbirds (Tyrannus tyrannus) and Eastern Phoebes (Sayornis phoebe, Murphy 1981), several species of North American passerines (Jongsomjit et al. 2007) and Barn Swallows (Fernaz et al. 2012). But the use of plumage development alone may lead to under-or overestimations in age (Wails et al. 2014), as we observed in the two older classes and during the initial 6 days of development. Therefore, field observers should rely on a combination of developmental characteristics as much as possible (Fernaz et al. 2012;Wails et al. 2014), in order to increase the accuracy of age estimation. Considerable care must be taken when examining nests of bee-eaters, as pre-fledging nestlings and adults can get trapped in the tunnel while trying to flee the approaching camera. Thus, we recommend extra care when adults are present and to avoid nest inspection during the later stages of development whenever possible.

Conclusions
With this guide we were able to estimate the hatch date to within 3 days. We suggest visiting the colony and nests at 12-day intervals to achieve the highest accuracy metrics with the smallest disturbance and logistic investment. This study highlights how indirect methods and a simple protocol can be established and employed to quickly estimate nestling age in cases where nestling handling is complicated or impossible, while minimizing disturbance in and around the nest.
Additional file 1: Table S1. Photographic guide of bee-eater nestlings from hatch to fledging. Table S2. Table illustrating the confusion matrix. Table S3. Confusion matrix comparing predicted classes by the observers in the test to the actual classes. Table S4. Ranking of candidate models explaining the ability to predict bee-eater nestling's age. Table S5. Pairwise comparisons between trials. Figure S1. Variation on the mean (± SE) percentage of correctly assigned estimates for age class, observer and trial.
• fast, convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold Open Access which fosters wider collaboration and increased citations maximum visibility for your research: over 100M website views per year • At BMC, research is always in progress.

Learn more biomedcentral.com/submissions
Ready to submit your research ? Choose BMC and benefit from: