Monday, July 18, 2011

Trends in the data...

As a research biologist I am always looking for patterns in the environment or in the data that I collect that could lead to new conclusions and knowledge about the species we study. Usually these patterns are mostly hidden from view and we rely on statistics to tease them out. During my recent field season I noticed a very strong trend regarding the six non-failed goshawk nests were were monitoring at the end of my seventh week in the field. It appeared that the late spring weather might have had an impact on the number of nestlings. The earliest hatching nest (as determined by the age of the nestling) only had a single nestling, where the two latest hatching nests each had three!. A plot of the six nests looks like this...

Northern Goshawk nestling counts by hatching date for first six nests discovered.

Look at that amazing trend line. Not surprisingly with a pattern like that the trend is very significant. A simple regression has a p value less the 0.001 and an r-square of 0.95. This is unheard of in ecological research. To have such a strong p-value and r-square value with only six points is amazing. This was definitely looking like a strong conclusion. I looked through my research summaries (Book: The Goshawk by Robert Kenward) for other research with the same conclusion. I didn't find any. While this book is not comprehensive, it is nearly so. I became more optimistic that we might be on to something.

During week eight, we performed our job too well. We discovered two more occupied nests and these two did not fit the mold. Adding these two points into the analysis produced a picture that looks like this... (new points in red)

Northern Goshawk nestling counts by hatching date for all eight nests discovered.

The statistical significance goes right out the window... The p-value rises to 0.27 and the R-square drops to 0.06. The only trend remaining in the regression line is almost entirely influenced by the first point.

The lack of significance does not mean that early season nests didn't faced unique challenges affecting brood size, it just means that this study cannot conclude that there is a correlation.

Brood size by hatch date is not part of my thesis, but it might have been nice to have a freebie significant result. Anyway, I would rather have the two new nests for my thesis, which is challenged by sample size, than concluding brood size is related to hatch date.

