Project Title: | Estimating Mean Fish Length in a Kelp Forest Tank |
---|---|
Contributors: | Andrew Malloy & Jackson Hsu |
For best results, view the charts after submitting the Google Form.
Sample Mean Definition
A mean is a measure of center and is calculated by adding up individual observations and dividing them by the number of observations in your population. We often can not observe every individual in a population, so we use a sample (or subset) from the population to estimate the population mean. This is why it is important to select a sample that will represent your population well.
In our activity, we are taking fish lengths from five fish, adding them, then dividing this sum by five (the number of fish in our sample). For the purpose of this activity, we use the terms mean and average interchangeably because they are calculated with the same equation. Mathematically, the sample mean is represented by the symbol X-bar (an x with a bar on top) and our equation used to calculate the mean is:
where Xi is fish length and n is the number of fish in our sample. The symbol ∑ indicates that we are calculating a sum of our selected observations.
We can use our calculated sample mean as a reference point to compare individual data points to.
The Takeaway
In our activity, we want to show the importance of taking a sample from the population. We also want to show that different samples can produce different results. The self sample dot plot could reveal a human bias in our choices by showing where mean lengths fall in comparison to randomly selected samples.
Human bias usually causes us to select samples based on size, proximity, or visual characteristics. For example, choosing bigger or more brightly colored fish over those that are smaller or harder to see.
In statistics, we try our best to avoid bias by using random sampling.
A Real World Application
Fish populations are often estimated based on a random sample. This helps fisheries managers determine the health of fish stocks and make regulatory decisions. If we only sampled the largest fish, we would risk overestimating fish size, and miss accurately portraying the population in terms of fecundity and generational recruitment.
The ocean sport fishery is randomly sampled throughout California. Fish are measured to attain lengths and weights which are then compared to set harvest limits, often set in pounds for the state. When fish can not be weighed, weights can be attained through length-weight regressions. In this way, gathering lengths is one way to estimate the total pounds of fish harvested for a given population of sport-caught fish and determine the status of the fishery (i.e. Is a species being overfished?).
Just like in the kelp forest tank example, it would be impractical and arguably impossible to measure every fish caught and landed in the state of California, let alone the entire length of the Pacific coast line. A random sample allows us to gather data efficiently and in a timely manner to keep fisheries sustainable along our coast.