Bee Colony Survey Data — United States

Bee Colony Statistical Data from 1987–2017

Lisa
6 min readJun 5, 2023

PROJECT OVERVIEW:

Objective: Identifying bee species from images and tracking colony statistics from the U.S. Department of Agriculture.

Data description: Census and survey data from the USDA and colony loss data from Bee Informed.

DATA SUMMARY:

· “Bee Colony Survey Data by State data was retrieved from the United States Department of Agriculture National Agricultural Statistics Service Quick Stats Dataset with the selection criteria shown in the file Search criteria for bee colony census attached to this dataset.

· Bee Colony Census Data by County data was retrieved from the United States Department of Agriculture National Agricultural Statistics Service Quick Stats Dataset with the selection criteria shown in the file Search criteria for bee colony survey attached to this dataset.

· Bee Colony Loss file from the @makeovermonday dataset 2018w18-bee-colony-loss. Extended to include census region and division data from data.world fact tables”1 (data.world).

EXECUTIVE SUMMARY:

According to survey data, there has been a recent decline in the bee population, partly due to Colony Collapse Disorder. The U.S. Department of Agriculture is actively monitoring bee populations and implementing measures to counter CCD and other factors that contribute to bee loss. One solution for monitoring bees is to utilize machine learning algorithms to identify different bee species through image data. To this end, we developed an SVM model capable of training and testing a prediction algorithm to detect bee species from images. Our model achieved a score of 0.68 and an AUC (true positive rate) of 0.74.

There are still concerns regarding decreasing bee populations and climate change. To address the current challenges, recommendations include maintaining colonies and studying CCD. Mitigating the impact of climate change is also crucial, with potential solutions including developing a climate-proof beehive for indoor use or controlled bee reservoirs.

Another solution is the development of robotic bees with AI — an area still in research and development. In the meantime, it is important to continue reducing risks to bees and finding a cure for CCD. Raising funds for research and promoting awareness of these issues can also be effective paths forward.2

This is an overview of my project which aims to identify bee species from images. The identification of different bee species will allow researchers to more quickly and effectively collect field data.

Beehives are important because of the pollination services they provide. According to bee experts at the Food and Agriculture Organization of the United Nations, a third of the world’s food production depends on bees. In the United States alone, honeybees pollinate $15 billion worth of crops each year, including more than 130 types of fruits, nuts, and vegetables.

However, diseases like colony collapse disorder threaten these species. For example, the bee industry is currently facing difficulty meeting pollination demand in almonds. If research cannot solve CCD, beekeepers will be unable to meet demand for this and other crops. Climate change is another threat, affecting the ability for bees to forage, which decreases their population.

This project aims to address several business questions. Can we help identify species and control populations of bees? Are there trends in the use of historical data, i.e., images? Can we train a model to predict species based on a series of images? Can we evaluate the test predictions and score a model for future use?

The data includes bee colony loss counts and population figures by state and county. We also have photos of bumblebees and honeybees, along with grayscale versions and additionally, RGB values.

First, I researched summary statistics, such as beekeeper counts, and total bee count values over the years by state. In one example, California decreased and then re-grew its bee colony population over the years. There appears to have been a decrease in 2007 and an increase in 2012, but that does not tell the entire story.

This bee colony survey investigates the summary of bee values over the years by state for various data items, such as inventory by state, loss and loss by Colony Collapse Disorder, or CCD. The threat is increasing for CCD and overall loss of our bee colonies. Keeping track of stats like these help scientists understand the largest threats to bee colonies and our global crop production.

Dendrogram clustering is a technique to group similar data items together. It works by creating a tree-like structure called a dendrogram, where each observation is represented by a leaf and the branches show how they are clustered together. The distance between the leaves and the height of the branches determines the degree of similarity or dissimilarity between the data points. This technique is particularly useful for visualizing relationships in complex datasets. By using dendrogram clustering, we can gain insights into the underlying structure of the data and identify patterns that may not be immediately visible.

Image Recognition with Python

Color channels can help provide more information about an image. A picture of an ocean will be bluer, whereas a picture of a fields will be greener. This kind of information can be useful when building models or examining the differences between images. By examining the kernel density estimate for each of the color channels on the same plot, we can understand how they differ.

The goal is to develop a model that predicts bee species and evaluates its accuracy. The plan is to predict species using the data, features, and model by training and testing it using SVM and PCA methods. Finally, we aim to predict image labels and score/evaluate our model for future use.

In summary, the project aims to track bee colony statistics, predict bee species, and focus on image recognition. This will provide insights into business questions regarding bee populations with a machine learning model that can predict species with accuracy.

Data Sources:

https://data.world/makeovermonday/2018w18-bee-colony-loss

https://data.world/finley/bee-colony-statistical-data-from-1987-2017

https://app.datacamp.com/learn/projects/412

--

--