A summary of Zabble's AI object detection models development process and how we use industry metrics and benchmarks to develop, test and evaluate our AI systems.
At Zabble, we are committed to helping our customers streamline their zero waste goals through AI. Our primary focus is on AI-assisted bin fullness estimation and contaminant object detection. This is achieved through two distinct AI models. This blog’s major focus is our object detection model, which analyzes images to identify items and predict potential contaminants. Once we cover the basics, you will be able to understand how these metrics apply to our models and other industry standards in a subsequent blog post.
We first need to understand what AI actually is. Artificial Intelligence is software that mimics human intelligence (for the most part) to perform certain tasks within larger systems, including the ability to reason, learn, and act autonomously. Sounds impressive (and maybe a little scary), but in reality, AI is just intelligent software built on classical mathematics, probability, and statistics.
AI encompasses many subfields, and things can get difficult to follow pretty quickly. Our focus is primarily on Computer Vision and Deep Learning, the technologies powering both object detection and fullness estimation at Zabble.
Computer Vision and Deep Learning are both subfields within AI, with Deep Learning being a subfield of Machine Learning, which itself is a subfield within AI (you see what we mean?). Computer Vision is the subfield that extracts information from images or videos to perform specific tasks and Deep Learning is the subfield that builds intelligent software from data. It’s the most effective methodology for creating AI, which is why so many tech companies are obsessed with collecting your data!
Zabble’s AI uses Computer Vision models that extract key information from images to identify contaminants. This means that these models are built using Deep Learning paradigms.
We have only scratched the surface, and haven't even discussed Traditional AI vs. Generative AI! Generative AI creates new data that resembles the data they were trained on, like images or video, audio, and more (think ChatGPT). Traditional AI analyzes existing data to make decisions, classify, or predict outcomes. At Zabble, our AI models are categorized as Traditional AI. However, that doesn’t mean Generative AI can’t be used for waste identification. In fact, powerful Vision-Language Models (VLMs) could play a key role here. But that’s a topic for another time! GenAI solutions are just really demanding when it comes to computing power. The larger the AI brain the larger the computer it needs to run on. Zabble aims to streamline the identification process with AI that analyzes data quickly, runs efficiently on clients’ iOS devices, and works even without internet connectivity (GenAI solutions generally need this).
Object detection is a Computer Vision paradigm that aims to localize and classify objects within an image or video sequence. In other words, it seeks to answer:
But recall that the object detection AI is built on classical mathematics, probability, and statistics. At the model level the AI asks itself the following questions:
Now, of course, the model isn’t explicitly thinking through these questions, it’s simply performing the necessary mathematical computations to provide the answers to these questions.
In the example below, we can see what we mean by localization and classification. The AI model identifies two regions of interest and provides a bounding box around these objects. It then classifies these regions into the classes it finds most probable, in this case, “Can” and “Snack Wrapper.”. The rest of Zabble’s AI system then matches the items to the list of contaminants provided by the customer in the template. Finally, since this is a Landfill bin, it selects “Can” as a contaminant.
Object detection within waste management presents challenges due to the high density, and unstructured nature of waste environments, leading to inaccurate predictions.
A key challenge is managing intraclass variation (differences within a class) and interclass variation (similarities between classes). They may be inherent or due to image conditions like varying camera modalities, image noise, distortion, and compression artifacts. Additionally, models must function in unconstrained environments with challenges like poor lighting and occlusion.
In the waste management industry, spatial arrangements are especially complex, as objects are densely packed and often appear disfigured or crushed, creating a chaotic environment. This increases object variability, making accurate classification and localization more difficult. Unfortunately, we have to handle all of these challenges. Also, it needs to run efficiently on mobile devices, meaning their brain capacity is limited. But that’s part of the fun! Overall, object detection in waste management remains an area of active research, and at Zabble, we continuously work to improve our approach.
Object Detection metrics are essential for understanding AI models and evaluating their performance. The AI field uses many commonly known metrics to evaluate AI models. To compare different models from each other many scientists have designed benchmarks to determine which AI is better suited for your needs.
Object detection models are often evaluated based on their accuracy using a well known benchmark dataset called COCO. This dataset challenges models to detect and classify common objects. A more difficult benchmark is Open Images V7, created by Google, which includes a wider range of objects and more complex scenes. Typically, models that perform well on COCO often exhibit significantly lower accuracy on Open Images V7.
In other words, the complexity of the scene/application space, the number of classes being identified, and the amount of available training data all play a major role in a model’s accuracy. Unfortunately, there is no benchmark that is available for object detection in waste management so we evaluate our models using the standard well adopted object detection metric known as Mean Average Precision.
Mean Average Precision (mAP) is a commonly used metric that encompasses the models ability to classify an object within an image and localize it. While you can think of mAP as accuracy, it's actually more complicated than that. For our purposes, we may refer to mAP and accuracy interchangeably.
In the following example there is a ground truth bounding box in green, with two distinct bounding box predictions in red. To assess the accuracy of localization, we use Intersection over Union (IoU), which measures how closely the predicted bounding box aligns with the ground truth box. If the predicted bounding box has at least 50% overlap with the true bounding box, it is counted as correct.
We’ll spare you the math, but from here, we aggregate all correct and incorrect predictions and average them to calculate mAP. In this example, an IoU threshold of 0.5 or 50% is used, which we refer to as mAP50. The threshold can also be increased to be more stringent, depending on the needs of the application. In object detection, you will often see both mAP50 and mAP50-95. The mAP50-95 is the average of all mAP calculations across IoU thresholds ranging from 0.5 to 0.95 in increments of 0.05.
Which one should you use or maximize? That largely depends on your goals. If object presence is the priority with some level of localization is sufficient, mAP50 is robust enough. However, for applications where precise localization is critical, such as autonomous driving, mAP50-95 is the better choice. At Zabble, we choose to use mAP50 while using mAP50-95 as a comparison metric against other models. By using both metrics in conjunction, we can gain a more comprehensive understanding of our object detection models, and make informed decisions about how to optimize their performance for different applications.
To know more about how ZabbleAI's object detection models perform, read our next blog post, titled "How Zabble's Object Detection Model Stacks Up Against Industry Benchmarks", where we delve deeper into how Zabble utilizes these metrics to develop, test, and evaluate its state-of-the-art object detection AI models!
If you want to know more about how Zabble applies these AI models to detect contamination through our mobile application, please reach out to us for a live demo.
A summary of Zabble's AI object detection models development process and how we use industry metrics and benchmarks to develop, test and evaluate our AI systems.
Wednesday, April 2, 2025
Read about the basics of object detection AI, challenges to developing a methodology and metrics and what it means for contamination detection and identification at the bin.
Monday, March 31, 2025
AI and automation enhance route reviews and cart inspections by automating data capture, engaging stakeholders, educating residents and businesses and ensuring compliance for legislations such as SB1383. Examples include faster inspections in Oakland and data-driven education in Watsonville.
Wednesday, February 5, 2025
1966 Tice Valley Blvd, #105,
Walnut Creek, CA 94595
Tel.: 925-289-9345
Email: team@zabbleinc.com
Mobile Tagging
U.S. Patent No. 11,788,877