Orchard Segmentation

One of the key challenges faced by harvesting robots is the ability to recognize their surrounding environments. The real-world environment of a tree-grown orchard is inherently complex, making it difficult for harvesting robots to navigate and perform tasks while avoiding obstacles such as branches and foliage, while also accurately identifying and interacting with the fruit.

In response to this challenge, we developed the panoptic-DeepLab architecture tailored specifically for orchard environments. This architecture enables us to not only identify the main obstacles but also pinpoint the target, which, in our case, is apples for our harvesting robot. Our system segments each scene (see Fig. 1) into five distinct categories: apple, branch, foliage, sky, and ground. This dataset encompasses a wide range of orchard scenes, fruit morphologies, and various environmental conditions, thereby providing a comprehensive representation of real-world fruit detection scenarios.

Figure 1. Orchard segmentation results.
Figure 1. Orchard segmentation results.

Given that branches pose the most significant obstacle, we have further developed skeleton-lead CNNs. This network works to skeletonize the branches (see Fig. 2), effectively transforming them into graph structures, resulting in improved branch segmentation accuracy. As a by-product, the branch graphs can also be leveraged to efficiently construct a 3D representation (see Fig. 3) of the scene using corresponding depth images. This wealth of data serves as crucial input for robotics planning and decision-making processes.

Figure 2. Branch graphs outputed by skeleton-lead CNNs.
Figure 2. Branch graphs outputed by skeleton-lead CNNs.
Figure 3. 3D representation of branches and apples.
Figure 3. 3D representation of branches and apples.