How to Improve Computer Vision in Robotics for Agriculture?

The increase in the world population has resulted in an increase in food demand for agricultural products. Due to the limited availability of resources, increasing food production to meet growing demand is a difficult task. The food production sector is one of the most important occupations among rural people due to its underdeveloped methodology. But now, artificial intelligence and robotics in agriculture are leading this industry using powerful computer vision technology that trains machines to improve farming and farm productivity. Researchers, engineers, and farmers have come up with a variety of solutions, including better farming techniques, precision farming, and farm automation, etc. to overcome these challenges.

Contents

Computer Vision for Agriculture

Computer Vision in Robotics for Path Detection

Table 1: Table of the split data for each type, Training, and Validation

Computer Vision for Agriculture

Computer vision may seem simple enough to understand, but below the surface, there are complex and interdisciplinary disciplines related to a variety of technologies, old and new.

According to the requirements of the image, computer vision can be used in a variety of cameras as the “eyes” of the machine. A common method of fruit detection is to use a color camera to identify the fruit from the tree and an additional stereo camera to detect the relative position of the fruit for automatic harvesting. However, it is far from the only computer vision model. You can use infrared, multispectral, thermal, and even 3D cameras. Computers can use identification based on points or curves, or they can detect objects rather than shapes or textures. For agricultural applications, detecting a given object is not enough. Products are rarely uniform and can vary in size and color and require additional processing to accurately identify the fruit. In short, the “seeing” aspect of computer vision is just the start. The next step is to identify or understand the image. “Understanding” this image is quite a complex matter, and the process is generally segmented into machine learning or deep learning. Deep learning is essentially a subset of machine learning that uses artificial neural networks to collect information, identify patterns, and learn while doing. Deep learning is more complex because it mimics the way the human brain learns and can be applied to complex problems such as natural language processing and image recognition.

Computer Vision in Robotics for Path Detection

In agriculture, well-trained robots can be used to perform various tasks (planting, weeding, harvesting, etc.). Recently, autonomous agricultural robots have been widely adopted to increase crop productivity and work efficiency. Navigation systems are an important part of these autonomous robots. However, computer vision-based systems are more common due to their low cost, ease of use, and widespread use of vision-based sensors. One of the major problems with agricultural robot computer vision navigation is the precise detection of rows of crops to guide robots. Path detection is an important issue in mobile robot and autonomous car applications. Nowadays, most methods get reliable results only in certain structured environments. This blog offers a new vision-based approach to finding rows of crops in different regions.

To train the computer vision-based AI model, annotated data in image or picture format are used to make the subject or object of interest recognizable by machines using algorithms. machine learning for similar predictions. To solve the navigation problem and detect the path using Computer Vision methods where generally the image segmentation is the best choice to allocate and classify each pixel into Crop rows class or Background class (you can add any particular classes like persons or vehicles, etc. to add them). In image segmentation tasks, there are many types of models that can handle this complexity, but limited annotated data can be a major problem in generalizing the trained model to different fields in the crop rows. Although CNN has made great strides in image segmentation, it generally requires a large number of high-density annotated images for training and is difficult to generalize to new objects of the same category. Therefore, a Few shot segmentation has been developed to learn how to perform image segmentation from a few annotated examples. From the various models of few shot segmentation, the PANet model, a novel Prototype Alignment Network is used to learn classific prototype representations from a few support images with an embedding space before performing segmentation over the query images with matching each pixel to learned prototypes. To train the PANet model, we annotate 170 images and save them into Pascal VOC format, and split them into :

Table 1: Table of the split data for each type, Training, and Validation

Splitting Dataset
Training (70%)	120 images
Validation (30%)	50 images

Before training the PANet, we got to set the number of ways to one way, the number of shots between 5 and 7 shots and the number of steps to 30000. After training the model, we achieved 72% mIOU score in the validation dataset. Here are some examples from testing the model on new images:

Image 1: some results from the test

For the next step, the complexity will be in detecting and finding the center line of the crop rows and comparing it to the middle axe of the image (the camera was placed in the centre of the robot).