Of course, the problems caused by excess fertilization would be alleviated if farmers could make accurate determinations of the right amounts of fertilizer and seed to apply to their lands. Figuring out how much to apply isn’t too difficult if the cropland in question has uniform physical characteristics, such as elevation and soil quality. “When you have a very homogeneous field, and very flat, things are not varied—it’s easier to predict,” says Barbosa. The simple multiple linear regression algorithms that are already in use do it just fine.
The challenge is much greater, though, if the important attributes vary significantly across a field.
To tackle that challenge, Barbosa started developing four different convolutional neural network (CNN) architectures for predicting yield. Each of them uses a different approach to combine available data on a specific cornfield. To test the CNNs, he used data that had been collected from nine cornfields located in Illinois, Ohio, Nebraska, and Kansas.
He divided each of the fields into grids of 5-by-5-meter cells, and provided each CNN with data on five attributes of each cell: elevation, soil quality, satellite imagery, and varying levels of fertilization and seeding. For each CNN he evaluated how good the predictions were.
Barbosa discovered that the most effective CNN, dubbed “Late Fusion” (LF), outperformed not just his other three CNNs, but also the prior solutions, including a random forest model and a multiple linear regression model.
The LF architecture differs from the other CNNs in that it first analyzes each one of the five cell attributes by itself across the entire cornfield, examining how each individual attribute varies across all the cells in the cornfield. Only afterwards are the separate findings on the five attributes brought together to form a cornfield-wide analysis encompassing all five features.
Why would the LF strategy work the best? “Probably because of the physics of what’s happening in the field,” says Barbosa.
Early fusion would work better than late fusion if, for example, the fertilizer interacts with soil quality in a very fine-grained way, so that the model needed to capture the attributes’ interaction with high resolution. Barbosa’s findings suggest that the attributes in fact have a more generalized interaction with each other.
“Because of the characteristics of fields, it’s more efficient to make the feature extraction from each input independently and then combine them. It’s a more simple model and I think that’s one of the reasons it worked better,” he says.
Since the work featured in deeplearning.ai was published, Barbosa has had a second paper accepted that refines the optimization approach, which finds the best fertilizer and seed rate maps to improve yield. The next important step will be to quantify the uncertainty in the LF CNN’s recommendations, as that will be key to farmers’ decision-making process.
Source : illinois.edu