Cleaning Yield Monitor Data Matters for Better Farm Decisions

Share Tweet Email

Yield monitor data is one of the most essential datasets growers use in multiple decision-making processes. This data is in point-based geospatial format, collected by harvesters using yield monitor systems. Yield monitor data can provide information on grain flow, moisture, speed, swath width, time and date, and GPS locations.  

While harvesting in the field, yield data points are typically collected every 1 to 2 seconds. The harvester travel speed and data logging rate determine the distance between two yield monitor points (usually 5 to 10 feet). The header width (swath width, ranging from 15 to 40 feet) determines the spacing between two adjacent harvest passes. This yield monitor data is used to create maps that help growers understand crop performance based on the field variability in nutrients, topography, moisture, management, varieties, and pest/disease pressure. Using historical field yield maps, growers can select the appropriate seeding rate, site-specific fertilizer rates, or choose the proper variety/hybrid by identifying yield zones, such as high-, medium-, or low-performing areas within the field (Figure 1).
 

A map of grain yellow and blueAI-generated content may be incorrect.

Figure 1. Spatial distribution of soybean yield across a quarter-section field. The yield variability map was created using yield monitor data, and such maps illustrate the variability of yield, which is influenced by within-field topography (such as summits, backslopes, and foot slopes) and different soil types. Soybean yields ranged from 20 to 60 bushels per acre. Image provided by Deepak Joshi, K-State Extension.

 

Sources of error in the yield monitor data

Before implementing variability maps, it is crucial to clean them properly to remove any erroneous data that may be present. There are multiple ways in which data recorded by yield monitoring systems might be less reliable or may contain inaccurate data, such as:

  • Speed change: Inconsistent harvester speed, such as high speed at the beginning of a harvester pass or slow speed at the end while turning the harvester from one harvest pass to another, can result in extremely low or extremely high yield readings.
  • Turn and overlap areas: Driving the harvester with the header down in the harvesting position across already harvested fields may result in zero yield readings if the grain flow sensor is activated and no grain flows through it.
  • Incomplete harvest pass: When a field is nearly harvested, some passes may not utilize the full header width. For example, suppose a 40-foot-wide harvester typically combines 16 corn rows in a single pass. In that case, the final remaining section may be narrower, such as only 11 corn rows, resulting in an incomplete full-width pass. This may result in inaccurately low yield-per-acre values.
  • Flow delay: Flow delay is the travel time of grain through the combine to reach the flow sensor after a short delay. Typically, flow delays are 5 seconds, and these delays need to be corrected to obtain accurate spatial yield data.
  • Moisture delay: Similarly, moisture delay is caused by the time lag between the harvesting of crops and the measurement of their moisture content by the moisture sensor. For accurate and quality data, it needed to be adjusted.

Case study to demonstrate the importance of cleaning yield monitoring data

A corn field was harvested in the first week of October in McPherson County, KS, using a combine harvester equipped with a yield monitoring system. The raw dataset contained yield records collected every one to two seconds as the combine moved through the field. Each harvest pass was 30 feet wide, and the distance between one yield point and another was an average of 8.6 feet within each pass (Figure 2a). The raw data included a range of operational and agronomic information such as grain yield, grain flow rate, grain moisture content, combine travel speed, swath width, time and date, and the GPS coordinates of every harvested point. In total, the raw dataset consisted of approximately 13,044 individual data points, with a mean yield of 156 bu/ac across the field. The yield range of the raw data was from 5 to 3,963 bu/ac.

The raw data were then cleaned to remove erroneous points, resulting in a more accurate representation of the true field performance (Figure 2b). Cleaning removed approximately 1,600 erroneous points, reducing the standard deviation from 67 bu/ac to 22 bu/ac (Table 1).  Standard deviation is a measure of how different or variable the yield values are. The higher standard deviation in the raw data (67 bu/ac) compared to the cleaned data (22 bu/ac) indicates that many of the extreme values in the raw dataset were not representative of actual field conditions. These outliers created high inconsistency in the yield distribution. After cleaning the dataset, the mean yield changed very little (156 bu/ac in the raw data to 154 bu/ac in the cleaned data), indicating that the true yield values were preserved. However, the standard deviation dropped dramatically in the cleaned dataset, providing a far more accurate and reliable representation of the true spatial yield variability across the field.

 

A aerial view of a fieldAI-generated content may be incorrect.
A aerial view of a fieldAI-generated content may be incorrect.

Figure 2. Yield monitor data before cleaning (A) and after cleaning (B), highlighting the improvement in data quality for spatial analysis. Images by Deepak Joshi, K-State Extension.


Table 1. A comparison of statistical summaries for raw and cleaned yield monitor data highlights the importance of removing erroneous observations prior to analysis.

Statistics

Total data points

Mean (bu/ac)

STD* (bu/ac)

CV*
(%)

Min (bu/ac)

Max (bu/ac)

Range (bu/ac)

Raw data

13,044

156

67

43

5

3963

5 to 3963

Cleaned data

12,434

154

22

15

41

348

41 to 348

*STD (standard deviation) and CV (coefficient of variation) are measures of variability.
 

Take-home message

Overall, yield monitor data is essential in understanding the field's spatial variability. It enables the making of various agricultural decisions, including seeding and fertilizer rates based on within-field variability, as well as many other decisions. However, the real value of such data can be effectively understood through its cleaning and analysis. Raw or uncleaned data may create inaccurate yield maps, leading to poor decisions.

 

Deepak Joshi, Precision Agriculture Specialist
drjoshi@ksu.edu

Logan Simon, Southwest Area Agronomist
lsimon@ksu.edu

Tina Sullivan, Northeast Area Agronomist
tsullivan@ksu.edu

Lucas Haag, Cropping Systems Agronomist at Tribune
lhaag@ksu.edu


Tags:  yields data precision agriculture yield monitor 

Search
Events
Subscribe