Hi Simon,
Thanks very much for contacting us about this.
The short answer to your question is, yes, I think you certainly have room to expand your granule search volume to the CMR, as quantified by number of concurrent requests (quantified as searches per second or minute handled by the CMR).
Over the past 30 days, EarthEngineBot has issued an average of 11.7K/day, with a single day max of 23.6K. During the largest single week in the past 3 months (the week of 5/4/25), there were an average of 53 requests/min, with spikes >= 159 requests/min 5% of the time, with a max of 504/min. During this week, 5% of the time, the CMR handled >= 5 requests/second, with a max of 12 requests/second.
This is not a large percentage of CMR Search operations overall, which regularly sees more than 3M requests / day.
While the overall volume of EarthEngineBot queries is significantly lower than that of many other clients, the per search request latency for EarthEngineBot is actually higher than the overall search statistics for all other clients. This doesn't necessarily mean you can't increase your throughput, only that, on average, EarthEngineBot search requests are more costly. For the month of April, your search requests took an average of 1.3 seconds to complete, with 5% taking 4+ seconds. Something to be aware of as you increase the number of concurrent search threads.
The larger question of how much volume is acceptable vs too much is harder to answer. The CMR does not provide specific search volume guidance due to the variability of the search signatures and target search space. Simply, searches against some data sets, due primarily to the volume of available data, are more costly than others. Some of your search requests may be more readily handled by the CMR than others, so even across your target datasets, some volume increase may be barely noticeable while others are more impactful.
Therefore, we don't give a one-size-fits-all threshold number for search volume. At some point a client could load the system to the point that per search latency increased to the point that the increased volume generated diminishing returns for the client, and impacted overall system performance and stability for all users, but it's difficult to predict that threshold generally.
I would be cautious about increasing the number of worker threads if it meant you were going to be increasing the number of concurrent queries against the same data sets. We're working to improve this, but we've seen issues with overloading a single dataset when a client simply runs more threads against different temporal ranges of the same dataset. If you could structure your harvesting queries so that the expansion of workers included more target datasets, rather than more sub-sets of the same datasets, that would help your performance and the system's stability.
I suggest that you could double your workers from 10 to 20, run for a while and see how it performs. Please feel free to contact us at
support@earthdata.nasa.gov for a performance and stability check. If that looks ok, then you should be able to increase it again.
If it helps, as a general reference to any published guidance we offer, here are a couple of documentation links:
https://wiki.earthdata.nasa.gov/display/ED/CMR+Client+Partner+User+Guide#CMRClientPartnerUserGuide-BestPracticesforCMRClientOperations
https://wiki.earthdata.nasa.gov/display/CMR/CMR+Harvesting+Best+Practices
I hope this helps.
Regards,
John Teague
CMR Operations