As a followup to 3bf8c8ceb6 that added the
parse cache, add a small short lived cache on the workers to effectively
debounce the number of Software::new events sent up to the proxies.
User-Agents are highly repetitive, workers often see exact duplicate
user-agents on the same orig_h. Worse, due to NAT, virtualization, and
the proliferation of Electron based applications, variations of the same
user-agent can be seen at the same time. For example:
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.6613.18 Safari/537.36 Zoom/6.2.0 (1855)
When these two user-agents are seen concurrently, the software framework
will log each flip as a new user-agent. This can be fixed separately on
the proxy side, but a reduction of Software::new events is still needed
to reduce cluster communication overhead as well as the load on the
proxies.
With a 10 minute cache on the workers, this should greatly reduce the
number of redundant user-agents logged in the software.log