NMS-19560: Pre populate metadata for each service#8319
NMS-19560: Pre populate metadata for each service#8319
Conversation
Retreive metadata before once for each service instead of retrieving it for each RPC invoke. Whenever metadata updates, we should clear parameters for the service
67bbed1 to
e730aa0
Compare
cgorantla
left a comment
There was a problem hiding this comment.
I haven't tested this thoroughly that's why I kept it in Draft but I think this should work. I guess we can achieve similar pattern on Collectd/Thresholding.
There was a problem hiding this comment.
I really don't know if it's a good idea to tackle the performance problem in this way. For me, it carries too many risks of indeterminate behaviour, which is then very difficult to investigate.
My idea for optimisation is as follows. Currently, the interpolator methods work with scope instances. I would change this to ScopeProvider, i.e. the scope is not always generated, but only when a string actually contains a metadata expression. If it works as I imagine, this would eliminate all getScopeFor...() calls for the thresholds, as we do not have any metadata expressions there in the default configuration. And even for other spots in OpenNMS this will eliminate a lot of calls to the EntityScopeProvider.
| final Map<String, Object> raw = m_configService.getParameterMap(); | ||
| if (m_entityScopeProvider != null) { | ||
| final Scope scope = new FallbackScope( | ||
| m_entityScopeProvider.getScopeForNode(m_service.getNodeId()), |
There was a problem hiding this comment.
Our current implementation of a Scope retrieves for instance a node entity and add calls to the entity's properties. So, it did not fill in strings but calls the entity's method on demand. This means DAO objects are kept inside the scope and additional handling for node updates isn't necessary I think. My attempt to keep MapScope objects in the cache was not as successful as I had hoped. Unfortunately, this approach created additional load on the database whenever the cache was refreshed.
| } | ||
| intf.removeMetaData(context); | ||
| getDao().update(intf); | ||
| sendNodeMetadataUpdatedEvent(intf); |
There was a problem hiding this comment.
A poller scope consist of a service scope, an interface scope and a node scope. The node scope includes asset information and stuff like sys-object-id, the interface scope things like if-description. So, just handling the change of meta-data isn't enough I think. We also need to send events when node, interfaces and services are altered via ReST. Also a simple rescan can change some of the node's fields.
There was a problem hiding this comment.
Yeah, two events I'm keeping track of, node updated and node scan completed. Both of these can take care of all the updates. If we gap somewhere in our Rest API, we can add those.
| final Scope scope = new FallbackScope( | ||
| m_entityScopeProvider.getScopeForNode(m_service.getNodeId()), | ||
| m_entityScopeProvider.getScopeForInterface(m_service.getNodeId(), m_service.getIpAddr()), | ||
| m_entityScopeProvider.getScopeForService(m_service.getNodeId(), m_service.getAddress(), m_service.getSvcName()), |
There was a problem hiding this comment.
Even in this case scopes and interfaces are retrieved more than neccessary. Of course, this will happen only when scheduling, but it isn't distributed over the different poller calls over time. This may have an impact on scheduled provisiond runs.
There was a problem hiding this comment.
Once refresh happens, we clear current interpolated parameters and getParameters happens on it's own schedule. We are not fetching anything that is not necessary. It's the same code in PollerRequestBuilder that moved to here. Instead of doing it for each service schedule, we are doing it once at the beginning. Now parameters are holding actual interpolated params instead or raw.
yeah, that makes sense. That should definitely help. If we don't have metadata expressions, we don't load any scopes and DB calls. My current implementation in this PR moves interpolation from every poll cycle to once per service at the start. And for every node update and node scan complete, we clear the parameters. And next service schedule would do interpolation again. This doesn't introduce any new bottlenecks and DB calls are still per-service and still staggered by scheduler . Only thing is that we need to ensure all node metadata updates are taken care by events. We already have event based lifecycle updates in Polling/Collectd and Threshold and we should be able to handle this. We can test all of this more broadly and decide on next steps. |
Retrieve metadata once for each Poller request instead of retrieving it for each RPC invoke.
Whenever metadata updates, we will refresh the request and interpolate those again.
Runtime attributes will still be called for each RPC request
External References