- All Implemented Interfaces:
- org.quartz.Job
@PersistJobDataAfterExecution
@DisallowConcurrentExecution
public class OrphanedBuildMonitorJob
extends Object
implements org.quartz.Job
This class looks for orphaned builds - i.e. builds that claim to be in a certain state, but from server status
it's clear that they will never be able to make transition out of that state.
Currently, the following situations are detected:
- Build claims to be queued, but it's not in the queue for an extended period of time.
- Build claims to be active, but there is no agent that's actually building it
In case we find a problematic build we cannot just remove it:
1. maybe an agent is already working on it, but just did not report in yet (agentId == null, but everything OK)
2. maybe the agent is not responsive (agentId != null, agent may or may not come back).
3. maybe 1. happened and will be followed by 2.
etc.etc.
In case of 1., we should give the agent some time to report in.
In case of 2. or 3., we should let the AgentManager remove the agent and the build, it will do so in heartbeatTimeoutSeconds+heartbeat seconds
So, waiting heartbeatTimeoutSeconds + 2xheartbeat before taking action seems like a good idea