scheduler job execute more than one time

sugarcrm version: SugarCRM Version 11.0.3 (Build 292 P) (Q2 2021) (professional)

I have a monthly invoice renewal cron job. It run on every 1 day of each month and will create around 4000 invoices.
The problem is I found some invoice is created more than 1 or 2 randomly and usually no more than 10 duplicates each month.

I have checked the job_queue table, all job status is done and 0 requeue and none of it have retry_count.

I have no clue where to check and why these scheduler job run more than once.

Any suggestion would be appreciated.

Parents
  • This sounds like a custom scheduler job. It's not easy to find an exact cause without seeing your scheduler's code.

    Most custom schedulers should be in:

    custom/Extension/modules/Schedulers/Ext/ScheduledTasks

    the script name should be the same as the Job field you see in the Scheduler. So, for example if the Job field says function::generateInvoices

    then you should find the code at: custom/Extension/modules/Schedulers/Ext/ScheduledTasks/generateInvoices.php

    If your instance is old enough, the code may still be in the old entry point format and therefore in:

    custom/entry_points/

    My guess is that there is some flaw in the logic that is picking up some records more than once, perhaps a bad join in a query?

    I would check the corresponding script, see what the criteria is for processing the records and see if there are any flaws in that logic that may cause some records to be picked up twice. 

    Check to make sure that the appropriate records are flagged as processed once the invoice is generated so if the scheduler does run more than once it does not pick up duplicates.

    If all that is correct then it's possible that some records are not being flagged correctly, maybe there is an error that is particular to that record which causes it to not to be flagged?
    Adding some logging and monitoring the PHP and Sugar Log while the job is running may help you identify which records are being reprocessed and why.

    Hopefully something stands out to you once you find the right script.
    FrancescaS

  • Thank you Francesca,

    yes, it's a custom scheduler job.

    The list that need to create invoice is match in the job_queue. For example, there are 4000 invoice should be create on the 2022-10, the number of row in job_queue is also 4000. However, there 4008 invoice was created at the end.

    We don't use flag but we did check if there are any invoice for this month before creating the invoice.

    The duplicate invoice is random generated every month and we found not clue for that.

    We found below article and there is parameter related to scheduler job, at the beginner we thought it's due to the scheduler job is running too long and trigger the job run again, and we try to adjust the parameter to make to longer but still have duplicate invoice and the job_queue table show that all job is success and not retry.

    jobs.min_retry_interval


    support.sugarcrm.com/.../

    Finally, thank you for the suggestion. We may use a flag to check if we still can't find out the reason.

Reply
  • Thank you Francesca,

    yes, it's a custom scheduler job.

    The list that need to create invoice is match in the job_queue. For example, there are 4000 invoice should be create on the 2022-10, the number of row in job_queue is also 4000. However, there 4008 invoice was created at the end.

    We don't use flag but we did check if there are any invoice for this month before creating the invoice.

    The duplicate invoice is random generated every month and we found not clue for that.

    We found below article and there is parameter related to scheduler job, at the beginner we thought it's due to the scheduler job is running too long and trigger the job run again, and we try to adjust the parameter to make to longer but still have duplicate invoice and the job_queue table show that all job is success and not retry.

    jobs.min_retry_interval


    support.sugarcrm.com/.../

    Finally, thank you for the suggestion. We may use a flag to check if we still can't find out the reason.

Children
  • Hi everyone,

    For those of you who are encountering this problem.
    Here are some observations and how we are avoiding this scenario.
    This did happen in version 12.0.2 as well.

    1. When we added logs in the custom job script, we could confirm that a single job was picked up multiple times.

    2. Then we added a simple file-locking mechanism. It checks if a job has a corresponding file, if not, create a new file, execute the job and delete the file. If the file is found, exit the script.
    This brought down the number of duplicates drastically. Some jobs were executed simultaneously at the same time.

    3. To avoid the parallel run scenario, we introduced a random sleep between 0 and 1 second at the start of the script. We had 0 duplicates.

    4. We had a max_cron_jobs value close to 200 in production. Upon bringing it down to 50 (the default is 25 I think), the duplicate did not occur. I do not fully understand the significance of max_cron_jobs values. I guess a very high value makes the processing of job queues unreliable.

    So, we have brought down the value of max_cron_jobs and also based on the criticality of the job and to ensure the same job is not run again, we will still keep the locking and delay mechanisms in our script.

    Hope this is of use to someone.

    Ramprasath Karunakaran