How do we reduce storage cost and improve performance by cleaning BPM tables?

Hello there,

I am working with a Sugar Cloud customer on version 13.0 and they are asking what is the official way to cleanup all BPM SQL tables for old/orphan records.

They already have a cleanup process for the pmse_inbox SQL table through the application tool.

How can we efficiently, effectively and safely clean up SQL tables such as pmse_inbox, pmse_bpm_flow and pmse_bpm_form_action, on an ongoing basis, and also once-off?

Only on the SQL table pmse_bpm_form_action, the customer has about 3M records.

Are there existing best practices, or knowledge bases or tutorials on how to achieve this performance and cost-saving exercise correctly?

We also would like to not have this timeout the schedulers and therefore fail mid-way, especially for the initial cleanups.

Thank you for your help!

Parents
  • Hi Enrico,

    We had a serious problem with the growth of the database, and I've found the Data Archive function helped to reduce the size. 

      

    Bud Hartley | Cape Foulwind, NZ (and Oregon, USA)

  • Thank you  ,

    I am not sure if it is the specific system I am looking at, or if you have something special on yours, but the only pmse_ table we see is pmse_Inbox.

    See here:

    I was also looking at the actual conditions to delete all related tables that are no longer of interest, as we would not want to inadvertently delete information of in-flight long-running processes...

    --

    Enrico Simonetti

    Sugar veteran (from 2007)

    www.naonis.tech


    Feel free to reach out for consulting regarding:

    • API Integration and Automation Services
    • Sugar Architecture
    • Sugar Performance Optimisation
    • Sugar Consulting, Best Practices and Technical Training
    • AWS and Sugar Technical Help
    • CTO-as-a-service
    • Solutions-as-a-service
    • and more!

    All active SugarCRM certifications

    Actively working remotely with customers based in APAC and in the United States

  • Hi  ,

    The 'pmse_Inbox' selection in the Data Archiver actually has a much broader function in that it cleans other tables like pmse_bpm_flow, pmse_bpm_thread, and pmse_bpm_form. One thing you need to be cautious of with using the Data Archiver for this purpose is if your customer has any SugarBPM definitions with start events set to only trigger once for the lifetime of a record. For this type of start event to work, a record in the pmse_inbox needs to be retained from when it previously was triggered. The Data Archiver currently does not account for this scenario and will remove this history. You can see if your customer has any BPM definitions with these start events by running the following query:

    SELECT 
        * 
    FROM 
        pmse_bpm_event_definition pbed1 
    LEFT JOIN 
        pmse_project pp1 
    ON 
        pbed1.prj_id = pp1.id 
    WHERE 
        pbed1.evn_params = 'updated' 
    AND 
        pp1.deleted = 0 
    AND 
        pbed1.deleted = 0;

    Even if they do not currently have BPM defintions set up in this way, they would need to know to never do that without first disabling the data archiver job for pmse_Inbox. I do not recommend enabling Data Archiver for SugarBPM cleanup. 

    There are enhancements logged to improve a few different aspects:

    • 90933 - Give the 'pmse_Inbox' a more appropriate name in the Data Archiver module list so users know its true scope
    • 91044 - Allow Data Archiver to use reports to delete records (this enhancement was referenced to me as a potential solution for properly retaining BPM historical records for the use case cited above)
    • 92330 - The Data Archiver is limited to 10,000 records per run and has a timeout of 30 minutes when running on the 'pmse_Inbox' selection

    We recently performed some cleanup for a customer in their pmse_inbox and pmse_bpm_flow tables. These were the queries we had Sugar Support execute to trim down the table size and retain historical references of records previously going through BPM processes since they had a number of BPM definitions with the aforementioned start event criteria:

    CREATE TABLE 
        pmse_bpm_flow_cleaned LIKE pmse_bpm_flow;
    
    INSERT INTO 
        pmse_bpm_flow_cleaned 
    SELECT 
        * 
    FROM 
        pmse_bpm_flow 
    WHERE 
        pmse_bpm_flow.cas_id NOT IN (
            SELECT 
                cas_id 
            FROM 
                pmse_inbox 
            WHERE 
                cas_status IN ('COMPLETED', 'TERMINATED') 
            AND 
                date_modified < DATE_SUB(NOW(), INTERVAL 90 DAY) 
        ) 
    OR 
        pmse_bpm_flow.cas_index = 1;
    
    RENAME TABLE 
        pmse_bpm_flow TO pmse_bpm_flow_backup, 
        pmse_bpm_flow_cleaned TO pmse_bpm_flow;
    
    CREATE TABLE 
        pmse_inbox_cleaned LIKE pmse_inbox;
        
    INSERT INTO 
        pmse_inbox_cleaned 
    SELECT 
        * 
    FROM 
        pmse_inbox 
    WHERE 
        pmse_inbox.cas_status NOT IN ('COMPLETED', 'TERMINATED') 
    OR 
        pmse_inbox.date_modified > DATE_SUB(NOW(), INTERVAL 90 DAY) 
    OR 
        pmse_inbox.cas_id NOT IN (
            SELECT cas_id FROM pmse_bpm_flow WHERE cas_index = '1'
        );
    
    RENAME TABLE 
        pmse_inbox TO pmse_inbox_backup, 
        pmse_inbox_cleaned TO pmse_inbox;
        
    DROP TABLE IF EXISTS 
        pmse_bpm_flow_backup, 
        pmse_inbox_backup;

    Credit goes to  for those cleanup queries as he had helped someone in SugarClub previously with this same type of request. 

    Chris

  • Hello  ,

    Thanks a lot for the queries, very useful !

    But a question, when I ran the following query in order to count the number of rows : 

    SELECT 
        count(*)
    FROM 
        pmse_bpm_flow 
    WHERE 
        pmse_bpm_flow.cas_id NOT IN (
            SELECT 
                cas_id 
            FROM 
                pmse_inbox 
            WHERE 
                cas_status IN ('COMPLETED', 'TERMINATED') 
            AND 
                date_modified < DATE_SUB(NOW(), INTERVAL 90 DAY) 
        ) 
    OR 
        pmse_bpm_flow.cas_index = 1;

    I'm getting the same number of rows than the total of the pmse_bpm_flow table. It tells me that there is nothing to delete at all :/ I have more than 8.8 million rows ...

    Is there anything we should check on this query ? This is very strange, isn't it ?

    What would be the queries in order to clean up the tables pmse_bpm_form_action and pmse_bpm_thread ? 

    Thanks a lot and have a nice day !

    Best regards,

    Enes

Reply
  • Hello  ,

    Thanks a lot for the queries, very useful !

    But a question, when I ran the following query in order to count the number of rows : 

    SELECT 
        count(*)
    FROM 
        pmse_bpm_flow 
    WHERE 
        pmse_bpm_flow.cas_id NOT IN (
            SELECT 
                cas_id 
            FROM 
                pmse_inbox 
            WHERE 
                cas_status IN ('COMPLETED', 'TERMINATED') 
            AND 
                date_modified < DATE_SUB(NOW(), INTERVAL 90 DAY) 
        ) 
    OR 
        pmse_bpm_flow.cas_index = 1;

    I'm getting the same number of rows than the total of the pmse_bpm_flow table. It tells me that there is nothing to delete at all :/ I have more than 8.8 million rows ...

    Is there anything we should check on this query ? This is very strange, isn't it ?

    What would be the queries in order to clean up the tables pmse_bpm_form_action and pmse_bpm_thread ? 

    Thanks a lot and have a nice day !

    Best regards,

    Enes

Children
No Data