How do we reduce storage cost and improve performance by cleaning BPM tables?

Hello there,

I am working with a Sugar Cloud customer on version 13.0 and they are asking what is the official way to cleanup all BPM SQL tables for old/orphan records.

They already have a cleanup process for the pmse_inbox SQL table through the application tool.

How can we efficiently, effectively and safely clean up SQL tables such as pmse_inbox, pmse_bpm_flow and pmse_bpm_form_action, on an ongoing basis, and also once-off?

Only on the SQL table pmse_bpm_form_action, the customer has about 3M records.

Are there existing best practices, or knowledge bases or tutorials on how to achieve this performance and cost-saving exercise correctly?

We also would like to not have this timeout the schedulers and therefore fail mid-way, especially for the initial cleanups.

Thank you for your help!

  • Hi Enrico,

    We had a serious problem with the growth of the database, and I've found the Data Archive function helped to reduce the size. 

      

    Bud Hartley | Cape Foulwind, NZ (and Oregon, USA)

  • That helps if you can hard delete the records.

    The archiver has two options: move to an archive table (does nothing for DB size) or hard delete.

    It would be great if there was an option to archive to an off-site database so that customers had the option to have a data warehouse of older information off of their main Sugar DB.

    Of course this is less of an issue for those of us On-Site since disk space can be more easily managed when you have full control of the database.

    Francesca

  • Hey!

    I believe you *could*, in theory, do some of that. Aside from deleting the archived table from Sugar, which you would need Sugar to do.

    It is not for the faint of heart and it requires lots of scripting and a server doing the transformation for the "integration".

    The archived table is a new table (*_archive), you could then request via API a Sugar backup, download it, extract its content programmatically and, then load the SQL dump into a temporary database, which then pushes the _archive tables to the final offsite archival destination.

    But then you would need someone to delete the archived table for you...

    --

    Enrico Simonetti

    Sugar veteran (from 2007)

    www.naonis.tech


    Feel free to reach out for consulting regarding:

    • API Integration and Automation Services
    • Sugar Architecture
    • Sugar Performance Optimisation
    • Sugar Consulting, Best Practices and Technical Training
    • AWS and Sugar Technical Help
    • CTO-as-a-service
    • Solutions-as-a-service
    • and more!

    All active SugarCRM certifications

    Actively working remotely with customers based in APAC and in the United States

  • Thank you  ,

    I am not sure if it is the specific system I am looking at, or if you have something special on yours, but the only pmse_ table we see is pmse_Inbox.

    See here:

    I was also looking at the actual conditions to delete all related tables that are no longer of interest, as we would not want to inadvertently delete information of in-flight long-running processes...

    --

    Enrico Simonetti

    Sugar veteran (from 2007)

    www.naonis.tech


    Feel free to reach out for consulting regarding:

    • API Integration and Automation Services
    • Sugar Architecture
    • Sugar Performance Optimisation
    • Sugar Consulting, Best Practices and Technical Training
    • AWS and Sugar Technical Help
    • CTO-as-a-service
    • Solutions-as-a-service
    • and more!

    All active SugarCRM certifications

    Actively working remotely with customers based in APAC and in the United States

  • Hi  ,

    The 'pmse_Inbox' selection in the Data Archiver actually has a much broader function in that it cleans other tables like pmse_bpm_flow, pmse_bpm_thread, and pmse_bpm_form. One thing you need to be cautious of with using the Data Archiver for this purpose is if your customer has any SugarBPM definitions with start events set to only trigger once for the lifetime of a record. For this type of start event to work, a record in the pmse_inbox needs to be retained from when it previously was triggered. The Data Archiver currently does not account for this scenario and will remove this history. You can see if your customer has any BPM definitions with these start events by running the following query:

    SELECT 
        * 
    FROM 
        pmse_bpm_event_definition pbed1 
    LEFT JOIN 
        pmse_project pp1 
    ON 
        pbed1.prj_id = pp1.id 
    WHERE 
        pbed1.evn_params = 'updated' 
    AND 
        pp1.deleted = 0 
    AND 
        pbed1.deleted = 0;

    Even if they do not currently have BPM defintions set up in this way, they would need to know to never do that without first disabling the data archiver job for pmse_Inbox. I do not recommend enabling Data Archiver for SugarBPM cleanup. 

    There are enhancements logged to improve a few different aspects:

    • 90933 - Give the 'pmse_Inbox' a more appropriate name in the Data Archiver module list so users know its true scope
    • 91044 - Allow Data Archiver to use reports to delete records (this enhancement was referenced to me as a potential solution for properly retaining BPM historical records for the use case cited above)
    • 92330 - The Data Archiver is limited to 10,000 records per run and has a timeout of 30 minutes when running on the 'pmse_Inbox' selection

    We recently performed some cleanup for a customer in their pmse_inbox and pmse_bpm_flow tables. These were the queries we had Sugar Support execute to trim down the table size and retain historical references of records previously going through BPM processes since they had a number of BPM definitions with the aforementioned start event criteria:

    CREATE TABLE 
        pmse_bpm_flow_cleaned LIKE pmse_bpm_flow;
    
    INSERT INTO 
        pmse_bpm_flow_cleaned 
    SELECT 
        * 
    FROM 
        pmse_bpm_flow 
    WHERE 
        pmse_bpm_flow.cas_id NOT IN (
            SELECT 
                cas_id 
            FROM 
                pmse_inbox 
            WHERE 
                cas_status IN ('COMPLETED', 'TERMINATED') 
            AND 
                date_modified < DATE_SUB(NOW(), INTERVAL 90 DAY) 
        ) 
    OR 
        pmse_bpm_flow.cas_index = 1;
    
    RENAME TABLE 
        pmse_bpm_flow TO pmse_bpm_flow_backup, 
        pmse_bpm_flow_cleaned TO pmse_bpm_flow;
    
    CREATE TABLE 
        pmse_inbox_cleaned LIKE pmse_inbox;
        
    INSERT INTO 
        pmse_inbox_cleaned 
    SELECT 
        * 
    FROM 
        pmse_inbox 
    WHERE 
        pmse_inbox.cas_status NOT IN ('COMPLETED', 'TERMINATED') 
    OR 
        pmse_inbox.date_modified > DATE_SUB(NOW(), INTERVAL 90 DAY) 
    OR 
        pmse_inbox.cas_id NOT IN (
            SELECT cas_id FROM pmse_bpm_flow WHERE cas_index = '1'
        );
    
    RENAME TABLE 
        pmse_inbox TO pmse_inbox_backup, 
        pmse_inbox_cleaned TO pmse_inbox;
        
    DROP TABLE IF EXISTS 
        pmse_bpm_flow_backup, 
        pmse_inbox_backup;

    Credit goes to  for those cleanup queries as he had helped someone in SugarClub previously with this same type of request. 

    Chris

  • Wow!  I just looked and the option to select "pmse_BpmFlow" isn't available if I try to create a new Archive. I did this one about a year ago.

    FYI: Chris is correct that deleting a previous process that was scheduled to trigger on create or first update is lost if you delete the history.  I have a couple of processes that trigger that way, and the good news is that it wasn't a problem for me since those processes checked the presence of a value in a field that should have been populated when it triggered.

    Bud Hartley | Cape Foulwind, NZ (and Oregon, USA)

  • Thank you  !

    I'll keep the queries in mind. I did have a tool that was doing something similar, but it is no longer working on Sugar 13.0. During my "break" something major broke Slight smile  

    I would have to figure out how pmse_bpm_form_action works and how that table is related to the rest of the bpm tables, as that table is not trimmed in any script/query, and that's the huge one for the customer. 

    Cheers!

    --

    Enrico Simonetti

    Sugar veteran (from 2007)

    www.naonis.tech


    Feel free to reach out for consulting regarding:

    • API Integration and Automation Services
    • Sugar Architecture
    • Sugar Performance Optimisation
    • Sugar Consulting, Best Practices and Technical Training
    • AWS and Sugar Technical Help
    • CTO-as-a-service
    • Solutions-as-a-service
    • and more!

    All active SugarCRM certifications

    Actively working remotely with customers based in APAC and in the United States

  • Hello  ,

    Thanks a lot for the queries, very useful !

    But a question, when I ran the following query in order to count the number of rows : 

    SELECT 
        count(*)
    FROM 
        pmse_bpm_flow 
    WHERE 
        pmse_bpm_flow.cas_id NOT IN (
            SELECT 
                cas_id 
            FROM 
                pmse_inbox 
            WHERE 
                cas_status IN ('COMPLETED', 'TERMINATED') 
            AND 
                date_modified < DATE_SUB(NOW(), INTERVAL 90 DAY) 
        ) 
    OR 
        pmse_bpm_flow.cas_index = 1;

    I'm getting the same number of rows than the total of the pmse_bpm_flow table. It tells me that there is nothing to delete at all :/ I have more than 8.8 million rows ...

    Is there anything we should check on this query ? This is very strange, isn't it ?

    What would be the queries in order to clean up the tables pmse_bpm_form_action and pmse_bpm_thread ? 

    Thanks a lot and have a nice day !

    Best regards,

    Enes