OHDSI Home | Forums | Wiki | Github

How to turn off cohort cache on Atlas

Hello

We realized that the cohort results are cached on Atlas and not updated following the underlining CDM changes, unless the cohort definition is changed. This is an issue for us since our OMOP database get monthly update.

Can you please advise how to turn off cohort cache? I couldn’t find instructions on Atlas/WebAPI wiki site.

Thank you
Jack

There are 2 params for this:

    <cache.generation.invalidAfterDays>30</cache.generation.invalidAfterDays>
    <cache.generation.cleanupInterval>3600000</cache.generation.cleanupInterval>

The invalidAfterDays can be set to 0, and this will cause the cleanup to remove cached data for anyting prior to today. Set it to -1 if you to include the current day.

The cleanupInterval is in milliseconds, so the job will clean up the cache every hour. If you need it more frequent, then reduce that to something like 60000 to have it run every minute.

@Chris_Knoll

Thanks, Chris. I have few questions

  1. Is the setting applied to each cohort independently? or it is a system wide clean up schedule regardless when the cohort was generated?
  2. Does the cleanupInterval take effect only when invalidAfterDays is set to -1?

Thanks
Jack

1: It is the entire cohort cache.

2: The cleanup interval is just a polling interval…so every time the interval elapses the cleanup will remove any cohort results in the cache that is older than invalidAfterDays. -1 just ensures that everything is invalid.

@Chris_Knoll

These two parameters can be set to clean up the cache fast enough so that cached results only live in a very short period, but the cache function is not turned off, is my understanding correct?

In the pom.xml file, there is a setting for result cache warming, what does it do?
<cdm.result.cache.warming.enable>true </cdm.result.cache.warming.enable>

Thanks
Jack

the result.cache.warming is related to caching the datasources results.

What gets stored in cdm_cache vs. achilles_cache?

The cdm_cache is defined as:

CREATE TABLE ${ohdsiSchema}.cdm_cache
(
    id                      int8 NOT NULL DEFAULT nextval('${ohdsiSchema}.cdm_cache_seq'),
    concept_id              int4 NOT NULL,
    source_id               int4 NOT NULL,
    record_count            int8 NULL,
    descendant_record_count int8 NULL,
    person_count            int8 NULL,
    descendant_person_count int8 NULL,
    CONSTRAINT cdm_cache_pk PRIMARY KEY (id),
    CONSTRAINT cdm_cache_un UNIQUE (concept_id, source_id),
    CONSTRAINT cdm_cache_fk FOREIGN KEY (source_id) REFERENCES ${ohdsiSchema}.source (source_id) ON DELETE CASCADE
);

This stores record counts, by source_id. It was found that it’s quicker to query record counts by concept ID from the webAPI db instead of querying out to MMP platforms (redshift, netezza, spark, etc), so these records are copied here after they are requested.

achilles_cache is defined as:

CREATE TABLE ${ohdsiSchema}.achilles_cache
(
    id         bigint  NOT NULL DEFAULT nextval('${ohdsiSchema}.achilles_cache_seq'),
    source_id  int4    NOT NULL,
    cache_name varchar NOT NULL,
    cache      text,
    CONSTRAINT achilles_cache_pk PRIMARY KEY (id),
    CONSTRAINT achilles_cache_fk FOREIGN KEY (source_id) REFERENCES ${ohdsiSchema}."source" (source_id) ON DELETE CASCADE
);

Tis table caches a JSON (in the text field) report data for the achilles reports. After building the report data from the cdm source, it will cache those results as JSON in this table.

For Broadsea, I’m messing with adding these as env variables, so for instance to have the cache always be invalid:

CACHE_GENERATION_INVALIDAFTERDAYS=-1
CACHE_GENERATION_CLEANUPINTERVAL=3600000

yes, the reason why -1 value works is because it looks at the date of the generation and performs the addition of the days to the generation date and if it is later, than the cache is invalid. -1 causes the result to always be in the past (the day prior) so it will always consider the cache dirty.

1 Like

Great! Seems to work fine on my local Broadsea instance, will add to develop branch.

t