OHDSI Home | Forums | Wiki | Github

Atlas Custom Features - SQL based

Hello,

I’d like to begin adding some SQL-based custom features to our Atlas instance that will help enhance the characterizations we produce. One notable feature is location; we have data with state level information for patients.

I’ve seen in FeatureExtraction a bit on how to do this, but I’d like to leverage validated work if possible. Does anyone have some examples of SQL-based custom features, such as using the location.state field?

Thanks,
Ajit

Bump

@gregk do you have any resources on this?

Hi @Ajit_Londhe ,
I use the following SQL script to calculate BMI using the measurement table.
First I had created the SQL script in PostgreSQL and then changed the tables and schemas and cohort number to the relevant parameters.
“select
cast (4060705 as bigint)1000 AS covariate_id,
‘BMI (25-29)’ AS covariate_name,
4060705 AS concept_id,
count(distinct subject_id) AS sum_value,
sum(BMI)/count(
) AS average_value
from
(select subject_id, AVG(weight.value_as_number/(height.value_as_number/100height.value_as_number/100)) as BMI
from @cohort_table cohort
join @cdm_database_schema.measurement height on cohort.subject_id = height.person_id and height.measurement_concept_id = 3036277
join @cdm_database_schema.measurement weight on cohort.subject_id = weight.person_id and weight.measurement_concept_id = 3025315
where cohort_definition_id = @cohort_id
and weight.value_as_number/(height.value_as_number/100
height.value_as_number/100) >=25
and weight.value_as_number/(height.value_as_number/100*height.value_as_number/100) <30)
GROUP BY 1,2,3”

1 Like

@guy_livne Thanks for kicking this off. In a similar way I created a feature for visit provider specialism, both number of events and number of persons. Happy to share when tested.

Would be great if we can create a resource for these features.

Btw; anyone experience with creating a custom SQL-based ‘distribution’ feature (with mean and quartiles)? Is this even possible in Atlas at the moment? I only seem to be able to create a SQL-based prevalence feature.

Hi all, do we already have a resource as @MaximMoinat mentioned? Where can I find the SQL used for the already available feature analyses in ATLAS? To create custom feature analyses it would be useful to have access to existing code.

Largely it’s coming from FeatureExtraction. The queries (parameterized with SqlRender) are here:

Thanks @Ajit_Londhe! I’ve come across that recently but looking at the list it seems to be missing many of the features e.g. drugs, observations, etc.? Or are these also based off these SQL scripts?

There’s a lot of parameterization in these scripts, so DomainConcept.sql, for instance, can be re-used across different concept domains (drug, condition, etc).

t