OHDSI Home | Forums | Wiki | Github

Can we migrate from "gender" to "sex" please?

That’s a vocabulary problem and easy to fix. Plus it is a data problem: My hunch is that those cases are not well captured. Remember, for the vast majority of patients this gets entered by the front desk staff of an office or a registration office of a hospital. The gender information is probably more useful for those cases. And in international databases we may be even further away from the correct detail.

1 Like

…or from a data transferal from one EHR to another. Also, many patients demographics may be from old paper charts if the patient has been in the practice longer than the EHR has.

1 Like

Hmm. Perhaps the Gender column is named correctly with the best available data. Perhaps the problem is that we are trying to treat it as biological sex?

Perhaps we shouldn’t remove the Gender field from the Person table, but simply add the Sex column. If both fields are there, it means that those doing ETLs would not be able to ignore the differentiation, and could search for the best available data. Having both fields would mean those doing queries could also be more precise about what exactly they mean: gender or biological sex.

I believe the need for distinction is being recognized by publishers.
[Sex and Gender Analysis Policies of Peer-Reviewed Journals | Gendered Innovations]

1 Like

That’s one choice, but it probably doesn’t help us much:

  • Sex is static. It is established at birth, and you get born only once. So, it should be in PERSON. The field we have (gender_concept_id) is actually used as if it were sex_concept_id. It just has the wrong name.
  • Gender is dynamic. You can change it. Which means it cannot be in PERSON, but in OBSERVATION with a time stamp. All we need to do is to make sure we have a agreed gender concept convention.

The naming problem can be addressed as soon as we go to the next major version. There are other things we also want to change. If folks are eager - start rolling the drum.

In the mean time: We can add explanatory and apologetic language to the PERSON table documentation, explaining that gender_concept_id is wrongly named and really should be sex_concept_id, but otherwise it is all working.

Again, we don’t have a problem with the content. We have a problem with naming. Functionally, a variable or table field name is a memory address pointer for the processor. It means nothing. So, if we can live with having to apologize while we are working on a new version we are just fine.

1 Like

Thanks for this update. In fact, gender identity now has much clearer coding options and sex is still considered to be biological sex by journals, by the NIH, and others). I’d cast a vote for sex (adding intersex as a code) and gender, although gender could be an observation, as we code it now at our site. Gender identity can be fluid in a small number of people.

1 Like

As an ETL’er this is not always possible, as both fields may be in the demographics section, which does not have timestamps attached to them, at least not in the EHR that we use.

I see no way to change this that does not cause pain to someone, somewhere.

Thank you all so much for thinking about this. It’s not an easy topic.

I’m no expert, but I understand that most Intersex chromonal anomalies are discovered during puberty. Hence, this could be more dynamic than one might anticipate.

For transgender people receiving hormone replacement therapy, those detransitioning is extremely low. Hence, this is probably a bit more static than one might expect.

That’s good. However, adding another column wouldn’t be as disruptive.

If the goal would be to have one field rather than two, then this seems like a good first step. Even so, 90% of users don’t read manuals. So, perhaps a further step might be to change on screen field names and such, in the cases where it’s easy to do so. It’d create a discrepancy. That said, would let those building queries with graphical interfaces start using scientifically accepted terminology.

Being either Intersex or Transgender is about 2% of our population. Intrasex (those with chromosome anomalies) people are greater than 1% of the population. Those who are Transgender are also perhaps 1% of the population. For what it’s worth, I see these statistics as being somewhat like left-handed reporting, where it went from 4% (1920) to 12% (1960), leveling out and remaining around 12% since then. Only after it becomes socially acceptable to admit to these categories do we get accurate measurements.

Regardless, making the change will require funding. So, if we agree at the community level that it’s useful and state a direction to be more inclusive (being more useful to studies of these populations), then perhaps it would permit interested parties to write grants to cover development and data migration expenses. Those with relevant data sets may also be interested in collaborating further with OHDSI network studies.

Let me see if I understand this correctly. We are trying to accurately reflect both “sex” and “gender” in the CDM, with these considerations:

  1. There are three categories of sex, a congenital characteristic assessed by the appearance of genitalia/reproductive organs: Female, Male, and Intersex.

  2. A person identified at birth as “male” or “female” at birth might subsequently be reclassified as “intersex” based on clinical observations. No other transitions of “sex” exist, barring clerical or observational errors at delivery.

  3. Gender, is a construct of perception of personal characteristics, both physical and non-physical which is influenced by genetic characteristics and social norms.

  4. There are presently many gender terms in use, and it appears likely that the list will continue to grow/evolve.

  5. At any time a person might identify with more than one gender or revise their characteristic.

If so, it seems that Gender needs to be a time-stamped record in the Observation table, hopefully standardized to a vocabulary curated by another entity.

Nor am I disagreeing with you, but do not make it mandatory or make it documented that the timestamp may be set to a magic number as our EHR does not have any history of demographics (other than audit trace and there is no way security is letting me use that to do OMOP work).

@Mark:

Sounds plausible. For cases where we don’t have an event date we use the “history of” solution, and the date of that is when the patient was initiated (obs_period_start_date if you have nothing else to hang your hat on). This is not a specific problem to change of gender. And we should certainly not add that to demographic.

We are specifically being asked to capture both sex at birth and gender for our NIH transfusion project.
Agree that Gender goes to observation and there are some terms in the terminology but not exhaustive.
When we don’t find a date in the EHR that has been recored, there is generally a datetime that the data was entered. We use that essentially as Christian mentions as an “as of” date. We know that as of that date this was the status of that data element. This is a common occurrence for many data elements. Instead of arguing over dates v datetime it would be even more helpful if the verbatim concept could be extended to other tables. In our ETL across dozens of different EHRs this is an issue we often find.

The only way for me to get the date it was entered, is with an audit trace, as our system considers both Gender and Sex as demographics, which only has when the patient was created and last update, for any reason; there is no way to know what was updated.

I am a frim believer that bad data is worse than no data, and this would almost always be bad data.

Edit: The reason I am being so obstinate about this, is that there are already other places where I have been instructed to make ‘best guesses’ at when data was ‘valid as’. This bothers me at a very deep level. I spent too many years doing software for engineering where said data would skew the entire data set for this to ever set well with me. There needs to be some magic number that says that one does not know when said event happened.

I apologize if I have ruffled any feathers.

@Mark I am curious as to how you handle phone numbers, which are far more likely to change than gender. The distinction that I see being that it may be more important to know gender at any given point in time than phone number.

So, I don’t know how unpopular this perspective is, but I’ll put it out there anyways:

The dates in the CDM are the dates we observed them, not necessarily the actual date it happened. So, consider a condition that goes undiagnosed for years. The date it goes into the CDM is the date it’s observed, not backfilled to when it happened. Similarly for ‘history of’ contexts, the date of the ‘history of’ observation is the date we observed they have a historical situation…we don’t put the date back 5 years if they indicate it happened 5 years ago, we record the date they tell us and mark it is a history of at least 5 years ago of X.

The reason why I’m OK with this perspective is that patients may not receive treatments for some condition X until they know about it, and when we do predictions and phenotypes about conditions, these events are based on when they were known, not an inferred date about when they happened. Most of the time the actual event date is very close to the observed date (like their visit to the hospital os spot on, and drug exposures are very close…)

Others may not be OK with that…and that’s OK too.

2 Likes

We don’t expose PMI, so that is not an issue.

I am new here, but came here to say if anyone wants to work on this issue please count me in. I agree that it is an opportunity. What I love about OHDSI is that it is so open and inclusive. I am good at grant writing and can help with that, too.

Thank you. The congenital designation (“sex”) would only be from “male” or “female” to “intersex”. Moreover, it’d be a small subset of the patient population that would be updated. Permitting this to be changed in a subsequent ETL when it is discovered isn’t a horrible choice. In this way, various quality reports would better reflect the data. One could think of viewing an intersex designation as a data correction, and not a new observation.

Perhaps Gender may still be valuable at the top level, with the acknowledgement that it is the best known value, along side Race and Ethnicity. Perhaps this could be useful in data quality reports where the Sex != Gender. For example, when checking Testosterone or Estrogen boundary conditions, if the Sex is “female” but the Gender is “male”, one might admit higher than normal Testosterone levels and look for hormone therapy.

I think if this is to be addressed, and it should be since Gender is the wrong column name, we will need funding that is distributed across the various groups that would be impacted. I think a coordinating center would be needed.

1 Like

@DanielleBoyce notes a funding opportunity which might support this conversion work. The optional LOI is due May 20th. The proposal is due June 26th. There are eligibility constraints on both the PI and the institution.

1 Like

Thank you for sharing - I was lost and thought I sent it to the larger group. :laughing: I am already connected with AIM Ahead through an OHDSI course I am developing for them and am happy to reach out if anyone is interested!

t