We are trying to customize DQD rules to suit our site. I have few questions listed below. Can you help us with this?
a) I see DQD has two contexts
validation. May I know how can I find the rules under
validation category? In the github
csv files, I don’t see any specific column which indicates the context (except for 3 rows under
check_descriptions.csv). But in DQD dashboard, am able to see that there are around 402 validation context rules. Am trying to locate these 402 dq checks which comes under
validation category. or the validation is only for 3 scenarios such as
person completeness and
null in non-nullable field across different tables? Is there any other validation based DQ checks?
b) where can I find info on the external benchmarks/ values used for our validation check? I see that for validation checks, the data is compared with external source. can we know what is the comparator here?
For example, our dataset had 402 validation checks, out of which 1 failed. I would like to find out from where does it pick the info on the external benchmark? Against which value it is comparing our raw data? I know for verification, we can find the threshold limit for columns in Excel sheet. But for validation, where can we find this?
c) In the
field_level.csv, I see there are columns like
validPrevalenceLowThreshold etc. Am unable to understand how these fields are used. May I know what’s the use of these fields and are they even used for any DQ checks?