My apologies if this is already someplace obvious, but is there a existing list of the tests run by Achilles? I know there is SQL code, but I was looking for a list of issues that it tests. It might help people to think of other tests if there is an existing list of tests.
+1
I agree that a list would be useful.
Some background on this:
There is a list of pre-computed counts and from that list, there are rules executed.
There may be multiple rules per ID.
I suggested distinguishing them by sub-ID. See here
See also this old post about it here
Thanks @Vojtech_Huser – that list is a good start. Sorry I missed your post on this. I guess we need to find a way to put the Achilles documentation all together. It would be good to have a brief overview of everything Achilles in terms of methodology details – it is really something that doesn’t exist elsewhere. Maybe we can pester @Patrick_Ryan to help . . .
@mark_danese, your guilt trip won’t work…ok, so maybe it will:)
Several of us are looking forward to the upcoming data quality code-a-thon
that @mgkahn and @toanong are coordinating. That seems like an ideal
opportunity to figure out a systematic way to maintain the inventory of
data quality checks that can be / are performed on the aggregate summary
statistics that are generated for any given source. I’ll take this on
during/after that meeting.
That sounds great. You guys created a great tool that people want to use. Good problem to have.