It takes a lot of digging because there are lots of validation studies and one has to read the methods to figure out what source they used as the gold standard.
Here are a couple that might be useful (I didn't review them fully). This is a very interesting topic to me because we store the sensitivity and specificity of algorithms for exposures and outcomes in our software. So, examples like this are very useful to us. And, of course, the effect of the misclassification on the risk estimate is important too!
Identification of Physician-Diagnosed Alzheimer's Disease and Related Dementias in Population-Based Administrative Data: A Validation Study Using Family Physicians' Electronic Medical Records.
Development and Validation of an Algorithm to Identify Patients with Multiple Myeloma Using Administrative Claims Data.
Missing clinical and behavioral health data in a large electronic health record (EHR) system.