a subset of the training data which is expected to contain additional regularities - for instance, pages from the same site, or spans from the same document
In the case that the responses are not based on the entire sample, a description of the portion of the sample whose responses are being reported appears here (e.g. women, or those who favor a given policy).