How is this value obtained? Submitted bacterial profiles are mapped along with decision trees previously built by random forest to train the classifiers. Each tree gives a classification (roughly: yes/no the mapped data is/is not close to the model). Combing all voting tree classifications provides the voting tree probability. Thus, the voting tree probability translates the degree of similarity between the signatures and the unknown samples. The more similar the bacterial signature of each sample is to the set classifier fecal signatures, the higher the probability is in the source prediction output.
What is the difference between "low confidence", and "high confidence" regarding the fecal signature detection confidence index? We optimized for each classifier the voting tree cut-off (decision cut-off) at which a sample is considered as contaminated (or not). This cut-off was defined to maximize the sensitivity and the specificity of the classifiers. Above the decision cut-off, a sample is classified as "high confidence", i.e., the fecal signature has been detected in the submitted sample. Below the decision cut-off, a sample is classified as "low confidence", i.e., a fraction of the fecal signature was recovered in the submitted sample, but not enough to be classified as contaminated.
"low confidence" in opposition to "low confidence (trace level detection)" indicates that at the voting tree probability observed in the submitted sample, the classifier had a specificity of at least 80% when evaluated using raw fecal samples.
Please consult our FAQ to have more details about the global, draft (and excluded) classifiers.