Hi Robyn,
I am evaluating the accuracy of PICRUST2 on my custom trait table
When evaluating PICRUSt2 SC accuracy with a custom trait table, what is the recommended procedure for a rigorous Leave-One-Out Cross-Validation (LOOCV) when the test sequences are already present in the default reference tree?
To ensure no "training-test contamination," I am deciding between two approaches:
Table Removal only: Keep the species/tip in the default reference tree but remove the corresponding entry from the trait table (-i input).
Tree Pruning + Table Removal: Prune the species/tip from the reference tree entirely and remove it from the trait table, then re-place the sequence using place_seqs.py.
Is pruning the tree necessary to accurately simulate the PICRUSt2 pipeline's performance on novel sequences?
Thank you,
Abel Tan
Hi Robyn,
I am evaluating the accuracy of PICRUST2 on my custom trait table
When evaluating PICRUSt2 SC accuracy with a custom trait table, what is the recommended procedure for a rigorous Leave-One-Out Cross-Validation (LOOCV) when the test sequences are already present in the default reference tree?
To ensure no "training-test contamination," I am deciding between two approaches:
Table Removal only: Keep the species/tip in the default reference tree but remove the corresponding entry from the trait table (-i input).
Tree Pruning + Table Removal: Prune the species/tip from the reference tree entirely and remove it from the trait table, then re-place the sequence using place_seqs.py.
Is pruning the tree necessary to accurately simulate the PICRUSt2 pipeline's performance on novel sequences?
Thank you,
Abel Tan