A Phylogeny-Based benchmarking test for orthology inference reveals the limitations of Function-Based validation Article uri icon