Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

I see TF5xxxxx families are “created by hcluster”. What is the algorithm behind?

ALGORITHM Families
0
10 Posted

I see TF5xxxxx families are “created by hcluster”. What is the algorithm behind?

0

TreeFam clusters (TF5xxxxx series) are created by hcluster_sg, a hierarchical clustering software for sparse graphs. Basically, hcluster_sg performs hierarchical clustering under mean distance. It reads an input file that describes the similarity between two sequences, and groups two nearest nodes at each step. When two nodes are joined, the distance between the joined node and all the other nodes are updated by mean distance. This procedure is iterated until one of the three rules is met: • Do not merge cluster A and B if the total number of edges between A and B is smaller than |A|*|B|/3, where |A| and |B| are the sizes of A and B, respectively. This rule guarantees each cluster is compact. • Do not join A to any other cluster if |A| &#60 500. This rule avoids huge clusters which may cause computational burden for multialignment and tree building as well. • Do not join A and B if both A and B contain plant genes or both A and B contain Fungi genes. This rule tries to find animal gene

Related Questions

What is your question?

*Sadly, we had to bring back ads too. Hopefully more targeted.

Experts123