Fig. 1.From: Predicting the hosts of prokaryotic viruses using GCN-based semi-supervised learningThe pipeline of HostG. I: Using the pre-trained CNN model to encode contigs into node feature vectors. II: Utilizing BLASTN to create virus-host connections. III: Creating protein clusters using DIAMOND-based BLASTP and MCL. Then, the protein clusters will be employed to create virus-virus connections. IV: Creating the knowledge graph by combining the node feature and edge connections. Then, GCN is employed to train and assign taxonomic labelsBack to article page