Skip to main content
Fig. 1. | BMC Biology

Fig. 1.

From: Predicting the hosts of prokaryotic viruses using GCN-based semi-supervised learning

Fig. 1.

The pipeline of HostG. I: Using the pre-trained CNN model to encode contigs into node feature vectors. II: Utilizing BLASTN to create virus-host connections. III: Creating protein clusters using DIAMOND-based BLASTP and MCL. Then, the protein clusters will be employed to create virus-virus connections. IV: Creating the knowledge graph by combining the node feature and edge connections. Then, GCN is employed to train and assign taxonomic labels

Back to article page