Relationship Extraction (RE) from biomedical literature is an important and challenging problem in both text mining and bioinformatics. Although various approaches have been proposed to extract protein?protein interaction types, their accuracy rates leave a large room for further exploring. In this paper, two supervised learning algorithms based on newly defined "bio-semantic token subsequence" are proposed for multi-class biomedical relationship classification. The first approach calculates a "bio-semantic token subsequence kernel", whereas the second one explicitly extracts weighted features from bio-semantic token subsequences. The two proposed approaches outperform several alternatives reported in literature on multi-class protein?protein interaction classification.
Bioinformatics and Biomedicine
Digital Object Identifier (DOI)