摘要: |
模拟人浏览句子按照语境寻找消歧证据的经验,计算歧义字段与其所在句子的语义相似度和相关度,据此作为语境计算模型,利用歧义字段与其所在句子的语境信息进行中文分词交叉歧义处理;与经典的基于统计方法相比,切分准确率有很大提高。 |
关键词: 中文自动分词 交叉歧义 语义联系 语境计算模型 |
DOI: |
分类号: |
基金项目: |
|
Context based Approach to Overlapping Ambiguity Resolution in Chinese Word Segmentation |
YIN Qian
|
Abstract: |
This method simulates the experience of people when searching for evidence to eliminate overlapping ambiguity using context information in Chinese word segmentation. The semantic similarity and correlation of overlapping ambiguity and its sentence are calculated. A context calculation model is set up, which resolves overlapping ambiguity in Chinese word segmentation using contextual information. Our experimental results show that compared with the traditional statistics based approach, our algorithm has a good segmentation accuracy. |
Key words: Chinese word segmentation overlapping ambiguity semantic correlation context calcul ation model |