xpaf开源解析框架
XPath-based Parsing Framework (XPaF) 是一个简单、方便的开源解析框架,便于从 HTML 和 XML 文档中提取语法上的相关性(subject-predicate-object triples)。
代码示例:
<table>
<tr>
<td class="name">Aaron</td>
<td class="occ">Engineer</td>
</tr>
<tr>
<td class="name">Jennifer</td>
<td class="occ">Archeologist</td>
</tr>
</table>
parser_name: "my_parser"
relation_tmpls {
subject: "//td[@class='name']"
predicate: "occupation"
object: "//td[@class='occ']"
subject_cardinality: MANY
object_cardinality: MANY
}
评论
