使用文本挖掘概念从文本内容中检测事件是文献中一个经过充分研究的领域。另一方面,近年来的图建模和图嵌入技术提供了将文本内容表示为图的机会。文本可以通过图表中的附加属性来丰富,并且可以在图表中捕获复杂的关系。在本文中,我们专注于新闻预测并将问题建模为子图预测。更具体地说,我们旨在以子图的形式预测新闻骨架。为此,构建了基于图的新闻文章表示,并提出了一种基于图挖掘的模式提取方法。所提出的方法包括三个主要步骤。最初,构建新闻文本的图形表示。然后,频繁子图挖掘和顺序规则挖掘算法适用于图序列的模式预测。我们认为子图捕获了内容的主要故事,顺序规则指示子图模式的时间关系。最后,提取的序列模式用于预测未来的新闻骨架(即新闻的主要特征)。为了测量相似度,还采用了图嵌入技术。所提出的方法在来自在线报纸的新闻集合和基准新闻数据集上针对基线方法进行了分析。提取的序列模式用于预测未来的新闻骨架(即新闻的主要特征)。为了测量相似度,还采用了图嵌入技术。所提出的方法在来自在线报纸的新闻集合和基准新闻数据集上针对基线方法进行了分析。提取的序列模式用于预测未来的新闻骨架(即新闻的主要特征)。为了测量相似度,还采用了图嵌入技术。所提出的方法在来自在线报纸的新闻集合和基准新闻数据集上针对基线方法进行了分析。

Event detection from textual content by using text mining concepts is a well-researched field in the literature. On the other hand, graph modeling and graph embedding techniques in recent years provide an opportunity to represent textual contents as graphs. Text can be enriched with additional attributes in graphs, and the complex relationships can be captured within the graphs. In this paper, we focus on news prediction and model the problem as subgraph prediction. More specifically, we aim to predict the news skeleton in the form of a subgraph. To this aim, graph-based representations of news articles are constructed and a graph mining based pattern extraction method is proposed. The proposed method consists of three main steps. Initially, graph representation of the news text is constructed. Afterwards, frequent subgraph mining and sequential rule mining algorithms are adapted for pattern prediction on graph sequences. We consider that a subgraph captures the main story of the contents, and the sequential rules indicate the subgraph patterns’ temporal relationships. Finally, extracted sequential patterns are used for predicting the future news skeleton (i.e. main features of the news). In order to measure the similarity, graph embedding techniques are also employed. The proposed method is analyzed on both a collection of news from an online newspaper and on a benchmark news dataset against baseline methods.