Conference paper (in proceedings)

ConvTab: A Context-Preserving, Convolutional Model for Ad-Hoc Table Retrieval

DOKPE

Show more…
  • 13.01.2022
Published in:
  • 2021 IEEE International Conference on Big Data (Big Data). - IEEE. - 2022
English Ad-hoc table retrieval, also known as table search, is the problem of finding tables relevant to a search query. This search query can be a keyword or a table itself, referred to as keyword-based and table-based search, respectively. With the vast amounts of tabular data available online, it has become essential for users to identify relevant tables that meet their search criteria. In this regard, there has been a wide variety of
research on this problem using pure lexical features, semantic representation, embeddings, as well as intrinsic and extrinsic features of the tables. However, one of the significant limitations of most of the existing methods is that they do not keep the table’s structure and the globalized context intact when building semantic representations of tabular data. Deriving motivation from this fact, we propose an effective approach based on Convolutional Neural Networks (CNNs) – ConvTab – to train the embeddings of tabular data. Our approach is divided into
two phases. First, we leverage the discriminating power of CNNs to train a table classifier. Next, the representations learned from this model are used to generate semantic features for query-table similarity. These query-table similarity features are then used as input to the learning algorithm. We evaluate our approach on the table retrieval task using standard NDCG, MAP, and MRR metrics. Experiments reveal that ConvTab significantly outperforms the state of the art in ad-hoc table retrieval by 16.9% and 8.37% using NDCG at cutoffs 5 and 20, respectively. For reproducibility purposes, we share our model as well as all details of our implementation.
Faculty
Faculté des sciences et de médecine
Department
Département d'Informatique
Language
  • English
License
Rights reserved
Open access status
green
Identifiers
Persistent URL
https://folia.unifr.ch/unifr/documents/325396
Statistics

Document views: 15 File downloads:
  • agarwal2021bigdata_0.pdf: 27