Preliminary Exploration of Formula Embedding for Mathematical Information Retrieval: can mathematical formulae be embedded like a natural language?

29 Aug 2017  ·  Gao Liangcai, Jiang Zhuoren, Yin Yue, Yuan Ke, Yan Zuoyu, Tang Zhi ·

While neural network approaches are achieving breakthrough performance in the natural language related fields, there have been few similar attempts at mathematical language related tasks. In this study, we explore the potential of applying neural representation techniques to Mathematical Information Retrieval (MIR) tasks. In more detail, we first briefly analyze the characteristic differences between natural language and mathematical language. Then we design a "symbol2vec" method to learn the vector representations of formula symbols (numbers, variables, operators, functions, etc.) Finally, we propose a "formula2vec" based MIR approach and evaluate its performance. Preliminary experiment results show that there is a promising potential for applying formula embedding models to mathematical language representation and MIR tasks.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here