no code implementations • Findings (ACL) 2022 • Zi Lin, Jeremiah Zhe Liu, Jingbo Shang
Recent work in task-independent graph semantic parsing has shifted from grammar-based symbolic approaches to neural models, showing strong performance on different types of meaning representations.
1 code implementation • EMNLP 2020 • Dheeraj Mekala, Xinyang Zhang, Jingbo Shang
Based on seed words, we rank and filter motif instances to distill highly label-indicative ones as {``}seed motifs{''}, which provide additional weak supervision.
no code implementations • Findings (ACL) 2022 • Zihan Wang, Jiuxiang Gu, Jason Kuen, Handong Zhao, Vlad Morariu, Ruiyi Zhang, Ani Nenkova, Tong Sun, Jingbo Shang
We present a comprehensive study of sparse attention patterns in Transformer models.
no code implementations • EMNLP 2021 • Zihan Wang, chengyu dong, Jingbo Shang
In this paper, we present an empirical property of these representations—”average” approximates “first principal component”.
1 code implementation • ICML 2020 • chengyu dong, Liyuan Liu, Zichao Li, Jingbo Shang
Serving as a crucial factor, the depth of residual networks balances model capacity, performance, and training efficiency.
no code implementations • ACL 2022 • Xiaotao Gu, Yikang Shen, Jiaming Shen, Jingbo Shang, Jiawei Han
Recent studies have achieved inspiring success in unsupervised grammar induction using masked language modeling (MLM) as the proxy task.
1 code implementation • 13 May 2024 • Letian Peng, Jingbo Shang
We further leverage it as a reward system in direct preference optimization (DPO) for better AI characters.
no code implementations • 7 May 2024 • Yongqi Tong, Sizhe Wang, Dawei Li, Yifan Wang, Simeng Han, Zi Lin, Chengsong Huang, Jiaxin Huang, Jingbo Shang
Therefore, we present \textsc{PuzzleBen}, a weakly supervised benchmark that comprises 25, 147 complex questions, answers, and human-generated rationales across various domains, such as brainteasers, puzzles, riddles, parajumbles, and critical reasoning tasks.
1 code implementation • 22 Apr 2024 • Xiaochen Kev Gao, Feng Yao, Kewen Zhao, Beilei He, Animesh Kumar, Vish Krishnan, Jingbo Shang
In this paper, we delve into the patent approval pre-diction task and unveil that simple domain-specific graph methods outperform enlarging the model, using the intrinsic dependencies within the patent data.
1 code implementation • 16 Apr 2024 • Letian Peng, Jingbo Shang
In this paper, we aim to generate text classification data given arbitrary class definitions (i. e., user instruction), so one can train a small text classifier without any human annotation or raw corpus.
no code implementations • 10 Apr 2024 • Chenyang An, Zhibo Chen, Qihao Ye, Emily First, Letian Peng, Jiayun Zhang, Zihan Wang, Sorin Lerner, Jingbo Shang
Recent advances in Automated Theorem Proving have shown the effectiveness of leveraging a (large) language model that generates tactics (i. e. proof steps) to search through proof states.
1 code implementation • 2 Apr 2024 • Dawei Li, William Hogan, Jingbo Shang
This strategy enables a larger attack budget for entities and coaxes the model to leverage relational patterns embedded in the context.
no code implementations • 30 Mar 2024 • Alex Nguyen, Zilong Wang, Jingbo Shang, Dheeraj Mekala
The application of natural language processing models to PDF documents is pivotal for various business applications yet the challenge of training models for this purpose persists in businesses due to specific hurdles.
1 code implementation • 30 Mar 2024 • Letian Peng, Zilong Wang, Feng Yao, Zihan Wang, Jingbo Shang
We construct the distillation dataset via sampling sentences from language model pre-training datasets (e. g., OpenWebText in our implementation) and prompting an LLM to identify the typed spans of "important information".
no code implementations • 29 Mar 2024 • Yongqi Tong, Dawei Li, Sizhe Wang, Yujia Wang, Fei Teng, Jingbo Shang
We conduct a series of experiments to prove LLMs can obtain benefits from mistakes in both directions.
1 code implementation • 25 Feb 2024 • Lily Zhong, Zilong Wang, Jingbo Shang
Large language models (LLMs) are leading significant progress in code generation.
Ranked #2 on Code Generation on HumanEval
1 code implementation • 21 Feb 2024 • Dheeraj Mekala, Jason Weston, Jack Lanchantin, Roberta Raileanu, Maria Lomeli, Jingbo Shang, Jane Dwivedi-Yu
Teaching language models to use tools is an important milestone towards building general assistants, but remains an open problem.
1 code implementation • 16 Feb 2024 • Dheeraj Mekala, Alex Nguyen, Jingbo Shang
In this paper, we introduce a novel training data selection based on the learning percentage of the samples.
1 code implementation • 15 Feb 2024 • Letian Peng, Yuwei Zhang, Zilong Wang, Jayanth Srinivasa, Gaowen Liu, Zihan Wang, Jingbo Shang
This work aims to build a text embedder that can capture characteristics of texts specified by user instructions.
no code implementations • 7 Feb 2024 • Yu Wang, Yifan Gao, Xiusi Chen, Haoming Jiang, Shiyang Li, Jingfeng Yang, Qingyu Yin, Zheng Li, Xian Li, Bing Yin, Jingbo Shang, Julian McAuley
We aim to build models containing a considerable portion of self-updatable parameters, enabling the model to integrate new knowledge effectively and efficiently.
1 code implementation • 6 Feb 2024 • Yufan Zhuang, Liyuan Liu, Chandan Singh, Jingbo Shang, Jianfeng Gao
We then train MetaTree to produce the trees that achieve strong generalization performance.
no code implementations • 5 Feb 2024 • Zihan Wang, Yunxuan Li, Yuexin Wu, Liangchen Luo, Le Hou, Hongkun Yu, Jingbo Shang
Process supervision, using a trained verifier to evaluate the intermediate steps generated by reasoner, has demonstrated significant improvements in multi-step problem solving.
1 code implementation • 2 Feb 2024 • Xiyuan Zhang, Ranak Roy Chowdhury, Rajesh K. Gupta, Jingbo Shang
Large Language Models (LLMs) have seen significant use in domains such as natural language processing and computer vision.
no code implementations • 9 Jan 2024 • Zilong Wang, Hao Zhang, Chun-Liang Li, Julian Martin Eisenschlos, Vincent Perot, Zifeng Wang, Lesly Miculicich, Yasuhisa Fujii, Jingbo Shang, Chen-Yu Lee, Tomas Pfister
We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts.
Ranked #3 on Table-based Fact Verification on TabFact
no code implementations • 6 Dec 2023 • Weitang Liu, Ying Wai Li, Tianle Wang, Yi-Zhuang You, Jingbo Shang
We propose a novel model-centric evaluation framework, OmniInput, to evaluate the quality of an AI/ML model's predictions on all possible inputs (including human-unrecognizable ones), which is crucial for AI safety and reliability.
no code implementations • 12 Nov 2023 • Xiyuan Zhang, Xiaohan Fu, Diyan Teng, chengyu dong, Keerthivasan Vijayakumar, Jiayun Zhang, Ranak Roy Chowdhury, Junsheng Han, Dezhi Hong, Rashmi Kulkarni, Jingbo Shang, Rajesh Gupta
By obviating the need for ground truth clean data, our method offers a practical denoising solution for real-world applications.
1 code implementation • 6 Nov 2023 • Letian Peng, Zihan Wang, Jingbo Shang
We study the named entity recognition (NER) problem under the extremely weak supervision (XWS) setting, where only one example entity per type is given in a context-free way.
no code implementations • 6 Nov 2023 • Dawei Li, Yaxuan Li, Dheeraj Mekala, Shuyao Li, Yulin Wang, Xueqi Wang, William Hogan, Jingbo Shang
DAIL leverages the intuition that large language models are more familiar with the content generated by themselves.
1 code implementation • 3 Nov 2023 • Letian Peng, Zilong Wang, Hang Liu, Zihan Wang, Jingbo Shang
With the rapid development of the internet, online social media welcomes people with different backgrounds through its diverse content.
no code implementations • 26 Oct 2023 • Zi Lin, Zihan Wang, Yongqi Tong, Yangkun Wang, Yuxin Guo, Yujia Wang, Jingbo Shang
This benchmark contains the rich, nuanced phenomena that can be tricky for current toxicity detection models to identify, revealing a significant domain difference compared to social media content.
no code implementations • 18 Oct 2023 • Yongqi Tong, Yifan Wang, Dawei Li, Sizhe Wang, Zi Lin, Simeng Han, Jingbo Shang
Chain-of-Thought(CoT) prompting and its variants explore equipping large language models (LLMs) with high-level reasoning abilities by emulating human-like linear cognition and logic.
no code implementations • 11 Oct 2023 • chengyu dong, Liyuan Liu, Hao Cheng, Jingbo Shang, Jianfeng Gao, Xiaodong Liu
Although ELECTRA offers a significant boost in efficiency, its potential is constrained by the training cost brought by the auxiliary model.
no code implementations • 7 Oct 2023 • Liangchen Luo, Zi Lin, Yinxiao Liu, Lei Shu, Yun Zhu, Jingbo Shang, Lei Meng
In the era of large language models (LLMs), this study explores the ability of LLMs to deliver accurate critiques across various tasks.
no code implementations • 4 Oct 2023 • An Yan, Yu Wang, Yiwu Zhong, Zexue He, Petros Karypis, Zihan Wang, chengyu dong, Amilcare Gentili, Chun-Nan Hsu, Jingbo Shang, Julian McAuley
Medical image classification is a critical problem for healthcare, with the potential to alleviate the workload of doctors and facilitate diagnoses of patients.
1 code implementation • ICCV 2023 • An Yan, Yu Wang, Yiwu Zhong, chengyu dong, Zexue He, Yujie Lu, William Wang, Jingbo Shang, Julian McAuley
Recent advances in foundation models present new opportunities for interpretable visual recognition -- one can first query Large Language Models (LLMs) to obtain a set of attributes that describe each class, then apply vision-language models to classify images via these attributes.
1 code implementation • 14 Jul 2023 • Letian Peng, Yuwei Zhang, Jingbo Shang
Prompting large language models (LLMs) for data augmentation has recently become a common practice in few-shot NLP tasks.
no code implementations • 4 Jul 2023 • Zijie Huang, Daheng Wang, Binxuan Huang, Chenwei Zhang, Jingbo Shang, Yan Liang, Zhengyang Wang, Xian Li, Christos Faloutsos, Yizhou Sun, Wei Wang
We propose Concept2Box, a novel approach that jointly embeds the two views of a KG using dual geometric representations.
1 code implementation • AAAI Conference on Artificial Intelligence 2023 • Ranak Roy Chowdhury, Jiacheng Li, Xiyuan Zhang, Dezhi Hong, Rajesh K. Gupta, Jingbo Shang
In this work, we propose PrimeNet to learn a self-supervised representation for irregular multivariate time series.
no code implementations • 1 Jun 2023 • Hejie Cui, Rongmei Lin, Nasser Zalmout, Chenwei Zhang, Jingbo Shang, Carl Yang, Xian Li
Information extraction, e. g., attribute value extraction, has been extensively studied and formulated based only on text.
1 code implementation • 26 May 2023 • Liyan Xu, Chenwei Zhang, Xian Li, Jingbo Shang, Jinho D. Choi
We present a new task setting for attribute mining on e-commerce products, serving as a practical solution to extract open-world attributes without extensive human intervention.
1 code implementation • 24 May 2023 • Dheeraj Mekala, Adithya Samavedhi, chengyu dong, Jingbo Shang
To address the annotation bottleneck, we introduce SELFOOD, a self-supervised OOD detection method that requires only in-distribution samples as supervision.
1 code implementation • 24 May 2023 • chengyu dong, Zihan Wang, Jingbo Shang
We show that the limited performance of seed matching is largely due to the label bias injected by the simple seed-match rule, which prevents the classifier from learning reliable confidence for selecting high-quality pseudo-labels.
1 code implementation • 24 May 2023 • Yuwei Zhang, Zihan Wang, Jingbo Shang
First, we prompt ChatGPT for insights on clustering perspective by constructing hard triplet questions <does A better correspond to B than C>, where A, B and C are similar data points that belong to different clusters according to small embedder.
1 code implementation • 24 May 2023 • Prashant Krishnan, Zilong Wang, Yangkun Wang, Jingbo Shang
Recent advances of incorporating layout information, typically bounding box coordinates, into pre-trained language models have achieved significant performance in entity recognition from document images.
1 code implementation • 23 May 2023 • Zihan Wang, Jingbo Shang, Ruiqi Zhong
We propose a new task formulation, "Goal-Driven Clustering with Explanations" (GoalEx), which represents both the goal and the explanations as free-form language descriptions.
1 code implementation • 23 May 2023 • Jiacheng Li, Ming Wang, Jin Li, Jinmiao Fu, Xin Shen, Jingbo Shang, Julian McAuley
In this paper, we propose to model user preferences and item features as language representations that can be generalized to new items and datasets.
no code implementations • 23 May 2023 • Zilong Wang, Jingbo Shang
In this paper, we propose a new approach, ReXMiner, for zero-shot relation extraction in web mining.
1 code implementation • 22 May 2023 • Zihan Wang, Tianle Wang, Dheeraj Mekala, Jingbo Shang
Etremely Weakly Supervised Text Classification (XWS-TC) refers to text classification based on minimal high-level human guidance, such as a few label-indicative seed words or classification instructions.
no code implementations • 22 May 2023 • William Hogan, Jiacheng Li, Jingbo Shang
Motivated by these insights, we present a novel method called KNoRD (Known and Novel Relation Discovery), which effectively classifies explicitly and implicitly expressed relations from known and novel classes within unlabeled data.
1 code implementation • 21 May 2023 • Tianle Wang, Zihan Wang, Weitang Liu, Jingbo Shang
State-of-the-art weakly supervised text classification methods, while significantly reduced the required human supervision, still requires the supervision to cover all the classes of interest.
no code implementations • 24 Mar 2023 • Xiyuan Zhang, Ranak Roy Chowdhury, Jingbo Shang, Rajesh Gupta, Dezhi Hong
We note that augmentation designed for forecasting requires diversity as well as coherence with the original temporal dynamics.
no code implementations • 19 Feb 2023 • Weitang Liu, Ying-Wai Li, Yi-Zhuang You, Jingbo Shang
We first draw the connection between the output distribution of a NN and the density of states (DOS) of a physical system.
1 code implementation • 26 Jan 2023 • Zi Lin, Jeremiah Liu, Jingbo Shang
Pre-trained seq2seq models excel at graph semantic parsing with rich annotated data, but generalize worse to out-of-distribution (OOD) and long-tail examples.
no code implementations • 1 Jan 2023 • Xiyuan Zhang, Ranak Roy Chowdhury, Jiayun Zhang, Dezhi Hong, Rajesh K. Gupta, Jingbo Shang
In this paper, we propose SHARE, a HAR framework that takes into account shared structures of label names for different activities.
1 code implementation • 1 Jan 2023 • Jiayun Zhang, Xiyuan Zhang, Xinyang Zhang, Dezhi Hong, Rajesh K. Gupta, Jingbo Shang
Traditional federated classification methods, even those designed for non-IID clients, assume that each client annotates its local data with respect to the same universal class set.
no code implementations • 27 Nov 2022 • Zilong Wang, Jiuxiang Gu, Chris Tensmeyer, Nikolaos Barmpalios, Ani Nenkova, Tong Sun, Jingbo Shang, Vlad I. Morariu
In contrast, region-level models attempt to encode regions corresponding to paragraphs or text blocks into a single embedding, but they perform worse with additional word-level features.
1 code implementation • 25 Oct 2022 • Sudhanshu Ranjan, Dheeraj Mekala, Jingbo Shang
Instead of training on the entire code-switched corpus at once, we create buckets based on the fraction of words in the resource-rich language and progressively train from resource-rich language dominated samples to low-resource language dominated samples.
no code implementations • 5 Oct 2022 • Yufan Zhuang, Zihan Wang, Fangbo Tao, Jingbo Shang
Recent works show that learning attention in the Fourier space can improve the long sequence learning capability of Transformers.
1 code implementation • 28 Sep 2022 • Jiacheng Li, Zhankui He, Jingbo Shang, Julian McAuley
Then, to obtain personalized explanations under this framework of insertion-based generation, we design a method of incorporating aspect planning and personalized references into the insertion process.
no code implementations • 14 Jun 2022 • chengyu dong, Liyuan Liu, Jingbo Shang
How to conduct teacher training for knowledge distillation is still an open problem.
1 code implementation • 25 May 2022 • William Hogan, Jiacheng Li, Jingbo Shang
Recent relation extraction (RE) works have shown encouraging improvements by conducting contrastive learning on silver labels generated by distant supervision before fine-tuning on gold labels.
Ranked #36 on Relation Extraction on DocRED
1 code implementation • 25 May 2022 • Dheeraj Mekala, chengyu dong, Jingbo Shang
Weakly supervised text classification methods typically train a deep neural classifier based on pseudo-labels.
2 code implementations • 25 May 2022 • Dheeraj Mekala, Tu Vu, Timo Schick, Jingbo Shang
The ability of generative language models (GLMs) to generate text has improved considerably in the last few years, enabling their use for generative data augmentation.
no code implementations • 24 May 2022 • Lesheng Jin, Zihan Wang, Jingbo Shang
Inspired by this observation, in WeDef, we define the reliability of samples based on whether the predictions of the weak classifier agree with their labels in the poisoned training set.
1 code implementation • 24 May 2022 • Zihan Wang, Kewen Zhao, Zilong Wang, Jingbo Shang
Fine-tuning pre-trained language models has recently become a common practice in building NLP models for various tasks, especially few-shot tasks.
1 code implementation • 29 Apr 2022 • Xinyang Zhang, Chenwei Zhang, Xian Li, Xin Luna Dong, Jingbo Shang, Christos Faloutsos, Jiawei Han
Most prior works on this matter mine new values for a set of known attributes but cannot handle new attributes that arose from constantly changing data.
1 code implementation • Findings (ACL) 2022 • Zilong Wang, Jingbo Shang
To overcome the data limitation, we propose to leverage the label surface names to better inform the model of the target entity type semantics and also embed the labels into the spatial embedding space to capture the spatial correspondence between regions and labels.
no code implementations • 7 Oct 2021 • chengyu dong, Liyuan Liu, Jingbo Shang
We show that label noise exists in adversarial training.
no code implementations • 29 Sep 2021 • Zichao Li, Liyuan Liu, chengyu dong, Jingbo Shang
While this phenomenon is commonly explained as overfitting, we observe that it is a twin process: not only does the model catastrophic overfits to one type of perturbation, but also the perturbation deteriorates into random noise.
no code implementations • Findings (EMNLP) 2021 • Zichao Li, Dheeraj Mekala, chengyu dong, Jingbo Shang
To recognize the poisoned subset, we examine the training samples with these identified triggers as the most suspicious token, and check if removing the trigger will change the poisoned model's prediction.
no code implementations • EMNLP 2021 • Dheeraj Mekala, Varun Gangal, Jingbo Shang
Existing text classification methods mainly focus on a fixed label set, whereas many real-world applications require extending to new fine-grained classes as the number of samples per label increases.
1 code implementation • EMNLP 2021 • Zilong Wang, Yiheng Xu, Lei Cui, Jingbo Shang, Furu Wei
Reading order detection is the cornerstone to understanding visually-rich documents (e. g., receipts and forms).
Ranked #2 on Reading Order Detection on ReadingBank
Document Layout Analysis Optical Character Recognition (OCR) +1
2 code implementations • ACL 2021 • Jiacheng Li, Haibo Ding, Jingbo Shang, Julian McAuley, Zhe Feng
We study the problem of building entity tagging systems by using a few rules as weak supervision.
no code implementations • NAACL 2021 • Jiaming Shen, Wenda Qiu, Yu Meng, Jingbo Shang, Xiang Ren, Jiawei Han
Hierarchical multi-label text classification (HMTC) aims to tag each document with a set of classes from a taxonomic class hierarchy.
Multi Label Text Classification Multi-Label Text Classification +3
2 code implementations • 28 May 2021 • Xiaotao Gu, Zihan Wang, Zhenyu Bi, Yu Meng, Liyuan Liu, Jiawei Han, Jingbo Shang
Training a conventional neural tagger based on silver labels usually faces the risk of overfitting phrase surface names.
Ranked #1 on Phrase Tagging on KPTimes
1 code implementation • 18 Apr 2021 • Xiuwen Zheng, Dheeraj Mekala, Amarnath Gupta, Jingbo Shang
Hashtag annotation for microblog posts has been recently formulated as a sequence generation problem to handle emerging hashtags that are unseen in the training set.
1 code implementation • 18 Apr 2021 • Zihan Wang, chengyu dong, Jingbo Shang
In this paper, we present an empirical property of these representations -- "average" approximates "first principal component".
1 code implementation • 18 Apr 2021 • Xianjie Shen, Yinghan Wang, Rui Meng, Jingbo Shang
Keyphrase generation aims to summarize long documents with a collection of salient phrases.
no code implementations • 23 Feb 2021 • Xinyang Zhang, Chenwei Zhang, Luna Xin Dong, Jingbo Shang, Jiawei Han
Specifically, we jointly train two modules with different inductive biases -- a text analysis module for text understanding and a network learning module for class-discriminative, scalable network learning.
1 code implementation • 15 Feb 2021 • chengyu dong, Liyuan Liu, Jingbo Shang
Specifically, we first propose a strategy to measure the data quality based on the learning behaviors of the data during adversarial training and find that low-quality data may not be useful and even detrimental to the adversarial robustness.
1 code implementation • Findings (ACL) 2021 • Jiaman Wu, Dezhi Hong, Rajesh Gupta, Jingbo Shang
A sensor name, typically an alphanumeric string, encodes the key context (e. g., function and location) of a sensor needed for deploying smart building applications.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Yang Jiao, Jiacheng Li, Jiaman Wu, Dezhi Hong, Rajesh Gupta, Jingbo Shang
Sensor metadata tagging, akin to the named entity recognition task, provides key contextual information (e. g., measurement type and location) about sensors for running smart building applications.
3 code implementations • NAACL 2021 • Zihan Wang, Dheeraj Mekala, Jingbo Shang
Finally, we pick the most confident documents from each cluster to train a text classifier.
2 code implementations • 15 Oct 2020 • Zichao Li, Liyuan Liu, chengyu dong, Jingbo Shang
Our goal is to understand why the robustness drops after conducting adversarial training for too long.
no code implementations • EMNLP 2020 • Jiaming Shen, Wenda Qiu, Jingbo Shang, Michelle Vanni, Xiang Ren, Jiawei Han
To facilitate the research on studying the interplays of these two tasks, we create the first large-scale Synonym-Enhanced Set Expansion (SE2) dataset via crowdsourcing.
1 code implementation • ACL 2020 • Dheeraj Mekala, Jingbo Shang
Weakly supervised text classification based on a few user-provided seed words has recently attracted much attention from researchers.
1 code implementation • 30 Apr 2020 • Peiran Li, Fang Guo, Jingbo Shang
Aspect classification, identifying aspects of text segments, facilitates numerous applications, such as sentiment analysis and review summarization.
1 code implementation • ACL 2020 • Yunyi Zhang, Jiaming Shen, Jingbo Shang, Jiawei Han
Existing set expansion methods bootstrap the seed entity set by adaptively selecting context features and extracting new entities.
1 code implementation • 17 Oct 2019 • Jiaming Shen, Zeqiu Wu, Dongming Lei, Jingbo Shang, Xiang Ren, Jiawei Han
In this study, we propose a novel framework, SetExpan, which tackles this problem, with two techniques: (1) a context feature selection method that selects clean context features for calculating entity-entity distributional similarity, and (2) a ranking-based unsupervised ensemble method for expanding entity set based on denoised context features.
1 code implementation • 10 Oct 2019 • Wanzheng Zhu, Hongyu Gong, Jiaming Shen, Chao Zhang, Jingbo Shang, Suma Bhat, Jiawei Han
In this paper, we study the task of multi-faceted set expansion, which aims to capture all semantic facets in the seed set and return multiple sets of entities, one for each semantic facet.
1 code implementation • IJCNLP 2019 • Zihan Wang, Jingbo Shang, Liyuan Liu, Lihao Lu, Jiacheng Liu, Jiawei Han
Therefore, we manually correct these label mistakes and form a cleaner test set.
Ranked #3 on Named Entity Recognition (NER) on CoNLL++ (using extra training data)
1 code implementation • 14 Aug 2019 • Liyuan Liu, Zihan Wang, Jingbo Shang, Dandong Yin, Heng Ji, Xiang Ren, Shaowen Wang, Jiawei Han
Our model neither requires the conversion from character sequences to word sequences, nor assumes tokenizer can correctly detect all word boundaries.
no code implementations • WS 2019 • Liyuan Liu, Jingbo Shang, Jiawei Han
This paper presents the winning solution to the Arabic Named Entity Recognition challenge run by Topcoder. com.
1 code implementation • EMNLP 2018 • Jingbo Shang, Liyuan Liu, Xiang Ren, Xiaotao Gu, Teng Ren, Jiawei Han
Recent advances in deep neural models allow us to build reliable named entity recognition (NER) systems without handcrafting features.
1 code implementation • 29 Apr 2018 • Jiaming Shen, Jinfeng Xiao, Xinwei He, Jingbo Shang, Saurabh Sinha, Jiawei Han
Different from Web or general domain search, a large portion of queries in scientific literature search are entity-set queries, that is, multiple entities of possibly different types.
1 code implementation • 26 Apr 2018 • Qi Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Ahmed El-Kishky, Jiawei Han
However, current Open IE systems focus on modeling local context information in a sentence to extract relation tuples, while ignoring the fact that global statistics in a large corpus can be collectively leveraged to identify high-quality sentence-level extractions.
1 code implementation • EMNLP 2018 • Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
Many efforts have been made to facilitate natural language processing tasks with pre-trained language models (LMs), and brought significant improvements to various applications.
Ranked #47 on Named Entity Recognition (NER) on CoNLL 2003 (English)
1 code implementation • 21 Feb 2018 • Jingbo Shang, Tianhang Sun, Jiaming Shen, Xingbang Liu, Anja Gruenheid, Flip Korn, Adam Lelkes, Cong Yu, Jiawei Han
We build Maester based on the following two key observations: (1) relatedness can commonly be determined by keywords and entities occurring in both questions and articles, and (2) the level of agreement between the investigative question and the related news article can often be decided by a few key sentences.
2 code implementations • 30 Jan 2018 • Xuan Wang, Yu Zhang, Xiang Ren, Yuhao Zhang, Marinka Zitnik, Jingbo Shang, Curtis Langlotz, Jiawei Han
Motivation: State-of-the-art biomedical named entity recognition (BioNER) systems often require handcrafted features specific to each entity type, such as genes, chemicals and diseases.
1 code implementation • 19 Sep 2017 • Meng Qu, Jian Tang, Jingbo Shang, Xiang Ren, Ming Zhang, Jiawei Han
Existing approaches usually study networks with a single type of proximity between nodes, which defines a single view of a network.
3 code implementations • 13 Sep 2017 • Liyuan Liu, Jingbo Shang, Frank F. Xu, Xiang Ren, Huan Gui, Jian Peng, Jiawei Han
In this study, we develop a novel neural framework to extract abundant knowledge hidden in raw texts to empower the sequence labeling task.
Ranked #13 on Part-Of-Speech Tagging on Penn Treebank
no code implementations • 13 Mar 2017 • Meng Jiang, Jingbo Shang, Taylor Cassidy, Xiang Ren, Lance M. Kaplan, Timothy P. Hanratty, Jiawei Han
We propose an efficient framework, called MetaPAD, which discovers meta patterns from massive corpora with three techniques: (1) it develops a context-aware segmentation method to carefully determine the boundaries of patterns with a learnt pattern quality assessment function, which avoids costly dependency parsing and generates high-quality patterns; (2) it identifies and groups synonymous meta patterns from multiple facets---their types, contexts, and extractions; and (3) it examines type distributions of entities in the instances extracted by each group of patterns, and looks for appropriate type levels to make discovered patterns precise.
4 code implementations • 15 Feb 2017 • Jingbo Shang, Jialu Liu, Meng Jiang, Xiang Ren, Clare R. Voss, Jiawei Han
As one of the fundamental tasks in text analysis, phrase mining aims at extracting quality phrases from a text corpus.
no code implementations • 31 Oct 2016 • Jingbo Shang, Meng Jiang, Wenzhu Tong, Jinfeng Xiao, Jian Peng, Jiawei Han
In the literature, two series of models have been proposed to address prediction problems including classification and regression.
1 code implementation • 31 Oct 2016 • Jingbo Shang, Meng Qu, Jialu Liu, Lance M. Kaplan, Jiawei Han, Jian Peng
It models vertices as low-dimensional vectors to explore network structure-embedded similarity.
no code implementations • 22 Oct 2014 • Jingbo Shang, Tianqi Chen, Hang Li, Zhengdong Lu, Yong Yu
In this paper, we tackle this challenge with a novel parallel and efficient algorithm for feature-based matrix factorization.