Nguyễn Kiêm Hiếu (aka Kiem-Hieu NGUYEN)

Office: P.706 B1
Email:

For information on the classes, move to the Teaching tab. Remember to complete assignments and read slides before class.

My research is presented in the Research, Publications, and R&D Projects tabs. Among others, BKTreebank is a project that I really like.

Drop me an email if you want to discuss supervision or collaboration.

Courses

[IT4772] Xử lý ngôn ngữ tự nhiên

[IT4868] Khai phá web L1 L2-1 L2-2 L2-3 L3 L4 L5-1 L5-2 L6-1 L6-2 L6-3 L7 L8 L9

[IT4853(Q)] Tìm kiếm thông tin Intro L1 L2 L3 L4 L5 L6 L7 L8 L9 L10 L11 L12 L13 L14 L15 L16 L17 L18 L19 L20

[IT3210] C programming language Intro L1 L2 L3 L4 L5 L6 L7 L8 L9 L10 L11 L12 L13 L14

[IT3220] C programming intro L1 L2 L3 L4 L5 L6 L7 L8 L9 L10 L11 L12 L13 L14

[IT3230(E)] Lập trình C cơ bản Intro L1(en,vi) L2(en,vi) L3(en,vi) L4(en,vi) L5(en,vi) L6(en,vi) L7(en,vi) L8(en,vi) L9(en,vi) L10(en,vi) L11(en,vi) L12(en,vi) L13(en,vi) L14(en,vi) Data structure & algrorithm basic lab L1 L2 L3 L4 L5 L6 L7 gdb L8 L9 L10 L11 L12 L13 L14

[IT3240(E)] Lập trình C nâng cao, Data structure & algrorithm advanced lab

Selected graduation projects

Research area

Natural Language Processing

Information Extraction

Research projects

Synonym discovery (Naver)

Graduate students

Alumni

VNLP NLP in specific domains
  • Spelling correction
  • Co-reference resolution
  • Relation extraction
  • Text summarization
  • Dependency parsing
  • News tag generation
  • Social part-of-speech tagging
  • Named entity recognition
  • (Samsung) SMS prediction
  • Information extraction: (Viettel) Cyber security, healthcare
  • Chatbots: E-commerce, telco, banking, healthcare
Google scholar

Conferences

Thi-Nhung Nguyen, Kiem-Hieu Nguyen, Tuan-Dung Cao and Young-In Song. An Uncertainty-Aware Encoder for Aspect Detection. Findings of ACL: EMNLP 2021

Thi-Thanh Ha, Van-Nha Nguyen, Kiem-Hieu Nguyen, Tien-Thanh Nguyen and Kim-Anh Nguyen. Utilizing Bert for Question Retrieval on Vietnameses E-commerce Sites. PACLIC 2020

Thi-Trang Nguyen, Huu-Hoang Nguyen and Kiem-Hieu Nguyen. A Study on Seq2seq for Sentence Compressionin Vietnamese. PACLIC 2020

Anh-Duong Nguyen, Kiem-Hieu Nguyen and Van-Vi Ngo. Neural Sequence Labeling for Vietnamese POS Tagging and NER. 2019 IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF)

Ba-Long Bui, Thi-Trang Nguyen, Huu-Hoang Nguyen, Kiem-Hieu Nguyen. HMMs for Unsupervised Vietnamese Word Segmentation. 2019 IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF)

Kiem-Hieu Nguyen. BKTreebank: Building a Vietnamese Dependency Treebank. In Proceedings of 12th Language Resources and Evaluation Conference, LREC 2018 paper poster data demo

Viet-Trung Tran, Kiem-Hieu Nguyen and Duc-Hanh Bui. A Vietnamese Language Model based on Recurrent Neural Network. In Proceedings of 8th International Conference on Knowledge and System Engineering, KSE 2016 pdf slides dataset (email me)

Kiem-Hieu Nguyen, Xavier Tannier, Olivier Ferret and Romaric Besancon. A Dataset for Open Event Extraction in English. In Proceedings of 10th Language Resources and Evaluation Conference, LREC 2016 pdf dataset (email me)

Kiem-Hieu Nguyen, Xavier Tannier, Olivier Ferret and Romaric Besancon. Generative Event Schema Induction with Entity Disambiguation. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) pdf sourcecode (email me)

Kiem-Hieu Nguyen, Xavier Tannier, Olivier Ferret and Romaric Besançon. Désambiguïsation d’entités pour l’induction non superviséede schémas événementiels. 22ème Traitement Automatique des Langues Naturelles

Kiem-Hieu Nguyen, Xavier Tannier and Veronique Moriceau. Ranking Multidocument Event Descriptions for Building Thematic Timelines. Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers pdf poster

Kiem-Hieu Nguyen and Cheol-Young Ock. Semantic Relatedness for Biomedical Word Sense Disambiguation. Workshop Proceedings of TextGraphs-7: Graph-based Methods for Natural Language Processing

Kiem-Hieu Nguyen and Cheol-Young Ock. Margin perceptron for word sense disambiguation. Proceedings of the 2010 Symposium on Information and Communication Technology

Journals

Van-Hai Vu, Quang-Phuoc Nguyen, Kiem-Hieu Nguyen, Joon-Choul Shin, Cheol-Young Ock. Korean-Vietnamese Neural Machine Translation with Named Entity Recognition and Part-of-Speech Tags. IEICE Transactions on Information and Systems. 2020.

Thanh Thi Ha, Atsuhiro Takasu, Thanh Chinh Nguyen, Kiem Hieu Nguyen, Van Nha Nguyen, Kim Anh Nguyen, Son Giang Tran. Supervised attention for answer selection in community question answering. IAES International Journal of Artificial Intelligence. 2020

Thi-Thanh Ha, Thanh-Chinh Nguyen, Kiem-Hieu Nguyen, Van-Chung Vu and Kim-Anh Nguyen. Unsupervised Sentence Embeddings for Answer Summarization in Non-factoid CQA. Computación y Sistemas. 2018

Thi-Thanh Ha, Van-Chung Vu and Kiem-Hieu Nguyen. Towards Event Timeline Generation from Vietnamese News. CICLING 2018, Lecture Notes in Computer Science

Kiem-Hieu Nguyen and Cheol-Young Ock. Word Sense Disambiguation as a Traveling Salesman Problem. ARTIF. INTELL. REV. 2013 pdf res

Kiem-Hieu Nguyen and Cheol-Young Ock. Using Wiktionary to Improve Lexical Disambiguation in Multiple Languages. Gelbukh A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science

Kiem-Hieu Nguyen and Cheol-Young Ock. Diacritics restoration in vietnamese: letter based vs. syllable based model. Zhang BT., Orgun M.A. (eds) PRICAI 2010: Trends in Artificial Intelligence. PRICAI 2010. Lecture Notes in Computer Science

PhD thesis

Kiem-Hieu Nguyen. Knowledge-based Word Sense Disambiguation using Information Retrieval and Topic Model. PhD Thesis. Uni. of Ulsan pdf slide

08.2021: We have one paper accepted for EMNLP Findings.

05.2021: Congratulation to Tri for his outstanding GPA at SoICT-HUST!

04.2021: Congratulation to Trang for becoming a lecturer at Computer Science Department, Faculty of Infomation Technology, Posts and Telecommunications Institute of Technology!

03.2021: I am on PC of EMLNP 2021

02.2021: We start a short-term project with Viettel on cyber security information extraction

01.2021: We start a 1-year project with Naver on NLP for information retrieval

12.2020: Chinh (an alumnus) and Nha will present their system at VLSP 2020 ReINTEL challenge

12.2020: Congratulation to Nha for his 1st prize at Zalo AI Voice verification challenge!

11.2020: Congratulation to Trang and Nha for successfully defending their Master thesis!

11.2020: I am on PC of Hackashop 2021 (a satellite event of EACL 2021).

10.2020: I am on PC of NAACL 2021.

09.2020: We have two papers accepted at PACLIC 2020.

15.12.2018: Our papers, "Neural sequence labeling for Vietnamese NER and POS tagging" and "HMM for Unsupervised Vietnamese Word Segmentation" were accepted for RIVF 2019.

02.02.2018: Our papers, "Unsupervised Sentence Embeddings for Answer Summarization in Non-factoid CQA" and "Towards Event Timeline Generation from Vietnamese News", were accepted for CICLING 2018.

03.01.2018: Our paper, "BKTreebank: Building a Vietnamese Dependency Treebank", was accepted for LREC 2018.

22.09.2017: I serve in the Scientific Committee for LREC 2018

22.09.2017: I serve as a PC member for SoICT 2017

26.03.2017: I serve as a PC member for KSE 2017

20.02.2017: Prof Antoine Doucet will give a talk on multilingual event detection at DS Lab.

07.12.2016: Please join us to contribute for Vietnamese Language and Speech Processing - VLSP Wiki!

28.11.2016: I review for PAKDD 2016

16.11.2016: I introduce NLP research at our lab in the meeting with VNPT slide

29.08.2016: I review for ACML 2016.

07.08.2016: Our paper "A Vietnamese Language Model based on Recurrent Neural Network" was accepted for KSE 2016.

25.05.2016: I serve in the committe of SoICT Student Research Campaign 2016.

14.04.2016: I talk about data construction for open event at KDE lab slide

14.04.2016: I serve as a PC member of KSE 2016

05.04.2016: We have two positions for 6-month intership working on NLP/ML/deep learning

29.03.2016: yo!!! Our co-project with Samsung on NLP for Vietnamese has been accepted

22.03.2016: Some talks and photos from KDE open day

01.03.2016: I'm a subreviewer for IJCAI 2016

27.02.2016: I will talk at KDE Lab Open Day on 19-Mars

16.02.2016: On 23/2, Prof Ock Cheol Young will give a talk on semantic processing for Korean at SoICT, HUST

01.02.2016: I have a paper accepted at LREC 2016

03.12.2015: Research topics/projects for 2015-2016 is here. Feel free to contact if you (undergraduate/graduate students or others from academy and/or industry) are interested on these topics. NOTE: All topics descriptions are in Vietnamese so contact me directly or use perhaps Google Translate if you want an English version.

26.11.2015: I talk at Hanoi Univ. of Industry on thematic timeline generation (slides)

03.11.2015: I serve as an external reviewer for PAKDD 2016

28.10.2015: I serve as a member of Scientific Committee for LREC 2016

22.09.2015: For those who are interested in doing PhD in temporal information extraction, in France (Paris region), link is here

10.09.2015: List of projects for students on 1st semester, 2015 is here

04.09.2015: I'm a reviewer for SoICT 2015

01.09.2015: I join Dep of Information System, and Knowledge & Data Engineering Lab, SoICT, HUST

16.08.2015: I talk at SoICT on generative approach to information extraction (link, slide)

BKTreebank

INTRODUCTION

BKTreebank 1.0 contains 6,900 sentences annotated with POS tagging and dependency parsing for Vietnamese. For more details on this version of the treebank, please refer to the paper:

Kiem-Hieu Nguyen. "BKTreebank: Building a Vietnamese Dependency Treebank". LREC 2018


BKPARSER LIBRARY

We also provide the library written in Java (with JRE 8 or higher) with vanilla POS tagger and dependency parser as described in our paper. The download link is
here

ACCESS TO TREEBANK

The treebank is released for research purpose. In order to access to the data, please fill and submit the Google form below using an email from an academic establishment (undergraduate students or graduate students please ask your supervisors).