We designed a topic-aware model for QA system.

In the second phase of the research program, my teammate and I discussed with Dr. Xiaodong GU and designed a simple topic-aware seq2seq QA model. In this project, we first generate ground-truth topic classes for each sentences using kmeans++ and sentence embedding, assuming the embedding vectors are well distributed. Then we train a MLP to be a topic classifier, and use VAE to encode topic and RNN-generated sentence vector into a latent variable, which is passed to first-word generator. The generated first word is passed to RNN decoder (instead of ‘<SOS>’), together with sentence vector and topic class. This project was done based on the assumption that the first word of a sentence is important in leading the sentence meaning. It inspired Dr. GU’s following work.

You can view our report here.