DMI Discourse Mutual Information

Introducing DMI

DMI_model

Table of Contents

Abstract

Although many pretrained models exist for text or images, there have been relatively fewer attempts to train representations specifically for dialog understanding. Prior works usually relied on finetuned representations based on generic text representation models like BERT or GPT-2. But such language modeling pretraining objectives do not take the structural information of conversational text into consideration. Although generative dialog models can learn structural features too, we argue that the structure-unaware word-by-word generation is not suitable for effective conversation modeling. We empirically demonstrate that such representations do not perform consistently across various dialog understanding tasks. Hence, we propose a structure-aware Mutual Information based loss-function DMI (Discourse Mutual Information) for training dialog-representation models, that additionally captures the inherent uncertainty in response prediction. Extensive evaluation on nine diverse dialog modeling tasks shows that our proposed DMI-based models outperform strong baselines by significant margins.

Paper

Getting Access to the Source Code or Pretrained Models

To get access to the source-code or pretrained-model checkpoints, please send a request to AcadGrants@service.microsoft.com and cc to pawang [_at_] iitkgp.ac.in and bsantraigi [_at_] gmail.com.

Note

The requesting third party

  1. Can download and use these deliverables for research as well as commercial use,
  2. Modify it as they like but should include citation to our work and include this readme, and
  3. Cannot redistribute strictly to any other organization.

Cite As

@inproceedings{santra2022representation,
  title={Representation Learning for Conversational Data using Discourse Mutual Information Maximization},
  author={Santra, Bishal and Roychowdhury, Sumegh and Mandal, Aishik and Gurram, Vasu and Naik, Atharva and Gupta, Manish and Goyal, Pawan},
  booktitle={Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
  year={2022}
}

Intuition

DMI_Cover

Results

results_combined results_ablations

Error Analysis

eintent-error

Authors

Acknowledgements

This work was partially supported by Microsoft Academic Partnership Grant (MAPG) 2021. The first author was also supported by Prime Minister’s Research Fellowship (PMRF), India.