Zero-shot Generative Linguistic Steganography

1Tsinghua University, 2Nanyang Technological University
NAACL '24

Abstract

Generative linguistic steganography attempts to hide secret messages into covertext. Previous studies have generally focused on the statistical differences between the covertext and stegotext, however, ill-formed stegotext can readily be identified by humans. In this paper, we propose a novel zero-shot approach based on in-context learning for linguistic steganography to achieve better perceptual and statistical imperceptibility. We also design several new metrics and reproducible language evaluations to measure the imperceptibility of the stegotext. Our experimental results indicate that our method produces \(1.926\times\) more innocent and intelligible stegotext than any other method.

Summary

ZGLS Overview

In summary, the contributions of our work are as follows:

  • We present a zero-shot approach for linguistic steganography based on in-context learning using samples of the covertext.
  • We improve both the binary coding process and the embedding process by introducing several novel techniques.
  • We design several metrics and language evaluations to evaluate both the perceptual and statistical imperceptibility, whereas our method produces more innocent and intelligible stegotext compared to all the previous methods.

Methodology

Overview of EF coding
Edge Flipping Coding

An efficient and effective coding method that increase hidden message embedding rate.


Overview of stega-text generation process
In-context Stegotext Generation

In-context stegotext generation guided by Huffman Tree constructed using the covert bitstream.

Experiments

Experimental results under different BPWs and datasets
Experimental results under different BPWs and datasets

Our proposed method shows a reduced Perceptual-Statistical Imperceptibility Conflict Effect (Psic Effect) in comparison to prior methods as our JSDs rise with increasing BPW, whereas other methods see a decrease in JSDs.


Steganalysis
Steganalysis

Each value is the classification accuracy by steganalyzer. Lower accuracy \(\downarrow\) indicates better statistical imperceptibility.


Language Evaluation
Language Evaluation

Each value represents the ratio of how much our stegotext is considered better than the particular method.


Stegotext Examples
Stegotext Examples

Compared with other baseline methods, our method generates more reasonable sentences.

Poster

BibTeX


@inproceedings{lin2024zgls,
  title = "Zero-shot Generative Linguistic Steganography",
  author = "Lin, Ke  and
    Luo, Yiyang  and
    Zhang, Zijian  and
    Ping, Luo",
  editor = "Duh, Kevin  and
    Gomez, Helena  and
    Bethard, Steven",
  booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)",
  month = jun,
  year = "2024",
  address = "Mexico City, Mexico",
  publisher = "Association for Computational Linguistics",
  url = "https://aclanthology.org/2024.naacl-long.289",
  pages = "5168--5182",
}