Ke Lin

About Me

I am a third-year master student in Software Engineering, Tsinghua University, supervised by Prof. Ping Luo. I received my bachelor's degree in Software Engineering from Tsinghua University, advised by Prof. Lijie Wen.

Education

Tsinghua University

2022.09 - 2025.06 (expected)

Master of Engineering in Software Engineering

Tsinghua University

2018.09 - 2022.06

Bachelor in Software Engineering

Research Experiences

Momenta

2021.01 - 2021.04

Research & Development Intern

Selected Publications

* denotes equal contribution.

Lost in Overlap: Exploring Logit-based Watermark Collision in LLMs

Yiyang Luo*, Ke Lin*, Chao Gu*, Jiahui Hou, Lijie Wen, Ping Luo

NAACL 2025 Findings

CCF B CORE A Watermark

Code | Abstract | Project Page

The proliferation of large language models (LLMs) in generating content raises concerns about text copyright. Watermarking methods, particularly logit-based approaches, embed imperceptible identifiers into text to address these challenges. However, the widespread use of watermarking across diverse LLMs has led to an inevitable issue known as watermark collision during common tasks like question answering and paraphrasing. This study focuses on dual watermark collisions, where two watermarks are present simultaneously in the same text. The research demonstrates that watermark collision poses a threat to detection performance for detectors of both upstream and downstream watermark algorithms.

Context-Aware Indoor Point Cloud Object Generation through User Instructions

Yiyang Luo*, Ke Lin*, Chao Gu

ACM MM 2024

CCF A CORE A* 3D Vision

Code | Abstract | Project Page

Indoor scene modification has emerged as a prominent area within computer vision, particularly for its applications in Augmented Reality (AR) and Virtual Reality (VR). Traditional methods often rely on pre-existing object databases and predetermined object positions, limiting their flexibility and adaptability to new scenarios. In response to this challenge, we present a novel end-to-end multi-modal deep neural network capable of generating point cloud objects seamlessly integrated with their surroundings, driven by textual instructions. Our work proposes a novel approach in scene modification by enabling the creation of new environments with previously unseen object layouts, eliminating the need for pre-stored CAD models. Leveraging Point-E as our generative model, we introduce innovative techniques such as quantized position prediction and Top-K estimation to address the issue of false negatives resulting from ambiguous language descriptions. Furthermore, we conduct comprehensive evaluations to showcase the diversity of generated objects, the efficacy of textual instructions, and the quantitative metrics, affirming the realism and versatility of our model in generating indoor objects. To provide a holistic assessment, we incorporate visual grounding as an additional metric, ensuring the quality and coherence of the scenes produced by our model. Through these advancements, our approach not only advances the state-of-the-art in indoor scene modification but also lays the foundation for future innovations in immersive computing and digital environment creation.

Zero-shot Generative Linguistic Steganography

Ke Lin, Yiyang Luo, Zijian Zhang, Ping Luo

NAACL 2024

CCF B CORE A Watermark

Code | Abstract | Project Page

Generative linguistic steganography attempts to hide secret messages into covertext. Previous studies have generally focused on the statistical differences between the covertext and stegotext, however, ill-formed stegotext can readily be identified by humans. In this paper, we propose a novel zero-shot approach based on in-context learning for linguistic steganography to achieve better perceptual and statistical imperceptibility. We also design several new metrics and reproducible language evaluations to measure the imperceptibility of the stegotext. Our experimental results indicate that our method produces \(1.926\times\) more innocent and intelligible stegotext than any other method.

🔼 Full Publications

Low-Latency Privacy-Preserving Deep Learning Design via Secure MPC

Ke Lin, Yasir Glani, Ping Luo

IJCAI-2024 AISafety Workshop

CCF A CORE A* Workshop Security

Code | Abstract

Secure multi-party computation (MPC) facilitates privacy-preserving computation between multiple parties without leaking private information.While most secure deep learning techniques utilize MPC operations to achieve feasible privacy-preserving machine learning on downstream tasks, the overhead of the computation and communication still hampers their practical application.This work proposes a low-latency secret-sharing-based MPC design that reduces unnecessary communication rounds during the execution of MPC protocols.We also present a method for improving the computation of commonly used nonlinear functions in deep learning by integrating multivariate multiplication and coalescing different packets into one to maximize network utilization.Our experimental results indicate that our method is effective in a variety of settings, with a speedup in communication latency of \(10\sim20\%\).

ViRED: Prediction of Visual Relations in Engineering Drawings

Chao Gu, Ke Lin, Yiyang Luo, Jiahui Hou, Xiang-Yang Li

IEEE MSN 2024

CCF C 2D Vision

Code | Abstract | Project Page

To accurately understand engineering drawings, it is essential to establish the correspondence between images and their description tables within the drawings. Existing document understanding methods predominantly focus on text as the main modality, which is not suitable for documents containing substantial image information. In the field of visual relation detection, the structure of the task inherently limits its capacity to assess relationships among all entity pairs in the drawings. To address this issue, we propose a vision-based relation detection model, named ViRED, to identify the associations between tables and circuits in electrical engineering drawings. Our model mainly consists of three parts: a vision encoder, an object encoder, and a relation decoder. We implement ViRED using PyTorch to evaluate its performance. To validate the efficacy of ViRED, we conduct a series of experiments. The experimental results indicate that, within the engineering drawing dataset, our approach attained an accuracy of 96% in the task of relation prediction, marking a substantial improvement over existing methodologies. The results also show that ViRED can inference at a fast speed even when there are numerous objects in a single engineering drawing.

MDASC: Advanced Dual-Layered Code Cloning Technique for Identifying Reused Malicious Code

Yasir Glani, Ping Luo, Ke Lin

SEKE 2024

CCF C CORE C Security

Abstract

Code cloning is a prevalent practice in software development, frequently exploited by adversaries to propagate malicious code, compromising user security and privacy. Recently proposed detection techniques often fail to identify complex clones, posing significant software integrity and security risks. In response, we introduce MDASC, an advanced dual-layered approach to detecting known Android malware through code cloning. MDASC employs method-based signature detection followed by sliding windows-based detection, effectively identifying modified and unmodified states of reused malware code, encompassing Type-I, Type-II, and Type-III (VST, ST, and MT) clones. Our comprehensive comparative study evaluates MDASC against recently proposed techniques across various code cloning detection methods, demonstrating its superiority in precision, recall, F1-score, accuracy, and efficiency. Leveraging intelligent retrieval and processing algorithms, MDASC maintains high efficiency, achieving an average detection time of 0.33 seconds for a database of 132,414 files and 13.5 million lines of code. This efficiency and effectiveness emphasize MDASC’s potential for real-time security applications. MDASC represents a significant advancement in malicious code reuse detection, offering unparalleled accuracy and efficiency, thereby enhancing software security practices within the rapidly evolving Android ecosystem.

AyatDroid: A Lightweight Code Cloning Technique Using Different Static Features

Yasir Glani, Ping Luo, Ke Lin, Syed Asad Shah

IEEE SEAI 2023

Security

Abstract

In recent decades, malicious code reuse has surged in numbers and sophistication, it is a common practice among adversaries to reuse malicious code, which significantly threatens user privacy and security. Several signature-based code clone detection techniques have been proposed to detect malicious clones in Android applications that use the MD5 hash function to generate signatures. Meanwhile, these techniques only retrieve signatures from Java files. Due to the 128-bit signature size of the MD5 hash function, these techniques take longer to generate signatures. In this article, we propose the AyatDroid technique, which efficiently identifies malicious chunks by retrieving signatures from Java and manifest files . AyatDroid technique is tested on reliable CiCMalDroid 2020 dataset. We have evaluated the AyatDroid technique with other cutting-edge code clone detection techniques. Our experimental results demonstrated that AyatDroid outperformed regarding detection time and accuracy. AyatDroid is not only lightweight but also efficient, allowing it to be implemented on the large scale.

Services

Reviewer: ACL, EMNLP, NAACL, MM, IJCAI
Teaching Assistant:
- 2021 Web Front-end Technology. Worked with Prof. Lijie Wen.

Awards and Scholarships

2024 Tsinghua-Ubiquant Scholarship
2019 The Second Prize Scholarship

Page designed by Ankit Sultana and modified by K. Lin