Skip to content
| Marketplace
Sign in
Visual Studio Code>Visualization>Code Provenance LoggerNew to Visual Studio Code? Get it now.
Code Provenance Logger

Code Provenance Logger

HongwonJeong

|
5 installs
| (0) | Free
CS1 교육 과정을 위한 코딩 과정 추적 및 분석 도구
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Code Provenance Logger

Code Provenance Logger is a VS Code extension for CS1 (Introduction to Computer Science) education and research. It logs how students write, modify, execute, and debug code, while preserving privacy by design.

The extension focuses on coding behavior and process, not on storing or reconstructing source code content.

Features

  • Coding Behavior Logging

    • Edit events (insert / delete / replace)
    • Save events
    • Cursor movement and selection changes (debounced)
    • File open and close
    • Program execution
      • Task-based runs
      • Terminal command execution (when supported)
    • Diagnostics snapshots
      • Error, warning, info, and hint counts
      • Top diagnostics summarized by hashes
  • Privacy-Preserving by Default

    • No raw source code is stored
    • No raw error messages or command lines are stored
    • All text-based content is recorded as SHA-256 hashes only
    • Document identity is anonymized using hashed URIs
    • No personally identifiable information (PII) is collected
  • Efficient and Robust Logging

    • One JSONL file per coding session
    • Hybrid batching strategy
      • Time-based periodic flush
      • Event-count-based flush
      • Immediate flush on save, close, and run events
    • Local logging is always available
    • Server upload is strictly opt-in
  • Research-Ready Format

    • Structured JSONL records
    • Explicit session headers and batch boundaries
    • Batch-level integrity hashes
    • Designed for large-scale analysis and provenance tracking

Usage

  1. Install the extension Install using a .vsix file: VS Code → Extensions → … → Install from VSIX… or via command line: code --install-extension code-provenance-logger-0.0.6.vsix

  2. Restart VS Code The extension activates automatically on startup.

  3. Code as usual Write, save, and run programs normally. No additional interaction is required for local logging.

  4. Find local log files By default, logs are stored as JSONL files under: Windows %APPDATA%\Code\User\globalStorage\HongwonJeong.code-provenance-logger\logs

Each file corresponds to a single VS Code session.

Local Logging and Server Upload

  • Local Logging (Default, No Token Required)
    • Enabled by default (localLoggingEnabled = true)
    • Logs are saved only on the local machine
    • Suitable for personal use, offline analysis, and classroom deployment without server infrastructure

No configuration or token is required.

Server Upload (Optional, Token Required)

Server upload is disabled by default and requires explicit opt-in.

  • How to enable server upload

    1. Open the Command Palette
    2. Run: CPL: Enable Server Upload (Enter Token)
    3. Enter the upload token provided by the researcher
  • Once enabled:

    • The same batched logs are uploaded to the research server
    • Upload occurs automatically during flush events
  • Important guarantees

    • If no token is provided, no data is ever sent to the server
    • If server upload fails, logs are still preserved locally
    • Server upload can be disabled or cleared at any time using:
      • CPL: Disable Server Upload
      • CPL: Clear Server Token

Data Schema Overview

This section provides a high-level overview of the data schema to clarify what is collected and what is not.

  • Session Header Recorded once per session, at the beginning of the log.

    • schemaVersion: schema version number
    • clientId: anonymized, persistent identifier for the local environment
    • sessionId: unique identifier for the current VS Code session
    • sessionStartTs: session start timestamp
    • workspaceName: workspace name (if available)
    • extensionVersion: extension version string
  • Batch Record Each batch groups multiple events flushed together.

    • batchId: unique batch identifier
    • batchStartTs, batchEndTs: batch time window
    • flushReason: trigger for the batch (interval, save, run, etc.)
    • eventCount: number of events in the batch
    • metrics:
      • Inter-event time intervals
      • Approximate typing speed
      • Edit burst count
    • batchHash: integrity hash for the batch
  • Event Types (Summary) The following event types may appear inside a batch:

    • Edit
      • Edit type (insert / delete / replace)
      • Text length, newline count
      • Character class distribution
      • Hashed text content
    • Save
      • Hashed full-document content
      • Line count
    • Selection
      • Cursor position
      • Selection ranges
    • Open / Close
      • Timestamped file access events
    • Run (Task / Terminal)
      • Run count and phase (start / end)
      • Exit code when available
      • Hashed command line (terminal only)
    • Diagnostics
      • Total diagnostic counts by severity
      • Top diagnostics summarized by hashes Raw source code, raw messages, and raw commands are never stored.

Logged Data (Summary)

  • Edit statistics (length, newline count, character classes)
  • Save-time document hash and line count
  • Cursor position and selection ranges
  • File open and close timestamps
  • Program run attempts and exit codes (when available)
  • Diagnostic counts and summaries

Use Cases

  • Distinguishing typing-based coding from copy-and-paste behavior
  • Analyzing error–fix–run cycles in CS1 assignments
  • Studying students’ coding strategies and learning progress
  • Building datasets for educational data mining
  • Supporting research on programming process and code provenance

Notes

  • Intended for educational and research purposes
  • User consent is strongly recommended before classroom deployment
  • The extension is designed to minimize data collection by default
  • Server upload is strictly opt-in and token-gated

License

MIT

Author

Hongwon Jeong Hanyang University beatsbywoni@hanyang.ac.kr

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft