special project / cct-vlm

CCT-VLM

A dedicated vision workbench for the Qwen 3.5 sidecar on DGX. Use one image or two images, compare labels, inspect last-token logits, and keep each result as a response card with a stable response id.

Not logged in Sidecar: checking... Mode: chat Example: loading...

Back to chat

guided showcase

Verify the hypothesis, inspect the probabilities, then perturb the image

This page combines a real sidecar workbench with a guided reading flow. Load an example, show the exact instruction, verify entailment/contradiction/neutral, inspect the probability bars, and then test whether the judgment stays stable under prompt edits or Gaussian image noise.

Image A

Image B

Load a pair

Start from the SNLI-VE pair or switch to the mock pair.

Show the instruction

Keep the actual prompt visible so the evaluation is interpretable.

Check faithfulness

Compare the predicted label, its confidence, and the explanation.

Add noise or edit the prompt

See what changes and what remains robust.

Current example

Loading example…

Hypothesis

Instruction shown to model

Current verdict

No run yet

waiting

Run a verification or pipeline step to populate the summary.

Demo flow

Load an e-SNLI style image pair (or the built-in mock images) and run label scoring with Gaussian noise to see stability curves.

Demo buttons set image paths on DGX. Replace them with your e-SNLI paths if needed.

Mode Question

Retain context Show chat log

Predict prompt Explain prompt Prompt flow System Prompt Labels

Used for label scoring, comparison, and optional logits-side label scoring.

Top logits

Max new tokens

Temperature

Top p

Top k

Min p

Repetition penalty

Seed

Do sample Return full last-token logits

`return_full_last_token_logits` can be very large. Keep it off unless you are inspecting raw vocab behavior.

Image A

Image A path (DGX) Image A preview

Image B

Image B path (DGX) Image B preview

`compare_images` uses both Image A and Image B. The other modes use only Image A.

Gaussian noise

Enable noise

Std

Mean

Seed

Noise is applied per image before the sidecar runs. Turn it on to stress-test label stability.

Noise sweep

Min std

Max std

Steps

Sweeps the label scores over multiple Gaussian noise levels and plots the curves.

Ready.

Responses

Every request gets a local `response_id`. Use the cards to inspect output text, label scores, and token/logit distributions.

0 responses

Chat timeline

All tries show up here. Toggle retain context to chain the next prompt.

CCT-VLM

Verify the hypothesis, inspect the probabilities, then perturb the image

Demo flow

Image A

Image B

Gaussian noise

Noise sweep

Responses

Chat timeline

Sign in