Ivan Pavlov

Research Feb 2026 – Present

Finding Explanations for Emergent Misalignment in LLMs

Learning & Adaptive Systems (LAS) Group, ETH Zurich · Supervised by Cynthia Xin Chen & Lukas Fluri

Fine-tuned LLMs sometimes exhibit strikingly misaligned behavior — an unsettling phenomenon with no agreed mechanistic account. This project evaluates candidate explanations using linear probes, SAE features, and activation steering to determine what internal representations shift during fine-tuning and which shifts are causally responsible for misaligned outputs.

mechanistic interpretability LLMs AI safety SAE linear probes activation steering

Research Feb 2025 – Jun 2025

Post-training for SwissAI Apertus: SFT → DPO

CLAIRE Lab, EPFL

swiss-alignment olmes

Reproduced the Tülu3 post-training pipeline on Llama 3.1 8B and adapted it to the SwissAI Apertus (8B/70B) models: full SFT run followed by DPO. Discovered that DPO degraded MMLU despite clean training metrics — a concrete example of optimization signal diverging from downstream capability. Extended the olmes evaluation harness with vLLM + tensor parallelism, seven standard benchmarks, and automated SLURM job generation; productionized the DPO trainer in the SwissAI codebase with full W&B logging.

RLHF / DPO SFT Llama 3.1 vLLM DeepSpeed MMLU benchmarking

CS-503 · Visual Intelligence 2025

Auxiliary Evidence Routing for Egocentric Video QA

EPFL

Project page

Tests whether targeted external evidence helps a frozen video-native VLM (Qwen2-VL-7B) answer HD-EPIC multiple-choice questions under compute constraints. The model receives one of three conditions: native video alone, tool-selected evidence alone, or native video augmented with evidence. Six evidence-selection tools were evaluated (CLIP retrieval, motion+CLIP cascade, OCR crop, object tracking) alongside routing strategies ranging from keyword rules to TF-IDF classifiers. Key finding: a hand-crafted rule-based router achieved +3.9 pp over fixed-tool baselines, while learned classifiers consistently underperformed — pointing to a hard within-category gap of 16 pp that cheap visual features cannot close.

VLM egocentric video QA CLIP evidence routing multimodal Qwen2-VL

MATH-454 · HPC 2025

LBM 2D — Serial, MPI & CUDA

EPFL · Parallel and High Performance Computing

Project page

A 2D Lattice Boltzmann solver (D2Q9, BGK collision) for flow past one or two circular cylinders in a rectangular channel. The repository contains three solvers: a serial reference implementation, a distributed-memory MPI version, and a hand-written CUDA kernel — plus the profiling and scaling studies behind them. Covers domain decomposition, halo exchange, GPU memory layout, and strong/weak scaling analysis.

CUDA MPI C++ Lattice Boltzmann HPC fluid simulation

Course Author

Discrete Mathematics: Beginner Level

Author

Cogniterra (EN) Stepik (RU)

A modern interactive textbook covering combinatorics, graph theory, probability, logic, and algorithm complexity. 300+ auto-graded problems with Python examples. The Russian edition has the larger following.

discrete mathematics combinatorics graph theory probability

Course Author

Algorithmic Interview Problems: Beginner Level

Co-creator & Teaching Assistant

Stepik (RU)

Covers greedy algorithms, divide-and-conquer, and dynamic programming through 55 interactive interview-style problems with hints and detailed solutions. 91 lessons, 8 h/week pace.

algorithms dynamic programming interview prep

Ivan Pavlov

Publications

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

Projects

Finding Explanations for Emergent Misalignment in LLMs

Post-training for SwissAI Apertus: SFT → DPO

Auxiliary Evidence Routing for Egocentric Video QA

LBM 2D — Serial, MPI & CUDA

Discrete Mathematics: Beginner Level

Algorithmic Interview Problems: Beginner Level

Education

EPFL — École Polytechnique Fédérale de Lausanne

Saint Petersburg State University