Research

AI systems,
stress-tested.

Research-driven work exploring safety, reliability, benchmarking, and behavior under stress in large-scale software and AI systems.

Selected entries

Papers & investigations.

11 entries

N01 · 2024–2025

AgentLens

Decision-level observability for AI agents — when an agent breaks, AgentLens pinpoints exactly which decision caused it and what to change. Works with Anthropic, OpenAI, LangGraph, CrewAI, and AutoGen.

PythonAI AgentsObservabilityDeveloper Tools

Read →

N02 · 2024–2025

LLM-CODEGEN

Advanced code generation system using Large Language Models.

LLMGenerative AINLP

Read →

N03 · 2024–2025

Large-Scale Bug Prediction

Benchmarking bug prediction across 700k Python projects.

Big DataResearchMachine Learning

Read →

N04 · 2024–2025

AI Recommender

Intelligent recommendation engine for personalized content.

Recommender SystemsPythonML

Read →

N05 · 2024–2025

AI in Software Engineering

Research on applying AI techniques to software engineering problems.

ResearchAISE

Read →

N06 · 2024–2025

Research Paper

Academic research paper repository.

ResearchLaTeX

Read →

N07 · 2024–2025

VLM Failure Modes

Analysis of failure modes in Vision-Language Models.

VLMAI SafetyResearch

Read →

N08 · 2024–2025

VLM Adversarial Defense

Defense mechanisms against adversarial attacks on VLMs.

VLMAdversarial MLPython

Read →

N09 · 2024–2025

Amnesic VLM Defense

Amnesic defense techniques for Vision-Language Models.

VLMDefensePython

Read →

N10 · 2024–2025

AI Impact on Jobs

Analysis of AI's impact on the job market.

Data AnalysisResearch

Read →

N11 · 2024–2025

Sentiment Steering GPT

Steering GPT output sentiment.

LLMPythonNLP

Read →

AI systems,stress-tested.

Papers & investigations.

AgentLens

LLM-CODEGEN

Large-Scale Bug Prediction

AI Recommender

AI in Software Engineering

Research Paper

VLM Failure Modes

VLM Adversarial Defense

Amnesic VLM Defense

AI Impact on Jobs

Sentiment Steering GPT

AI systems,
stress-tested.