ECCV Redux: Day 4 - Nov 22

Name: ECCV Redux: Day 4 - Nov 22
Start: 2024-11-22T12:00:00-05:00
End: 2024-11-22T13:30:00-05:00

Network event

15 attendees from 16 groups hosting

Hosted By Ann Arbor AI, Machine Learning and Computer Vision Meetup

public group

Details

Missed the European Conference on Computer Vision (ECCV) last month? Have no fear, we have collected some of the best research from the show into a series of online events.

Register for the Zoom

Zero-shot Video Anomaly Detection: Leveraging Large Language Models for Rule-Based Reasoning

Video Anomaly Detection (VAD) is critical for applications such as surveillance and autonomous driving. However, existing methods lack transparent reasoning, limiting public trust in real-world deployments. We introduce a rule-based reasoning framework that leverages Large Language Models (LLMs) to induce detection rules from few-shot normal samples and apply them to identify anomalies, incorporating strategies such as rule aggregation and perception smoothing to enhance robustness. The abstract nature of language enables rapid adaptation to diverse VAD scenarios, ensuring flexibility and broad applicability.

ECCV 2024 Paper

Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models

About the Speaker

Yuchen Yang is a a Ph.D. Candidate in the Department of Computer Science at Johns Hopkins University. Her research aims to deliver functional, trustworthy solutions for machine learning and AI systems.

Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models

In this talk, I will introduce our recent work on open-vocabulary 3D semantic understanding. We propose a novel method, namely Diff2Scene, which leverages frozen representations from text-image generative models, for open-vocabulary 3D semantic segmentation and visual grounding tasks. Diff2Scene gets rid of any labeled 3D data and effectively identifies objects, appearances, locations and their compositions in 3D scenes.

ECCV 2024 Paper

Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models

About the Speaker

Xiaoyu Zhu is a Ph.D. student at Language Technologies Institute, School of Computer Science, Carnegie Mellon University. Her research interest is computer vision, multimodal learning, and generative models.

Events in Ann Arbor, MI Open Source Data Science

Artificial Intelligence Machine Learning Computer Vision