Vision-Language Models Tutorial

Context is king: How Avride uses cloud VLMs as a safety net for delivery robots

It is important to clarify: we do not use VLMs to drive the robot. Using a heavy cloud model to steer in real time would ...

IEEE

Mitigating Object Hallucination in Large Vision-Language Models via Visual Attention Direct Preference Optimization

Abstract: Large Vision-Language Models (LVLMs) suffer from severe object hallucinations, leading them to frequently generate outputs that do not correspond to the image content, significantly reducing ...

IEEE

Task Planning for a Factory Robot Using Large Language Model

Abstract: In recent years, automation has significantly advanced the automobile manufacturing industry. However, many tasks still involve human intervention, so there is a demand for the development ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Context is king: How Avride uses cloud VLMs as a safety net for delivery robots

Mitigating Object Hallucination in Large Vision-Language Models via Visual Attention Direct Preference Optimization

Task Planning for a Factory Robot Using Large Language Model

Trending now