MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Models (MLLMs)

MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Models (MLLMs)