Multimodal
updated
Unified Multimodal Understanding and Generation Models: Advances,
Challenges, and Opportunities
Paper
• 2505.02567
• Published
• 80
OmniGen2: Exploration to Advanced Multimodal Generation
Paper
• 2506.18871
• Published
• 78
UniFork: Exploring Modality Alignment for Unified Multimodal
Understanding and Generation
Paper
• 2506.17202
• Published
• 10
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image
Generation
Paper
• 2506.18095
• Published
• 66
Paper
• 2506.23044
• Published
• 61
A Survey on Vision-Language-Action Models: An Action Tokenization
Perspective
Paper
• 2507.01925
• Published
• 39
Pixels, Patterns, but No Poetry: To See The World like Humans
Paper
• 2507.16863
• Published
• 69