MultiNet-v2.0 Benchmark evaluating multimodal perception and action taking capabilities across different domains