Can LLMs Simulate Target Users in Visualization Case Studies?

Authors: Satkunarajan, Jena, Abdelaal, Moataz, Koch, Steffen, Kurzhals, Kuno, Weiskopf, Daniel

EG Link

Thumbnail

  • Make real users solve tasks <-> good viz
  • User study classification types (Isenberg et al.) - esp. in domain-specific applications: experts scarce…
  • Type III/IV studies - good, as we can safeguard and not too specific!
  • Eval: by replication!
    • both unpublished and published…

Beauty in the Eye of AI: Aligning LLMs and Vision Models with Human Aesthetics in Network Visualization

Authors: Li, X., Zhang, P., Wang, X., Shen, H., Hu, Y.

EG Link

Thumbnail

  • Checking for most-preferred graph, trying to align LM
  • But seems non-consistent? - very subjective task!

Do Graph Drawing Aesthetics Matter for AI? A Replication of Foundational Studies in Graph Readability

Authors: Di Bartolomeo, Sara, Schicho, Johann Sebastian, Traversini, Aurora, Fink, Simon Dominik, Didimo, Walter, Montecchiani, Fabrizio

EG Link

Thumbnail

  • Aesthetics of graphs - RL for GIBBER?
    • Edge crossings / …
  • How do these translate to (V)LM readability?
    • How can robots / AI interpret charts (i.e. maps, accessibility, papers)
  • their approach: replicate three task-based experiments
  • old readability tests: problem: low-quality images - solution: contact authors of 20y paper!
  • Good approach: SoTA models + self-hosted!
  • Findings:
    • human much better without crossings
    • LMs do not really care about crossings, much more about symmetry + orthogonal layouts
    • LMs perform best on force-directed layouts
  • Limits:
    • limited scope, but more faithful to original study
    • becomes outdated quickly!

How Do LLMs See Charts? A Comparative Study on High-Level Visualization Comprehension in Humans and LLMs

Authors: Jeon, Hyotaek, Lee, Hyunwook, Shin, Minjeong, Pandey, Tapendra, Kim, Joohee, Seon, Shinwook, Jeong, Daeun, Ko, Sungahn, Quadri, Ghulam Jilani

EG Link

Thumbnail

  • Stability, Reading Strategy, and Intent alignment?
  • How do they evaluate ‘sameness’? - matching code
  • Takeaway: LMs strong at decoding technicalities, but cannot judge effectiveness