avatar

Syed Talal Wasim

Computer Vision PhD Student
University of Bonn
wasimtalal(atsign)gmail.com


About Me

I am a PhD student, currently affiliated with the Computer Vision Group at the University of Bonn, Germany. I am supervised by Professor Dr. Jürgen Gall, and am working in the domain of Long-Term Multimodal Video Understanding.

Previously I was an Associate Researcher in computer vision, affiliated with the Intelligent Visual Analytics Lab (IVAL) at the Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI). I was supervised by Dr. Salman Khan.

I completed my master’s degree in Image Processing and Computer Vision (IPCV) funded by the Erasmus Mundus Joint Master’s Degree (EMJMD) scholarship program. During the master’s program, I was fortunate to have interned at the Empathic Computing Lab supervised by Dr. Mark Billinghurst. I completed my master’s thesis in the CVLAB at EPFL supervised by Dr. Mathieu Salzmann.

I hold an undergraduate degree in Electrical Engineering, with a minor in computer science, from Habib University in Karachi, Paksitan.

My previous website listing high-school, undrgraduate and graduate courses and projects can be found at talalwasim.weebly.com.

Research Interests

News

  • [Feb. 2025] Three of our papers (Video-Panda, GroupMamba, and STING-BEE) have been accepted in CVPR 2025.
  • [Dec. 2024] Our paper titled "Efficient Video Object Segmentation via Modulated Cross-Attention Memory" is accepted in WACV 2025.
  • [Oct. 2024] New preprint released titled "Distillation-free Scaling of Large SSMs for Images and Videos".
  • [Mar. 2024] Our paper titled "VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding" is accepted in CVPR 2024.
  • [Feb. 2024] My student Muhammad Zain Yousuf's bachelor thesis titled "AR-VPT: Simple Auto-Regressive Prompts for Adapting Frozen ViTs to Videos" is accepted in VISAPP 2024.
  • [Jan. 2024] I started a PhD at the University of Bonn, Germany working on Long-Term Multimodal Video Understanding, under the supervision of Professor Dr. Juergen Gall.
  • [Oct. 2023] Our paper titled "Hardware Resilience Properties of Text-Guided Image Classifiers" is accepted in NeurIPS 2023.
  • [Aug. 2023] Our paper titled "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" is accepted in ICCV 2023.
  • [Aug. 2023] Our paper titled "Self-regulating Prompts: Foundational Model Adaptation without Forgetting" is accepted in ICCV 2023.
  • [Jun. 2023] Our paper titled "Toward Automatic Typography Analysis: Serif Classification and Font Similarities" is accepted in the Journal of Data Mining in Digital Humanities (JDMDH).
  • [Mar. 2023] Our paper titled "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" is accepted in CVPR 2023.
  • [Jun. 2022] Our paper titled "Using Facial Micro-Expressions in Combination With EEG and Physiological Signals for Emotion Recognition" is accepted in the Frontiers in Psychology.
  • [Apr. 2022] I started working as a researcher at MBZUAI. I was supervised by Dr. Salman Khan, working on multimodal video understanding.
  • [Jul. 2021] I was accepted in the ETH Robotics Summer School and Symposium.
  • [Jun. 2021] I defended my master's thesis and graduated from the IPCV master's program.
  • [May. 2021] Our paper on synthetic data for object detection is accepted to CVPR 2021 CV4Animals workshop.
  • [Feb. 2021] I started my master's thesis in the CVLAB at EPFL supervised by Dr. Mathieu Salzmann. I worked on automated typography analysis on figurative content.
  • [Jul. 2020] I started a remote research internship at the Empathic Computing Lab supervised by Dr. Mark Billinghurst.
  • [Sep. 2019] I started my master's degree in Image Processing and Computer Vision (IPCV) funded by the Erasmus Mundus Joint Master's Degree (EMJMD) scholarship program.
  • [Jun. 2019] I completed my undergraduate degree in Electrical Engineering with a Minor in computer science. Graduated first in class with the Dean's Medal.
View More

Publications

  1. CVPR
    Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models
    Jinhui Yi*, Syed Talal Wasim*, Yanan Luo*, Muzammal Naseer and Juergen Gall
    CVPR, 2025
    ×
    @inproceedings{yi2024vpanda,
      title={Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models},
      author={Jinhui Yi* and Syed Talal Wasim* and Yanan Luo* and Muzammal Naseer and Juergen Gall},
      booktitle={CVPR}
      year={2025}}
            

  2. CVPR
    GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model
    Abdelrahman Shaker, Syed Talal Wasim, Salman Khan, and Fahad Shahbaz Khan
    CVPR, 2025
    ×
    @inproceedings{shaker2024groupmamba,
      title={GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model},
      author={Abdelrahman Shaker and Syed Talal Wasim and Salman Khan and Juergen Gall and Fahad Shahbaz Khan},
      booktitle={CVPR},
      year={2025}}
            

  3. CVPR
    STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection
    D. Velayudhan, A. Ahmed, M. Alansari, N. Gour, A. Behouch, T. Hassan, Syed Talal Wasim, N. Maalej, M. Naseer, J. Gall, M. Bennamoun, E. Damiani and N. Werghi
    CVPR, 2025 (Preprint will be released soon)
    ×
    @inproceedings{velayudhan2024sting,
      title={STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection},
      author={Divya Velayudhan and Abdelfatah Ahmed and Mohamad Alansari and Neha Gour and Abderaouf Behouch and Taimur Hassan and Syed Talal Wasim and Nabil Maalej and Muzammal Naseer and Juergen Gall and Mohammed Bennamoun and Ernesto Damiani and Naoufel Werghi},
      booktitle={CVPR}
      year={2025}}
            

  4. WACV
    Efficient Video Object Segmentation via Modulated Cross-Attention Memory
    Abdelrahman Shaker, Syed Talal Wasim, Martin Danelljan, Salman Khan, Ming-Hsuan Yang and Fahad Shahbaz Khan
    WACV, 2025
    ×
    @inproceedings{shaker2025mavos,
      title={Efficient Video Object Segmentation via Modulated Cross-Attention Memory},
      author={Abdelrahman Shaker and Syed Talal Wasim and Martin Danelljan and Salman Khan and Ming-Hsuan Yang and Fahad Shahbaz Khan},
      booktitle={WACV}
      year={2025}}
            

  5. Under Review
    Distillation-free Scaling of Large SSMs for Images and Videos
    Hamid Suleman*, Syed Talal Wasim*, Muzammal Naseer and Juergen Gall
    Under Review
    ×
    @article{suleman2024stablemamba,
      title={Distillation-free Scaling of Large SSMs for Images and Videos},
      author={Hamid Suleman* and Syed Talal Wasim* and Muzammal Naseer and Juergen Gall},
      journal={arxiv preprint, arxiv:2409.11867},
      year={2024}}
            

  6. CVPR
    Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding
    Syed Talal Wasim, Muzammal Naseer, Salman Khan, Ming-Hsuan Yang and Fahad Shahbaz Khan
    CVPR, 2024
    ×
    @inproceedings{wasim2024vgdino,
      title={Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding},
      author={Syed Talal Wasim and Muzammal Naseer and Salman Khan and Ming-Hsuan Yang and Fahad Shahbaz Khan},
      booktitle={CVPR}
      year={2024}}
            

  7. VISAPP
    AR-VPT: Simple Auto-Regressive Prompts for Adapting Frozen ViTs to Videos
    Muhammad Zain Yousuf, Syed Talal Wasim, Syed Nouman Hasany and Muhammad Farhan
    VISAPP, 2024
    ×
    @inproceedings{yousuf2024arvpt,
      title={AR-VPT: Simple Auto-Regressive Prompts for Adapting Frozen ViTs to Videos},
      author={Muhammad Zain Yousuf and Syed Talal Wasim and Syed Nouman Hasany and Muhammad Farhan},
      booktitle={VISAPP}
      year={2024}}
            

  8. NeurIPS
    Hardware Resilience Properties of Text-Guided Image Classifiers
    Syed Talal Wasim, Kabila Haile Soboka, Abdulrahman Mahmoud, Salman Khan, David Brooks and Gu-Yeon Wei
    NeurIPS, 2023
    ×
    @inproceedings{wasim2023textres,
      title={Hardware Resilience Properties of Text-Guided Image Classifiers},
      author={Syed Talal Wasim and Kabila Haile Soboka and Abdulrahman Mahmoud and Salman Khan and David Brooks and Gu-Yeon Wei},
      booktitle={NeurIPS}
      year={2023}}
            

  9. ICCV
    Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
    Syed Talal Wasim*, Muhammad Uzair Khattak*, Muzammal Naseer, Salman Khan, Mubarak Shah and Fahad Shahbaz Khan
    ICCV, 2023
    ×
    @inproceedings{wasim2023vfn,
      title={Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition},
      author={Syed Talal Wasim* and Muhammad Uzair Khattak* and Muzammal Naseer and Salman Khan and Mubarak Shah and Fahad Shahbaz Khan},
      booktitle={ICCV}
      year={2023}}
            

  10. ICCV
    Self-regulating Prompts: Foundational Model Adaptation without Forgetting
    Muhammad Uzair Khattak*, Syed Talal Wasim*, Muzammal Naseer, Salman Khan, Ming-Hsuan Yang and Fahad Shahbaz Khan
    ICCV, 2023
    ×
    @inproceedings{khattak2023promptsrc,
      title={Self-regulating Prompts: Foundational Model Adaptation without Forgetting},
      author={Muhammad Uzair Khattak* and Syed Talal Wasim* and Muzammal Naseer and Salman Khan and Ming-Hsuan Yang and Fahad Shahbaz Khan},
      booktitle={ICCV}
      year={2023}}
            

  11. CVPR
    Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
    Syed Talal Wasim, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan and Mubarak Shah
    CVPR, 2023
    ×
    @inproceedings{wasim2023vita,
      title={Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting},
      author={Syed Talal Wasim and Muzammal Naseer and Salman Khan and Fahad Shahbaz Khan and Mubarak Shah},
      booktitle={CVPR}
      year={2023}}
            

  12. JDMDH
    Toward Automatic Typography Analysis: Serif Classification and Font Similarities
    Syed Talal Wasim, Romain Collaud, Lara Défayes, Nicolas Henchoz, Mathieu Salzmann and Delphine Ribes
    Journal of Data Mining in Digital Humanities, 2023
    ×
    @article{wasim2023gest,
      title={Toward automatic typography analysis: serif classification and font similarities},
      author={Syed Talal Wasim and Romain Collaud and Lara Défayes and Nicolas Henchoz and Mathieu Salzmann and Delphine Ribes},
      journal={Journal of Data Mining in Digital Humanities (JDMDH)},
      year={2023}}
            

  13. Frontiers
    Using Facial Micro-Expressions in Combination With EEG and Physiological Signals for Emotion Recognition
    Nastaran Saffaryazdi, Syed Talal Wasim, Kuldeep Dileep, Alireza Farrokhi Nia, Suranga Nanayakkara, Elizabeth Broadbent and Mark Billinghurst
    Frontiers in Psychology, 2022
    ×
    @article{wasim2022ecl,
      title={Using facial micro-expressions in combination with EEG and physiological signals for emotion recognition},
      author={Nastaran Saffaryazdi and Syed Talal Wasim and Kuldeep Dileep and Alireza Farrokhi Nia and Suranga Nanayakkara and Elizabeth Broadbent and Mark Billinghurst},
      journal={Frontiers in Psychology},
      year={2022}}
            
  14. CVPRW
    Sim-to-Real Transfer for Object Detection and Localization on Animals
    Syed Talal Wasim, Syed N. Hasany, Kainat Abbasi, Huda Feroz, Anisa A. Ahmed, Mudasir H. Shaikh and Muhammad Farhan
    CV4Animals Workshop, CVPR 2021
    ×
    @inproceedings{wasim2021cv4animals,
      title={Sim-to-Real Transfer for Object Detection and Localization on Animals},
      author={Syed Talal Wasim and Syed N. Hasany and Kainat Abbasi and Huda Feroz and Anisa A. Ahmed and Mudasir H. Shaikh and Muhammad Farhan},
      booktitle={CV4Animals CVPR Workshop},
      year={2021}}
            

Services

Journal Reviewers

Conference Reviewers

Project Supervision


Powered by Jekyll and Minimal Light theme.