Georgia Gkioxari

I am an Assistant Professor of Computing + Mathematical Sciences at Caltech and a William H. Hurt scholar. I am also a visiting researcher at Meta AI in the Embodied AI team. From 2016 to 2022, I was a research scientist at Meta's FAIR team. I received my PhD from UC Berkeley, where I was advised by Jitendra Malik. I did my bachelors in ECE at NTUA in Athens, Greece, where I worked with Petros Maragos.

I am the recipient of the PAMI Young Researcher Award (2021). My teammates and I received the PAMI Mark Everingham Award (2021) for the Detectron Library Suite. I was named one of 30 influential women advancing AI in 2019 by ReWork and was nominated for the Women in AI Awards in 2020 by VentureBeat. Read more about me and my work in this Q&A.

/ / / /

Research Highlights

The goal of my work is to design visual perception models that bridge the gap between 2D imagery and our 4D world. My research interests lie in computer vision and machine learning. I want to build intelligent systems that perceive the world from as little as one single image -- just like humans do! Our world is complex, it is three dimensional and it is dynamic. Computational models get to observe this world from imagery but only partially as visual data does not completely capture the richness of the world we live in. Below I highlight work that attempts to transform visual data to semantic scene representations in 2D and 3D.

Caltech students (undergrads and grads): If you are at Caltech and wish to work with me, please read the information in this doc.

Prospective postdocs: If you are interested in a postdoc position and want to conduct research in computer vision, 3D understanding and visual perception, please contact me directly with your CV and a short research statement.

Prospective PhD students: I am looking for Ph.D. students to join my group. If you are interested in my group, please apply directly to the CMS department and mention my name in your statement of purpose. There is no need to email me.


Sabera Talukder
NSF GR & Chen Fellow

Guanzhi Wang

Ilona Demler
NSF GR & EAS Scholar

Damiano Marsili

Aadarsh Sahoo
Kortschak Scholar

Raphi Kang

Ziqi Ma
Kortschak Scholar


Teaching at Caltech


EE/CS 148 - Spring 2023: Large Language and Vision Models

EE/CS 148 - Spring 2024: Large Language and Vision Models


CS 101 - Winter 2024: Learning & 3D


Star Star Star

Selected Publications

Pixel-Aligned Recurrent Queries for Multi-View 3D Object Detection
Yiming Xie, Huaizu Jiang, Julian Straub*, Georgia Gkioxari*
International Conference of Computer Vision (ICCV), 2023

arxiv / project page / code / bibtex
  title={Pixel-Aligned Recurrent Queries for Multi-View 3D Object Detection},
  author={Xie, Yiming and Jiang, Huaizu and Gkioxari, Georgia and Straub, Julian},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},


Multiview Compressive Coding for 3D Reconstruction
Chao-Yuan Wu, Justin Johnson, Jitendra Malik, Christoph Feichtenhofer, Georgia Gkioxari
Computer Vision and Pattern Recognition (CVPR), 2023

arxiv / project page / code / bibtex
  author    = {Wu, Chao-Yuan and Johnson, Justin and Malik, Jitendra and Feichtenhofer, Christoph and Gkioxari, Georgia},
  title     = {Multiview Compressive Coding for 3{D} Reconstruction},
  journal   = {CVPR},
  year      = {2023},


Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild
Garrick Brazil, Abhinav Kumar, Julian Straub, Nikhila Ravi, Justin Johnson, Georgia Gkioxari
Computer Vision and Pattern Recognition (CVPR), 2023

arxiv / project page / code / bibtex
  title={{Omni3D}: A Large Benchmark and Model for {3D} Object Detection in the Wild},
  author={Garrick Brazil and Abhinav Kumar and Julian Straub and Nikhila Ravi and Justin Johnson and Georgia Gkioxari},


Learning 3D Object Shape and Layout without 3D Supervision
Georgia Gkioxari, Nikhila Ravi, Justin Johnson
Computer Vision and Pattern Recognition (CVPR), 2022

arxiv / project page / video / bibtex
  title={Learning 3D Object Shape and Layout without 3D Supervision},
  author={Georgia Gkioxari and Nikhila Ravi and Justin Johnson},


Differentiable Stereopsis: Meshes from multiple views using differentiable rendering
Shubham Goel, Georgia Gkioxari, Jitendra Malik
Computer Vision and Pattern Recognition (CVPR), 2022

arxiv / project page / code / bibtex
  title={Differentiable Stereopsis: Meshes from multiple views using differentiable rendering},
  author={Shubham Goel and Georgia Gkioxari and Jitendra Malik},

game game

Recognizing Scenes from Novel Viewpoints
Shengyi Qian, Alexander Kirillov, Nikhila Ravi, Devendra Singh Chaplot, Justin Johnson, David F. Fouhey, Georgia Gkioxari

arxiv / project page / code / bibtex
  title={Recognizing Scenes from Novel Viewpoints},
  author={Shengyi Qian and Alexander Kirillov and Nikhila Ravi and Devendra Singh Chaplot and Justin Johnson and David Fouhey and Georgia Gkioxari},
  journal={arXiv preprint arXiv:2112.01520},


Nikhila Ravi, Jeremy Reizenstein, David Novotny, Taylor Gordon, Wan-Yen Lo, Justin Johnson*, Georgia Gkioxari*

blog / arxiv / code / project page / bibtex
  title={Accelerating 3D Deep Learning with PyTorch3D},
  author={Ravi, Nikhila and Reizenstein, Jeremy and Novotny, David and Gordon, 
          Taylor and Lo, Wan-Yen and Johnson, Justin and Gkioxari, Georgia},
  journal={arXiv preprint arXiv:2007.08501},


3D Shape Reconstruction from Vision and Touch
Edward J. Smith, Roberto Calandra, Adriana Romero, Georgia Gkioxari, David Meger, Jitendra Malik, Michal Drozdal
Conference on Neural Information Processing Systems (NeurIPS), 2020

arxiv / code / bibtex
  title={3D Shape Reconstruction from Vision and Touch},
  author={Smith, Edward J and Calandra, Roberto and Romero, Adriana and Gkioxari, 
          Georgia and Meger, David and Malik, Jitendra and Drozdzal, Michal},


SynSin: End-to-end View Synthesis from a Single Image
Olivia Wiles, Georgia Gkioxari, Richard Szeliski, Justin Johnson
Computer Vision and Pattern Recognition (CVPR), 2020 (oral)

arxiv / code / project page / bibtex
Title={{SynSin}: {E}nd-to-end View Synthesis from a Single Image},,
Author={Olivia Wiles, Georgia Gkioxari, Richard Szeliski, Justin Johnson},


Mesh R-CNN
Georgia Gkioxari, Jitendra Malik, Justin Johnson
International Conference of Computer Vision (ICCV), 2019

arxiv / code / project page / examples / bibtex
Title={Mesh R-CNN},
Author={Georgia Gkioxari, Jitendra Malik, Justin Johnson},


Embodied Question Answering in Photorealistic Environments with Point Cloud Perception
Erik Wijmans, Samyak Datta, Oleksandr Maksymets, Abhishek Das, Georgia Gkioxari, Stefan Lee, Irfan Essa, Devi Parikh, Dhruv Batra
Computer Vision and Pattern Recognition (CVPR), 2019 (oral)

arxiv / project page / bibtex
Title={Embodied Question Answering in Photorealistic Environments with Point Cloud Perception},
Author={Erik Wijmans and Samyak Datta and Oleksandr Maksymets and Georgia Gkioxari
        and Stefan Lee and Irfan Essa and Devi Parikh and Dhruv Batra},


Multi-Target Embodied Question Answering
Licheng Yu, Xinlei Chen, Georgia Gkioxari, Mohit Bansal, Tamara Berg, Dhruv Batra
Computer Vision and Pattern Recognition (CVPR), 2019

arxiv / project page / bibtex
Title={Multi-Target Embodied Question Answering},
Author={Licheng Yu and Xinlei Chen and Georgia Gkioxari and Mohit Bansal and Tamara Berg and Dhruv Batra},


Neural Modular Control for Embodied Question Answering
Abhishek Das, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra
Conference on Robot Learning (CoRL), 2018

arxiv / project page / bibtex
Title={{N}eural {M}odular {C}ontrol for {E}mbodied {Q}uestion {A}nswering},
Author={Abhishek Das and Georgia Gkioxari 
        and Stefan Lee and Devi Parikh and Dhruv Batra},


Building Generalizable Agents With a Realistic And Rich 3D Environment
Yi Wu, Yuxin Wu, Georgia Gkioxari, Yuandong Tian
International Conference on Learning Representations (ICLR), Workshop Track, 2018

arxiv / code / bibtex
Author    = {Yi Wu and Yuxin Wu and 
            Georgia Gkioxari and Yuandong Tian},
Title     = {Building Generalizable Agents With a Realistic And Rich 3D Environment},
Journal   = {arXiv preprint arXiv:1801.02209},
Year      = {2018}}


Detecting and Recognizing Human-Object Interactions
Georgia Gkioxari, Ross Girshick, Piotr Dollàr and Kaiming He
Computer Vision and Pattern Recognition (CVPR), 2018 (spotlight)

arxiv / project page / bibtex
Author    = {Georgia Gkioxari and Ross Girshick and 
             Piotr Doll\'{a}r and Kaiming He},
Title     = {Detecting and Recognizing Human-Object Intaractions},
Booktitle = {CVPR},
Year      = {2018}}


Embodied Question Answering
Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra
Computer Vision and Pattern Recognition (CVPR), 2018 (oral)

arxiv / project page / code / bibtex
Title={{E}mbodied {Q}uestion {A}nswering},
Author={Abhishek Das and Samyak Datta and 
          Georgia Gkioxari and Stefan Lee and 
          Devi Parikh and Dhruv Batra},

game game

Detect-and-Track: Efficient Pose Estimation in Videos
Rohit Girdhar, Georgia Gkioxari, Lorenzo Torresani, Manohar Paluri and Du Tran
Computer Vision and Pattern Recognition (CVPR), 2018

arxiv / code / bibtex
Title={Detect-and-Track: Efficient Pose Estimation in Videos,
Author={Rohit Girdhar and Georgia Gkioxari and 
        Lorenzo Torresani and Manohar Paluri and Du Tran},


Data Distillation: Towards Omni-Supervised Learning
Ilija Radosavovic, Piotr Dollàr, Ross Girshick, Georgia Gkioxari and Kaiming He
Computer Vision and Pattern Recognition (CVPR), 2018

arxiv / bibtex
Title={Data Distillation: Towards Omni-Supervised Learning,
Author={Ilija Radosavovic and Piotr Doll\'{a}r and
        Ross Girshick and Georgia Gkioxari and 
        Kaiming He},


Mask R-CNN
Kaiming He, Georgia Gkioxari, Piotr Dollàr and Ross Girshick
International Conference of Computer Vision (ICCV), 2017 (oral)
Best Paper Award (Marr Prize)  

arxiv / code / bibtex
Author    = {Kaiming He and Georgia Gkioxari and
         Piotr Doll\'{a}r and Ross Girshick},
Title     = {Mask R-CNN},
Booktitle   = {ICCV},
Year      = {2017}}


Learn2Smile: Learning Non-verbal Interaction through Observation
Will Feng, Anitha Kannan, Georgia Gkioxari, Larry Zitnick
International Conference on Intelligent Robots and Systems (IROS), 2017 (oral)
Finalist for the JTCF Novel Technology Paper Award For Amusement Culture  

Author    = {Will Feng, Anitha Kannan, Georgia Gkioxari and Larry Zitnick},
Title     = {Learn2Smile: Learning Non-verbal Interaction through Observation},
Booktitle = {IROS},
Year      = {2017}}

game tennis

Chained Predictions Using Convolutional Neural Networks
Georgia Gkioxari, Alexander Toshev and Navdeep Jaitly
European Conference of Computer Vision (ECCV), 2016  

arxiv / project page / bibtex
Author = {G. Gkioxari and A. Toshev and N. Jaitly},
Title = {Chained Predictions Using 
       Convolutional Neural Networks},
Booktitle = {ECCV},
Year = {2016}}


Contextual Action Recognition with R*CNN
Georgia Gkioxari, Ross Girshick and Jitendra Malik
International Conference of Computer Vision (ICCV), 2015  

arxiv / code / bibtex
Author = {G. Gkioxari and R. Girshick and J. Malik},
Title = {Contextual Action Recognition with R*CNN},
Booktitle = {ICCV},
Year = {2015}}


Actions and Attributes from Wholes and Parts
Georgia Gkioxari, Ross Girshick and Jitendra Malik
International Conference of Computer Vision (ICCV), 2015  

arxiv / bibtex
Author = {G. Gkioxari and R. Girshick and J. Malik},
Title = {Actions and Attributes from Wholes and Parts},
Booktitle = {ICCV},
Year = {2015}}


Finding Action Tubes
Georgia Gkioxari and Jitendra Malik
Computer Vision and Pattern Recognition (CVPR), 2015  

project page / arxiv / code / negative results / UCF Sports Benchmark / bibtex
Author = {G. Gkioxari and J. Malik},
Title = {Finding Action Tubes},
Booktitle = {CVPR},
Year = {2015}}


R-CNNs for Pose Estimation and Action Detection
Georgia Gkioxari, Bharath Hariharan, Ross Girshick and Jitendra Malik

project page / arxiv / bibtex
Author = {G. Gkioxari and B. Hariharan
    and R. Girshick and J. Malik},
Title = {R-CNNs for Pose Estimation and Action Detection},
ArchivePrefix = {arXiv},
Eprint = {1406.5212},
PrimaryClass = {cs.CV},
Year = {2014}}


Using k-poselets for detecting people and localizing their keypoints
Georgia Gkioxari*, Bharath Hariharan*, Ross Girshick and Jitendra Malik
Computer Vision and Pattern Recognition (CVPR), 2014  
* authors contributed equally

project page / code / github / spotlight / bibtex
Author = {G. Gkioxari and B. Hariharan 
      and R. Girshick and J. Malik},
Title = {Using k-poselets for  detecting people and 
      localizing their keypoints},
Booktitle = {CVPR},
Year = {2014}}


Articulated Pose Estimation using Discriminative Armlet Classifiers
Georgia Gkioxari, Pablo Arbelaez, Lubomir Bourdev and Jitendra Malik
Computer Vision and Pattern Recognition (CVPR), 2013  

slides / bibtex
Author = {G. Gkioxari and P. Arbelaez 
      and L. Bourdev and J. Malik},
Title  = {Articulated Pose Estimation using 
      Discriminative Armlet Classifiers},
Booktitle = {CVPR},
Year  = {2013}}

Teaching at UC Berkeley


CS188 - Fall 2011 (GSI - GSI Outstanding Award)


CS 280 - Fall 2012 (GSI)

Stolen from Jon Barron