Welcome
Hi, I am Alex Falcon. I am a Post-doc researcher at the University of Udine (AI Lab). I completed my PhD in Computer Science, Mathematics and Physics, jointly held at Fondazione Bruno Kessler (Technology of Vision - TeV) and University of Udine, under the supervision of Oswald Lanz (Free University of Bolzano) and Giuseppe Serra (University of Udine). My main research focus is currently focused towards multimedia, video and language understanding, and deep learning. Before that, I completed my Bachelor’s and Master’s degree in Computer Science at the University of Udine. Specifically, during my Master’s I started working with AI, machine learning, and deep learning with a focus on Predictive Maintenance.
E-mail: falcon.alex ‘at’ spes.uniud.it / Google Scholar / Github / LinkedIn / CV
News
- I was nominated Outstanding Reviewer (198 out of almost 7300 reviewers!) at ECCV 2024!!! [link]
- one paper accepted at Ecological Informatics!!
- EQAI 2024 was featured on a local newspaper! Everything is ready for this September!
- I was invited as a speaker at the AI-DLDA summer school, where I talked about retrieving complex 3D scenarios using text! A notebook will become available later on!
- one paper accepted at Computer Vision and Image Understanding!!! [code]
- one paper accepted as Poster at ACM ICMR 2024! See you at Phuket, Thailand!
- We are organizing the 3rd ed. of CV4Metaverse workshop at ECCV 2024 and hosting the Metaverse Apartment Retrieval Challenge!
- I am part of the local organization committee for EQAI 2024 (European Summer School on Quantum AI)!
- I am a guest editor for the Special Issue on Text-Multimedia Retrieval: Retrieving Multimedia Data by Means of Natural Language at ACM TOMM! Check the call for papers! (deadline: June 30, 2024)
- one paper accepted as an Oral at IRCDL 2024!
[click for previous years]
- 2023
- one paper accepted as an Oral at MMM 2024! code by Ali
- one paper accepted at MMIR@ACM MM 2023 and one paper accepted at CV4Metaverse@ICCV 2023! Codebases by Ali: code1 code2
- I had a great experience at the ELLIS Summer School on Large-Scale AI for Research and Industry in Modena, Italy!
- one paper accepted at ICIAP 2023!
- our solution (report), trained with only 25% of the data, got 3rd place in the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge @ CVPR 2023!
- I am part of the local organization committee for ICIAP 2023!
- I delivered a seminar on "Deep Learning for Multimedia understanding" as the speaker at University of Udine!
- I am part of the local organization committee for the 2nd edition of the European Summer School on Quantum AI!
- March, 13th 2023: I successfully defended my PhD thesis cum laude!
- one paper accepted at Multimedia Tools and Applications! code
- 2022
- one paper (pdf) accepted as an Oral at AIABI@AIxIA 2022!
- one paper accepted at Computers in Industry! code
- one paper accepted as an Oral at ACM MM 2022! code
- I delivered two talks at University of Bolzano: "Data-driven approaches for the Remaining Useful Life Estimation problem" and "Learning video retrieval models with relevance-aware online mining"
- I was featured in the FBK magazine (italian)!
- our solution (report) got 1st place in the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge @ CVPR 2022!
- I attended the fantastic International Computer Vision Summer School (ICVSS) in Scicli, Italy and presented a poster titled "Relevance-aware Online Mining for Video Retrieval"!
- one paper accepted as an Oral at ICMR 2022! code
- one paper accepted as an Oral at ICIAP 2021! code
- I delivered a seminar on "Data-driven approaches for the remaining useful life estimation problem" as the speaker at FBK!
- 2021
- our solution (report) got 3rd place in the EPIC-Kitchens-100 Action Recognition Challenge @ CVPR 2021!
- I completed the "Fundamentals of Deep Learning for Multi-GPUs" course held by NVIDIA Deep Learning Institute!
- we organized the VIQA workshop @ ICPR 2020 (later merged into the VTIUR workshop)!
- 2020
- one paper accepted as an Oral at EPIC@ECCV 2020!
- I attended the "Machine Learning for non-matrix data" summer school at Politecnico di Milano!
- one paper accepted as an Oral at PHME 2020!
- one paper accepted as an Oral at ICPHM 2020!
- 2019
- one paper accepted at IRCDL 2019!
- October 2019: I started my PhD under the supervision of Oswald Lanz and Giuseppe Serra!
- July 2019: I successfully completed the Master's Degree in Computer Science cum laude!
Main projects and lines of research
Text-Video Retrieval
Topics: multimedia, cross-modal understanding, vision and language
TL;DR: Text-video retrieval is a task requiring to rank a collection of videos based on their relevance to a user textual query, with low ranks representing highly relevant videos and vice versa. State-of-the-art video retrieval systems are obtained with contrastive learning techniques. However, contrastive learning techniques enforce constraints at training time which neglect that multiple videos may be relevant for the same caption, and vice versa. In this line of research, we focus on the importance of introducing semantic knowledge into the training process to overcome these limitations. The results obtained confirm our observations and hypotheses, and the learning strategies we proposed effectively overcome them (e.g., from about 36% nDCG to almost 60% on the challenging EPIC-Kitchens-100 dataset).
Selected publications/endeavors
- Improving semantic video retrieval models by training with a relevance-aware online mining strategy.
A. Falcon, G. Serra, O. Lanz. Computer Vision and Image Understanding 245, 104035. 2024. [pdf] - Semantics for vision-and-language understanding.
A. Falcon. PhD Thesis, 2023. [pdf] - A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval.
A. Falcon, G. Serra, O. Lanz. ACM MM 2022. [pdf] - UniUD-FBK-UB-UniBZ Submission to the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge 2022. (ranked ranked 1st )
A. Falcon, G. Serra, S. Escalera, O. Lanz. EPIC@CVPR 2022. [pdf] - Relevance-based margin for contrastively-trained video retrieval models.
A. Falcon, S. Sudhakaran, G. Serra, S. Escalera, O. Lanz. ACM ICMR 2022.[pdf] - Learning video retrieval models with relevance-aware online mining.
A. Falcon, G. Serra, O. Lanz. ICIAP 2021. [pdf]
Ranking complex 3D scenes
topics: multimedia, cross-modal understanding, vision and language
TL;DR: Nowadays, multiple enticing experiences are available in the Metaverse. Actually, there are lots of them and it is difficult to find those which are relevant for the user. Can we formalize this as a ranking problem? We introduced and evaluated state-of-the-art techniques in various scenarios related to complex 3D scenes, composed of multiple furnished rooms (e.g., in apartments) or containing many multimedia items (e.g., paintings in museums) which can affect the relevance.
Selected publications/endeavors
- AdOCTeRA: Adaptive Optimization Constraints for improved Text-guided Retrieval of Apartments.
A. Abdari, A. Falcon, G. Serra. ACM ICMR 2024. [pdf] - Paving the Way for Personalized Museums Tours in the Metaverse.
A. Falcon, B. Portelli, A. Abdari, G. Serra. IRCDL 2024. [pdf] - A Language-based solution to enable Metaverse Retrieval.
A. Abdari, A. Falcon, G. Serra. MMM 2024. [pdf] - FArMARe: a Furniture-Aware Multi-task methodology for Recommending Apartments based on the user interests.
A. Abdari, A. Falcon, G. Serra. CV4Metaverse@ICCV 2023. [pdf] - Metaverse Retrieval: Finding the Best Metaverse Environment via Language.
A. Abdari, A. Falcon, G. Serra. MMIR@ACM MM 2023. [pdf]
Video Question Answering
topics: multimedia, cross-modal understanding, vision and language
TL;DR: Given a video and a question about its visual contents, can a model automatically provide the correct answer to that question? We investigated and introduced data augmentation techniques to achieve better accuracy while avoiding costly manual annotations, and proposed customized learning strategies leveraging the contents of the question itself.
Selected publications/endeavors
- Video question answering supported by a multi-task learning objective.
A. Falcon, G. Serra, O. Lanz. Multimedia Tools and Applications 82 (25), 38799-38826. 2023. [pdf] - Semantics for vision-and-language understanding.
A. Falcon. PhD Thesis, 2023. [pdf] - Data augmentation techniques for the video question answering task.
A. Falcon, G. Serra, O. Lanz. EPIC@ECCV 2020. [pdf]
Remaining Useful Life Estimation
topics: predictive maintenance
TL;DR: We deal with the problem of estimating the remaining useful life (RUL) of mechanical engines (aeroplanes, in particular) by using neural sequence models. The RUL can be seen as a measure of how long it will take for the device under analysis to reach a failure (or, a situation in which a failure is very likely). We introduced to this field of research a neural model inspired from Neural Turing Machines and evaluated them under different scenarios. Experimental results confirm their robustness and precision compared to several neural sequence models.
Selected publications/endeavors
- Neural turing machines for the remaining useful life estimation problem.
A. Falcon, G. D’Agostino, O. Lanz, G. Brajnik, C. Tasso, G. Serra. Computers in Industry 143, 103762. 2022. [pdf] - Estimating the Remaining Useful Life via Neural Sequence Models: a Comparative Study.
G. D'Agostino, A. Falcon, O. Lanz, G. Brajnik, C. Tasso, G. Serra. AIABI@AIxIA 2022. [pdf] - A Dual-Stream architecture based on Neural Turing Machine and Attention for the Remaining Useful Life Estimation problem. A. Falcon, G. D'Agostino, G. Serra, G. Brajnik, C. Tasso.
PHME 2020. [pdf] - A neural turing machine-based approach to remaining useful life estimation.
A. Falcon, G. D'Agostino, G. Serra, G. Brajnik, C. Tasso. ICPHM 2020. [pdf] - Remaining Useful Life Estimation using LSTM Networks and Attentive mechanisms.
A. Falcon. MSc Thesis, 2019.
Service
- Proceedings Chair: IRCDL 2023
- Part of the Organization Committee: IRCDL 2025, EQAI 2024, 2023, ICIAP 2023, AIxIA 2022
- Organizer: CV4Metaverse 2024 at ECCV, VIQA 2020/VTIUR 2020
- Guest associate editor: SI on Text-Multimedia Retrieval (ACM TOMM)
- Journal Reviewing: IJCV, IEEE TMM, CVIU, IET Computer Vision, ACM TOMM, IEEE Trans Hum Mach Syst.
- Conference Reviewing: MMM 2025, ECCV 2024 (Outstanding Reviewer!), ACM MM 2024, ACM MM 2023, CCISP 2023, IRCDL 2023, ICIAP 2023, ICPR 2022, ICIAP 2021, EMNLP 2021, ICPR 2020.
- Co-Supervision: 3 Bachelor and 5 Master students of Computer Science Degree or IoT, Big Data, and ML Degree at UniUD on topics related to Video&Language and Predictive Maintenance.
- Gallegos Carvajal Ian Marco, MSc Enhancing text-to-textured 3D mesh generation with training-free adaptation for textual-visual consistency using spatial constraints and quality assurance: a case study on Text2Room. 2024.
- Rodaro Edoardo, BSc Rilevamento del flusso di materiale su nastro trasportatore attraverso le reti neurali. 2023.
- Bianchi Carlo, MSc L'Intelligenza Artificiale a supporto del metaverso. 2023.
- Bruni Pierfrancesco, MSc Circulant matrices lead to an improved baseline for question-driven video moment localization. 2022. (published at IRCDL 2023!)
- De Martin Federica, MSc Ricerca di un nuovo modello video e transfer learning nell’ambito del Multi-Instance video-text retrieval. 2022.
- De Reggi Paolo, MSc Generazione automatica di domande e risposte per il problema del video question answering. 2021.
- Ferroli Daniele, BSc Implementazione di un sistema di Video Question Answering. 2020.
- Rosso Giovanni, BSc Utilizzo di reti neurali convolutive per la manutenzione predittiva. 2020.