Welcome

Hi, I am Alex Falcon. I am a Post-doc researcher at the University of Udine (AI Lab). I completed my PhD in Computer Science, Mathematics and Physics, jointly held at Fondazione Bruno Kessler (Technology of Vision - TeV) and University of Udine, under the supervision of Oswald Lanz (Free University of Bolzano) and Giuseppe Serra (University of Udine). My main research focus is currently focused towards multimedia, video and language understanding, and deep learning. Before that, I completed my Bachelor’s and Master’s degree in Computer Science at the University of Udine. Specifically, during my Master’s I started working with AI, machine learning, and deep learning with a focus on Predictive Maintenance.

E-mail: falcon.alex ‘at’ spes.uniud.it / Google Scholar / Github / LinkedIn / CV

News

two journal papers on multidisciplinary collaborations were recently accepted on Annals of GIS and Nutrients!!
one paper accepted at ICIAP!
two papers (one full paper based on work of my MSc students + one reproducibility) accepted at ACM ICMR 2025!
won a MIT GSF grant (about $25k) and one NRRP grant for €250k!!!
one paper accepted at IEEE Transactions on Multimedia!! [open access paper soon]
one paper accepted at Artificial Intelligence in Agriculture!!
check this report we published at ACM SIGIR Forum!
I'm part of the organization for the 4th ed. of CV4Metaverse at CVPR 2025! See you in Nashville, Tennessee! [CFP (DL: March 22!)]
one paper accepted at IRCDL! [pdf]
one paper accepted at MMM 2025! See you in Nara, Japan! [link] [code]
I was nominated Outstanding Reviewer (198 out of almost 7300 reviewers!) at ECCV 2024!!! [link]
one paper accepted at Ecological Informatics!!
I'm part of the organization committee for IRCDL 2025!
EQAI 2024 was featured on a local newspaper! Everything is ready for this September!
I was invited as a speaker at the AI-DLDA summer school, where I talked about retrieving complex 3D scenarios using text! A notebook will become available later on!
one paper accepted at Computer Vision and Image Understanding!!! [code]
one paper accepted as Poster at ACM ICMR 2024! See you in Phuket, Thailand!
We are organizing the 3rd ed. of CV4Metaverse workshop at ECCV 2024 and hosting the Metaverse Apartment Retrieval Challenge!
I am part of the local organization committee for EQAI 2024 (European Summer School on Quantum AI)!
I am a guest editor for the Special Issue on Text-Multimedia Retrieval: Retrieving Multimedia Data by Means of Natural Language at ACM TOMM! Check the call for papers! (deadline: June 30, 2024)
one paper accepted as an Oral at IRCDL 2024!

[click for previous years]

2023

one paper accepted as an Oral at MMM 2024! code by Ali
one paper accepted at MMIR@ACM MM 2023 and one paper accepted at CV4Metaverse@ICCV 2023! Codebases by Ali: code1 code2
I had a great experience at the ELLIS Summer School on Large-Scale AI for Research and Industry in Modena, Italy!
one paper accepted at ICIAP 2023!
our solution (report), trained with only 25% of the data, got 3rd place in the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge @ CVPR 2023!
I am part of the local organization committee for ICIAP 2023!
I delivered a seminar on "Deep Learning for Multimedia understanding" as the speaker at University of Udine!
I am part of the local organization committee for the 2nd edition of the European Summer School on Quantum AI!
March, 13th 2023: I successfully defended my PhD thesis cum laude!
one paper accepted at Multimedia Tools and Applications! code

2022

one paper (pdf) accepted as an Oral at AIABI@AIxIA 2022!
one paper accepted at Computers in Industry! code
one paper accepted as an Oral at ACM MM 2022! code
I delivered two talks at University of Bolzano: "Data-driven approaches for the Remaining Useful Life Estimation problem" and "Learning video retrieval models with relevance-aware online mining"
I was featured in the FBK magazine (italian)!
our solution (report) got 1st place in the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge @ CVPR 2022!
I attended the fantastic International Computer Vision Summer School (ICVSS) in Scicli, Italy and presented a poster titled "Relevance-aware Online Mining for Video Retrieval"!
one paper accepted as an Oral at ICMR 2022! code
one paper accepted as an Oral at ICIAP 2021! code
I delivered a seminar on "Data-driven approaches for the remaining useful life estimation problem" as the speaker at FBK!

2021

our solution (report) got 3rd place in the EPIC-Kitchens-100 Action Recognition Challenge @ CVPR 2021!
I completed the "Fundamentals of Deep Learning for Multi-GPUs" course held by NVIDIA Deep Learning Institute!
we organized the VIQA workshop @ ICPR 2020 (later merged into the VTIUR workshop)!

2020

one paper accepted as an Oral at EPIC@ECCV 2020!
I attended the "Machine Learning for non-matrix data" summer school at Politecnico di Milano!
one paper accepted as an Oral at PHME 2020!
one paper accepted as an Oral at ICPHM 2020!

2019

one paper accepted at IRCDL 2019!
October 2019: I started my PhD under the supervision of Oswald Lanz and Giuseppe Serra!
July 2019: I successfully completed the Master's Degree in Computer Science cum laude!

Main projects and lines of research

Text-Video Retrieval

Topics: multimedia, cross-modal understanding, vision and language

Overview of the algorithm

TL;DR: Text-video retrieval is a task requiring to rank a collection of videos based on their relevance to a user textual query, with low ranks representing highly relevant videos and vice versa. State-of-the-art video retrieval systems are obtained with contrastive learning techniques. However, contrastive learning techniques enforce constraints at training time which neglect that multiple videos may be relevant for the same caption, and vice versa. In this line of research, we focus on the importance of introducing semantic knowledge into the training process to overcome these limitations. The results obtained confirm our observations and hypotheses, and the learning strategies we proposed effectively overcome them (e.g., from about 36% nDCG to almost 60% on the challenging EPIC-Kitchens-100 dataset).

Selected publications/endeavors

Improving semantic video retrieval models by training with a relevance-aware online mining strategy.
A. Falcon, G. Serra, O. Lanz. Computer Vision and Image Understanding 245, 104035. 2024. [pdf]
Semantics for vision-and-language understanding.
A. Falcon. PhD Thesis, 2023. [pdf]
A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval.
A. Falcon, G. Serra, O. Lanz. ACM MM 2022. [pdf]
UniUD-FBK-UB-UniBZ Submission to the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge 2022. (ranked ranked 1st )
A. Falcon, G. Serra, S. Escalera, O. Lanz. EPIC@CVPR 2022. [pdf]
Relevance-based margin for contrastively-trained video retrieval models.
A. Falcon, S. Sudhakaran, G. Serra, S. Escalera, O. Lanz. ACM ICMR 2022.[pdf]
Learning video retrieval models with relevance-aware online mining.
A. Falcon, G. Serra, O. Lanz. ICIAP 2021. [pdf]

Ranking complex 3D scenes

topics: multimedia, cross-modal understanding, vision and language

Overview of the algorithm

TL;DR: Nowadays, multiple enticing experiences are available in the Metaverse. Actually, there are lots of them and it is difficult to find those which are relevant for the user. Can we formalize this as a ranking problem? We introduced and evaluated state-of-the-art techniques in various scenarios related to complex 3D scenes, composed of multiple furnished rooms (e.g., in apartments) or containing many multimedia items (e.g., paintings in museums) which can affect the relevance.

Selected publications/endeavors

Hierarchical Vision-Language Retrieval of Educational Metaverse Content in Agriculture.
A. Abdari, A. Falcon, G. Serra. ICIAP 2025. [pdf soon]
HM3: Hierarchical Modeling of Multimedia Metaverses on 10000 Thematic Museums via Theme-aware Contrastive Loss Function.
G. Macrì, L. Bazzana, A. Falcon, G. Serra. ACM ICMR 2025. [pdf]
Reproducibility Companion Paper: AdOCTeRA - Adaptive Optimization Constraints for Improved text-guided Retrieval of Apartments
A. Abdari, A. Falcon, G. Serra. ACM ICMR 2025. [pdf]
ALCER3D: Adaptive Learning Constraints for Enhanced Retrieval of Complex Indoor 3D Scenarios.
A. Falcon, A. Abdari, G. Serra. (accepted for publication in March at) IEEE TMM 2025.[pdf soon] [code]
AgriMus: Developing Museums in the Metaverse for Agricultural Education
A. Abdari, A. Falcon, G. Serra. IRCDL 2025 [pdf]
HierArtEx: Hierarchical Representations and Art Experts Supporting the Retrieval of Museums in the Metaverse.
A. Falcon, A. Abdari, G. Serra. MMM 2025.[pdf] [code]
AdOCTeRA: Adaptive Optimization Constraints for improved Text-guided Retrieval of Apartments.
A. Abdari, A. Falcon, G. Serra. ACM ICMR 2024. [pdf]
Paving the Way for Personalized Museums Tours in the Metaverse.
A. Falcon, B. Portelli, A. Abdari, G. Serra. IRCDL 2024. [pdf]
A Language-based solution to enable Metaverse Retrieval.
A. Abdari, A. Falcon, G. Serra. MMM 2024. [pdf]
FArMARe: a Furniture-Aware Multi-task methodology for Recommending Apartments based on the user interests.
A. Abdari, A. Falcon, G. Serra. CV4Metaverse@ICCV 2023. [pdf]
Metaverse Retrieval: Finding the Best Metaverse Environment via Language.
A. Abdari, A. Falcon, G. Serra. MMIR@ACM MM 2023. [pdf]

Applied AI for Multidisciplinary Problems

topics: real world data, machine learning for ecological problems, sustainability, computer vision for nutrition, computer vision for agriculture

Example of pipeline for dealing with forestry data

TL;DR: Dealing with multidisciplinary problems means dealing with real world data. This raises additional challenges (eg: data becomes more scarce or lower quality), and requires a different mindset to solve the problem at hand. Nonetheless, this also means dealing with problems that might have an impact on real issues, such as more sustainable farming techniques. I've had the opportunity of collaborating on several projects dealing with experts from heterogeneous fields including forestry, wine production, agriculture, and nutrition.

Selected publications/endeavors

Evaluating organic carbon in living and dead trees using GLCM features and explainable machine learning: insights from Italian national forest.
M. Fasihi, A. Falcon, G. Alberti, L. Cadez, F. Giannetti, A. Tomao, G. Serra. Annals of GIS, 2025. [pdf]
2D Prediction of the Nutritional Composition of Dishes from Food Images: Deep Learning Algorithm Selection and Data Curation Beyond the Nutrition5k Project.
R. Bianco, S. Coluccia, M. Marinoni, A. Falcon, F. Fiori, G. Serra, M. Ferraroni, V. Edefonti, M. Parpinel. Nutrients 17(13), 2025. [pdf]
Boosting grapevine phenological stages prediction based on climatic data by pseudo-labeling approach.
M. Fasihi, M. Sodini, A. Falcon, F. Degano, P. Sivilotti, G. Serra. Artificial Intelligence in Agriculture 15(3). [pdf]
Assessing ensemble models for carbon sequestration and storage estimation in forests using remote sensing data.
M. Fasihi, B. Portelli, L. Cadez, A. Tomao, A. Falcon, G. Alberti, G. Serra. Ecological Informatics 83, 2024. [pdf]

Video Question Answering

topics: multimedia, cross-modal understanding, vision and language

Overview of the algorithm

TL;DR: Given a video and a question about its visual contents, can a model automatically provide the correct answer to that question? We investigated and introduced data augmentation techniques to achieve better accuracy while avoiding costly manual annotations, and proposed customized learning strategies leveraging the contents of the question itself.

Selected publications/endeavors

Video question answering supported by a multi-task learning objective.
A. Falcon, G. Serra, O. Lanz. Multimedia Tools and Applications 82 (25), 38799-38826. 2023. [pdf]
Semantics for vision-and-language understanding.
A. Falcon. PhD Thesis, 2023. [pdf]
Data augmentation techniques for the video question answering task.
A. Falcon, G. Serra, O. Lanz. EPIC@ECCV 2020. [pdf]

Remaining Useful Life Estimation

topics: predictive maintenance

Overview of the algorithm

TL;DR: We deal with the problem of estimating the remaining useful life (RUL) of mechanical engines (aeroplanes, in particular) by using neural sequence models. The RUL can be seen as a measure of how long it will take for the device under analysis to reach a failure (or, a situation in which a failure is very likely). We introduced to this field of research a neural model inspired from Neural Turing Machines and evaluated them under different scenarios. Experimental results confirm their robustness and precision compared to several neural sequence models.

Selected publications/endeavors

Neural turing machines for the remaining useful life estimation problem.
A. Falcon, G. D’Agostino, O. Lanz, G. Brajnik, C. Tasso, G. Serra. Computers in Industry 143, 103762. 2022. [pdf]
Estimating the Remaining Useful Life via Neural Sequence Models: a Comparative Study.
G. D'Agostino, A. Falcon, O. Lanz, G. Brajnik, C. Tasso, G. Serra. AIABI@AIxIA 2022. [pdf]
A Dual-Stream architecture based on Neural Turing Machine and Attention for the Remaining Useful Life Estimation problem. A. Falcon, G. D'Agostino, G. Serra, G. Brajnik, C. Tasso.
PHME 2020. [pdf]
A neural turing machine-based approach to remaining useful life estimation.
A. Falcon, G. D'Agostino, G. Serra, G. Brajnik, C. Tasso. ICPHM 2020. [pdf]
Remaining Useful Life Estimation using LSTM Networks and Attentive mechanisms.
A. Falcon. MSc Thesis, 2019.

Academic Service

Proceedings Chair: IRCDL 2023
Organizer or Part of the Organization Committee:
- CV4Metaverse at CVPR 2025 and ECCV 2024,
- IRCDL 2025,
- EQAI 2025, 2024, 2023,
- ICIAP 2023,
- AIxIA 2022,
- VIQA 2020/VTIUR 2020
Invited speaker or lecturer:
- "Text-to-Metaverse Retrieval: A New Frontier in Search" at AI-DLDA summer school (2024),
- "Deep Learning for Multimedia understanding" at University of Udine (2023),
- "Data-driven approaches for the Remaining Useful Life Estimation problem" at University of Bolzano (2022),
- "Learning video retrieval models with relevance-aware online mining" at University of Bolzano (2022),
- "Data-driven approaches for the remaining useful life estimation problem" at FBK (2022).
Guest associate editor: SI on Text-Multimedia Retrieval (ACM TOMM)
Journal Reviewing: IJCV, IEEE TMM, CVIU, IET Computer Vision, ACM TOMM, IEEE Trans Hum Mach Syst.
Conference Reviewing: ACM MM 2023-2025, CVPR 2025, ICCV 2025, ACM ICMR 2025, MMM 2025, ECCV 2024 (Outstanding Reviewer!), CCISP 2023, IRCDL 2023, ICIAP 2021-2023, EMNLP 2021, ICPR 2020-2022.
Co-Supervision: 3 Bachelor and 6 Master students of Computer Science Degree or IoT, Big Data, and ML Degree at UniUD on topics related to Video&Language, 3D Scenes Retrieval, and Predictive Maintenance.
- Bazzana Lorenzo, Msc Enhancing Metaverse Retrieval Effectiveness through Hierarchical Room-Aware Representations. 2025.
- Fedrigo Mattia, Msc Automating Vegetation Cover Estimation with Deep Learning: A Transfer Learning-Based Semantic Segmentation Approach. 2025.
- Lavarone Stefano, Msc Design and Evaluation of a Multimodal Retrieval System on a Novel Dataset of Automatically Generated Virtual Museums. 2025.
- Macrì Gianluca, MSc IA per il retrieval di esibizioni d'arte multimediale per il Metaverso. 2024. (published at ACM ICMR 2025!)
- Gallegos Carvajal Ian Marco, MSc Enhancing text-to-textured 3D mesh generation with training-free adaptation for textual-visual consistency using spatial constraints and quality assurance: a case study on Text2Room. 2024.
- Rodaro Edoardo, BSc Rilevamento del flusso di materiale su nastro trasportatore attraverso le reti neurali. 2023.
- Bianchi Carlo, MSc L'Intelligenza Artificiale a supporto del metaverso. 2023.
- Bruni Pierfrancesco, MSc Circulant matrices lead to an improved baseline for question-driven video moment localization. 2022. (published at IRCDL 2023!)
- De Martin Federica, MSc Ricerca di un nuovo modello video e transfer learning nell’ambito del Multi-Instance video-text retrieval. 2022.
- De Reggi Paolo, MSc Generazione automatica di domande e risposte per il problema del video question answering. 2021.
- Ferroli Daniele, BSc Implementazione di un sistema di Video Question Answering. 2020.
- Rosso Giovanni, BSc Utilizzo di reti neurali convolutive per la manutenzione predittiva. 2020.