Video Thumbnail Selector: Combining Adversarial and Reinforcement Learning

This software can be used for training a deep learning architecture for video thumbnail selection, taking under consideration the representativeness and the aesthetic quality of the video frames. Training is fully-unsupervised, based on a combination of adversarial and reinforcement learning. After being trained on a collection of videos, the Video Thumbnail Selector is capable of selecting a set of representative video thumbnails for unseen videos.
- Related publications [icmr21]
- Software package [download source code]

SmartVidCrop: A Fast Cropping Method for Video Retargeting

This is an implementation of our fast video cropping method, which allows adapting an input video to a different aspect ratio while staying focused on the original video's main subject. Our method utilizes visual saliency to find the regions of attention in each frame, and employs a filtering-through-clustering technique as well as temporal filters to select the main region of focus and produce a smooth sequence of cropped frames.
- Related publications [icip21]
- Software package [download source code]

ObjectGraphs: Video Events Recognition and Explanation

This is an implementation of ObjectGraphs, our novel bottom-up video event recognition and explanation approach. It combines object detection, a graph convolutional network (GCN) and a long short-term memory (LSTM) network for performing video event recognition as well as for identifying the objects that contributed the most to the event recognition decisions, thus providing explanations for the latter.
- Related publications [cvprw21]
- Software package [download source code]

AC-SUM-GAN for Unsupervised Video Summarization

This is an implementation of our latest video summarization method, presented in our paper "AC-SUM-GAN: Connecting Actor-Critic and Generative Adversarial Networks for Unsupervised Video Summarization", IEEE Trans. on Circuits and Systems for Video Technology (IEEE TCSVT), 2020 (early access). This is, to date, our most complete and best-performing method for video summarization.
- Related publications [csvt20]
- Software package [download source code]

Structured Pruning of LSTMs

We provide the code for our paper "Structured Pruning of LSTMs via Eigenanalysis and Geometric Median for Mobile Multimedia and Deep Learning Applications", Proc. 22nd IEEE Int. Symposium on Multimedia (ISM), Dec. 2020. This code can be used for generating more compact LSTMs, which is very useful for mobile multimedia applications and deep learning applications in other resource-constrained environments.
- Related publications [ism20]
- Software package [download source code]

Video Summarization Evaluation: Performance over Random

We provide an implementation of our video summarization evaluation method presented in our publication "Performance over Random: A Robust Evaluation Protocol for Video Summarization Methods", Proc. 28th ACM Int. Conf. on Multimedia (ACM MM '20). This software can be used for evaluating automatically-generated video summaries using the Performance over Random (PoR) evaluation protocol.
- Related publications [acmmm2020]
- Software package [download source code]

Dual Encoding Attention Network for ad-hoc Video Search

We provide an implementation of our extended dual encoding network for ad-hoc video search, presented at ACM ICMR 2020. This network makes use of more than one encodings of the visual and textual content, as well as two different attention mechanisms.
- Related publications [icmr2020]
- Software package [download source code]

Fractional Step Discriminant Pruning for DCNNs

This is an implementation of our filter pruning framework for DCNNs, presented at the IEEE ICME 2020 Mobile Multimedia Computing Workshop. This framework compresses noisy or less discriminant filters in small fractional steps, utilizing a class-separability criterion and an asymptotic schedule for the pruning rate and scaling factor, so that the selected filters' weights are gradually reduced to zero.
- Related publications [icme2020]
- Software package [download source code]

"DATASET 1"

"DATASET 2"

On-line video summarization

This web service lets you submit videos in various formats, and uses a variation of our SoA deep-learning-based summarization methods (relying on Generative Adversarial Networks) to automatically generate video summaries that are customized for use in various social media channels. Watch a 2-minute tutorial video for this service, and make the most out of your content by generating your own video summaries.

On-line video smart-cropping

A web service similar to the summarization one, however adapting only the aspect ratio of the input video without changing the video's duration. The input video can be of any aspect ratio, and various popular target aspect ratios are supported. Try the service with your own videos.

On-line video fragmentation and reverse image search

This web service lets you extract a set of representative keyframes from a video, and use these keyframes to perform fine-grained reverse video search with the help of the Google Image Search functionality, in order to find out if this video has appeared before on the Web. To submit a video for analysis the user can either provide its URL (several online video sources are supported), or upload a local copy of it from his/her PC. Try the service yourself.

On-line video analysis and annotation services

We have developed several on-line services for the analysis and annotation of audio-visual material. Our latest interactive web service (v5.0) lets you upload videos in various formats, and performs shot and scene segmentation as well as visual concept detection with the YouTube-8M concepts. The processing is fast (several times faster than real-time video processing). The results are displayed with the help of an interactive user interface, offering various fragment-level navigation, playback and search functionalities. Watch a tutorial video for the service, and try the latest version (v5.0) of the service, or the previous one (v4.0) (using a different concepts set) with your own videos.

Misinformation on the internet: Video and AI. Presentation available in Slideshare. Delivered at the Age of misinformation: an interdisciplinary outlook on fake news workshop/webinar, in Dec. 2020.

Structured Pruning of LSTMs via Eigenanalysis and Geometric Median for Mobile Multimedia and Deep Learning Applications. Presentation available in Slideshare. Delivered at the 22nd IEEE Int. Symposium on Multimedia (ISM), Dec. 2020.

Performance over Random: A robust evaluation protocol for video summarization methods. Presentation available in Slideshare. Delivered at ACM Multimedia 2020 (ACM MM), Seattle, WA, USA, Oct. 2020.

Migration-Related Semantic Concepts for the Retrieval of Relevant Video Content. Presentation available in Slideshare. Delivered at INTAP 2020, in Oct. 2020.

GAN-based video summarization. Presentation available in Slideshare. Delivered at AI4Media Workshop on GANs for Media Content Generation, in Oct. 2020.

Video Summarization and Re-use Technologies and Tools: Automatic video summarization. Tutorial delivered at the IEEE Int. Conf. on Multimedia and Expo (ICME), 6-10 July 2020. Slides available in Slideshare.

Fractional Step Discriminant Pruning: A Filter Pruning Framework for Deep Convolutional Neural Networks. Presentation available in Slideshare. Delivered at the 7th IEEE Int. Workshop on Mobile Multimedia Computing (MMC2020), IEEE Int. Conf. on Multimedia and Expo (ICME), London, UK, July 2020.