If you did not already know

Dynamic Temporal Pyramid Network (DTPN) Recognizing instances at different scales simultaneously is a fundamental challenge in visual detection problems. While spatial multi-scale modeling has been well studied in object detection, how to effectively apply a multi-scale architecture to temporal models for activity detection is still under-explored. In this paper, we identify three unique challenges that need to be specifically handled for temporal activity detection compared to its spatial counterpart. To address all these issues, we propose Dynamic Temporal Pyramid Network (DTPN), a new activity detection framework with a multi-scale pyramidal architecture featuring three novel designs: (1) We sample input video frames dynamically with varying frame per seconds (FPS) to construct a natural pyramidal input for video of an arbitrary length. (2) We design a two-branch multi-scale temporal feature hierarchy to deal with the inherent temporal scale variation of activity instances. (3) We further exploit the temporal context of activities by appropriately fusing multi-scale feature maps, and demonstrate that both local and global temporal contexts are important. By combining all these components into a uniform network, we end up with a single-shot activity detector involving single-pass inferencing and end-to-end training. Extensive experiments show that the proposed DTPN achieves state-of-the-art performance on the challenging ActvityNet dataset. …

Infobesity Information overload (also known as infobesity or infoxication) refers to the difficulty a person can have understanding an issue and making decisions that can be caused by the presence of too much information. The term is popularized by Alvin Toffler in his bestselling 1970 book Future Shock, but is mentioned in a 1964 book by Bertram Gross, The Managing of Organizations. Speier et al. (1999) stated: ‘Information overload occurs when the amount of input to a system exceeds its processing capacity. Decision makers have fairly limited cognitive processing capacity. Consequently, when information overload occurs, it is likely that a reduction in decision quality will occur.’ In recent years, the term ‘information overload’ has evolved into phrases such as ‘information glut’ and ‘data smog’ (Shenk, 1997). What was once a term grounded in cognitive psychology has evolved into a rich metaphor used outside the world of academia. In many ways, the advent of information technology has increased the focus on information overload: information technology may be a primary reason for information overload due to its ability to produce more information more quickly and to disseminate this information to a wider audience than ever before (Evaristo, Adams, & Curley, 1995; Hiltz & Turoff, 1985). …

Generic Diffusion Process (genericDP) Image restoration problems are typical ill-posed problems where the regularization term plays an important role. The regularization term learned via generative approaches is easy to transfer to various image restoration, but offers inferior restoration quality compared with that learned via discriminative approaches. On the contrary, the regularization term learned via discriminative approaches are usually trained for a specific image restoration problem, and fail in the problem for which it is not trained. To address this issue, we propose a generic diffusion process (genericDP) to handle multiple Gaussian denoising problems based on the Trainable Non-linear Reaction Diffusion (TNRD) models. Instead of one model, which consists of a diffusion and a reaction term, for one Gaussian denoising problem in TNRD, we enforce multiple TNRD models to share one diffusion term. The trained genericDP model can provide both promising denoising performance and high training efficiency compared with the original TNRD models. We also transfer the trained diffusion term to non-blind deconvolution which is unseen in the training phase. Experiment results show that the trained diffusion term for multiple Gaussian denoising can be transferred to image non-blind deconvolution as an image prior and provide competitive performance. …

Like this:

Like Loading…

Related