Multimedia

All posts under category "Multimedia"

4 posts total
Sorted by date
Exploring World Maps in Virtual Reality  Comparing 3D Exocentric Globes, Flat Maps, Egocentric 3D Globes, and Curved Maps

Exploring World Maps in Virtual Reality Comparing 3D Exocentric Globes, Flat Maps, Egocentric 3D Globes, and Curved Maps

This paper explores different ways to render world-wide geographic maps in virtual reality (VR). We compare (a) a 3D exocentric globe, where the user s viewpoint is outside the globe; (b) a flat map (rendered to a plane in VR); (c) an egocentric 3D globe, with the viewpoint inside the globe; and (d) a curved map, created by projecting the map onto a section of a sphere which curves around the user. In all four visualisations the geographic centre can be smoothly adjusted with a standard handheld VR controller and the user, through a head-tracked headset, can physically move around the visualisation. For distance comparison, exocentric globe is more accurate than egocentric globe and flat map. For area comparison, more time is required with exocentric and egocentric globes than with flat and curved maps. For direction estimation, the exocentric globe is more accurate and faster than the other visual presentations. Our study participants had a weak preference for the exocentric globe. Generally, the curved map had benefits over the flat map. In almost all cases the egocentric globe was found to be the least effective visualisation. Overall, our results provide support for the use of exocentric globes for geographic visualisation in mixed-reality.

paper research
Quality-of-Experience Driven Coupled Uplink and Downlink Rate Adaptation for 360-Degree Video Live Streaming

Quality-of-Experience Driven Coupled Uplink and Downlink Rate Adaptation for 360-Degree Video Live Streaming

360-degree video provides an immersive 360-degree viewing experience and has been widely used in many areas. The 360-degree video live streaming systems involve capturing, compression, uplink (camera to video server) and downlink (video server to user) transmissions. However, few studies have jointly investigated such complex systems, especially the rate adaptation for the coupled uplink and downlink in the 360-degree video streaming under limited bandwidth constraints. In this letter, we propose a quality of experience (QoE)-driven 360-degree video live streaming system, in which a video server performs rate adaptation based on the uplink and downlink bandwidths and information concerning each user s real-time field-of-view (FOV). We formulate it as a nonlinear integer programming problem and propose an algorithm, which combines the Karush-Kuhn-Tucker (KKT) condition and branch and bound method, to solve it. The numerical results show that the proposed optimization model can improve users QoE significantly in comparison with other baseline schemes.

paper research
No Image

Indian EmoSpeech Command Dataset A Dataset for Emotion-Based Speech Recognition in Real-World Scenarios

Speech emotion analysis is an important task which further enables several application use cases. The non-verbal sounds within speech utterances also play a pivotal role in emotion analysis in speech. Due to the widespread use of smartphones, it becomes viable to analyze speech commands captured using microphones for emotion understanding by utilizing on-device machine learning models. The non-verbal information includes the environment background sounds describing the type of surroundings, current situation and activities being performed. In this work, we consider both verbal (speech commands) and non-verbal sounds (background noises) within an utterance for emotion analysis in real-life scenarios. We create an indigenous dataset for this task namely Indian EmoSpeech Command Dataset . It contains keywords with diverse emotions and background sounds, presented to explore new challenges in audio analysis. We exhaustively compare with various baseline models for emotion analysis on speech commands on several performance metrics. We demonstrate that we achieve a significant average gain of 3.3% in top-one score over a subset of speech command dataset for keyword spotting.

paper research
CALPA-NET  A Channel Pruning Assisted Deep Residual Network for Digital Image Steganalysis

CALPA-NET A Channel Pruning Assisted Deep Residual Network for Digital Image Steganalysis

Over the past few years, detection performance improvements of deep-learning based steganalyzers have been usually achieved through structure expansion. However, excessive expanded structure results in huge computational cost, storage overheads, and consequently difficulty in training and deployment. In this paper we propose CALPA-NET, a ChAnneL-Pruning-Assisted deep residual network architecture search approach to shrink the network structure of existing vast, over-parameterized deep-learning based steganalyzers. We observe that the broad inverted-pyramid structure of existing deep-learning based steganalyzers might contradict the well-established model diversity oriented philosophy, and therefore is not suitable for steganalysis. Then a hybrid criterion combined with two network pruning schemes is introduced to adaptively shrink every involved convolutional layer in a data-driven manner. The resulting network architecture presents a slender bottleneck-like structure. We have conducted extensive experiments on BOSSBase+BOWS2 dataset, more diverse ALASKA dataset and even a large-scale subset extracted from ImageNet CLS-LOC dataset. The experimental results show that the model structure generated by our proposed CALPA-NET can achieve comparative performance with less than two percent of parameters and about one third FLOPs compared to the original steganalytic model. The new model possesses even better adaptivity, transferability, and scalability.

paper research

< Category Statistics (Total: 566) >

Computer Science (514) Machine Learning (117) Artificial Intelligence (89) Computer Vision (71) Computation and Language (NLP) (62) Electrical Engineering and Systems Science (36) Cryptography and Security (24) Robotics (22) Systems and Control (22) Software Engineering (20) Mathematics (18) Statistics (17) Economics (16) Information Retrieval (15) Distributed, Parallel, and Cluster Computing (14) Human-Computer Interaction (14) Neural and Evolutionary Computing (13) Computer Science and Game Theory (11) Econometrics (11) Image and Video Processing (10) Physics (10) Sound (10) Multiagent Systems (9) Optimization and Control (8) Computational Geometry (7) Databases (7) Graphics (6) Networking and Internet Architecture (6) Quantitative Biology (6) Quantum Physics (5) Theoretical Economics (5) Computational Complexity (4) Computational Engineering, Finance, and Science (4) Computers and Society (4) Emerging Technologies (4) Information Theory (4) Methodology (4) Multimedia (4) Programming Languages (4) Quantitative Finance (4) Signal Processing (4) Audio and Speech Processing (3) Data Structures and Algorithms (3) Hardware Architecture (3) History and Philosophy of Physics (3) Logic in Computer Science (3) Neurons and Cognition (3) Social and Information Networks (3) Statistics Theory (3) Computation (2) Condensed Matter (2) Dynamical Systems (2) Formal Languages and Automata Theory (2) General Finance (2) Operating Systems (2) Optics (2) Quantitative Methods (2) Applications (1) Astrophysics (1) Combinatorics (1) Computational Physics (1) Digital Libraries (1) Disordered Systems and Neural Networks (1) General Economics (1) Genomics (1) Geophysics (1) Instrumentation and Methods for Astrophysics (1) Logic (1) Mathematical Finance (1) Mathematical Software (1) Medical Physics (1) Mesoscale and Nanoscale Physics (1) Metric Geometry (1) Other Statistics (1) Performance (1) Physics and Society (1) Plasma Physics (1) Probability (1) Trading and Market Microstructure (1)

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut