Omar's blog: November 2008

Last week I've participated in the international conference in Multimedia, held in Vancouver B.C. (check for ACM Multimedia 2008 on google) from 27 to 31 October. I've went there with a few colleges from the my institute (IRIT, Toulouse, France).

I went there to present a short paper on new visualizations for mobile guide application and another student presented a full article which won the "BEST PAPER" award !!! So I suppose we rock :)

Well, in this post I want to present a little of what I've seen there.

Monday

I've asisted to a tutorial on Multimedia Ambient Intelligence, presented by one of the organizers, Rosa Iglesias. It was very interesting, actually we had more discussions than presentation but I suppose this is something good.

After the fist presentation we had a great launch at the PanPacific hotel, with a great view to the ocean. Actually each day we had launch in the same place, of course with different meals :).

Tuesday

This is actually the start day of the conference, with a key note speach and the presentation of the best papers.

Key Note speach from Raj Jain

http://cse.wustl.edu/~jain/talks/in3_acm.htm

Need for new internet system, US investing a lot in this.
GENI/FIND there are the 2 big projects involved in the developement of the next generation of internet.

First version (1.0) between 1969 and 1989.
Second version (2.0) between 1989 and present, included security, commerce, inter-domain routing (RFC 1108).

Current problems:

- security: internet wasn't designed with security in mind
- no location or ID-related IP
- no way to say who we are trying to reach, e.g. the person and not his laptop
- most ideas and protocols asume that machines are always "awake"
- IP protocol is stateless, no help for QoS

Best Papers

1. Streaming plants in distributed network environments
This paper wan the "best paper award". Very good, well explained, application from start to end.

2. I saw this thing and thought to you
Application to retrieve and modify video content adding metadata to help annotate and add coments to videos without actually modifying the file.

3. Trade size and definition on mobile devices
UNIC project, University College London
This paper made a study on the size and resolution needed on mobile devices to get the best optical image. What is needed and what is too much.
Result -> people do NOT want high resolution in small devices.

4. Fliker Distance
Paper from Microsoft Research, Asia.
This paper described a new way to search images across a large database like Fliker. It defined a new formula to search between tags, called "flicker distance".

Camera based mobile data channel

Xu liu from Maryland University

A new data channel for mobile devices. Instead of using |GPRS/CDMA/USB stick or other means this paper propose a new channel -> camera channel.

This is a great new idea. The guy is saying that you can transfer data using the video camera on a mobile device, i.e. on a screen (e.g. a wide screen in a city) there is some wired video (data bits) displayed and the mobile can record it using the camera and then decode it to get the data inside.

Technology is called "VCode Symbology". Features:
* calibration area
* bounding box
* error checksum area

Problems with color detection, resolution, frame dropping.

Ideas:
* good temporal error correction method
* color selection as different as possible so there is a small chance that a color will be missinterpreted in various cameras
* result of color set has 16 colors => 4 bits per square in the image. 7.2 bits/pixel possible.
* solving perspective distorsion using no floating support (like in mobile devices) - works 8 times better than usual method with floating point
* optimal density of 50x40x4 (50 x 40 squares of 4 bits each on the data channel)
* calibration patern used to learn colors used in transmission

Application:
* one hand operation
* signal and speed rate bars
* average bit rate of 15.4kbps

Q: did you try actually floating point computations on nokia? they work just fine. It seems his method is faster
Q: how do you know when the application download is completed? What is the output? .jar, .sis, etc? He has start/end marks in the protocol and also it has metadata / header to know what type of application it is downloading as well as automatic launch/install.

Enhancing QoE

This is about enhancing the QoE - Quality of Experience by providing a mixture between Packet Skipping and Packet Switching (when a frame is not received or it is corrupted we can use an old frame) for network transmission.

Localization in surveilance cameras

Similar work:
* View Finder
* RKG + 07
* QT07

Ideas:
* mapping between 2D map and camera view
* some conversions and projection/rotation transformations involved
* 2d to 3d and 3D to 2D transformations presented
* only for static cameras
* future work on mobile cameras

Search Trails using User Feedback to Improve Video Search

Frank Hopfgartner from University of Glasgow

Main ideas:

* represent user interactions as a graph (query and document nodes as well as action nodes)
* recomended techniques:
* neiborhood queries
* related documents
* interaction process
* application created to register user input and solution functionality
* test was made to see results

Wednesday

online advertising - inside images from breave new topics

* Deliver ads inside web images non-intrusive, visually pleasant, etc...
* Deliver ads as inside images
* Search-based image annotation

social signal processing; state of the art

Presenting State of the art and future perspectives of the area

What is it? :
* Another kind of communication, without using words between humans
* capturing signals from humans to perceive/understand messages/situations/behaviors
* Domain that aims at detecting and analyzin automatically human signals to see/understand what happens in society between 2 or more human people

State of art:
* if you look good people think you'll be more smart and interesting
* gestures and postures can say a lot
* great conference/community: "face and geastures"

Main applications:
* broadcast material
* meeting recordings
* role recognition (what is the role in action of each participant to a social event)
* predict the result of the action (e.g. a customer will buy the product or not)

Important things:
* working on real data, as much as possible for realistic results
* apply multimodal approach
* get psychology and engineering closer
* identify relevant applications

Other:
* SSPNet -> research fonds to research in the SSP (social signaling processing) area
* 2 groups: human interaction and machine interaction
* address: http://www.sspnet.eu (official version in august 2009). Ambition is to make this portal THE reference of SSP.

Rest of the day - posters and demos

Some of the most interesting (for me of course) posters and demos that I've seen are:

* Map-based music interfaces for Mobile Devices: a new way to listen to music, instead of having a textual playlist you can use a visual map having regions of interest representing different genres of music. You can create a playlist drawing a curve around the map
* HOTPAPER: this is a cool application that uses the camera from mobile devices to recognize a piece of text as a bar code. It actually creates the connection between physical paper (a book, magazine, etc...) and the digital world, e.g. you scan a book/article/whatever and the application returns a link or description on the internet of it
* Testbed for Mobile TV (DVB-H): these guys have an open source software that you can use to broadcast DVB-H content. All you need is an antena; only costs about a few thousands dollars :)
* What did you do today?: this is a mobile application used to track user position during the day. After a long usage of the application it can make some interesting statistics regarding user presence (where it spends most of its life: home, work, other)
* IntentSearch: a very cool plugin for search engines to look for specific images. First you search normally using text, but after the first search you select an image and the plugin will filter only those images that are similar with the one selected. It works really well

After a long day we had a very crappy banquet (sorry but it was like that), where the only interesting thing was the announcement of the best paper winner, our college Sebastian Mondet :)

Thursday

contextual in-image advertising

Paper submitted by Microsoft Research Asia.

This paper describes an algotithm/script to embed advertising logos/names in images. For eg. if a user has a website with images, he can add the script to the website so many companies can add their logo to these images. Generally the algorithm looks for representative images to match with corresponding advertising (e.g. a picture with a car with lamborghini logo).

Interactive Spatial Multimedia Communication of Art in museums

Presented by Karen, University of Aarhus, Denmark

This was a thesis presentation, with different ways of enhancing the visit to museums, having visitors interact with the different art objects, using sound, image and always the human body.

posters and demos

This day was my presentation day so I didn't catch part of the demos and presentations. I've also missed the open-source presentation which many people say it was very interesting. Anyways here is what I've seen:

* Free viewpoint video generation: a very cool application that uses multiple cameras to record a scene/event and then it allows you to playback it in 3D, that is you can move around the scene as if it was a 3D scene
* ImageSense: an application that detects the position of the face in different images so it can provide a set of best images to fit in different web application that require your face to do some mixing, e.g. to put your face instead of a monkey face, just for fun

Omar's blog

Saturday, November 29, 2008

Blog Scanned :)

Monday, November 3, 2008

ACM Multimedia 2008

About Me

My Favourite stuff (links, music, books, etc...)

Blogs that I follow

Followers

Blog Archive