Montag, 15. August 2011

What are the next steps for augmented reality?

After Web2.0 and mobile internet being widely accepted, the question is, what wil be next. I think the answer is augmented reality.

Currently augmented reality can be experienced through mobile device with display, GPS and cam, the user points with the cam at some point of interest and the software overlays on the display the cam picture with information from Wikipedia or other sources.

Now imagine a device which looks like glasses. The glasses are transparent, next to the glasses there is a small cam. The glasses have a wired (or bluetoothed) connection to a mobile device, which connects to the Internet over LTE. The user looks though the glasses and the information about what he sees is transparantely overlayed. This is the way how our real world gets a coverage by the digital world. Several questions are arising.

What is the technology behind the glasses? Transparent LED-screens are in development for at least 10 years now. I've seen some at CeBIT when I was a student roughly 10 years ago. Also some of high-end cars from BMW show the driver information on the front glas. So it should be possible to pack this technology into glasses, which are not too heavy. The remaining electronics can be integrated in a special device, which is very similiar to todays mobile phones.

How does the input look like? Some of the actions, which are not too complex should be doable by closing and opening of the eye lid and movement of the eyes in different directions. That means that a second cam is needed which watches the eyes of the user. More complex actions like dialing a number should be possible by pointing with the fingers at virtual keyboard, which appears through the glasses. Such technology is used by Microsoft in Kinect products. It is probably not possible or desireable to type longer texts with such technics, but as it was the case with all other input devices, it will not replace them, so touchscreens or even small keyboards will still be used.

The resolution of GPS is not good enough, even in combination with Glonass or Galileo the resolution will be several centimeters, so alone from GPS it will not be possible to determine what the user is looking at. Also user might want to have information about non-stationary objects, so the position will not have any value. So pattern recognition is very important. GPS could help to give rough estimation what the user is seeing in his current environment, so information from eg. Google Street Maps can be used to match exact point user is looking at. One question is how much preprocessing of the image is happening on the device and how much will be offloaded to a server. This determines the required speed of the internet connection and processing power of the mobile device.

But what kind of information can a user get? Well, starting from information about the buildings, their current value, when build, who's the architect, how did it look like in the past to information about public signs and arrival times of public transport, to botanical information about the trees or flowers, zoological names of the animals, car labels and so on. So expect huge databases, which are filled by volounteers just like OpenStreetMap. Of course there will be navigation software available. Now to more sensitive or commercially interesting informations. If you're in front of a store, you can see current offerings, in front of the restaurant you see the lunch card, opening times and ratings from other users. Lot of people will mark their places of living, their window, their car, just to earn a badge from Foursquare or whatever such a service will be called. Now to the most sensitive information. If you know everything about the buildings, the animals, the trees, the cars, the only white spaces which remain, are the people you meet on your way. Do you really think it will stay like that? It will not take too long and the software will be able to recognize automatically the faces and show the profiles of the persons. I don't think it is avoidable, even if the person is against it, but it is hardly possible to control all the photos in the internet, lot of them are tagged, so databases contain enough information for calibration of every face. The recognition will probably never be 100% correct, there will be countermeassures, like big sunglasses, strong make-up, all the tricks from celebrities, but the success rate will be pretty high.

Currently this is a horror for most people to imagine, that they will be recognized on the street by complete strangers, but I can imagine it will change. People will have to live with it, so they will adapt their behaviour and moral norms will change. There are only very few internet trends, which were not accepted by the society, like sharing of child pornography, but this was illegal before internet as well, other forbidden trends like illegal copying of software or media sharing was declared illegal by the industry, faces recognition and augmented reality in general will be a multibillion dollar market, so I cannot imagine that any industry could be disturbed in their business by that. There will always be enough people, who have nothing against being recognized in public, so first resistance won't last too long.

So what is needed for this vision to be realized? Glasses and mobile device are probably main technological problems, but it should be solvable within next 5 years, maybe sooner. LTE should provide enough bandwidth for transferring the data about pattern recognition and information about the recognized objects. Massive databases are required for providing information about every object and an army of volounteers, who are feeding these databases. A new operating system with new input and output possibilities for the glasses and an effective system how to create the data in the databases is necessary. But all these issues are hardly unsolvable, so I expect this to happen in the next couple of years.