Showing posts with label photosynth. Show all posts
Showing posts with label photosynth. Show all posts

18 July 2007

Seeing the Power of the Visual Commons

I've written before about Microsoft's Photosynth, which draws on the Net's visual commons - Flickr, typically - to create three-dimensional images. Here's another research project that's just as cool - and just as good a demonstration of why every contribution to a commons enriches us all:

What can you do with a million images? In this paper we present a new image completion algorithm powered by a huge database of photographs gathered from the Web. The algorithm patches up holes in images by finding similar image regions in the database that are not only seamless but also semantically valid. Our chief insight is that while the space of images is effectively infinite, the space of semantically differentiable scenes is actually not that large. For many image completion tasks we are able to find similar scenes which contain image fragments that will convincingly complete the image. Our algorithm is entirely data-driven, requiring no annotations or labelling by the user.

One of the most interesting discoveries was the following:

It takes a large amount of data for our method to succeed. We saw dramatic improvement when moving from ten thousand to two million images. But two million is still a tiny fraction of the high quality photographs available on sites like Picasa or Flickr (which has approximately 500 million photos). The number of photos on the entire Internet is surely orders of magnitude larger still. Therefore, our approach would be an attractive web-based application. A user would submit an incomplete photo and a remote service would search a massive database, in parallel, and return results.

In other words, the bigger the commons, the more everyone benefits.

Moreover:

Beyond the particular graphics application, the deeper question for all appearance-based data-driven methods is this: would it be possible to ever have enough data to represent the entire visual world? Clearly, attempting to gather all possible images of the world is a futile task, but what about collecting the set of all semantically differentiable scenes? That is, given any input image can we find a scene that is “similar enough” under some metric? The truly exciting (and surprising!) result of our work is that not only does it seem possible, but the number of required images might not be astronomically large. This paper, along with work by Torralba et al. [2007], suggest the feasibility of sampling from the entire space of scenes as a way of exhaustively modelling our visual world.

But that is only feasible if that "space of scenes" is a commons. (BTW, do check out the paper's sample images - they're amazing.)

20 June 2007

Crowdsourcing Sousveillance

I wrote recently about Microsoft's amazing Photosynth demo, which shows pictures of Notre-Dame taken from Flickr stitched together automatically to produce a three-dimensional model that you can zoom into in just about any way.

Then I read this:

Madeleine McCann's parents will appeal to Irish tourists to check holiday snaps for clues - while the flat their child was abducted from reportedly sold for half price.

Madeleine's parents Kate McCann, 38, and Gerry, 39, will appear on television to ask anyone who took a trip to Portugal in early May to send photos to British investigators.

It occurred to me that what we really need is a system that can take these holiday snaps and put them together in time to create a four-dimensional model that can be explored by the police - a new kind of crowdsourced sousveillance.

Given that Photosynth is still experimental, we're probably some way off this. I'd also have concerns about handing over all this information to the authorities without better controls on what would be done with it (look what's happening with the UK's DNA database.)

12 June 2007

Great, Microsoft - But What About the Commons?

Photosynth is undoubtedly amazing. But this video indicates that it's even more powerful than previously suggested; specifically, it talks about using public pictures on Flickr to create not only detailed, three-dimensional images of the world, but also to use any tags they have to provide transferable metadata. In other words, it's a product of collective intelligence, that builds on the work of the many.

That's all well and good, but I do wonder whether Microsoft has given any thought to its responsibility to the commons it is making free with here....

25 April 2007

Quakr, the New Quake?

Well, no, not really: actually, it's much more impressive:

Quakr is a project to build a 3-dimensional world from user contributed photos.

30 November 2006

Finding Our Way to a Third Life

Talking of geography, here's geograph, which "aims to collect geographically representative photographs and information for every square kilometre of the UK and Eire". That's nice, but I'd like to see this go further.

Imagine if pix were available for a much finer mesh - say, every ten metres (or something). Imagine, then, using some software like Photosynth, a seriously cool piece of software that is sadly closed source (and Microsoft's, to boot), to stitch all those images together into a complete, three-dimensional world - our world - that you could navigate through, while able to see everyone else there doing the same.

Third Life, anyone? (Via Open (finds, minds, conversations)...)