Vvek's Pedia: June 2012

Saturday 30 June 2012

Google's 'brain simulator': 16000 computers to identify a cat

Inside Google's secretive X laboratory, known for inventing self-driving cars and augmented reality glasses, a small group of researchers began working several years ago on a simulation of the human brain. There Google scientists created one of the largest neural networks for machine learning by connecting 16,000 computer processors, which they turned loose on the internet to learn on its own.

Stanford computer scientist Andrew Ng next to an image of a cat that a neural network taught itself to recognise. Photo: The New York Times

Presented with 10 million digital images found in YouTube videos, what did Google's brain do? What millions of humans do with YouTube: looked for cats.

The neural network taught itself to recognise cats, which is actually no frivolous activity. This week the researchers will present the results of their work at a conference in Edinburgh, Scotland.
The Google scientists and programmers will note that while it is hardly news that the internet is full of cat videos, the simulation nevertheless surprised them. It performed far better than any previous effort by roughly doubling its accuracy in recognising objects in a challenging list of 20,000 distinct items.
The research is representative of a new generation of computer science that is exploiting the falling cost of computing and the availability of huge clusters of computers in giant data centers. It is leading to significant advances in areas as diverse as machine vision and perception, speech recognition and language translation.
Although some of the computer science ideas that the researchers are using are not new, the sheer scale of the software simulations is leading to learning systems that were not previously possible.
And Google researchers are not alone in exploiting the techniques, which are referred to as "deep learning" models. Last year Microsoft scientists presented research showing that the techniques could be applied equally well to build computer systems to understand human speech.
"This is the hottest thing in the speech recognition field these days," said Yann LeCun, a computer scientist who specialises in machine learning at the Courant Institute of Mathematical Sciences at New York University.

And then, of course, there are the cats.
To find them, the Google research team, lead by the Stanford University computer scientist Andrew Y. Ng and the Google fellow Jeff Dean, used an array of 16,000 processors to create a neural network with more than one billion connections. They then fed it random thumbnails of images, one each extracted from 10 million YouTube videos.

The videos were selected randomly and that in itself is an interesting comment on what interests humans in the internet age. However, the research is also striking. That is because the software-based neural network created by the researchers appeared to closely mirror theories developed by biologists that suggest individual neurons are trained inside the brain to detect significant objects.

Currently much commercial machine vision technology is done by having humans "supervise" the learning process by labeling specific features. In the Google research, the machine was given no help in identifying features.

"The idea is that instead of having teams of researchers trying to find out how to find edges, you instead throw a ton of data at the algorithm and you let the data speak and have the software automatically learn from the data," Dr. Ng said.

"We never told it during the training, 'This is a cat,' " said Dr. Dean, who originally helped Google design the software that lets it easily break programs into many tasks that can be computed simultaneously. "It basically invented the concept of a cat. We probably have other ones that are side views of cats."

The Google brain assembled a dreamlike digital image of a cat by employing a hierarchy of memory locations to successively cull out general features after being exposed to millions of images. The scientists said, however, that it appeared they had developed a cybernetic cousin to what takes place in the brain's visual cortex.

Neuroscientists have discussed the possibility of what they call the "grandmother neuron," specialised cells in the brain that fire when they are exposed repeatedly or "trained" to recognise a particular face of an individual.

"You learn to identify a friend through repetition," said Gary Bradski, a neuroscientist at Industrial Perception, in Palo Alto, Calif.
While the scientists were struck by the parallel emergence of the cat images, as well as human faces and body parts in specific memory regions of their computer model, Dr. Ng said he was cautious about drawing parallels between his software system and biological life.

"A loose and frankly awful analogy is that our numerical parameters correspond to synapses," said Dr. Ng. He noted that one difference was that despite the immense computing capacity that the scientists used, it was still dwarfed by the number of connections found in the brain.

"It is worth noting that our network is still tiny compared to the human visual cortex, which is 106 times larger in terms of the number of neurons and synapses," the researchers wrote.

Despite being dwarfed by the immense scale of biological brains, the Google research provides new evidence that existing machine learning algorithms improve greatly as the machines are given access to large pools of data.

"The Stanford/Google paper pushes the envelope on the size and scale of neural networks by an order of magnitude over previous efforts," said David A. Bader, executive director of high-performance computing at the Georgia Tech College of Computing. He said that rapid increases in computer technology would close the gap within a relatively short period of time: "The scale of modeling the full human visual cortex may be within reach before the end of the decade."

Google scientists said that the research project had now moved out of the Google X laboratory and was being pursued in the division that houses the company's search business and related services. Potential applications include improvements to image search, speech recognition and machine language translation.

Despite their success, the Google researchers remained cautious about whether they had hit upon the holy grail of machines that can teach themselves.

"It'd be fantastic if it turns out that all we need to do is take current algorithms and run them bigger, but my gut feeling is that we still don't quite have the right algorithm yet," said Dr. Ng.

The New York Times

Saturday 16 June 2012

Windows 8 Release Preview(Evaluation copy)

Its look like Microsoft also begins to distribute free apps and let us to develop a app like Linux distributions..... Microsoft clearly plans to overtake both Android and Linux by distributing free apps like open source!!!

Friday 8 June 2012

The Amazon Mechanical Turk (MTurk)

The Amazon Mechanical Turk (MTurk) is a crowdsourcing Internet marketplace that enables computer programmers (known as Requesters) to co-ordinate the use of human intelligence to perform tasks that computers are currently unable to do. It is one of the suites of Amazon Web Services. The Requesters are able to post tasks known as HITs (Human Intelligence Tasks), such as choosing the best among several photographs of a store-front, writing product descriptions, or identifying performers on music CDs. Workers (called Providers in Mechanical Turk's Terms of Service) can then browse among existing tasks and complete them for a monetary payment set by the Requester. To place HITs, the requesting programs use an open Application Programming Interface, or the more limited MTurk Requester site. Requestors are restricted to US-based entities.

Requesters can ask that Workers fulfill Qualifications before engaging a task, and they can set up a test in order to verify the Qualification. They can also accept or reject the result sent by the Worker, which reflects on the Worker's reputation. Currently, Workers can have an address anywhere in the world. Payments for completing tasks can be redeemed on Amazon.com via gift certificate or be later transferred to a Worker's U.S. bank account. Requesters, which are typically businesses, pay 10 percent of the price of successfully completed HITs to Amazon.

Sunday 3 June 2012

CERN Officials May Have Witnessed ‘God Particle’

The Higgs boson is a hypothetical elementary particle predicted by the Standard Model (SM) of particle physics. It belongs to a class of particles known as bosons, characterized by an integer value of their spin quantum number. The Higgs field is a quantum field with a non-zero value that fills all of space, and explains why fundamental particles such as quarks and electrons have mass. The Higgs boson is an excitation of the Higgs field above its ground state.

One possible signature of a Higgs boson from a simulated proton-proton collision. It decays almost immediately into two jets of hadrons and two electrons, visible as lines.

The existence of the Higgs boson is predicted by the Standard Model to explain how spontaneous breaking of electroweak symmetry (the Higgs mechanism) takes place in nature, which in turn explains why other elementary particles have mass. Its discovery would further validate the Standard Model as essentially correct, as it is the only elementary particle predicted by the Standard Model that has not yet been observed in particle physics experiments. The Standard Model completely fixes the properties of the Higgs boson, except for its mass. It is expected to have no spin and no electric or color charge, and it interacts with other particles through weak interaction and Yukawa interactions. Alternative sources of the Higgs mechanism that do not need the Higgs boson are also possible and would be considered if the existence of the Higgs boson were ruled out. They are known as Higgsless models.

Experiments to determine whether the Higgs boson exists are currently being performed using the Large Hadron Collider (LHC) at CERN, and were performed at Fermilab's Tevatron until its closure in late 2011. Mathematical consistency of the Standard Model requires that any mechanism capable of generating the masses of elementary particles become visible at energies above 1.4 TeV; therefore, the LHC (designed to collide two 7-TeV proton beams) is expected to be able to answer the question of whether or not the Higgs boson actually exists. In December 2011, Fabiola Gianotti and Guido Tonelli, spokespersons of the two main experiments at the LHC (ATLAS and CMS) both reported independently that their data hints at a possibility the Higgs may exist with a mass around 125 GeV/c2 (about 133 proton masses, on the order of 10−25 kg). They also reported that the original range under investigation has been narrowed down considerably and that a mass outside approximately 115–130 GeV/c2 is almost ruled out. No conclusive answer yet exists, although it is expected that the LHC will provide sufficient data by the end of 2012 for a definite answer.

"Discovery or exclusion of the Higgs particle, as predicted by the Standard Model, is getting ever closer," CERN Director for Research and Scientific Computing, Sergio Bertolucci, said in a statement. "Both occurrences will be great news for physics, the former allowing us to start the detailed study of the Higgs particle, the latter being the first proof of the incompleteness of the Standard Model, requiring new phenomena to be happening within the reach of the LHC."

"We’re taking our first steps in this new physics landscape," added CMS experiment spokesman Guido Tonelli, "and it is great to see how fast we are producing new results. I am confident that soon there will be only a few regions left where the Higgs boson, as postulated by the Standard Model, might still be hiding."

Source: redOrbit (http://s.tt/160nr)

Menu Bar