Big Data Gurus meetup yesterday, hosted in Samsung R&D center in San Jose, Jimmy Retzlaff from Yelp gave a talk on Big data at Yelp.
By the end of March 2013, Yelp has 36 million user reviews. These reviews cover from restaurants to hair salon, and other local businesses. The number of reviews on Yelp website has grown exponentially in the last few years.
Yelp also sees high traffic now. In January 2013, there are 100 million unique visitors to Yelp. The website records 2 terabytes of log data and another 2 terabytes of derived log every day. While this data size is still small comparing to eBay or LinkedIn, it calls for implementation of big data infrastructure and data mining methods.
Yelp uses MapReduce extensively and builds its infrastructure on Amazon cloud.
Yelp’s log data contain ad display, user clicks and so on. Data mining helps Yelp in designing search system, showing ads, and filter fake reviews. In addition, data mining enables products such as "review highlights", "people who viewed this also viewed...".
Yelp is one example of companies starting to tackle big data and taking advantage of data mining for creating better services.