DIY with Clarifai: Building your own “smart” user generated content solution à la Yelp

Artificial intelligence improves Yelp’s user experience and boosts engagement by automatically organizing and curating millions of user-uploaded images.

Yelp is a site that crowdsources local business reviews. They rely on users to contribute not only first-hand (and sometimes brutally scathing) reviews, but also millions of images.

But, you know what they say – more content, more problems. The good news for sites and brands that rake in user generated content is they’re probably getting some great customer engagement. The bad news is figuring out what to do with all those images and videos.

There are two major challenges with user generated content:

1. HIGH VOLUMES: How do you know what content you’re getting?
Yelp receives tens of thousands of user uploads every day – some of the images they receive have captions, but most come with no information or unreliable metadata attached. At such volumes, it’s nearly impossible to go manually through each image and categorize it appropriately.

2. VALUE EXTRACTION: Once you know what the content is, what do you do with it?
Knowing is only half the battle (thanks, G.I. Joe!) – once you have an understanding of your content, you need to put your knowledge into action. For Yelp, that meant finding a way to organize and curate relevant and diverse photos to their users.

Yelp solved its UGC problem with artificial intelligence

Yelp had the resources to build its own in-house deep learning classification system to address its UGC challenges. That’s a great option when you have a billion dollars and an army of data scientists – if you don’t, you can use the Clarifai API to do the same thing Yelp did for way, way less time and money.

The first thing Yelp had to do was solve for its lack of data around user uploaded images. They built a photo classifier that allowed them to “see” what was in each image and sort the images into buckets. To save money and lighten the load on their servers, they opt to do this by batch at the end of each day.

Photo credit: Yelp Blog
Photo credit: Yelp Blog

Once Yelp had their photo classification service, they were able to tackle the challenge of surfacing the right images at the right time. The first feature they improved was their business page “cover photos”. A business page on Yelp shows a set of cover photos that are recommended by Yelp’s user feedback. These cover photos typically lacked diversity – for example, all the cover photos would usually be related to a single class (e.g. food). After implementing deep learning, Yelp was able to ensure a diverse set of cover photos to help users get a more holistic view of the business.


The second feature Yelp added was tabbed browsing on business pages to make it easier for users to jump to the information they wanted to find. Before, users could only see an unsorted grid of all user uploaded photos. Now, users are able to browse different tabs representing different classes of images, making it super easy for users to find the content they are looking for.

*References: Yelp blog,

DIY with Clarifai

Now that you’ve been inspired by Yelp’s “smart” user generated content application, it’s time to build your own. Clarifai’s core model includes tags for over 11,000 concepts you can apply to your business. All it takes is three simple lines of code – sign up for a developer API account to get started for free!

Once you’ve signed up for a developer account, head over to Applications and make a new one.  Make sure you nab that Client ID and Client Secret:


Now, head over to There, you’ll find our Node.js client, which makes this process even easier. To set up your environment, download the clarifai node.js file and stick it in your project.

Boo yah. You’re set up. Now head over to your Node project and just require the Clarifai client:

var Clarifai = require('./YOUR_PATH_HERE/clarifai_node.js');


Remember that Client ID and Client Secret you nabbed earlier? We’re gonna use those now. You can either paste them in this function directly, or save them in an environment variable.



Now for the fun part. You can easily tag an image with just 3 lines of code:

var imageURL = 'MY_IMAGE_URL';
var ourId = 'my great image'; // any string that identifies the image to your system
Clarifai.tagURL(imageURL, ourId, handler); // “handler” is your basic error handler function


You’re all set! Now you can easily make like Yelp and tag and sort your images to your heart’s desire. If you’d like to see a more in-depth example, check out clarifai_sample.js in the GitHub repo.

Clarifai Featured Hack: Get addicted to Snap Tag, an app to test your photo taking skills

Our team has been busy traveling across North America and Europe, (willingly) foregoing sleep, showers, and sanity to take part in some of the best university hackathons around. These are chances for us to connect with new and existing users, recruit for our open positions, and watch teams build really amazing things with our API. Our new Clarifai Featured Hack series will share the coolest of the cool, so hold onto your butts.

**Update: As of Nov. 15, 2017, we have transitioned our API to V2. Please read our V1 to V2 transition guide before implementing the code in this blog.  

Remember when you were a kid and you’d play tag on the playground? And some big kid would always catch you, say “Tag, you’re it!”, and rub your face in the dirt? No? Ok, maybe it’s just me. Anyway, Snap Tag is kind of like an endless game of tag, only with photos (and without the dirt).

Snap Tag is an app that allows users to send pictures to each other – with a catch. When you send a picture to your friend, it is automatically tagged with three words that are relevant to the content in the photo. When your friend receives the photo, they have to respond with a picture that matches at least one of the three tags you sent, thus returning the challenge with their own photo and set of tags.



We get to play Snap Tag at the office and call it work, so that’s pretty neat. It’s kind of like a photo scavenger hunt but really fun and social. Sometimes it’s easy and straightforward (take a picture of a tree!), but sometimes it’s a bit harder to find right away (Mexican Restaurant!). And, sometimes it’s borderline trolling your boring life (hot air balloon?! Ugh, I wish).


We asked Joey Li of the Snap Tag team to tell us the deep, dark, sinister secrets to creating the most addicting game we’ve played in a while. Here’s what he had to say!

Clarifai: What inspired your idea for a photo “tag” game?

Joey: At DubHacks 2015, my team (Kevin Wong, Lisa Li, and Luxi Xu from the University of British Columbia and Malcolm Daigle from the University of Washington) wanted to make an app that would be cool enough to use ourselves. We challenged ourselves to make use of the seemingly magical Clarifai API for generating tags for images.

How does the game work?

Our game is simple: send a picture to a friend and the app will automatically generate three tags for it. The receiver is only allowed to respond with a picture that has at least one of those three tags, but not all three. Here’s the catch: the receiver can’t see the tags, and the number of attempts is passed along with the picture. One can liken it to a game of telephone with pictures. Or a game of tag … Snap Tag – that’s pretty catchy. You can find it on GitHub. Not everything works yet, and it likes to randomly crash sometimes, but we’re excited to continue improving it!

What’s the magic sauce behind your Clarifai implementation?

The key here is parsing the JSON response from the /v1/tag endpoint. By default, the API returns the top 20 tags for each image. In Snap Tag’s case, 20 tags would be too easy! So, to restrict the valid photo responses to only the top three tags, you can select top-n from results[0].result.tag.classes. The tags are sorted by decreasing probability, so “top-3” will give you the 3 most applicable tags.

Thanks for sharing, Joey!

To learn more, check out our documentation and sign-up for a free Clarifai account to start using our API – all it takes is three lines of code to get up and running! We’re super excited to share all the cool things built by our developer community, so don’t forget to tweet @Clarifai to show us your apps.

And give Joey, Kevin, Lisa, Luxi, and Malcolm some props in the comments below. Until next time!