Azure Face API and Vision for Face Recognition

13 Mar, 2019 | 4 minutes read

Face recognition?! Yeah, it is nothing new… we have already seen a lot of face recognition applications even before Facebook came on the market. We all know how face recognition works, but for those not sure about it here is how it works: the face recognition systems are based on the face recognition algorithm, whose purpose is to extract facial features like eyebrow length, eye position, lips width, etc. The first phase is to detect the face in a photo and then create the face map and save it for future analysis. But the face recognition technology brings privacy concerns, we don’t know when and how our faces will be used or abused. We must be aware when giving permissions across social networks and mobile apps. Face recognition apps nowadays are used in different fields like retail, advertising, security, air traveling, access control, and many more. Now you might think why do we talk about face recognition when it is nothing new? Furthermore, we will talk not about face recognition, but using Azure Face API and Vision to get a lot of details about the people attending events.

We replaced our feedback forms for internal events with something unique. During the year, we have different types of events such as team-building events, company birthday parties, New Year parties, technical presentations, and many more. At the end of the day, we needed some answers like:

  • Did the employees have fun during the games?
  • What is the average age?
  • How many males and females?
  • Did they enjoy the food?
  • Were colleagues surprised for the awards they won?

We wanted to go a step further and get answers for questions like:

  • What their faces and emotions are saying?
  • What is the total percentage of HAPPINESS?
  • What are the accent colors of the event?

After we sat down to see the available options to get employees feedback, the conclusion was clear, nobody wants to use forms with tons of questions, radio buttons, check boxes and mandatory text boxes – people simply don’t want to spend time on that anymore. One thing that came to our mind was: let’s see the images from the event and try to see whether we can gather all that information without disturbing our employees.

We made a few calls, Azure Cognitive APIs calls, in order to get answers on the questions above. Azure Cognitive Services consist of Vison, Speech, Language, Knowledge and Search services. For our project we used APIs from the Vison section, Face API and Vison API.

Usage of the Face API

Face API helped us find unique faces and determine their age and gender. Finding unique faces, we were able to count the total attendees at our events. When it comes to emotions, Face API can provide the following emotion information for a face: anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise. This is how we could measure the happiness of the people participating in the events, for a certain part of the day and during the games. The emotions list has the emotion surprise as well, which was used to answer the question: Were the colleagues surprised for the awards they won?

Usage of Vision API

A picture is worth a thousand words and Vison API is “reading” the words for us. Vison API can also find faces in an image and much more. It detects adult content, image type, return description of the image in a form of a sentence, can describe the image and return names of the objects, known as tags. The power of Vision API in our case, was used to extract the accent colors in the event’s photos and also dominant background and foreground colors.

{
    "color": {
        "dominantColorForeground": "Brown",
        "dominantColorBackground": "Grey",
        "dominantColors": [
            "Brown",
            "Grey"
        ],
        "accentColor": "1A427C",
        "isBwImg": false,
        "isBWImg": false
    },
    "description": {
        "tags": [
            "person",
            "building",
            "outdoor",
            "people",
            "man",
            "group",
            "standing",
            "woman",
            "posing",
            "holding",
            "young",
            "table",
            "boy",
            "walking",
            "board"
        ],
        "captions": [
            {
                "text": "a group of people standing next to a building",
                "confidence": 0.92497622887208164
            }
        ]
    },
    "requestId": "6b9f4e21-f2c7-452c-afcc-7e05eb9ee243",
    "metadata": {
        "width": 1800,
        "height": 1200,
        "format": "Jpeg"
    }
}

Privacy concerns 

Privacy always comes first. Attendants on the events should sign off that they agree their data to be used for events improvement. Also, they should be informed what data is going to be used, for how long it will be kept and who has access to it.

Conclusion  

Living in the era of smart devices and new technologies, we are constantly seeking challenges and implementing solutions to improve our everyday work and living. This time we use Azure Cognitive Services in Vision section, to get event’s feedback without filling feedback form. At the same time, we learnt something new. Win-win!

Stay tuned for our future work in Cognitive Services field.

Are you interested in exploring these technologies? Contact us!