GEES

A brilliantly unique business card digitization system

Combines micro-tasking and multi-sourcing in a proprietary system for digitizing
contact data

DSOC produced this one-of-a-kind system and named it GEES based on the words that represent its ability.

Global

Effectively utilizes resources from around the world

(Meets local requirements to resolve issues in languages other than Japanese)

Elastic

Designed around a structure that can deal with both quiet and busy periods in a flexible, elastic manner

Efficient

Roles are positioned effectively

Scalable

Can be scaled and adapted

When development began, GEES was constructed based around a guiding goal of creating a system that would permit digitization of business cards using resources other than the operators our company employed.

This was realized through a combination of micro-tasking (work that allows anyone, anywhere, at any time, to work when they’re free, and with no advance preparation) and multi-sourcing (operators based in operations centers, working from home, working overseas, and via crowdsourcing).

Using the power of automation to segment business card images into small work units ensures both accuracy and security while allowing for the data to be utilized effectively by limited human resources. DSOC plans to continue to reduce processes that are reliant on human resources, with the aim of realizing entirely automated business card digitization.

Business card digitization through GEES

  1. 1

    Business card images captured by mobile phones and scanners are first trimmed along four vertices to separate them from the background. Shadows are also removed. Processing such as whitening makes the text more prominent. This processing prevents damage to digitization quality caused by shadows from mobile devices or the hands of the person taking the photo, along with the color of the lighting in the room. This allows for a constant level of quality to be

  2. 2

    This is followed by processing that creates segmentation between each category, with unique algorithms that incorporate deep learning used to sort category names, allowing for simplification for micro-tasking purposes.

  3. 3

    To ensure security is maintained, images that contain last names, phone numbers, and email addresses are further segmented to ensure no value is left in the remaining information. This processing means images are so comprehensively segmented that operators who ultimately perform data input in the following stages cannot determine the images came from business cards.

  4. 4

    Operators perform input based on the segmented image information. To prevent input mistakes, the same categories are sent to multiple operators and input multiple times. These results are compared to increase digitization accuracy. Human operators make consistent and regular mistakes, so deep learning is performed to analyze these mistakes and further refine the accuracy of the digitization process.

  5. 5

    Content input operators perform visual checks for each category. The segmented data is then finally re-aggregated into a single set of business card information that is presented to the end user. Approximately 20 processes are performed per business card. Combined, this means that billions of processes are performed as tasks every month as part of this digitization.

JOIN 
US

SHARE THIS PAGE

Data Strategy &
Operation Center

6F One Omotesando 3-5-29 Kita-Aoyama,
Minato-ku, Tokyo, 107-0061

Member Organization

© Sansan, Inc.

SCROLL