Top Banner
AI data collecting & labeling platform Copyright © 2021 SELECTSTAR. All Rights Reserved.
28

AI data collecting & labeling platform - SAMSUNG C-LAB

Mar 13, 2023

Download

Documents

Khang Minh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: AI data collecting & labeling platform - SAMSUNG C-LAB

AI data collecting &

labeling platform

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Page 2: AI data collecting & labeling platform - SAMSUNG C-LAB

Founded in

2018.11

Clients

194Sales in 2020

6M USD

Crowd workers

13K+Processed data

70MMembers

80

AI data collecting and labeling platform

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Page 3: AI data collecting & labeling platform - SAMSUNG C-LAB

Aug. 2020

Copyright © 2021 SELECTSTAR. All Rights Reserved.

AI data collecting and labeling platform

Raised $4 Million worth of Series A investment

Page 4: AI data collecting & labeling platform - SAMSUNG C-LAB

Nominated as2021 Forbes 30 Under 30 Asia,Enterprise Technology

Copyright © 2021 SELECTSTAR. All Rights Reserved.

https://www.forbes.com/30-under-30/2021/asia/enterprise-technology

AI data collecting and labeling platform

Page 5: AI data collecting & labeling platform - SAMSUNG C-LAB

AI models are trained and developedwith data prepared by humans.

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Page 6: AI data collecting & labeling platform - SAMSUNG C-LAB

자동차자동차

자동차사람

Train

Image Image -location & type of objects

Copyright © 2021 SELECTSTAR. All Rights Reserved.

AI models are trained and developedwith data prepared by humans.

Humanlabor

Automatedvehicle

Page 7: AI data collecting & labeling platform - SAMSUNG C-LAB

자동차자동차

자동차사람

Train

Image Image -location & type of objects

Copyright © 2021 SELECTSTAR. All Rights Reserved.

AI models are trained and developedwith data prepared by humans.

Labeling

Training data

Humanlabor

Automatedvehicle

Page 8: AI data collecting & labeling platform - SAMSUNG C-LAB

Problem

No AI can exist without training data,

data quality defines AI quality

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Page 9: AI data collecting & labeling platform - SAMSUNG C-LAB

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Data scientists spend 80% of their time oncollecting and preparing data.

Continuous manual labor

Problem

Page 10: AI data collecting & labeling platform - SAMSUNG C-LAB

Solution

Division of process

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Page 11: AI data collecting & labeling platform - SAMSUNG C-LAB

=Crowd + Sourcing

Crowd-sourcing

= The practice of obtaining informationor input into a task or projectby a large number of people.

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Solution

Page 12: AI data collecting & labeling platform - SAMSUNG C-LAB

Available fromiOS, Android and Web

Crowd-sourcing

Mobile crowd-sourcing platform with 130K users, Cashmission.

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Downloads270K

1K+ reviews4.2★

Page 13: AI data collecting & labeling platform - SAMSUNG C-LAB

The data for smarter AISELECTSTAR provides

consistently accurate, high quality data

for you to focus on the essence of your AI research.

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Page 14: AI data collecting & labeling platform - SAMSUNG C-LAB

Customer cases

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Page 15: AI data collecting & labeling platform - SAMSUNG C-LAB

Audio data of professional voice actors collected for

improving audio generation AI model’s performance.

(Data inspection via crowd-sourcing)

Samsung AI Development Group

“ In search of voice data of professional voice actors,

we contacted SELECTSTAR. SELECTSTAR has been

much helpful in providing with samples during the early

process of selecting the adequate voice. The quality

of the final data received was very satisfying as well.

We hope to work with SELECTSTAR again in the future.”

Customer cases

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Page 16: AI data collecting & labeling platform - SAMSUNG C-LAB

LG CNS AI & Big Data Research Center

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Korean Q&A

KorQuad Dataset 2.0 wholly based on crowd-sourcing.

Dataset that works as a benchmark of machine comprehension

AI models of industry leaders such as Kakao and Naver.

“ With SELECTSTAR, we were able to efficiently collect

KorQuad 2.0, a Question-Answer dataset in Korean.

We loved the quality and diversity of data,

Collected from broad workers. Especially, SELECTSTAR's

user guideline for our task was very impressive,

capturing our expectations into clear explanations for

the workers.”

Customer cases

Page 17: AI data collecting & labeling platform - SAMSUNG C-LAB

SK Telecom

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Hourly labeling

of large amount of media contents data

“ SELECTSTAR’s platform allowed us to swiftly and

accurately complete data labeling for quite amount

of media contents. Accurate labeling was carried out

in a timely manner- earlier than we have anticipated.

Had it not been for SELECTSTAR, it would have taken

much longer for us to independently employ workers.

Instant feedback was also possible via constant

communication. "As a result, we received data in high

quality as we had wished for, and developed and

advanced the AI model according to our purposes.”

Customer cases

Page 18: AI data collecting & labeling platform - SAMSUNG C-LAB

롯데정보통신

Copyright © 2021 SELECTSTAR. All Rights Reserved.

“ We were able to collect accurately processed large

OCR data using SELECTSTAR platform very promptly.

SELECTSTAR delivered the data for less than a month

period, while our own in-house workers took almost

six months to complete. Continuous communication

and instant feedback updates were especially

satisfactory. Loved cost, quality, and delivery time

And we were happy to save many resources.”

OCR dataset built via crowd-sourcing

Customer cases

Page 19: AI data collecting & labeling platform - SAMSUNG C-LAB

“ While building the KLUE dataset with SELECTSTAR, we were most impressed by their data quality assurance system.

Despite the intricacy of the data and the tight deadline, SELECTSTAR was able to provide specific guidelines for

the workers to guarantee data consistency. They also made sure to train and select qualified workers, and inspect

the entire dataset. We believe that KLUE, the representative Korean NLP benchmark dataset, was able to come into

the world, owing to SELECTSTAR’s capability and passion.”

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Customer cases

Dataset includes complex Machine Reading Comprehension,Topic Classification, Semantic Textual Similarity, NaturalLanguage Interference, and others

Leading Korean NLP bench mark dataset on 8types of Korean NLP tasks. Many of major Korean research labs have participated.

Page 20: AI data collecting & labeling platform - SAMSUNG C-LAB

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Page 21: AI data collecting & labeling platform - SAMSUNG C-LAB

Collaboration case

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Biz unit network

Samsung Research Samsung C-Lab

Page 22: AI data collecting & labeling platform - SAMSUNG C-LAB

Usability test onrenewed Cashmission

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Collaboration case

• Usability test carried out through Focus Group

Discussions(FGD) based on previous design mockups

• Gathered users interested in cash-back apps and

their opinions on app design and usability via FGD

• Utilized FGD results for qualitative research to

analyze psychological motivations, tendencies,

and preferences of target users

• Usability testing report

Page 23: AI data collecting & labeling platform - SAMSUNG C-LAB

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Collaboration case

1. Strengthen SEO (focus on Google-friendly contents) - Professional consultationincluding tips on settings

2. Recommend marketing utilization tools (similar-target strategy) - Pipedrive,Wapplyzer, Clearbit, LAL, etc.

3. Recommend ways and tools for CRM utilization - Intercom, Drift, Channeltalk, etc.

1. Assess growth rate and goals

2. Assess marketing analysis tool

3. Determine growth hacking funnel

4. Analyze problems via market research and conduct customized consultation

(May-Aug. 2021)

Assessment

Solution

Application

4. Invite professionals for advices and insights based on experiences

1. Apply SEO strengthening tips (multi-language, tags, titles, etc.)

2. Analyze recommended marketing tool > Choose one> Apply (in process - Clearbit)

3. Analyze recommended CRM tool > Choose one> Apply (in process - Intercom)

GlobalMarketingConsulting

Page 24: AI data collecting & labeling platform - SAMSUNG C-LAB

SELECTSTAR Members

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Page 25: AI data collecting & labeling platform - SAMSUNG C-LAB

SELECTSTAR members

Members Co-founder R&D

Copyright © 2021 SELECTSTAR. All Rights Reserved.

80 All co-founders are KAIST alumni

Consist of members with doctorates in mathematics

and other post-graduate degrees from KAIST

Page 26: AI data collecting & labeling platform - SAMSUNG C-LAB

From model-centric to data-centric AI

The most important feature in building AI model is data.

Andrew Ng (Founder & CEO of landing AI)

Copyright © 2021 SELECTSTAR. All Rights Reserved.

Page 27: AI data collecting & labeling platform - SAMSUNG C-LAB

Copyright © 2021 SELECTSTAR. All Rights Reserved.

We aim to be the World’s best All-in-one Data Platform,

solving all data problems is the AI lifecycle.

Page 28: AI data collecting & labeling platform - SAMSUNG C-LAB

Contact E-mail [email protected]

https://SELECTSTAR.ai/Tel. +82-1666-3282CEO