Artificial Intelligence 101 · 2019-08-04 · •Learned patterns can be used to make predictions about new data instances. •Patterns constitute a “model” of the real world

1

Artificial Intelligence 101Arup Das, Founder and Chief Executive Officer of

Alphaserve Technologies

What is Artificial Intelligence and Machine Learning?

Data Machine Patterns

Artificial Intelligence systems are computer systems exhibiting some form of human intelligence. Machine Learning stems from the field of Artificial Intelligence, first

seen in the 1950s.

• Computer systems incorporating Machine Learning technologies have the ability to learn from real world observational data by applying various types of algorithms.

• Extract patterns from data or classify data instances to make sense.• Learned patterns can be used to make predictions about new data instances.

• Patterns constitute a “model” of the real world to help computers act automatically without being explicitly programmed.

• Related fields are statistics and probability theory, and the theory of optimization.

https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRxqFQoTCOyLkaaX4MgCFQQ-Pgod7j0KbA&url=https://www.iconfinder.com/icons/220851/control_gear_job_machine_mechanic_options_parts_preferences_settings_system_tool_tools_work_icon&bvm=bv.105841590,d.cWw&psig=AFQjCNHcZUw3j-8SHOxgcV3xB-DHJKcZGA&ust=1445950414439699

http://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRxqFQoTCMPEwvWX4MgCFUU2PgodMfgBZA&url=http://www.clker.com/clipart-binary-data.html&bvm=bv.105841590,d.cWw&psig=AFQjCNFsKrWhwDkrk9jI4ZGswLpPUGq1rw&ust=1445950597355610

http://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRxqFQoTCPX6r7ym4MgCFUQ6PgodN8cGHg&url=http://computer.howstuffworks.com/question599.htm&psig=AFQjCNHBmBD_Nq9Vmedx0-uxMMIoo1tmlw&ust=1445954385462765

https://pixabay.com/en/paper-document-text-layout-23701/

https://blog.majestic.com/how-to-videos/

http://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRxqFQoTCPiW9e-n4MgCFUF7Pgodh-cNiQ&url=http://superpowered.com/audio-file-player/&psig=AFQjCNEt2O6VmNAotdrLcvWZXPgeht6RQA&ust=1445954889666536

2

Understanding Analytics

Descriptive Analytics: How have the monthly sales for the past twelve months been? Who are the most valuable customers?

Predictive Analytics: What are the projected sales for the next six months? Which customers are likely to leave?

Prescriptive Analytics: What actions could be taken to increase sales? What incentives can be offered to encourage customers to prevent them from leaving?

Company Offers Services Sales Transaction

Database

“… leverage data in a particular functional process (or application) to enable context-specific insight that is actionable.” – Gartner

• Descriptive analytics – Current and historical look at organizational performance.

• Predictive analytics – Predicts future trends, behavior and events for decision support.

• Prescriptive analytics – Also known as decision support. Determines alternative courses of actions or decisions given the historical, current and projected situations, and a set of objectives, requirements, and constraints.

AI in Analytics at a Glance

3

Analytics Examples

Customer Relationship Management

How to best and profitably classify customers into category A (most valuable), B and C (descriptive). How to predict the probability that a customer will be lost within two years (predictive).

Marketing How to compute the likelihood of purchasing a product by each existing customer to launch an advertising campaign for the product.

Call Center How to assign the most able agent to an incoming call requiring specialized expertise.

Analytics Examples

Insurance How to estimate the probability of a claim (e.g. car accident)

by an existing customer or by a new application using historical

personal data.

Telecommunication How to cluster customers on the basis of collected historic

data points (e.g. calls, text and multi-media messages,

navigation, mail exchange) and then offer tailored offers.

Banking How to determine the credit-worthiness of new clients on the

basis of historic data of past clients. How to determine credit

card usage fraud based on the usage patterns.

4

Analytics Examples

Medical and Pharmaceutical

How to determine possible side-effects of a drug given to a

patient and the associated factors. How to determine current

and future clinical state of a subject.

Logistics Supply Chain How to predict the number of goods consumed at different

places.

Human Resources How to predict the financial impact of fundamental strategies

such as pay differentiation, pay-at-risk, total rewards mix, and

organizational structure.

• Structured data is highly organized information that uploads neatly into a relational database and is easily searchable by basic algorithms.

• Structured data is usually displayed in titled columns and rows which can easily be ordered and processed by data mining tools.

• Examples: text files, spreadsheets, data from machine sensors

• Unstructured data is that which has no identifiable internal structure and is more like human language.

• Unstructured data may have its own internal structure, but does not conform neatly into a spreadsheet or database.

• Examples: emails, video, audio, social media posts

Structured vs. Unstructured Data

• Text is sometimes categorized as semi-structured.

5

Structured Data Types

Textual Data Example

Credit Card Agreement

Non-Disclosure Agreement

Appellate Court Decision

6

Forms of Variables

• Quantitative or numerical variables• Observations are measured on a continuous scale or numbers

(e.g. temperature, height, weight, number of incoming calls, number of customers visiting)

• Discrete and continuous numerical variables are often differentiated by determining whether the variables are related to a count or a measurement

• Qualitative or categorical variables• Observations are measured on a discrete set of values (e.g.

occupation, day of the week, gender, size)• Nominal categorical variables have no inherent order (e.g. M,

F) whereas ordinal categorical variables have an inherent rank or order (e.g. short, medium, tall)

The 7 Vs of Big Data

1. Volume: Enormous volumes of data

2. Velocity: Pace at which data flows in from sources like business processes, machines, networks, human interaction with social media, mobile devices, etc.

3. Variety: Many sources and types of data, both structured and unstructured

4. Veracity: Biases, noise, and abnormality in data

5. Value: Refers to the ability to turn data into value

6. Validity: Is the data correct and accurate for the intended usage?

7. Volatility: How long do you need to store this data?

7

Growing Importance of Big Data and Business Analytics

• Organizations amass tremendous amounts of data through day-to-day business operations.

• Traditionally, singular business departments such as sales, marketing, research, human resources, and finance & accounting now generate exponentially growing quantities of data as well.

• This growing digital footprint lends itself to enormous opportunity.

• Data can be sold, or harnessed to gain insights to render business functions more efficient, enhance customer satisfaction, and reveal strengths and weaknesses.

Major Types of Machine Learning

Source: McKinsey

8

Supervised Machine Learning

Unsupervised Machine Learning

9

Regression and Classification

• Regression: One of your variables depends on some or all of the others. Can you predict the value of the dependent variable based on the values of the independent variables?

• Example: predict the insurance risk based on applicant’s background and medical conditions or the price of a car based on its weight, horsepower, and other characteristics

• Example: Linear Regression, Decision Tree

• Classification: Each data point falls into some category or class. Can you predict, without looking at the category value, which category each point is in?

• Example: predict which application should be accepted or rejected or which court case is likely winning one

• Example: Logistic Regression, Neural Network, SVM

Two ways of thinking about supervised learning problems:

Supervised Learning -Classification vs. Regression

10

Unsupervised Learning - Clustering

In clustering the idea is not to predict the target class as like classification , it’s more ever trying to group the similar kind of things by considering the most satisfied condition -all of the items in the same group should be similar and no two different group items should not be similar.

Next Generation Information Flow

11

Putting It All Together

Evolution of Artificial Intelligence and Law

1960s

The phrase “computers and law” was coined

1968

Law and Computer Technology published their

first journal in January

1970s

First legal expert systems were created

1970

Some Speculation about AI and Legal Reasoning,

Stanford AI Laboratory, Buchanan and Headrick

1974

McKaay and Robillard, Predicting judicial decisions

1987

Susskind published Expert Systems in Law

1987

The first international conference on law and

artificial intelligence was held in Boston

1992

Artificial Intelligence and Law Journal, first published

1990s

Machine Learning and language processing for

eDiscovery

12

Current Trends

Relevant AI Disciplines

AI & LAW

13

Legal Expert Systems

• Reasoning Types • Deductive• Inductive• Abductive or analogical

• Architectures• Rule-based system• Case-based reasoning• Neural networks• Fuzzy logic• Bayesian networks

Data Driven Law Components• Categorization of AI solutions – Client facing vs. Internal

• Single repository for knowledge extraction – Extract current information from firm systems into a single repository (Data Hub)

• External Data – Merge external relevant data with firm’s data hub

• Execution Strategy–In-house vs. co-sourced

• Build vs. Buy or Build, Buy and Integrate

• Risk Management – Total cost of ownership, Intellectual property transfer from vendors

• Culture and Adoption of AI based tools - Innovation hours

• Innovation metrics – KPIs to measure success

14

Challenges and Opportunities

• Law firms have vast and growing volumes of information containing their know-how and experience, as well as client data, most of which is unstructured textual data such as documents and emails.

• Not only is it important for lawyers to be able to leverage this experience from documents for obvious competitive advantages and to promote efficiency, but being able to classify and mine the content to extract and interpret relevant information.

• Due to the growing volume of unstructured data, there has been a rise in changes in legislation where failure to comply carries major penalties.

• Traditionally, these tasks are either carried out by junior associates or temporary staff, or are passed to the legal process outsourcing industry in order to meet tight deadlines.

• Not only this is costly and time-consuming but also the data extracted may carry risk due to inconsistencies and incorrect data being extracted due to human error.

Technical Challenges

• Artificial Intelligence (AI) and Machine Learning (ML) have the potential to mitigate these challenges with its powerful foundational technologies for handling textual data and structured relational data.

• However, traditional tools for handling textual data rely on simple Boolean search and retrieval.

• Better tools are needed to truly understand data, summarize and infer meaning, predict outcome, classify the various types of ideas present, and help get to the result fast—even if that result didn’t involve the keywords you used.

• Any discussion of AI must note that tasks involving reasoning or perception, such as language understanding, are by far the most difficult for AI and yet to be overcome.

• AI can also automate due diligence tasks to highlight areas of contracts that need to be

addressed due to changes in legislation, mitigating risk to the firm.

15

Specific Technical Solutions

• Automated information extraction from legal databases and texts.

• Automatic legal text classification and summarization.

• Conceptual or model-based legal information retrieval.

• Identifying sections in legal briefs.

• Help the litigants to know about the cases seeing precedent and help them in filing it properly.

• Develop an interactive search engine providing with litigant’s case history that provides a clear timeline and reference to past actions.

• Provide litigants an idea about their chances of winning the case and also analyzing the worth of it.

• Translate exchanges in courts in different languages, for both text and verbal communication.

Business Cases within Law Firms

Business Case Description Benefit

Customer Churn Prediction /Customer Classification

Predictive analytics to determine which customers are at risk. Classification of customer other than revenue to derive hidden patterns

Eliminates risk of customer loss, Customer upsell

Matter Pricing Pattern extraction for current scope of work to determine accurate revenue-cost-profit to scope future cases

Accurate pricing of matter to maintain profitability for the firm

Patent SearchAbility to search USPTO data with internal data sources

Gain critical insight like prior art search

CompetitiveIntelligence

Automatic extraction of presence of competitors from web site, legal filing

Provides metrics on social presence, practice areas and star lawyers

16

Business Cases within Law Firms

Business Case Description Benefit

Sematic Search and Retrieval

Interactive search and retrieval of large quantities of structured (spreadsheets) and unstructured (text-based) documents of legal case histories based on keywords, phrases, and short case descriptions

Saves time and resources relating to manually extract vital information for transactions and litigation; enhanced search accuracy

Prediction and Decision Support

Pattern extraction for automated assisted decision-making and prediction from early case and litigation assessments

Provide litigants an idea about their chances of prevailing in the matter for litigation risk analysis purposes

Text ClassificationAutomatic legal text classification to identify relevant documents from large document repositories

Provides critical document information for transactions and litigation applications

Business Cases within Law Firms Business Case Description Benefit

Document Summarization

Automatic legal document summarization into a desired length

Provides critical document information for transactions and litigation applications

Information Extraction

Identifying, collecting and producing electronically stored information in response to a request, highlighting relevant sections in legal briefs

Rapid and accurate production in a litigation or investigation, or review for potential transaction

Contract ReviewIdentifying and classifying contract sections, identifying missing or differing terms based on baseline contract language,

Automate contract review process – provide transactional support and optimize review time on contract language significantly different from expectations.

Identifying Star Personnel/ LateralPartners

Identifying the future potential of an employee lawyer to put on the track of becoming a partner

Helps firm to enhance reputation and increase revenue

17

Possible Business Cases within Law Firms

• Chatbots for customer interactions – Status of case

• Customer clustering to see which customer are alike and where is upsell – Customer 360 dashboards

• Customer platform - Providing detailed knowledge about a country, region based on legal issues

• GDPR – Risk Analysis utilizing graph

• Recommendation engine for partner for which associate to work with

• M&A analysis for customer – sentiment analysis and background analysis

• AI powered document management system vs. utilizing traditional meta tags

Solutions Utilizing AI: eDiscovery

• Text analytics software can do the following for identifying, collecting and producing electronically stored information in response to a request for production in a lawsuit or investigation:

• Highlight certain phrases, named entities and patterns of exchanges more thoroughly and effectively at a significantly cheaper cost

• Identify relevant documents (eliminating the irrelevant ones) to build the current case

• Summarize pages of documents into a desired length to avoid reading irrelevant content

18

Solutions Utilizing AI: Contract Review

• We can "classify" paragraphs in a contract (e.g. NDA) into various clauses (e.g. confidentiality, accuracy, privacy, indemnity). In that way you will know what's missing before even checking the validity of an individual clause. We can demo this feature with a different corpus.

• If you have a standard set of desired clauses or certain terms (expressed in one or more sentences) then we can check if those are "missing" from a newly arrived contract. Even if the desired clause is present we can check how "similar" the clause is to each of the sentences that occur in a contract.

• There is a very powerful techniquecalled "textual entailment" with whichwe will be able to determine if clausesare semantically equivalent or oneentails the other.

Solutions Utilizing AI: Citation Analysis

• Many efforts in computational law are focused on the empirical analysis of legal decisions, and their relation to legislation by making use of citation analysis, which examines patterns in citations between works.

• It is possible to construct citation indices and large graphs of legal precedent, called citation networks.

• Citation networks allow the use of graph traversal algorithms in order to relate cases to one another, as well as the use of various distance metrics to find mathematical relationships between them.

19

Q & A

Artificial Intelligence 101 · 2019-08-04 · •Learned patterns can be used to make predictions about new data instances. •Patterns constitute a “model” of the real world

Documents