This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• Exam (comprehensive, with focus on material since midterm), Thurs 5:30-7:30pm, in this room, two pages and notes and simple calculator (log, e, * / + -) allowed
• The Turing Test• Strong vs Weak AI Hypotheses• Searle’s Chinese Room Story• High-Level Recap of Topics since Midterm• Final List of Topics Covered this Term (to various levels of depth, of course)• Review of Fall 2014 Final (Recall: another review tomorrow of Spring
• Says intelligence is judgedbased on behavior(rather than inspecting internal data structures)
• Focus on cognition, rather than perception, so use a simple ‘ascii’ interface
• If human judge interacting (via a teletype) with two ‘entities’ cannot accurately say which is the human and which is the computer, then the computer is intelligent (visualized on next slide)
• Is the room + the human intelligent?– After all, no one part of an airplane has the property
flies, but the whole thing does– This is called ‘the systems reply’
(see http://plato.stanford.edu/entries/chinese-room/)
• The ‘robot reply’ says that the problem is that the person doesn’t sense/interact with the real world – ‘symbols’ would be grounded to actual physical things and thereby become meaningful
Problem-solving methods based on the biophysical world Genetic algorithmsSimulated annealingNeural networksReinforcement learning
Philosophical aspectsTuring testSearle's Chinese Room thought experimentThe coming singularityStrong vs. weak AISocietal impact and future of AI
Learning from labeled data Experimental methodologies for choosing parameter
settings and estimating future accuracyDecision trees and random forestsProbabilistic modelsNearest-neighbor methodsGenetic algorithmsNeural networksSupport vector machinesReinforcement learning (reinforcements are ‘indirect’ labels)Inductive logic programmingComputational learning theoryVariations: incremental, active, and transfer learning
Learning from unlabeled dataK-meansExpectation maximization Auto association neural networks
Searching for solutions Heuristically finding shortest pathsAlgorithms for playing games like chessSimulated annealingGenetic algorithms
Suggestions• Be sure to carefully review all the HW solutions, especially HWs 3, 4, & 5• Imagine a HW 6 on MLNs and RL (and see worked examples in lec notes)• My old cs540 exams highly predictive of my future cs540 exams• ILP: understand search space when predicates have k arguments
• Some things to only know at the ‘2 pt’ level
– Calculus: have an intuitive sense of slope in non-linear curves(only need to know well: algebra, exp & log’s, arithmetic, (weighted) sums and products: and )
– Matrices and using linear programming to solve SVMs (do know dot product well)
– Active, transfer, and incremental learning
– ‘Generalizing across state’ in RL
– ‘Covering’ algorithms for learning a set of rules (covered in ILP lecture)
– Won’t be on final: Using variable types to control search in ILP
– COLT: only need to understand role of epsilon and delta ( and )
– How to build your own walking-talking robot :-)12/15/15 11
An “On Your Own” RL HW(Solution)
Consider the deterministic reinforcement environment drawn below. Let γ=0.5. Immediate rewards are indicated inside nodes. Once the agent reaches the ‘end’ state the current episode ends and the agent is magically transported to the ‘start’ state.
(a) A one-step, Q-table learner follows the path Start B C End. On the graph below, show the Q values that have changed, and show your work. Assume that for all legal actions (ie, for all the arcs on the graph), the initial values in the Q table are 4, as show above (feel free to copy the above 4’s below, but somehow highlight the changed values).
(b) Starting with the Q table you produced in Part (a), again follow the path Start B C End and show the Q values below that have changed from Part (a).Show your work.
(c) What would the final Q values be in the limit of trying all possible arcs ‘infinitely’ often? Ie, what is the Bellman-optimal Q table? Explain your answer.
(d) What is the optimal path between Start and End? Explain.
Start B C End The policy is: take the arc with the highest Q out of each node
Given these three rules, what is the prob of P?wgt = 2P Rwgt = 3R Q [ same as R Q ]wgt = 1Q [shorthand for the rule: ‘true Q’ ]
P Q R Unnormalized ProbF F F exp(0 + 3 + 0)F F T exp(0 + 0 + 0)F T F exp(0 + 3 + 1)F T T exp(0 + 3 + 1)T F F exp(0 + 3 + 0)T F T exp(2 + 0 + 0)T T F exp(0 + 3 + 1)T T T exp(2 + 3 + 1)
• How much time did we, as a group, spend me teaching you a fraction of what I (and the authors of assigned readings) know about AI?– 50 hrs of class time x 70 humans 150 days– Plus, no doubt, 10x that time outside of class :-)
• How long will it take one robot that has learned ‘a lot’ to teach 70 robots? 7M robots? 7B?– A few seconds?– Or will robots+owners have ‘individual differences’
that preclude direct brain-to-brain copying?– Remember : predictions (a) for nuclear power leading to
“electricity to cheap to meter” and (b) `the war to end all wars’12/15/15 17
Societal Impact of AI? (2)When will owning a car be a hobby?
When will communication between human speakers of any two natural (and non-rare) languages be as easy as communication in the same one?
When will our 'digital doubles' and robots do all our Travel planning? Entertainment planning? Financial decision making? Medical decision making? Shopping? Cooking? Cleaning?
When will the average human life span grow faster than one year per year? (Will AI drive med?) Robot care and engagement in nursing homes?
AI and war? AI and privacy? AI and income distribution?
Others? Comments or Questions?
What is the prob we will all look back at these questions in 25 years and see them as naively optimistic? Seems likely (but other things will happen faster than we expect)