Artificial Intelligence Questions

(AI)

(Spring 2015 - Abbeel, Klein, Darrell, Bajcsy):
Part 1 questions, Part 2 is currently based on previous. years CS 189 finals (pdf)

(Spring 2014 - Bajcsy):

Questions were regarding:
AI: Constraint satisfaction search;
Belief networks and dynamic networks;
Principle of Stereo Vision.

(Spring 2013 - Abbeel* & Bartlett):

1. MDPs
What is an MDP? Formulate elevator control as an MDP. Describe either value iteration or policy iteration. Explain how partial observability can be handled.

2. Kernel methods
Formulate linear regression as a maximum likelihood learning problem. How would you compute a MAP estimate with a Gaussian prior on the parameters? Explain how these two regression methods can be kernelized.

3. Vision
Explain how to recover depth from stereo images. Explain an algorithm for camera calibration. Explain an algorithm for object recognition.

4. Statistical language processing
What is a PCFG? Explain how you could estimate a PCFG from data. What are N-gram language models and how can they be estimated from data?

5. Graphical models
Consider a directed graphical model with discrete variables. Describe an algorithm for parameter estimation and discuss its computational complexity. How could you deal with missing data and parameter sharing?

(Spring 2012 Jordan and Abbeel):

1. Reinforcement Learning
• What is an MDP? Give an example.

• What is value iteration? How to handle large state spaces---too large to enumerate, or even continuous?

• How would you handle a utility function U = max_{t} R_t [rather than the usual \sum_t \gamma^t R_t ]

• How would you handle if the agent, instead of simply getting to choose an action at each time, at each time it first is require to flip a coin, if Heads then a random action is taken, if Tails the agent can choose the action.

2. Computer Vision
• Describe image formation equations under the pinhole camera model

• How can one perform edge detection?

• What is the Gaussian pyramid, what is it used for?

• Explain aliasing (1-D signals is fine). Talk both about time/spatial domain and frequency domain.

3. Graphical models
• Draw an undirected graphical model over 3 variables, A, B, C, where there is an edge between A and B, and an edge between C and D. What (conditional) independencies are implied by this model? What does that conditional independency mean --- is it true for all distributions that can be encoded by this model ? Are there distributions that can be encoded with this model that have more independencies than just the one you already listed? Give an example of a distribution that matches with this model. What if we marginalize out B --- are A and C independent in the resulting distribution?

• Draw a graphical model for a mixture of Gaussians. Write the equation for the probability density. Write the likelihood for this model. Is the likelihood bounded?

4. NLP
• What is a stochastic context free grammar? Let's see you are given the Penn Treebank, how do you learn such a grammar? What if you want to learn the grammar from data that is not annotated with parses?

• What is compositional semantics?

(Fall 2010 - Klein & Malik):
"Language:
-- Express the ASR problem as an HMM.  What are the features and state space?
-- Imagine you have a sequence of words and a PCFG in CNF.  Define the quantity computed by the CKY algorithm and give the recurrence used by the algorithm.
-- How would you jointly do the ASR and parsing?  Explain how to merge the two algorithms.

Vision:
-- Define texture.
-- Explain how to segment an image by texture.  What is the key reason simple methods like convolving with basic filters don't work? What is the solution?  How does segmentation work?
-- Explain how to classify a region given reference examples of various textures.  What would the features be, etc?

Graphical Models / Search:
-- [Example GM, questions about what is conditionally independent of what]
-- State the Bayes ball (or d-separation) algorithm as a state-space search problem.  What are the states, successor function, and so on?
-- Give an admissible heuristic for telling whether a source and target variable are conditionally independent given evidence.

Reinforcement Learning:
-- State Blackjack as an MDP.  What are the states, rewards, etc?
-- What is a value?  How do you compute the values in a known MDP.
-- Imagine you did not know the transition function for Blackjack (or didn't want to compute it).  Describe how to determine values of a policy.
-- What is q-learning?  What is the difference between on- and off-policy learning?  What is the purpose of an epsilon-greedy policy?"

(Spring 2009 - Klein & Jordan):
"1) What is a CSP?  Define formally what a constraint is, etc.  What is
arc consistency and why would we want it?  Give an algorithm for
establishing it.

2) What is PCA?  What is kernel PCA?  Why would we use it?  What are
the inputs and outputs?  Give the algorithm.

3) What is the EM algorithm?  What is it for?  What does it do,
formally?  How does it work for a bivariate Gaussian with missing
components?  What sufficient statistics?

4) Describe how a statistical machine translation system works.  What
data does it require?  How are the models learned?  How does decoding
work?

5) Describe how depth can be recovered from an image.  What about
without stereo or motion?"

(Spring 2007 - Bartlett & Jordan):
learning, vision, language and graphical models:

1. What is an MDP?
Explain how to formulate the game of Tetris as an MDP.
Describe an algorithm for finding the optimal policy for this game.
Suggest an efficient approximate approach that might be more effective in practice.

2. What is the image segmentation and why might one want to segment an image?

3. What is part-of-speech tagging?
Suggest two approaches to the part-of-speech tagging problem.

4. What is the junction tree algorithm?
Given a parameterized graphical model, and given observations of a subset of
nodes of the model, describe a general approach to maximum likelihood parameter
estimation for the model.  What role might the junction tree algorithm play in
this procedure?"

(Fall 2006 - Klein & Malik):
"General AI:
-- What is an MDP?  Define it formally.
-- What are the Bellman equations
-- What is the value iteration algorithm?
-- What is Q-learning?

Learning:
-- Define linear regression as an optimization problem.
-- What is a kernel and why are they used?
-- Can you kernelize linear regression?  How?

Language:
-- Describe a chain of processing from a wave file to first-order logic.
-- How does ASR work?  What is an HMM in general?  What is the Viterbi algorithm?
-- How does parsing work?  What is an efficient algorithm for probabilistic parsing?
-- How might semantic translation to first-order logic work?

Vision:
-- Define the object recognition problem.  What makes it hard?
-- How might you segment and classify a zip code from a picture of an envelope?
-- What is edge detection, and how does it work?"

(Fall 2005 - Klein & Jordan):
"General AI:
1. Robot motion planning
a.  Draw a configuration space for a 2D robot arm with rotational shoulder and elbow joints.
Is the mapping from configurations to workspace positions of the actuator invertible?
b.  If the shoulder joint is suspended from the ceiling, how does the free space change?
c.  Assume now there's a floor with a hole in it that the arm move obstacles into.  Why is
the free space no longer convex?
2. Search
a.  How could I use state-space search to plan paths for the robot discussed before, assuming
there is no uncertainty in the robots movements and the obstacles are fully known?  What
are the drawbacks to the method you proposed?
b.  How could I use A* search?  What would be an admissible heuristic?

Statistical Learning Theory:
1. Kernels
a.  What is Fisher discriminant analysis (aka linear discriminant analysis)?
b.  Can it be kernelized?  What does that mean?  Why do you know that the
parameter vector will be a linear combination of transformed data points?
2. Parameter estimation
a.  Show how logistic regression can be motivated from a generative
perspective.  In particular, what assumptions on the class-conditional
densities do you need.
a.  Given a specific generative model, describe two ways to estimate
parameters for the resulting logistic regression, using maximum likelihood
in both cases.  Will they give the same results?
3. Graphical models
a.  What is the treewidth of a graph?  What does it have to do with the
complexity of inference?
b.  What is a junction tree?  How is it different from an arbitrary tree of cliques?

Language and Speech
1. PCFGs
a.  What is a PCFG?  How could one be used in a speech recognition?
b.  Sketch an algorithm for finding the best parse or the sum of all parses for a sentence.
What are the time and space complexities of your algorithm and why?
2. Semantic Translation
a.  Write a reasonable translation of "Tom likes Amy" into FOPL.  How could you derive
this expression compositionally from a parse tree.
b.  How about "Everyone likes Amy"?  "Everyone likes someone"?
c.  How might we represent tense and aspect in this process?

Vision
1. Shape detection
a.  What is a Hough transform?  How can it be used for segmentation?
2. Depth
a.  What are some techniques for finding depth in images?"

(Spring 2005 - Russell & Wilensky):
"1. Vision
Define object recognition and explain why it's hard.
How might learning help?
Why might it be important for higher-level visual
hypotheses to inform lower-level processes such
as edge detection?

2. KR
What are the principal differences between propositional and
first-order formalisms?
Can you write the rules of chess in propositional logic?
Are graphical models propositional? Can we make them first-order?

3. Planning
How might actions be described in first-order logic?
What are the benefits and drawbacks to such a representation?
What are some alternatives?
Are there domain-independent heuristics for planning?

4. Language, speech
What are some important types of ambiguity that occur in
natural language use?
Pick one of these and explain some approaches to addressing it.
What are some examples of semantic ambiguity other than
lexical semantic ambiguity?
What sorts of ambiguity arise in speech? How do speech systems
resolve those ambiguities?

5. Robotics
How can a robot get its hand from here to there?
What are some of the problems that one would encounter in designing
a robot perform some task in a new environment, such as finding a
telephone in an unfamiliar house? Solutions?

6. Learning
Suppose you wanted to get a machine to play Tetris.
How might you formulate and solve this as a learning problem?
Ditto for battleships"

(Fall 2004 - Feldman & Klein):
"Language:
1. What are some levels of structure in natural languages, and what formal models
are used for them?
2. Why might we want to model syntax as something other than a CFG?  Name some
options and reasons.

Search:
1. What is state-space search?  Describe a general heuristic approach to search.
2. What is A* search, what is admissibility, and why is it important? What would
happen if we used an inadmissible heuristic?
3. Contrast state-space search and dynamic programming.  Can they be combined?

Learning:
1. Describe your favorite classifier.  What objective function does it maximize,
and what are its properties?  What is its mechanism for preventing overfitting?
2. When might your classifier be inappropriate?  What are its weaknesses? What would

Vision:
1. For 2D images, suggest a domain where automatic visual recognition is successful.
What representations and techniques are used?
2. For 3D, do the same."

(Spring 2004 - Jordan & Russell):
"1. What is a planning problem and how could one solve such problems using
a satisfiability algorithm?

2. What is boosting and how does it work? What formal properties
of boosting can be shown?

3. Consider the problem of moving a robot arm from one place to another.
How can such problems be formulated and solved?

4. What is the part-of-speech tagging problem and how can it be solved?

5. Explain the EM algorithm for hidden Markov models. Is forward-backward
an elimination algorithm?"

(Fall 2003 - Feldman & Wilensky):
"1. Vision
Motion is a major problem for visual systems.
Describe how Kalman filters can be used in tracking moving objects.
How would you handle the problem of data association?
To what extent is the problem of tracking people solved?

2. Language
What is the problem of reference resolution?
Can it be divided into subproblems that may be subject to different constraints?
Does the need to address this problem have representational consequences?
What are some possible algorithms to address this problem?

3. Speech recognition
What are hidden Markov models, and why and how are they used in speech recognition?
How are HMMs trained?
What other components are needed in a speech recognition system?

A current speech recognition system, given two utterances, transcribed them as follows:
1.	“One carrot cake, one carat diamond, one karat gold ring”
2.	“My cake is made with carrots. Diamonds are weighed in carrots.
Gold is weighing the in carrots.”

The sound variously transcribed as “carrot”, “carat” and “karat” is identical.
How might one account for this variation in performance?

4. Logic and planning
Assuming a STRIPS-like representation, how might we represent a simple
operator, e.g., "Go" (i.e., the robot goes from one location to another).
Such a representation is generally preferred to a "pure" logical formulation.
Why?  Compare state-based and plan-based approaches to problem solving.
Suppose that our robot could pick something up and take it with it.
How might we reason now about the consequences of actions?
How is the planning process complicated?

5. Search
Propose an approach to solving cryptarithmetic problems.
Within this approach, what are some different strategies one can exploit?

6. Learning
Is there a class of problems for which it is not appropriate?"

(Spring 2003 - Feldman & Russell):
"1. Vision
What is segmentation and what role does it play in the overall
vision problem? [Explain with reference to a specific image.]
How is segmentation done?
What is the connection between segmentation and edge detection?
How is edge detection done?

2. Language
What is unification grammar? What basic advantage does it
offer over standard context free grammars.
Give an example of a unification grammar rule that
incorporates number agreement.
What are some limitations of unification grammar?

3. Speech recognition
What is a language model and how is it used in speech?
Describe the probability models used in statistical speech recognition.
Describe the EM algorithm and how it is used to train these models.

4. Logic and planning
Describe an efficient SAT algorithm. Is your algorithm complete?
Explain how to use a SAT algorithm to solve a STRIPS planning
problem such as the following:
Go(x,y):
Preconditions: At(x)
Effects: ~At(x), At(y)
Initial state: At(A)
Goal state: At(B)

5. Search and robotics
Describe an algorithm for finding the shortest path to a
known goal location in a deterministic grid world with obstacles.
What if the map of the grid world is not available?
How can a real robot build a map of a real environment?

6. Learning
Explain decision tree learning
Explain how to combine this with boosting
Explain the problems involved in learning Bayesian networks
from data, distinguishing between Bayesian and maximum-likelihood
approaches."

(Spring 2002 - Jordan & Russell):
"See (http://www.cs.berkeley.edu/~daf/AI-prelim/AI-prelim.html) for the

(Fall 2001 - Feldman & Wilensky):
"1) What are computational methods for going from 2D images to
models of the 3D world?

3) Discuss state-space and plan-space planning techniques.

4) Compare Neural Networks and Bayesian Networks.

5) Discuss the feasibility of Context Free Grammars for Natural
Language processing."

(Fall 1999 - Jordan & Russell):
"Q1: search as a solution to robot path planning
b. heuristics
c. what if you can't assume polygonal obstacles (configuration space)?

Q2: belief networks
a. are diagnostic or causal queries easier?
b. can we remove a node from the network for a given query?
c. complexity of variable elimination for diagnostic/causal

Q3: mdps: what are two ways of solving them / pros-cons?
a. what if they are not observable?
b. what is the size of the search space (unobservable)?
c. why should we assume the Markov property holds in belief space?

Q4: mdps vs. dbns
a. complexity of inference (forward algorithm)
b. what if the model is a mixture of gaussians?  complexity of forward
chaining then?  what if the gausisian nodes have discrete parents?  what
then?

Q5: mdp vs. reinforcement learning
a. what are the differences?
b. what are some strategies for reinforcement learning?"

(Fall 1998 - Malik & Wilensky):
"Q1: Draw a belief network with two causal nodes A and B, and one
symptom node C.  What are the independence relationships in the network?
Number of parameters in the CPT?  Write a joint probability.  What will
happen if we reverse some arrows (can we still capture the same
independence relationships)?

Q2: Draw a belief network for "smoking", "lung cancer", "coughing",
"positive x-ray", and "black lung".  How do you do diagnostic query?
What do you do when poly tree doesn't work here?  Suppose we use
clustering.  Is there any systematic way of deciding which nodes to
merge?

Q3: Construct a neural network for the example in the last question.
Compare the neural nets with belief nets, both domain independent and
domain specific properties.  Why do people use sum-of-square error
instead of sum-of-absolute error in neural net training?

Q4: What is non-linear planning?  Why is that better than linear
planning?  Illustrate resolving threats with promotion and demotion.
What other problems do planning have in real world situations? How do we
cope with uncertainties?  Can you incorporate probability into POP? Any
other ways of using probability?  Do you get a plan out of a Markov
decision process? How do you find such policy?

Q5: What is version space? Compare version space to neural nets,