NLP 2021 @ BGU.CS

JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

NLP 2021 @ BGU.CS

Lecture #10 / Quiz #09

Email *

Consider the basic Viterbi algorithm: Input: a sentence x1 ... xn, parameters q(s|u, v) and e(x|s). Definitions: Define K to be the set of possible tags. Define K−1 = K0 = {*}, and Kk = K for k = 1 ... n. Initialization: Set π(0, *, *) = 1. Algorithm: For k = 1 ... n, For u ∈ Kk−1, v ∈ Kk, π(k, u, v) = max_w ∈ Kk−2(π(k − 1, w, u) × q(v | w, u) × e(xk | v)) Return max_u ∈ Kn−1, v ∈ Kn(π(n, u, v) × q(STOP | u, v)) What is the complexity of this algorithm which produces a sequence of n tags given a sequence of n words with a label set of size K?

In the Maximum Entroy Markov Model (MEMM), we model the sequential tagging problem of finding a sequence of tags given a sequence of words with the following independence assumption: Chain rule: p(t_1, t_2, ..., t_n | x_1, ..., x_n) = Πi=1..n p(s_i | s_1, ..., s_(i−1), x_1, ..., x_n) Independence assumption: p(t_i | t_1, ..., t_(i-1), x_1, ..., x_n) = p(t_i | t_(i-1), x_1, ..., x_n) In addition, we model the conditional distribution using a log linear distribution with a feature extraction function Φ and parameters w. What are the parameters of the feature extraction function Φ?

Write the formula of the log-linear (using the softmax function) with and parameters w: p(ti | ti-1, x1, ..., xn) = softmax(__________________________________________)

Is decoding necessary in the MEMM approach? Why?

In the CRF model, we model the sequence tagging problem with a global feature function and parameters w as follows: x = (x1, ..., xn) t = (t1, t2, ..., tn) p(t | x, w) = softmax( w . Φ(x, t)) What are the two computational challenges that must be overcome to solve such a model?

Submit

Clear form

Never submit passwords through Google Forms.

This content is neither created nor endorsed by Google. Report Abuse - Terms of Service - Privacy Policy

Forms