NLP. Quiz 1

JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

NLP. Quiz 1

Intro to NLP and Deep Learning. Word embeddings.

Some questions can be not mentioned in the lecture explicitly, but you can still use logic and google.

Email *

Github *

What are the advantages of deep learning approach over classical machine learning approach?

2 points

It works well with almost raw data and requires much less feature engineering

Deep learning models have higher capacity

It works better with complex feature representations: a lot of categorical and continious variables

It is always perform better given the same dataset

Models are faster

Other:

Should one have domain specific knowledge in, say pharmacology, to predict possible drugs using deep learning against given disease?

1 point

yes, one should have Ph.D. in pharmacology

no, it's not necessary

Clear selection

What is the main difficulty of processing natural language?

1 point

The need to build complex formal models

Because one cannot use gradient descent methods (cost function is not differentiable)

Because of learning to distinct ambiguity in language requires understanding the context.

Clear selection

How many verbs in the sentence: "Can you can a can as a canner can can a can?"

Clear selection

What is the most possible solution of equation: word2vec('"king") + word2vec("woman") - word2vec("man") = x?

2 points

word2vec("queen")

vector that is close, but not equal to word2vec("queen")

Other:

Let the vector representation for the word "jungle" be [-0.123 0.432 1.453 -0.003]. Which of these vectors are probable to be representations of the word "forest"?

1 point

[0 0 0 0 0 0 0 1]

[-0.120 0.410 1.312 -0.012]

[0 0 0 1 0 1]

[-0.140 0.5 1.479 0.002]

[-1.453 0.002 0.132 -0.231]

What are the advantages of using small dense vector representations (eg. word2vec) compared to large sparse vectors (eg. TF-IDF)

2 points

Faster to train

Better semantic and syntactic properties

More information in the vector

Better gradient flow

Linear models perform better with dense representations in practice

Other:

Check all true statements about Negative Sampling

2 points

It speeds up computations by simplifying normalization coefficient for softmax in CBOW model

It greatly reduces number of iterations required to reach convergence in Skip-Gram model

It works better to sample words with unigram distribution (word frequency) in power of 3/4 when in power of 1

It works best to sample words with uniform distribution

Which of the following tasks can be used for *intrinsic* word vector evaluation?

2 points

Sentiment analysis

Semantic analogies

Part of speech tagging

Named entity recognition

Syntactic analogies

Correlation with human evaluation of word similarity

Your questions about the lecture (if any, you may write in Russian as well)

1 point

Any suggestions how to make this course better

Clear form

Never submit passwords through Google Forms.

This content is neither created nor endorsed by Google. Report Abuse - Terms of Service - Privacy Policy

Forms