Overview

  • After talking about human language and word meaning, we’ll introduce the ideas of the word2vec algorithm for learning word meaning.
  • Going from there, we’ll kind of concretely work through how you can work out objective function gradients with respect to the word2vec algorithm.
  • Next, we shall look into how optimization works.
  • Finally, we develop a sense of how these word vectors work and what you can do with them.
  • Key takeaway
    • The really surprising result that word meaning can be represented not perfectly, but rather well by a large vector of real numbers gives a sense of how amazing deep learning word vectors are.

What do you hope to learn in this course?

  • Okay, so I’m quickly, what do we hope to teach in this course. So we’ve got three primary goals.
  • The first is to teach you the foundations I a good deep understanding
  • Of the effect of modern methods for deep learning applied when or p. So we are going to start with and go through the basics and then go on to key methods that are used in NLP recurrent networks attention transformers and things like that.
  • We want to do something more than just that. We’d also like to give you some sense of a big picture understanding of human languages and what are the reasons for why they’re actually quite difficult to understand produce even though human seems
  • To add easily. Now obviously if you really want to learn a lot about this topic, you should enroll in and go and start doing some classes and the linguistics department. But nevertheless, for a lot of you. This is the only human language. p Content. You’ll see during your master’s degree or whatever. And so we do hope to spend a bit of time on that starting today.
  • And then finally, we want to give you an understanding of an ability to build systems in PI torch for some of the major problems in NLP, so we’ll look at learning word meanings dependency parsing machine translation question answering.

Neural machine translation

  • In the last decade or so, and especially in the last few years, machine translation has advanced by leaps and bounds thanks to neural networks. Machine translation powered by neural nets is referred to as “neural machine translation”.
  • This is outright amazing because for thousands of years, learning languages was a human task which required effort.
  • But now we’re in a world where you could just hop on your web browser and think, “Oh, I wonder what the news is in Kenya today”. You can head off over to a Kenyan website and you can see something like this (cf. image above). You can then ask Google to translate it for you from Swahili. While the translation isn’t quite perfect, it’s reasonably good.

GPT-3: A first step on the path to universal models

  • The single biggest development in NLP in the last year was GPT-3, which was a huge new model that was released by OpenAI.
  • This is an exciting development because it’s a step on the path to universal models, where you can train up an extremely large model on an extensive dataset. Such models knowledge of how to do a variety of tasks and can be easily applied to a specific task at hand. So, we’re no longer building a model to detect spam or a model to detect foreign language content. Instead of building separate supervised classifies for every different task, we’ve now just built up a model that understands language and can do a variety of NLP-related tasks.

  • In the image above, on the left, the model is being prompted to write about Elon Musk in the style of Dr. Seuss. We prompted the model with some text which led it to generate more text. The way it generates more texts is literally by just predicting one word at a time.
  • This setup yields something very powerful because what you can do with GPT-3 is you can give it examples of what you’d like it to do.
  • In the image above, on the upper right, the model is being prompted with a conversation between two people, starting with “I broke the window”. Now, change it into a question: “what did I break?”. Next, “I gracefully saved the day” - change it into a question - “what did I gracefully save?”. Feed in this prompt to GPT-3 so it understands what you’re trying to do.
  • Now, if we give it another statement like “I gave John flowers”. I can then say, GPT-3 predict what words come next, and it will follow my prompt and produce “who did I give flowers to?”. Or you can say “I gave her a rose and a guitar”, and it will follow the idea of the pattern and output “who did I give a rose and a tattoo to?”.
  • This one model can do an amazing range of NLP tasks given an example of the task.
  • Let’s take the task of translating human language sentences into SQL as another example (cf. image above, lower right). You can give GPT-3 a prompt saying “how many users have signed up since the start of 2020?” and GPT-3 turns it into SQL! Or you can give it another query “the average number of influences each user subscribe to”. And again, GPT-3 converts that into SQL. This shows that GPT-3 learns patterns within language and is versatile to a range of tasks.

How do we represent the meaning of a word?

  • How do we represent the meaning of a word? Well, let’s start with what is meaning. We can look up the Webster Dictionary and here’s what it says:
    • The idea that is represented by a word, phrase, etc.
    • The idea that a person wants to express by using word, signs, etc.
    • The idea that is expressed in a work of writing, art, etc.
  • So that they think of word meaning as being a pairing between a word which is a signifier or symbol. And the thing that it signifies the signified thing, which is an idea or things so that the meaning of the word chair is a set of things that are chairs.
  • And that’s referred to as the notation, or semantics as a term that’s also used and similarly applied for the semantics of programming languages.
  • But this model isn’t very gently implementable like how do I go from the idea that okay Chairman’s the set of chairs in the world, something I can manipulate meaning within my computers so
  • Traditionally, the way that meaning has normally.
  • Been handled in natural language processing systems is to make use of resources like dictionaries and footsore I, in particular, popular one is word net which organized words and terms into both synonyms sets.
  • Words that can mean the same thing and hyper names which correspond to is the relationships.
  • And so for the is the relationships, you know, we can kind of look at the hype in terms of panda and a Pandora’s a kind of proceed need whatever those are hiking. Yes, that’s probably with red pandas.
  • Which is a kind of, kind of, or which is the kind of placenta, which is kind of mammal, and you sort of head up this hyping him.
  • Hierarchy. So work that has been a greater resource for an LP pencils also been highly deficient so
  • It lacks a lot of nuance. So, for example, and word net proficient is listed as a synonym for good.
  • That, you know, maybe that’s sometimes true, but it seems like in a lot of context. It’s not true. And you mean something rather different. When you say proficient versus good
  • It’s limited as a human constructed thesaurus. So in particular, there’s lots of words and lots of uses of words that just aren’t there including, you know, anything that is, you know,
  • Sort of more current terminology in like wicked is there for the Wicked Witch, but not for more modern colloquial uses
  • Ninja certainly isn’t there for the kind of description, some people make a programmers and this is impossible to keep up to date. So it requires a lot of human labor, but even when you have that
  • You know, it has a sense of synonyms. That doesn’t really have a good sense of words that means something similar. So
  • Fantastic and great means something similar without really being synonyms. And so this idea of meaning similarity is something that would be really useful to make progress on and where deep learning models Excel.

Citation

If you found our work useful, please cite it as:

@article{Chadha2021Distilled,
  title   = {},
  author  = {Chadha, Aman},
  journal = {Distilled Notes for Stanford CS224n: Natural Language Processing with Deep Learning},
  year    = {2021},
  note    = {\url{https://aman.ai}}
}