One of the biggest buzzwords around Google and the overall technology market is machine learning. Google uses it with RankBrain for search and in other ways. We asked Gary Illyes from Google in part two of our interview with him how Google uses machine learning with search.
Illyes said that Google uses it mostly for “coming up with new signals and signal aggregations.” So they may look at two or more different existing, non-machine learning, signals and see if adding machine learning to the aggregation of them can help improve search rankings and quality.
He also said “RankBrain, where… which re-ranks based on based on historical signals,” is another way they use machine learning, and later explained how RankBrain works and that Penguin doesn’t really use machine learning.
Here is the audio file:
Here is the full transcript:
Danny Sullivan: These days it seems like it’s really cool for people to just say machine learning is being used in everything.
Gary Illyes: and then people freak out.
Danny Sullivan: Yeah. What is it, what are you doing with machine learning? Like, so when you say it’s not being used in the core algorithm. So no one’s getting fired. The machines haven’t taken over the algorithm, you guys are still using an algorithm. You still have people trying to figure out the best way to process signals and then what do you do with the machine learning is part of that?
Gary Illyes: They are typically used for coming up with new signals and signal aggregations. So basically let’s say that this is a random example and not know if this is real, but let’s say that I would want to see if combining PageRank with Panda and whatever else, I don’t know, token frequency.
If combining those three in some way would result in better ranking and for that for example we could easily use machine learning. And then create the new composite signal. That would be one example.
The other example would be RankBrain, where… which re-ranks based on based on historical signals.
But that also is, if you, if you think about it, it’s also a composite signal.
It’s using several signals to come up with a new multiplier for the results that are already ranked by the core algorithm.
What else?
Barry Schwartz: Didn’t you first use it as a query refinement? Right? That’s the main thing?
Gary Illyes: I don’t know that?
Barry Schwartz: Wasn’t RankBrain all about some type of query understanding and…
Gary Illyes: Well making sure that for the query we are the best possible result basically, it is re-ranking in a way.
Barry Schwartz: Danny did you understand RankBrain to mean, maybe it was just me, to mean, alright someone searched for X but RankBrain really makes into Xish? And then the queries would be the, results.
Danny Sullivan: When it first came out my understanding was it RankBrain was being used for longtail queries to correspond them to short short answers. So somebody comes along and says, Why is the tied super high sometimes when I don’t understand the moon seemed to be very big and that’s a very unusual query right. And and Google might be going. OK. There’s a lot going on here. How do unpack this and to where and then getting the confidence and using typical things where you’d be like OK we’ll see if we have all these words you have a link to whatever. Meanwhile really what the person is saying is why is the tide high when the moon is full. And that is a more common query. And Google probably has much more confidence in what its ranking when it deals with that and my understanding RankBrain helped Google better understand that these longer queries corespondent basically to the shorter queries where it had a lot of confidence about the answers.
That was then, that was like what a year ago or so. At this point Gary, when you start talking that re-ranking is that the kind of the re-ranking you’re talking about?
Gary Illyes: Yea.
Danny Sullivan: OK.
Barry Schwartz: All right. So we shouldn’t be classifying all these things as RankBrain or should we? Like it could be other machine learning.
Gary Illyes: RankBrain is one component in our ranking system. There are over 200, as we said in the beginning, signals that we use and what each of them might become like machine learning based.
But when you or I don’t expect that any time soon or in the foreseeable future all of them would become machine learning based. Or that’s what we call the core algorithm would become machine learning based. The main reason for that is that debugging machine learning decisions or AI decisions, if you want, if you like, is incredibly hard especially when you have when you have multiple layers of neural networks. It becomes close to impossible to debug a decision. And that’s very bad for us. And for that we try to develop new ways to to track back decisions. But if it can easily obfuscate issues and that would limit our with our ability to improve search in general.
Barry Schwartz: So when people say Penguin is now an old machine learning based.
Gary Illyes: Penguin is not ML.
Barry Schwartz: OK there’s a lot of people saying that we Penguin machine learning based.
Gary Illyes: Of course they do. I mean if you think about it it’s a very sexy word. Right. And if you publish it…
Danny Sullivan: People use it in bars and online all the time. Like hey machine learning. Oh yeah.
Gary Illyes: But basically if you publish an article with a title like machine learning is now in Penguin or Penguin generated by machine learning it’s like…. But if you publish an article with that title it’s much more likely that people could click on that title and well probably come up with the idea that you are insane or something like that. But it’s much more likely they would visit your site than if you publish something with a title Penguin has launched.
Note: This article was pre-written and scheduled to be published today.
The post How Google uses machine learning in their search algorithms appeared first on Search Engine Land.
No comments:
Post a Comment