Dec 202012
 

Last week Bill Slawski covered a patent that has just been granted to Google, named “Determining spam in information collected by a source.” As it is immediately obvious from its name, the patent discusses methods for discovering spammy information coming from third-party sources. This seems to cover mostly (but apparently not limited to) business entities. Therefore, it would be safe to say that the patent discusses local citations and how a search engine might determine if the citation carries purposefully set incorrect information (spam). The two main ways, according to the patent, are:

- By measuring the “frequency of occurrence” of each phrase/element of the citation

- By measuring the “trustworthiness” of each source

I will not dabble further into the explanations of how each of these two factors are counted and how the whole system works, but I’d rather focus on the practical implications.

What It Tackles

Google obviously would like to present the most accurate and complete information to its users. This is possible only if it obtains this information from as many sources as possible. However, some of these sources might sometimes provide inaccurate or even spammy information, so Google needs methods to find it. Some obvious examples of information Google would rather completely disregard are telephone number in the business name, mentions of words such as “discount”, “sales”, etc. In other cases it might be more difficult for Google to understand if particular content is correct or not. An example would be “[Business Name] in [City]“, or even worse – “[City] [Business Name]” (as in Miami Printing, which is the actual name of a real business).

The patent gives examples mostly related to the category of the business, but I believe the practical implications are mostly related to the business name. It is a known and widely approved fact that the business name plays role in how Google ranks the local search results (see factors #15 and #22 here). That is why over the years many have adopted the bad practice (intentionally or not) to add extraneous keywords to the business name in their Google local listings. When Google started getting stricter, the “practitioners” (predominantly black hat SEOs) got smarter and started creating citations with the business name including the keywords. That way the third-party data would support the information the “business owner” submits via Google Places.

The Threats

Reading through the patent, two major threats occur in my mind:

1) The main one is that Google seems to rely a lot (probably too much) on information coming from “trusted sources” (according to the patent a source can be designated “as a trusted source based on, for example, a reputation of the source or previous dealings with the source or combinations of them”). This means that it is theoretically possible that if a source is trustworthy enough, it is possible that Google might take the information from this source for granted and would never disregard it or check its accuracy. Examples of such sources would be LocalEze and Infogroup/CityGrid in the USA, and YellowPages.ca in Canada. Translated in local SEO language this means that it is possible that if a listing is added to LocalEze (for example) and the same business information is not found anywhere else on the web, Google might still create a new listing using this information. This obviously opens up a big hole in the described system, because Google would be very dependent on such third-party trusted sources. And it is important to mention that many of these potentially trusted sources have close to no mechanisms for checking the authenticity of the business information added to their databases (other than phone verifications, which is an insufficiently reliable method).

2) My other concern is related to businesses that actually do have such words (regarded as spam) in their business names, websites, or even physical addresses. How Google has historically been dealing with such situations is they were withholding the activation of a listing that contains such words (the biggest publicly available list of these is here) and they have manually been verifying the accuracy of the information. However, according to the patent even words such as city names could be considered spammy, which opens up a broad field for false positives.

What This Means from Local SEO Point of View

As mentioned above, there are two main factors taken into account – trustworthiness and frequency. While the patent doesn’t discuss these factors in regards with organic search rankings, it could be assumed that similar methodology is used when determining the value of citations and how business listings are ranked. This means that we could distinguish between two types of local citation sources:

1. Qualitatively-important – such as the aforementioned LocalEze, Infogroup, CityGrid, Yellowpages, etc.

2. Quantitatively-important – either less authoritative or less-probable-to-be-citation-sources sources.

To have a strong “citation profile”, you must cover the first type of sources, and just after this you should proceed with looking for further volumes and opportunities. At the same time while 2. could be useless without 1., 1. would be (in many cases) insufficient without 2.

Conclusive Words

I believe Google has been using this (or similar) method to detect and disregard spammy information coming from third-party sources at least for some time. Nevertheless, the patent comes to shed further light on the systems Google adopts to find, compile, and process business information.

  16 Responses to “How Google Might Be Determining If A Local Citation Is Spammy or Not”

Comments (16)
  1. Nyagoslav, this is a really interesting post. It’s been while since I’ve read an interesting local post so.

    I also remember when Google Places used to show citation sources in the Google Places listings and they show anything that had your phone # on it. It was amazing at all the low quality crap they’d display as a citation mention.

    Reverse phone look ups. Crappy profile citations. etc And those citations still work today. Although, it’s best you focus on getting all the top citations that have the most domain trust. Then target city and niche based sites for citations. Then if you are done all that, you can get into getting those crappy citations: reverse phone, profile citations, etc.

    Even though GP does not show those citation sources any more in the GP listings we all know they still count and that Google can see ALL citation mentions of your business… but it’s interesting to think now they may start to judge the citations more.

    This would make sense and should not be that hard if they are using a similar algo as they do for links. (Maybe they already do). I would think they’d judge it more based on the site or page level then as opposed to the info associated with the citation.

    But interesting food for thought.

    • Hey Matt, thanks for the kind words!

      I think the value of reverse phone search types of citations is greatly discounted nowadays. As discussed in the article, I would put much more effort in covering the basics, compared to doing something extra. You cannot imagine how many of our clients have spent significant amount of money in tackling hundreds of citations, and at the same time they have unsolved wrong duplicate listings or inconsistent data on sites like Yelp, Citysearch, Merchant Circle, even the data aggregators.

  2. Hey Nyagoslav,
    This is a good breakdown of the possible ramifications of this patent. Bill does a solid job at finding relevant patents to the industry & making everyone aware of them also.

    This does intrigue me as to how this patent (or similar future patents) could affect the weight of instances of citations with inaccurate information in cases where the business is not obviously spamming but has bad data. Such as, a case where the business may have moved and still has a number of citations pointing to a previous address. Or, a case where they changed phone numbers. Either of which happen frequently and are not intentional misrepresentations.

    • Hey Ryan, thanks for the comment!

      I think this patent specifically is mostly related to discounting the value of citations that might have intentionally been created to include keywords in the business name (for instance), rather than tackling the problem of inconsistent business information.

      • I totally agree. It just makes me wonder if they may start adding focus to citation quality in different aspects after they refine their crawling of intentionally placed incorrect NAP info.

        • It is possible that they might start penalizing for this, but it is doubtful. They should be really very sure that the information is intentionally misleading to do that, which would be hard to achieve.

  3. My main beef with the Places Business Name is that google in the guidelines speak against including Categories and Locations.

    And then the annual ranking review shows ranking points for having them in there.

    Also the ‘Lucky 1 Pack’ is driven primarily from the Business Name.

    I have been complaining for a long time in the Google and Your Business Forum, and its Whitewashed antecedents that no ranking should be taken out of the Business Name. It would then simply be a human readable key and hint for the merge algorithm.

    SQLPerformance

    • Unfortunately, this is the reality we live in. Google doesn’t place as much value on the keywords in the business name as it did previously, but it still bugs me why it even counts it at all.

  4. Hey Nyagoslav,

    Characteristically excellent stuff – thanks for posting!

    I think it all comes down to “threat #1”: Google’s heavy reliance on “trusted” data-providers. There is a certain threshold of spammability (not that that’s a word), if you will, below which one can get away with quite a lot. Part of that threshold is the length of the business name: InfoGroup and LocalEze cut you off around 30 characters, so you’re out of luck if you’re trying to stuff in a *lot* of keywords. But even if your name is totally bogus and not blatantly spammy (like with a gibberish name that just happens to include a city name), if you’re under that ~30-characters threshold you’ll probably be OK – for better or for worse.

    Similar story on Yelp: your name is unlikely to pass Yelp’s sniff test if it’s long, but even a bogus name can get published if it’s short enough and not blatantly spammy (e.g. an exact-match, city-specific local search term).

    I think there should be a certain amount of “play” or “give” on Google’s part in accepting or trusting a given business name, simply so that businesses with horrible “official” names have a chance to brand themselves online with their real-world business names (e.g. the ones on their business cards). Which is legitimate, and which Google generally doesn’t seem to have a problem with. But obviously this does leave the door open for spammers. It just seems like a balance that Google has yet to strike.

    • Hey Phil, thanks for the kind words!

      Isn’t it paradoxical – where is the balance between not filtering out “Miami Printing” (which is a real business), but filtering out “Miami Carpet Cleaning” (which is very possibly not a real business, or at least not the real name)? It is really hard to say and I don’t think a balance can ever be struck. Look for example at the case of locksmiths – Google had so big problems in dealing with this industry that it decided to wipe it out completely, including a lot of honest, small business owners that played entirely by the rules (I know at least a few such).

  5. Great post Nyagoslav! I think Google polices some verticals harder than others. Like Matt, I see a lot of spam ranking well both in local and organic SERPs. The key is of course having strong structured citations but I agree that the value of the ancillary citations (eg: NAP in video sites, article marketing etc) is gradually becoming less and less useful.

  6. Nice post Nyagoslav! What do you thing about citations with the same phone number and address but with different names, tags and descriptions? Are they considered spamy?

  7. I agree with Andrew. There should be absolutely no ranking benefits for using category or geographical modifier in the business name. There will be manipulation as long as Google leaves it as a factor. Getting a DBA from the state is simple, cheap, and regrettably beneficial in some some instances because of this issue.

    The “trusted sources” do a pretty poor job at QA in the business listing’s info. “Trust” is definitely relative.

    Thanks Nyagoslav!

  8. Good article about local businesses and the citation directories. Although Yellowpages.ca is considered an authority site, they are very difficult to deal with on having a free listing posted with them. I truly wonder how long they will keep their authority when they are difficult to work with compared to other directories in Canada.

    Local and small businesses are seeing that more of the new prospective clients are coming from the internet. They struggle with understanding how to claim the listings and optimize them to be at the top of the local search results. Thus there is a significant opportunity to provide small business consulting specializing in online.

  9. It is probably a good article, but that damned sharing thing just keeps coming down the page obscuring the text. Too annoying to put up with.

    • Thanks for the feedback Stu. Overall redesigning of the website is planned, and one of the changes that would be implemented is the scrolling social bar improvement. Sorry for the inconvenience.

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>