Last week I read a post on Search Engine Land that was discussing a study by Implied Intelligence. The study was related to business data accuracy, and was comparing a number of the biggest players in the UK market. As I felt the SEL article didn’t offer too much details, and I was very interested to learn more, I contacted Marc Brombert and he was kind enough to provide me with the complete research.
Which Sites Were Researched
The full list of sites that were researched includes: Foursquare, Thomsonlocal, Scoot/TouchLocal, Bing Local, DNB, ThePhoneBook, Yell, Google Maps, Yelp, 118.com, 192.com, HotFrog, Yahoo Local UK, Qype UK. The mentioned sites were chosen based on traffic and general industry prestige (see here for reference). It is also worth mentioning that:
1) These are all different types of websites. For instance, Yelp and Qype are rather review social networks, Foursquare is a check-in social network, Yell, Scoot/TouchLocal and Thomsonlocal are traditional IYPs, Google Maps, Bing Local, Yahoo Local are a kind of mixtures. Thus, it is to be expected that the business data accuracy would vary.
2) Many of these exchange data between each other, or get data from the same source. For instance, My118information.co.uk provides data to 118.com, 192.com, TouchLocal, Yahoo, Bing, and others (according to Sarah Shepherd, Head of Customer Service at the company). LocalDataCompany.com provides data to Google, Yell, Thomsonlocal, Touchlocal, Qype, 192.com, and others (according to their website). At the same time 192.com gets data by 118.com.
Methodology of the Research
Implied Intelligence researched “1,400 hand-checked records from a random UK geography.” The records were all extracted from the websites of the businesses themselves. Because of this most of the records were of small businesses, as only records with one address per homepage were chosen (which means that chains and franchises were largely excluded). The study looks into a few aspects of the data, going beyond simple data accuracy. These include: coverage (how many of the records were present), number of duplicates (determining if a listing is a duplicate of another listing was based on a robust match between the name, address, phone number and URL), accuracy (of the main business data), and richness (presence of additional details).
As stated in the title, it turns out that Google is the most accurate and detailed source of business data in the UK. However, before looking into the final details, I’d like to discuss the scoring per factor.
Where everyone seemed to fail was at the coverage test. Google was the only one to score over 50% (58.8%), followed by 192.com (49.1%). Both these companies’ databases receive data from a large number of places, and this could explain their advantage. The worst scoring, by a large margin, is Foursquare with just 6.9% of the records being present on their website. The reason is most probably the relative unpopularity of the social network in the UK (85.2% of the British have never used it, and just 3.1% use it frequently). It is interesting, however, that Yelp takes the second spot (in reversed order) with only 24.6% of the businesses researched being present on their website. This performance is significantly worse than the one the site shows in the US market – 63.2%.
The second factor researched was percentage of duplicate listings. “Winner” is ThePhoneBook with 20.8% duplicates, followed by HotFrog with 12.3%. Yahoo, which in the UK gets data from Infoserve, was found to have 0% duplicates, followed by Yelp (1.4%). Google is just third here with 2.5%. These percentages are generally much lower than what I would expect. Based on my previous researches, the duplicate percentage should be on average around the low double-digits.
It gets scary when looking at the data accuracy findings. The worst performing overall is once again ThePhoneBook, featuring a wrong phone number in 27.8% of the cases, and a wrong address in 2.9% of the cases. It is followed by HotFrog (25.9% and 2.5% respectively). The best two are Qype (19.2% and 1.7%) and Bing (20% and 1.6%). It is to be noted that according to the research DNB does not provide any business information, such as address, phone, and additional details, so most of their scores are either N/A or 0.
The survey looks into what percentage of the records have website URL associated with them. Google is the winner with 87.9%, followed by Yell with 79.7%, and Bing with 78.7%. A number of websites do not allow business websites to be added to listings, so their overall score is significantly lowered by this factor. These include: DNB, ThePhoneBook, 118.com, 192.com, and Qype.
The most inaccurate in terms of business website associated with a listing are Yelp (33.6%), Yahoo (33.3%), and Scoot (32.6%). Google is the best performer (from the ones that do allow business websites to be added to listings) with just 15.8% incorrect URLs.
The research also looks into the percentage of records with opening hours, and the percentage of records with additional information (including about-us information (taglines), payment options, free quotes, certifications, and others, but excluding reviews and check-ins). The unchallenged winner in both is Google with 28.3% and 97.9% respectively. There are again quite a number of directories that do not offer any additional information: Foursquare, DNB, ThePhoneBook, 118.com, 192.com, Yahoo.
Based on the factors discussed above (coverage score, duplicate score, phone error score, address error score, URL coverage score, URL accuracy score, hours score, and additional info score), the winner is Google Maps. Second place goes to HotFrog, and third place goes to Bing.
While I appreciate and respect the scoring system Implied Intelligence used, I believe some of the factors are not equal in terms of importance to other factors. These include URL coverage, URL accuracy, and business hours presence. I combined URL coverage, business hours coverage, and additional details presence into one factor, and completely excluded URL accuracy, and here is how the data accuracy rankings turned out:
1. Google Maps (=)
2. Bing Local (+1)
3. Thomsonlocal (+5)
4. Qype (+3)
5/6. HotFrog (-3/-4) and Yell (-1/-2)
7/8. Yelp (-1/-2) and Yahoo Local (+2/+3)
9. Scoot (-4)
10. 118.com (-1)
11. Foursquare (=)
12. 192.com (=)
13. DNB (+1)
14. The PhoneBook (-1)
Below is a graph that shows the difference in overall score when using Implied Intelligence’s scoring system and my scoring system:
Undoubtedly, Google is the winner in terms of complete and accurate UK businesses data. It is interesting that Bing, which does not have an automated system for businesses to list themselves or to edit their data (such as Business Portal in the US), performs very well (they do, however, have a request form for adding/updating a listing). From the viewpoint of local SEO, HotFrog and Yell seem to be important citation sources (I have discussed this previously) as they offer a big number of additional business information bits to be added (check here why this is important).