HMRC is trawling the internet, including social media and other websites in which people share information, in a bid to find potential evidence of tax fraud that it can feed into its new Connect data warehouse.
Although HMRC is reluctant to go into detail, Mike Hainey, head of analytics at HMRC, has told Computing that since the implementation of Connect, the organisation's "big data" analytics system, the organisation has started feeding in information from the internet to help it better target tax investigations.
Connecting the dots at HMRC HMRC saved 74m on IT between 2011 and 2012, says NAO Two-fifths of companies fear financial penalties under new HMRC system for PAYE
"[It's] information that we can obtain that is visible and available legally for HMRC to review," Hainey told Computing. While such information would include the easily identifiable accounts people run on Twitter and Facebook, it would also almost certainly include websites where traders ply their trade, such as RatedPeople.com, and where their customers leave comments.
Indeed, HMRC has always taken data feeds from a variety of sources to support its Enforcement and Compliance organisation. "It's departmental data at one end of the spectrum, commercial data, buying in information around businesses et cetera. We also get information from other government departments and other foreign FISCs [fiscal regimes] through various treaties and arrangements," says Hainey.
He adds: "Also, on occasions, we will bring in information that we may obtain from the internet and bring that into the picture."
The commercial data, he says, is typically information from Companies House about companies and directorships, or from credit reference agencies.
Of course, many organisations trawl social media for all kinds of purposes, often using automated tools. At their most innocent, they represent little more than an extension to the press-cuttings services that have been offered to companies and wealthy celebrities for decades.
More seriously, reputation management companies also trawl social networks for evidence of potential libellous comments, or misuse of corporate imagery and other copyrighted material.
HMRC's Connect analytics system won the prize for Best Big Data Project at the UK IT Industry Awards in November 2012, which Computing recently covered in a case study feature.
The new system, which HMRC is still in the process of ramping up following an 18-month trial, helps to unify the siloed data collected under different tax systems, such as National Insurance and VAT. The aim is to build up more rounded pictures of the taxpaying public based on recognisable entities, such as individuals, their families and the circles within which they do business.
To put together such holistic pictures of taxpayers previously required several weeks of work just to pull the data from the disparate systems.