Big Data Brokers and Open Data Sources

The current state of the world is that we almost every action and fact we take are tracked and recorded; ranging from mobile phone usage, driving behaviors, social networks, and our shopping transactions.

Due to the nature of data and the fact that it exist in an invisible layer that most of us don’t comprehend, the results is that many of us don’t realize the ‘big business’ of our data.

Without our knowledge and comprehension there are hundreds of big data brokers who sell and barter our information for a wild variety of uses. These organization not only make massive amounts of profit by selling information we created, but they also make tremendous impacts to the bottom line revenue of businesses who understand how to use big data.

As I go through this article  (which has turned into a book) I’m going to give you a thought as to the scope of this business and why you should pay great attention to it both personally and professionally, along with giving you pointers to some of the available sources of data (ranging from open, governmental, local, industry, etc.)

We’ll examine some critical areas around big data

  • Why to use big data and open data sources regardless of how big or small your business is
  • Common areas to browse through and familiarize yourself with
  • Understanding what trends are affecting your future

Getting Started with the big question…

What is my data worth?

Part of answering the bigger questions around big data brokers and open data sources is the need to understand what your data is worth and that for every person, group, and organization there is an enormous scale of data being created.

Consider this, for every person: They interact with 100+ other individuals, 100+ local businesses, and 1000+ global businesses.
Each interaction creates a data point. Each data point has value. 

This federal action helps illustrate the scope of value: Oobama-executive-order-open-datan May 9th President Obama issued an Executive Order directing historic steps to make government data more accessible for economic growth and innovation. The order declares that the information is a valuable resource that needs to be used as a strategic asset for the nation. President Obama called for an Open Data Policy, requiring machine readable formats of data for government records that can drive long-term economic growth.

The origination of the order comes from growing concern including personal privacy, regulation, industry development, and competitive positioning of ‘who owns your data’

If the President of the United States is issuing an Executive Order regarding government data and considers it a national asset, chances are you should really be paying attention to the types of data you create and interact with.

By the People, for the People.

In an on-going series of efforts going back a decade, our government has a series of sites providing what I would refer to as an elementary view into some of the data we create. On a rating of 1 to 10 and of usability to the common person or business this data is a 1, it has questionable freshness, unidentified quality, and lack luster categorization.

The idea is there, but what the government is providing is really ‘only data’ –  It doesn’t come in fancy spreadsheet or colorful visualization that make you understand the significance of the data. This prevents the common professional and business owner from making decisions regarding on-going trends ranging from local, regional, and global perspectives.

Examples of basic government data

We also have Open Data

Open data is publicly accessible information that has been established by public interest groups, government, corporate entities, and expert practitioners.

The general idea behind Open Data is that even big corporations need to release some ideas about the information they have access to so that innovative entrepreneurs and community minded leaders can build interesting things with that data.

Example sources of Open Data


Big Data Brokers

Take all of your data, along with governmental and open data, then buy some more!

Big Data Brokers collect all sorts of data ranging from transaction and purchase behavior, consumer interest information, and usage patterns. The basic concept of a big data broker is that they collect merchant details at the point of sale, social data from your relationship network, and demographic identifiers from millions of different databases that have been collected under a few hundred organizations that fall into the category of being a ‘data broker.’

They sell basic pieces of data such as names, contact information, and additional demographics, like age, race, occupation and education level. There are lesser known data areas that include everything from how fast you drive your car (insurance reports, traffic records) to what your entire employment history looks like.

You also have big data brokers that sell information about products (such as Amazon store data), places (Google Maps), events, vendors, manufacturers, employee names, and stocks (Nasdaq, Dow Jones), etc.

The list is truly endless.

On the downside of big data brokers is the fact that there is someone willing to track everything and anything related to you. I’ve had the pleasure of speaking about the topic of privacy and invasive trends that affect us on a multitude of levels, but in most cases the ‘big data’ being collected about you is years ahead of legislation and regulation.


Think about everything you type into Google, Facebook, or your Mobile Phone.
Take every e-mail, text, conversation, or transaction and think about where all of the data goes.
Imagine the past five years of your  data collected on a computer and offered up for sale for $1

While the privacy issues are enormous, the beneficial areas are enormous as well.

Overlapping Points to Consider

We can think of the overlapping open, governmental, and big data segments in several different viewpoints that range across three primary categories: the individual, the business, the community.

Each of these categories have signals of data that reach far beyond the boundaries of the specific element. The visual below helps demonstrate how waves of data interact at dozens of points from the three categories.


  • For the individual big data has the risk of invasion of privacy, with the benefit of streamlined performance and increasing the quality of life. The same data about your home electric consumption can be used against you to market goods and services, or used for you to help reduce usage and be environmentally aware.
  • For the business side of data we run the risk of being the instigators of breaching personal privacy, while we have the benefits of improving income and employee/consumer lifestyle. We need to be aware of local and global differences in ethics, morality, and legality as we engage our businesses as both consumers and providers of competitive marketplaces.
  • For the community portion of big data we run a balancing act of considering privacy of the individual, the trends of our community, and the financial interactions of our community businesses. We also need to consider the asset of data as a community, as the risk and benefit for both individuals and businesses are almost entirely controlled by representation as a community (in the form of a real world community, association, employer, etc.)

What is all of this data used for?

That is a pretty big question. The most typical use is one of two things:

  • Option 1: attempting to collect enough data to reach a critical mass that is easy to sell. Most social networks (Facebook, etc) and search engines (Google, etc) need to collect billions of data points before there is enough quantity to sell the data over and over again in some form of profitable model.
  • Option 2: they are attempting to find areas of overlapping data that have a point of interaction that identifies a recurring situation or life-event trigger such as buying a new smartphone, getting married, or ordering a movie on demand.

How do I use this in business?

If we imagine both option 1 and option 2 happening at the same time, we could look at an example of several business entities sharing a common community. This could be a group of industry businesses, a few stores working together at the local mall, or perhaps international trade partners shipping into the same city.


When a new individual interacts with the community or any one of the three businesses, a data exchange is possible.

In the real world this could be thought of as a friendly referral (it can have darker tones as well.)

When the new individual is introduced, the data points they bring with them have dozens of interaction points with the three vendors.

big-data-brokers-for sale

The are important things to realize about the above chart.

  • When an individual first engages it is typically with a community layer.
  • The individual has data that reaches into the core of one business (perhaps a transaction) and has only secondary data touching the other two.
  • All three businesses collect different data points on the same individual.
  • The individual may be aware of one, two, three or none of the businesses.

How should I use Big Data?

(Disclaimer: this is entirely my opinion.)

Most big data is used to sell you stuff. This data exists about a tremendous number of things and you can use it to sell all sorts of ‘stuff.’
Making money and driving a strong economy can be a great reason to use big data.

As a serial entrepreneur and innovation consultant, I prefer to give people a big idea to go with big data. 

You can sell the data…You can buy the data… Or…

Change the World with Big Data.

You can build things with data.

You can inspire people with data.

You can change lives with data.

If you want to so do something big-  think about using big data to help your community, people who do not have access to the data, or your family. Look at your network of friends, family, and co-workers and consider the insights that are hidden in your data.

  • Take a look a local charities and think about how your data aligns with something that makes you feel good.
  • Look at local employment rates and use the data to make jobs for people who need them.
  • Use data to reduce city spending, improve the education system, or help solve homelessness.
  • Spend some time looking through the open data and government data above.
  • Think about ways you can visualize data to help people understand.

I hope this has given you some insight to
the world of Big Data.
Go and do something useful with it!


Some interesting context to specific Big Data issues

Health Data

Forbes: What the Government’s Big Data Dump Tells Us About the Hospital Market –  I believe in the power of markets to create efficiency and innovation, so I will go there. To unleash the power of entrepreneurs to create transparency and choice in the hospital services market. Fortunately this starting to happen. Companies like CastLightHealthInReach, and Pokitdok are good examples.

Data Value

AdAge: FTC Sting Operation Results in Warnings to 10 Data Brokers – The FTC “test-shopped” for data from the companies and determined that they may be in violation of the FCRA, according to the missives, dated last week. The firms in the crosshairs are ConsumerBase, Brokers Data, US Data Corporation,, 4Nannies, U.S. Information Search, People Search Now, Case Breakers and USA People Search. The tenth company was not named by the FTC.

 Privacy Data

IT World: Web Trackers are Out of Control – The amount of data compiled about me was jarring. BlueKai and its tracking partners (DataLogix, Lotame, and others) had aggregated some 471 separate pieces of data. On the other hand, BlueKai correctly determined that I have two teenage kids, I’ve lived in the same place for more than 7 years, carry a mortgage, and like sci-fi movies, travel, and politics.

WordPress Security & Backup

This is one of those stories about WordPress you should read, because it affects you even if you don’t use (or even know about) WordPress.


On April 15th, the United States Department of Homeland Security, Computer Emergency Readiness Team issued a warning of an “ongoing campaign targeting the content management software WordPress…”

On the Department of Homeland Security site… 

CloudFlare, a web performance and security startup, has to block 60 million requests against its WordPress customers within one hour elapse time. The online requests reprise the WordPress scenario targeting administrative accounts from a botnet supported by more than 90,000 separate IP addresses. A CloudFlare spokesman asserted that if hackers successfully control WordPress servers, potential damage and service disruption could exceed common distributed denial of service (DDoS) attack defenses.

Why should we care?

According to Wikipedia,

WordPress is used by over 14.7% of Alexa Internet’s “top 1 million” websites and as of August 2011 manages 22% of all new websites. WordPress is currently the most popular blogging system in use on the Web, powering over 60 million websites worldwide.

According to, WordPress 3.5 has been downloaded 19,370,337 times (as of this article)

So what?

Imagine for a minute that 20,000,000 websites were running code for personal blogs, corporate websites, and local stores.

Imagine if four million of those were out of date and had security holes.

You don’t have to imagine.

WordPress 3.5 is currently the most current release of WordPress.

Roughly 65% of them are out of date.
*pie chart is directly from



Two-thirds of the WordPress sites are out of date…

Creating a  ‘feeding frenzy‘ of criminal hackers looking for easy victims.

We can see by searching Google Trends that search volumes for words such as WordPress Security and WordPress Backup are trending up with no sign of slowing down. If we look at small segments of time related to keywords in the marketplace we can see that backup and security solutions for WordPress as follow-up remedies after the situation has escalated.

In other words: WordPress users wait until a problem happens, then they complain they left the door unlocked.



How does this affect me?

Imagine the value of your personal data.

Think about the previous line item “WordPress is used by over 14.7% of Alexa Internet’s “top 1 million” websites”

  • How many times a month do you visit a WordPress site?
    (hint… you are on a WordPress site right now.)
  • How many of those sites have you joined up as a member?
  • Did you release your e-mail or privately talk with another member?
  • Does your e-mail, username, and password match one of your utility or bank logins?
  • Did you click on a message link from a friend only to realize it was spam?
  • Did you download a nasty virus on your computer off the web?
  • Did you wait a few extra seconds waiting for a site to load?

Simply said: there are hundreds of ways that a WordPress virus or hack can affect you, your business, or the people you care about. Some of them are simply nuisance items and others are a complete panic.

If you own a site that has been attacked, the resulting damage can cause irreparable damage to your business and your community members. Repairing a site can often cost thousands of dollars in lost income and technical fees (and the trust of your users.)


If you use WordPress: UPDATE IT along with ALL of the plugins you use.

Share this article with anyone who uses WordPress.

Encourage them to update the software they use and make frequent back-ups to protect themselves and their users.

We all have a responsibility to help each other have a positive and safe experience on the web.


Big Names (outside of the Department of Homeland Security) are talking about it.

Techcrunch, ArsTechnica, Hostgator…


Global WordPress Brute Force Flood – Gator Crossing – HostGator
Apr 11, 2013 Global WordPress Brute Force Flood The main force of this attack began last week, then slightly died off, before picking back up again

Fix For Recent WordPress Brute Force Attack Is Easier Than You …
Apr 14, 2013 Over the past couple of days it has been widely reported that WordPress based sites are being targeted by a massive brute force attack, one

Brute Force Attacks Build WordPress Botnet — Krebs on Security
Apr 12, 2013 Brute Force Attacks Build WordPress Botnet. Security experts are warning that an escalating series of online attacks designed to break into

Hackers Point Large Botnet At WordPress Sites To Steal Admin …
Apr 12, 2013 For the most part, this is a bruteforce dictionary-based attack that aim to that CloudFlare saw attacks on virtually every WordPress site on its

Huge attack on WordPress sites could spawn never-before-seen …
Apr 13, 2013 According to CloudFlare’s Prince, the distributed attacks are attempting to brute force the administrative portals of WordPress servers,

Worldwide WordPress Security Admin Attacks: Secure Your Blogs Now
Apr 16, 2013 WordPress Security is always an important part of running your blog, but right now it is even more important than ever. There is a severe

Update WP Super Cache and W3TC Immediately – Remote Code …
Apr 23, 2013 Shame on us for not catching this a month ago when it was first reported, but it seems that two of the biggest caching plugins in WordPress


Competitive Research Tools & Tips

Competitive research isn’t just for major corporations

There are hundreds of competitive research tools out there that provide *FREE* insight to what you should be doing as a business.

This article is going to go through a few of my favorite free tools. If you have any questions about how to apply them, please make sure to zip me off an e-mail or leave a comment below.

As an example for this article I am going to use the Samsung Galaxy S4 product launch.

*************** Read more

Seattle News and Social Media – Part II – Keywords

If you read part I of Seattle News and Social Media, you are probably beginning to ask some of the right questions about how news drives traffic and attention.

As we dive deeper on how Seattle News is consumed, we need to understand how keywords are related to it.

This section will cover some strategies for the keywords involve in Seattle News.

For most of the numbers I’m reviewing here I used the Google Adword tool and Google Traffic Estimator.

STEP ONE: Related Keyword Traffic

Whenever I look at a marketplace, I always look at some of the ‘common sense’ words that drive traffic (via Google Traffic Estimator.) In this case I’m looking at ‘Seattle News’ as a keyword and examining the top related search keywords. This first thin I do is look at the volume of traffic, as I want to understand where the eye balls are and if there is a branding or impression scenario to consider.

STEP TWO: Related Keyword Cost

The next item to look at (using Adwords Keyword Tool) is to examine the most expensive keywords related to the original term. This often exposes areas that businesses can have an opportunity around, as well as keywords that other businesses have bid the price up. If someone is willing to spend money on these, basic logic says I should ask the question of whether or not there is reasonable business opportunity there.

STEP 3: Where is our Audience?

You can’t just assume that everyone is searching for “Seattle News” is in Seattle…. the chart below details how a good chunk of the search volume is coming from Port Angeles and Tacoma, anywhere from 25 to 75 miles away.

If I had made the mistake of pursuing “Seattle News” as a local Seattle business, I may have been reaching an audience that was simply outside of my business market. You can explore this data using Google Insights.


STEP 4: Keyword Funnels


Any single keyword leads to another, leads to another.

When you are using Google you can see this in action by paying attention to Google auto-complete when they try to provide options as you type in your phrase.

When you are typing in the search box, pay close attention to the words and phrases that appear. This is invaluable for discovering what journey an online audience takes to discover new information.

The chart (left) details how  the search audience goes from keywords like “Seattle News” to keywords such as Seattle Traffic News, Seattle Times, Seattle PI, Seattle Traffic, SeaTac Airport, etc.

One of these keyword funnels may be the exact audience you want to communicate with (for advertising or search optimization and Adwords) or simply help educate you on how your consumers are exploring relevant information.

 STEP 5 : The Local Market

I’ll be digging into this topic on part 3 of Seattle News & Social Media.

Let me know if you have any questions and make sure to check out some of the related articles below.