If Trulia and Zillow wanted to start fixing their data quality problems today…

It’s been half a year since Redfin released a study that found about 36 percent of the listings shown as active on Zillow and Trulia were no longer for sale in the local MLS. While I don’t view the MLS as the only source of listings in a market, and wonder if the markets chosen for the study weren’t cherry picked, the fact of the matter is, Zillow and Trulia have big data problems.

weedsZ&T’s PR teams have expressed their commitment to improving their data accuracy, but their progress appears to be at a snail’s pace. This winter, Trulia announced a new framework that allows them to update direct MLS feeds every ten minutes. MLS data really isn’t the biggest of their problems, but the framework would make MLS data a little more timely. Trulia hasn’t announced a new MLS data partnership in more than six months on their industry blog, so I would assume their data issues have no more than marginally unimproved since the Redfin study. Zillow’s made a big push to secure direct data agreements with big brokers, but like MLSs, big brokers are not the primary source of bad data. These brokers were probably already sending their listing data through a reliable syndicator like ListHub or their MLS.

Just two months ago, ZipRealty announced a study with similar findings. The blame for Trulia and Zillow’s accuracy problems doesn’t fall solely on them, but there’s no doubt that they have a problem and it’s not getting any better. It’s almost a guarantee that someone from Zillow or Trulia shows up to comment on a post like this to tell us that they’re trying, but are they? How bad do they really want to clean up their data?  Let’s find out.

Trulia and Zillow can make huge advancements in the quality of data tomorrow. Here’s how:  Tweet this

Step 1. Identify the top 25 sources of bad data

Since both Zillow and Trulia have direct data agreements with many MLSs, they can use that data to measure the accuracy of their own data set. They already know which companies are delivering the lowest quality data. Most of these are virtual tour companies that offer syndication, single listing web page builders, or agent marketing companies that syndicate.

Step 2. Cut them off

That was easy, wasn’t it?

Zillow and Trulia could literally do this by the end of the day tomorrow. Will they? I doubt it, but cutting the sources of bad data is the only way their data can get any better.  In addition, they will be poised to ask for MLSs to help them better identify additional bad actors and cut them off as well. Agents will still be able to submit through their broker, the MLS, or even directly to Trulia and Zillow. Nobody gets cut off except the syndicators that are providing the stale data.

This is the first step in sending a huge signal to all sources of listing data that they are finally ready to take the issue seriously. I double dog dare them to do it. Who’s with me?

Photo: Creative Commons license via Flickr user madprime

11 thoughts on “If Trulia and Zillow wanted to start fixing their data quality problems today…

  1. Steve Jolly


    I agree 100%. It should not be difficult to scrub their data. However, it does not appear that Z&T view inaccurate data as problem. They may consider inaccurate data to be valuable content.

    Steve Jolly

  2. Todd Carpenter Post author

    Steve, I can tell you that there are plenty of people who think Z&T wants to display bad data. The only way they can finally prove these people wrong is to start cutting off the worst offenders.

  3. Erica Ramus

    I think it’s a race to get the MOST listings, not the most accurate listings. Quantity over quantity. I point to Trulia’s “please send me info on this sold property” button. Do you know how many “leads” we get each week from T from a buyer who thinks the house is available (yeah yeah yeah all they have to do is read SOLD but they don’t). They press the button, we reply the house is sold is there something else we can help them with and they are disappointed/gone. Sometimes we can convert them, most times they just go away. I know T’s goal is to get future sellers (consumers who MIGHT want to sell and are curious about neighborhood solds) to push that button so we can then figure out who they are and how to sell them on us — but at least in my market it’s not working that way.

  4. George

    Zillow and Trulia don’t give a flying fornicate about data accuracy. The more data that they can put out there, the more traffic they generate, which means more leads that they can sell to us.
    Here’s a novel idea, why don’t the MLSs get their heads out of their butts, stop being greedy, and cut off Zillow and Trulia?

    *drops the mic*

  5. Doug Francis

    I have always noticed a time lag as to when a new listing will hit these sites. It can take 2 to 3 days.

    But most of my clients use multiple Apps with market alerts because they know some sites post updates faster. Why do they have 3 Apps? Probably because they are free, or have better school info, or it was the first one they discovered. Consumers really don’t care about this data issue.

    Consumers aren’t loyal to these companies. They get what they want and then leave.

    The bigger issue (in my opinion) is the hyped up pimping of zip codes to agents. I get phones calls that a “slot in 22182 just popped up and it is only going to cost $528 a month to be shown as a Primo Agent” to unsuspecting home buyers. Really, only $528?

    For many agents, this is an expensive last ditch effort to snag business.

  6. Robert Drummer

    “Neither Trulia nor Zillow give a shit about data quality. Nor Wall Street, consumers or anyone else, apparently. Now what?”

    I was limited to 140 characters in that response, so now I’ll add: brokers, franchisors and MLSs don’t seem to care about Trulia or Zillow either. Sure, there are a few outliers, but the core groups don’t seem to care.

    Agents that care mention quality and how it impacts the customer experience and conversion. Agents that don’t care are just happy that they received a lead.

    Listing agents tout to prospective sellers that they’ll put the home on every frickin’ site in the universe, because they think that’s what clients want to hear. Forget the fact that search engines solved that issue over 10 years ago, people still think the house has to be everywhere.

    My understanding is if an MLS suggested stopping the feeds, they think the brokers/agents would be up in arms because the listings aren’t getting visibility.

    I remain in the vocal minority that these sites only help themselves. Roughly $3B in market cap between Z & T, all created at the expense of, and with the permission of MLSs and their members.

  7. Robert Drummer

    For fun, let’s move your premise up the ladder to the MLS level.

    Step 1: Identify the top 25 sources of bad data (Zillow, Trulia, et al)

    Step 2: Cut them off.

    You’re right. That was easy.

  8. Judith Lindenau

    Let’s comment on the quality of MLS data: nobody seems to be addressing that problem. Of course, it’s the best of the evils, but MLSs know that their data is full of holes–unreported contracts, bad measurements, misleading descriptions. The real question, in my mind, is ‘how can we get really good quality data for not only consumers, but also real estate professionals.


Leave a Reply to Erica Ramus Cancel reply