Data inaccuracy is a community effort.

First off, let me just be clear about my financial relationship to my former employer: I have no vested financial interest in Trulia.

These are my opinions.

bad dataHardly a week goes by where someone in the real estate industry doesn’t write a blog post, commission a white paper, or publish a press release about the inaccuracy of listing data on Trulia and Zillow. Just yesterday, Inman news reported on Zip Realty’s new “listing check” tool. It highlights the number of inaccurate listings on portals that accept non-MLS sourced data. It shouldn’t be news to anyone in our industry that Zillow and Trulia have a significant problem with stale listing data and listing coverage, but I’m often surprised at how many professionals don’t understand why. What’s causing the nation’s top two real estate portals to struggle with this issue? Well, it’s a community effort.

The MLS

While Trulia and Zillow would love to source more data from the MLS, not all of them are playing ball. They have their own interests to protect, and I’m not saying they are wrong, but when an MLS refuses to send data feeds to Zillow and Trulia, the portals use non-MLS sources.

The Syndicators

Often, an MLS will choose to use a syndication service to manage their broker’s requests to market their listings on Zillow and Trulia. These companies push very accurate MLS sourced data to the publishers, but time is a measure of accuracy as well. Realtor.com gets their data from the MLS several times an hour. Most syndicating companies only send updates a few times a day. Some only post once a day. In a hot market, a day can be a long time for updated listing information to be published.

The Brokers

Some brokers have decided to opt-out of syndication to Zillow and Trulia. I understand the logic behind doing this as well, and for them, it might be a good move. However, when a broker pulls out of syndication, you’ll often see that broker’s biggest competitor use the decision as a talking point in recruiting agents and securing listing agreements. Because of this, brokers that pull out of syndication often have a hard time forcing their agents to go along with the decision. In most cases, they allow the agent to continue to submit the listing manually, and outside of the control of the broker or MLS. Manually entered listings are by far the largest cause of bad data.

The broker is the legal owner of the listing and the MLS is the governing body that assures the listing data is correct. When both of these shepherds refuse to participate, it’s an agent free-for-all.

The Agents

While the MLS and brokers may have decided that it’s in their best interest not to work with Zillow and Trulia, agents are less likely to agree. They have consumers who want their listings on the two most trafficked real estate portals in the country and they don’t care about the motivations of the broker, or the MLS. When unsupported, agents turn to less reliable methods of syndication. They either enter the data manually  or use a virtual tour company, or some other syndicator that processes non-MLS feeds.

Listings are generally pretty accurate when they are created, but as the status of the listing changes on the MLS, it’s not being updated as often on these other platforms. Some agents don’t remember to go back and manually update the listing in all the places they published it. Eventually the home sells, but no one tells Trulia or Zillow, so they keep reporting it as an active listing.

In addition, some agents push listings to Trulia and Zillow through more than one source. The portals have to sort through the various feeds, determine which one is most accurate, and trump the rest. Amazingly, some agents will try to trip up the portals by changing the verbiage of the address to try to generate more than one listing of the same home.

Finally, a few agents purposely avoid updating a listing that’s now off the market because they are still trying to generate leads from it.

The Publishers

By accepting non-MLS sources of data both Trulia and Zillow know that stale data is going to be a reality on their sites. They’ve heavily invested in developing the technology to try to identify bad listings on their own, but they ultimately need the cooperation of the MLS to really get a grip on what data is good, and what data is bad.

So what can be done?

Many argue that Zillow and Trulia should pledge to only take data from MLS sources. If they did that, the argument is that the MLS will want to work with them. However, both Zillow and Trulia would have never made it this far if they had adhered to that principle. The MLS that doesn’t want to work with them would simply refuse to do so, and the publishers would cease to exist. The willingness to take non-MLS data is a leveraging point to counter the unwillingness of the MLS to syndicate to these portals.

The most popular solution seems to be that Trulia and Zillow should pay for the listing data. Here’s the real reason why they can’t do this: both companies built their platforms through partnerships with paying brokers that wanted a competitive advantage in their market. Going back to those partners, thanking them for their business, and letting them know the  capital raised will be used to secure competing listings, isn’t going to fly. Less listings on Zillow and Trulia means more visibility for paying brokers. Paying brokers don’t want their competitors to be able to list for free and they definitely don’t want their competitors to get paid. Ultimately, Zillow and Trulia have to serve their paying customers.

Another option is to hold brokers and agents accountable for the inaccurate listing data they push to Zillow and Trulia. The portals aren’t going to do this, but fellow REALTORS® in their markets could certainly attempt to resolve the issue through COE complaints. However, most REALTORS® are hesitant to file a COE complaint.

Finally, the MLS could work more closely with Zillow and Trulia by sending a direct feed, or by fact-checking the portal’s data against their own. Obviously, this is what Trulia and Zillow would like. Adoption of this kind of partnership takes trust from both sides. That’s something that’s going to take time. Many MLS boards still feel that these portals ultimately want to disintermediate the MLS (a topic for another blog post).

The Result (for now)

Trulia and Zillow will continue to have a data problem. Their competitors will keep publishing white papers and blog posts, and the peanut gallery will have something to talk about.

Consider this: in spite of their data problems, Zillow and Trulia have become the two biggest real estate portals in America. This should lead you to ask yourself what the consumer cares about most, because data accuracy doesn’t seem to be it.

Photo: Creative Commons license via Flickr user stallio.

23 thoughts on “Data inaccuracy is a community effort.

  1. Erion Shehaj

    Todd

    Your list of reasons for data inaccuracies is almost complete. One missing reason is that at least in the case of Zillow, the wrong data is intentional and strategic. Specifically, they recently started posting properties that had received notices of default and were going into foreclosure as active listings for sale through their partnership with RealtyTrac. This was intentionally misleading to substantiate the claim that Zillow has many exclusive listings that consumers can’t find even on the MLS. It’s beside the point that those properties aren’t really on the market at all.

    Reply
    1. Audrey

      I agree with you. I have buyers that call me about properties they see that are posted on Zillow and Trulia as a home for sale. I explain it is not listed, they might be behind on their mortgage or taxes and that is where the information is coming from. There is not a specific house number ( on the post I just had this conversation about ) however, they DO list a price? I believe it must have been the tax assessment, but the buyers believe that to be a real list price and of course want me to find the property.

      Reply
  2. Aaron Dickinson

    I’d argue a lot of consumers still do not know how inaccurate Z&T are. I am still regularly explaining it when they bring me old listings, Zestimates, etc.

    Reply
  3. Don Reedy

    Todd,

    Good to hear from you. I was following and feeling satisfied up until your last paragraph.

    “Consider this: in spite of their data problems, Zillow and Trulia have become the two biggest real estate portals in America. This should lead you to ask yourself what the consumer cares about most, because data accuracy doesn’t seem to be it.”

    I almost don’t know where to begin with this. But let’s try. First, I’d like to just deal with ethics and what’s right. Marketing…..for another day, okay? And that means that we have to care, even if the consumer doesn’t seem bothered. Thousands may flock to their physicians for the “blue pill”, Nexium. The doctor, however, is our conduit for sanity and accuracy. Millions have flocked to McDonald’s for their fast food fix. It’s up to our nutritionists and health care providers to provide a fix for that fix. And so it goes. Z&T get it wrong, so very wrong, so much of the time, that even the public seems jaded and accepting of mediocrity. You, however, aren’t mediocre, nor am I, or X, or Y, or lots of us. Let’s not go down the path of even suggesting that because consumers don’t seem bothered by inaccuracy that our profession shouldn’t be very bothered.

    Now, as to marketing……as I said. You keep writing. We’ll talk about that later.

    Welcome back.

    Reply
  4. Todd Carpenter Post author

    Erion, I’m won’t disagree with you on the listing of pre-foreclosure homes on Zillow and Trulia. I don’t think it serves the average consumer that searches the site. At a minimum, I don’t think they belong in the default search.

    Don, I know for a fact that Pete Flint and Spencer Rascoff care a great deal about the accuracy of the data on their sites. They want to fix it, but it’s not up to them. Ultimately, it’s the responsibility of the broker to make sure their listings are accurately marketed. When a critical mass of brokers decide it’s time to fix this, that’s when it will be solved.

    Reply
  5. Aaron Dickinson

    Todd,

    After your last comment I’ve rethought the whole data inaccuracy issue.

    It is not consumers that have embraced inaccurate data but rather the brokers and agents have. They are the ones allowing this bad data to continue because they have neither committed to keeping it current nor stopped providing the data to Trulia and Zillow. I posted some more thoughts here:

    http://www.twincitiesrealestateblog.com/2013/bad-data-on-trulia-zillow-is-the-agents-brokers-fault/

    Reply
    1. Todd Carpenter Post author

      Great post Aaron. I hope this doesn’t look like I’m trying to lay all the blame on the agent. Everyone plays a part. With only a few exceptions, I think everyone involved wants the data to be correct.

      Reply
  6. Kathy Sperl-Bell

    My listings are syndicated from my MLS via ListHub. Nonetheless all of my single family homes were listed in Zillow as Condos. So don’t tell me that a feed from the MLS will ensure accuracy. I do not believe that Trulia or Zillow care one bit about REALTORS or the data accuracy. They care about being first in the search engines and growing their shareholder base.

    Reply
    1. Todd Carpenter Post author

      Hi Kathy,

      You raise a good point. If the data in the MLS feed is bad, then the data on the publisher site will be bad as well.

      Also. An MLS feed that’s distributed to Zillow and Trulia via ListHub, or any other syndicator is not as helpful to Zillow and Trulia as a direct feed from the MLS. Direct feeds not only allow the listing to be displayed, but also enable the publisher to fact-check all of their data against the MLS.

      Reply
    2. Sara Bonert

      Kathy – We researched this issue for you and found that in the data feed we are being sent Unit Number = 0 for every listing in this MLS that doesn’t actually have a unit number (verse sending nothing in that field).
      We have more feeds sending us SFH (single family home) for everything in their feed than appropiately sending “condo” as property type when applicable. So to combat this, we wrote code that says – when ever we get a unit number for a listing, we assume it is a condo verses a SFH, unless the public record we have for the home tells us otherwise (we have about 110M public records).
      In your case many of your listings appear to be new or newer construction, thus no public record to double check against.

      We are working now with Listhub to see if we can’t get the #0 unit data cleaned up in the feed we recieve.

      Lastly, please feel free to always email things like this to listingsupport@zillow.com.
      Sara from Zillow

      Reply
  7. Sara Bonert

    Todd-
    I like the sentiment of this post, it does take the community.

    It is the responsibility of the publisher to update in a timely manner and try to prevent scams, scraping, resyndication of data, and general abuse of the respective sites. For example, Trulia’s Direct Reference Program.

    It is the responsibility of the syndicators to provide clean data to the publishers and information to their clients. For example, Listhub providing “publisher scorecards” to help their clients keep up with online landscape and increasing the number of times a day they update their data feeds to publishers.

    It is the responsibility of the real estate professional community, should they decide to syndicate, to monitor the inventory they are distributing. Direct feeds help ensure accuracy immensely, which many are doing. For example, MRED recently partnering with Trulia or Century 21 and NRT recently launching online advertising partnership with national websites or Zillow’s Pro For Brokers Program.

    When everyone works together, the data can be better which obviously benefits all. It benefits the publishers with a more accurate site. It benefits the agents when more people visit these sites and request information or want to see their listings. And ultimately all benefitting the home buyer and seller with enhanced exposure and information to help them realize their home owning dreams.

    Sara B from Zillow

    Reply
  8. Matt Stigliano

    Maybe we need to all sit down at one big table able negotiate our way out of this mess? Yeah, I don’t see that happening either, since there is both money and politics involved.

    One thing I have to disagree with you about is listing accuracy on the agent side. The MLS itself is a cesspool of bad data. Not every agent is bad at data entry, but I see more inaccuracies there than I should. Locally, we have problems with agents copying data from a prior listing (which was incorrect to begin with) in order to avoid going through the house and making notes of all the little bits of data that are important to home buyers. While this does nothing to correct prices and listing status inaccuracies, it’s at the core of the problem. People who don’t care or are lazy and don’t want to take the time and effort to fill out a simple data sheet on the home. It frustrates me to no end – particularly when I need to find a property that matches very specific criteria for a client, but I may be missing homes because an agent didn’t want to fill in a field or because they filled it out wrong.

    We also have issues with listing status changes and our Board does work to address those, but often the staff of those boards are small and don’t quite understand the problems – they’re office people, but they don’t have the tech or real estate background to understand or care about things like MLS inaccuracies (I’ve learned this from personal experience with a mapping issue on a property and trying to get it corrected through the local Board).

    Maybe we do all need to sit at the same table. Where can we find one big enough?

    Reply
  9. Patrick Healy

    Great post Todd. I don’t think this could have been said better. The reality of the situation is that if MLSs, brokers and agents wanted the data to be accurate, it would be. Zillow and Trulia have some of the best data minds in the industry at their disposal. What they are being asked to to is, frankly, unreasonable. Having extensive experience with data I can say with 100% confidence that data is inherently dirty. There always must be some level of work done on it. When the quality of the data from the source is boycotted, neglected or sabotaged then you get this situation. These are not behaviors that cannot be rectified by Zillow or Trulia.

    The real issue here is that those that provide or don’t provide the data simply are not, at this point, looking after the best interests of their clients – buyers and sellers. Clearly the public has spoken as to where they want to go to browse listings. They want a beautiful experience instead of a cookie cutter POS templated site with an IDX solution in place. If I were a seller and my listing was not accurate up on Zillow or Trulia I would consider this grounds for breach of contract and be having the hard conversation of why my listing agent isn’t making this happen for the 6% of my home’s value I am paying them to sell it. Your ads are never wrong when it’s in the news paper and the listing data is never wrong on their own web sites.

    What about the customer? Why is the focus continually on the industry and their needs and never the customer?

    Reply
    1. Todd Carpenter Post author

      Patrick,

      I think that in most cases, the best strategy for selling a house for the highest value is to expose that listing to as many potential buyers as possible. When that strategy is employed, the listings I think they should be on Z&T, and they need to be correct. However, I think there’s situations where it’s not in the best interest of everyone involved to list a property publicly. I intend to write about this topic soon.

      Reply
    1. Don Reedy

      Teresa has it right. All the other comments seem to indicate that as Realtors we should actually want Z&T to have data at all, either good or bad.

      Actually, the consumer wants good data, and the real estate industry decided it was too lazy to do it right, and so Z&T usurped our clean and most current data. Now they have the consumers eyeballs, but we still have the real estate industry’s best data (and yes, I admit thare’s always going to be some crap in even our own data).

      How about this. Zillow and Trulia go away (by the way, I have no financial interest in either company). Now, we real estate pros create our own portals, with our own data, and disseminate that data to the public.

      Okay, I go on record as not caring about Z&T’s data as well. I don’t syndicate, because my data, about my listings, is top of the line, and like other excellent brokers who also chose not to syndicate, find that my clients are better served.

      Reply
      1. Todd Carpenter Post author

        Don,

        I certainly respect your choice to not syndicate your listings on Z&T. I think there’s plenty of good reasons why a broker may choose to do this.

        However, Z&T aren’t going away until someone finds a to replace them the way they replaced newspaper ads, magazine ads, and direct mail.

  10. Teresa Boardman

    Don- real estate agents supply about 75% of the money that Zillow amd Trulia take in. Can you imagine what we could do with all of that money? What we could build? WOW!

    Reply
  11. ken Brand

    If the MLS data in our market is around 95% accurate (this is a guess) and real time, and Truilia or Zillow is around 70% accurate (according to white papers, etc.) how is a crappy 70% accuracy rating an agent|broker responsibility when their input is 95% accurate? What can an agent do? What can a broker do?

    Yes, millions of page-viewers flock to those sites. Of their millions of “page viewers” how many are actual buyers? My point is if millions of people are dreaming and window shopping they of course don’t know and could care less if the data is accurate. How would they know what they’re looking at is shamefully inaccurate. The wouldn’t experience the frustration of getting excited about some specific properties and inquiring only to find out finding out they were sold weeks or months ago. Accuracy doesn’t matter to a window shopper, so why would it matter to a syndicator.

    For real buyers, it does matter. It does piss them off and agents in the field have to listen and deal with the complaints.

    I understand that it’s difficult to understand the negative impact if someone hasn’t actually had to deal with it in the field with real clients…over and over again. Same thing with Zestamites.

    Imo, their actions don’t match their stated concern for accurate data. As shared in an earlier comment, it’s in their best interest to display as many properties as possible because that’s what attracts millions of window shoppers, which propels their advertise-with-up mission. If suddenly their traffic cratered due to inaccurate data (which it won’t because window shoppers don’t care) they would scramble to fix the problem pronto. The desire for page views trumps the need for accuracy.

    In closing, it appears it’s all here to stay, because I can’t change the circumstance I’ll have to change my perspective from a sneer to tolerance, to how I can use them to my advantage. Sigh. Shrug.

    Thanks for sharing.

    Reply
    1. Todd Carpenter Post author

      First off Ken, those white papers mostly measured markets where Trulia and Zillow have the lowest coverage. You might be able to prove me wrong, but I doubt Zillow and Trulia’s accuracy is anywhere near as bad as 70% nationwide. In a market like Houston, they are better. But, even in a market where the MLS is ver cooperative (like HAR), there are still brokers who opt out, and agents are left to do things manually.

      Still, I plainly state that Zillow and Trulia’s willingness to take non-MLS sourced is part of the reason they have a data problem and I should have elaborated here. Nationwide, the MLS isn’t always the trumping data source. The portals have direct syndication agreements with many large brokers. Some feeds are more accurate than others. There are a ton of reasons why a broker would choose to send their own feed instead of using their MLS, and I think I’ll add that to my list of things to write about in the future.

      Reply
  12. Jason Berman

    I think the MLS’ fear of disintermediation is a valid concern. To me it seems to be the heart of the issue. How can they work together on data integrity when one side distrusts the true motivations of the other?

    Reply

Leave a Reply