But you never know whether anything will happen…
RSS icon Email icon Home icon
  • Star Trek, Berners-Lee, and DZone’s Ocean of Data

    Posted on May 24th, 2009 rick 6 comments

    I’m a product of my upbringing, and (using the term loosely) I grew up in the 60’s and 70’s. It was the time of Star Trek and a real-world space program, both of which had tremendous influence in shaping my belief that the pursuit of scientific knowledge leads to good things.

    The voyagers of the starship Enterprise had an excellent situation. They simply had to do whatever it was they were good at doing, and somehow resources were available to do keep doing it continuously and with little regard for cost. It’s a model that appeals to the closet utopian in me, but it’s pretty far from the day-to-day economic reality most of us live in.

    Today, I listened to Tim Berners-Lee’s TED talk where he urges us all to open up our data, all data, and make it available for linked use. I love the idea of “raw data now”, but it scares me. It happens that DZone is floating on an ocean of data. In our three years online we have tracked how millions and millions of developers have used hundreds of thousands of links from tens of thousands of domains. I imagine that intriguing insights about developer trends could be drawn from this data. It might be even more intriguing if it could be correlated to open source project activity and commit rates or some similar data pool that someone else possesses. Sir Tim’s idea of exposing “raw data now” challenges us to engage in a broad experiment and find out what happens.

    I’m close to taking up Sir Tim’s challenge, really I am. My desire to see what we might learn confronts my business training, which suggests that possessing information exclusively is my competitive advantage. My instincts, however, tell me not to sweat it and that things will be alright.

    Your input matters a lot, and I’d like to hear your ideas about how you would want to leverage DZone’s data if we opened it up (of course, no personal data would be shared!) What new and interesting possibilities would this create? Are there steps we could take in this direction without throwing the doors wide open and inviting the world into our databases? What would you do?

    I’m going to give this serious thought, and I would genuinely like to hear from you. Thanks!

     

    6 responses to “Star Trek, Berners-Lee, and DZone’s Ocean of Data” RSS icon

    • Go for it. While the data itself may be worth money somehow, it’s much more likely that this will enable some awesome dzone extensions. Instead of trying to build everything yourself, you can provide the API and let your users build stuff you might not have thought of. You could use rate limits like Twitter to mitigate risk.

    • [...] This post was Twitted by sclopit - Real-url.org [...]

    • I would love to run some number crunching on it to track trends over the last 3 years. It would be rather cool to see the rise and fall of projects/languages/frameworks.

      Another thing I would like to to is map each articles similarity to each other. So you can get groups of similar articles.

      Getting chunks of good data is really difficult these days, and I would love to see more companies and websites release it just so people can tweak with it. So long as its not personally identifying or company secrets opening it will help more then hinder, as well as foster trust with the audience.

      Besides, without the users its nothing. Why not give back what they create?

    • Rick,

      I am long time member (primarily reader) of javalobby and now dzone since 2002. There could be some monetary benefit by determining the ways in which people use the data and then finding some way to attach ads to it.

      From a purely informational standpoint, it could be one other point of analytics for a website or blog so that they can determine the best keywords to include in their title or description so that they can be found on dzone search.

      I don’t have any ideas of how I would use the data right now but may be inspired by some enterprising usages after you have made the api available.

      DZone is fantastic and addictive. Keep up the good work and thanks for the resource.

    • I think that the Dzone data are already very open you have RSS based on tags and RSS based on search. RSS is cool XML standard, you might consider making API page with RSS, Atom and JSON with more then 20 results per page.

      I think that the idea of Tim Berners is great, but one think is missing universal data or “Database” format in what format should we share “raw data”? Now that standard is HTML, the idea behind HTML was universal language why will unite all documents. But today HTML is used for doing everything. We need new “data based” standard for uniting all “raw data”.

    • Rick,
      it is the well known chicken-and-egg problem: you are not willing to give your data until somebody explains you the advantages you can get, and the community (in this case the linked data community) cannot fully explain you the advantages until they can experiment with your data.

      As Tim said in his talk, please, “stop hugging your data” and give the linked data community the chance to prove their claim. All the scenarios you and your commenters envisioned can be enabled by a linked data approach.

      See for example what’s happening in the LarKC project (http://www.larkc.eu) in the urban scenario and in the healthcare/lifescience field.

    Leave a reply