William Hertling's Thoughtstream: An Open Social Stream

I've been thinking about the web and the role and effect of social networks. While I'm a user of Facebook, and like certain parts of it, there are other aspects of it that concern me, both for the impact it's having now, as well as for the future. As an idea person, I ponder how we can get the benefits of social networking without the costs, while regaining the open web we used to have.

If you haven't done so, go read Anil Dash's The Web We Lost. I'll wait.

I'm going to cover three topics in this post:

The shortcomings of social networking as they exist today.
The benefits of social networking. I don't want to throw away the good parts.
A description of what a truly open social network would look like.

The Problems of Today's Social Networks

These are the main problems I see. I'm not trying to represent all people's needs or concerns, just capture a few of the high-level problems.

Transitory Nature

My first problem with Facebook and Twitter is the transitory nature of the information. I'm used to the world of books, magazines, and blogs, where information is created and then accessible over the long term. Years later I can find Rebecca Blood's series of articles on eating organic on a foodstamp budget, my review of our Meile dishwasher written in 2006, or my project in 2007 to build the SUV of baby strollers. These are events that stand out in my mind.

Yet if I want to find an old Twitter or Facebook post, it's nearly impossible, even if it happened just a few months ago. There was a post on Facebook where I asked for people who wanted to review my next novel and twenty-five people volunteered. Now it's a few months later, and I want to find that post again. I can't. (Having grown used to this problem, I took a screenshot of it, but that's an awful solution.)

The point is that properly indexed and searchable historical information is valuable to us, our friends, and possibly our descendants. However, it's not valuable to Facebook and Twitter, whose focus in on streaming in real-time.

Ownership and Control Over Our Data

It should be unambiguous that we own our own data: our posts, our social network, our photos, and that we should have control over that information. As a blogger and author, I would choose to make much of that public, but it should be my choice. Similarly, it should be possible to have it be private. My data shouldn't be used for commercial purposes without my explicit opt-in, and I should have control over who gets it and how they use it.

Personally, I'd like my content to be creative commons licensed: It's mine, but you can use it for non-commercial purposes if you give me attribution.

Yet this is not the case today. We have problems, again and again, with Facebook, Google, Instagram, and other services claiming the right to use our material for advertising, using it commercially, reselling it, and so on.

Advertising

We should have the right to be free from advertising if we wish, and certainly to have our children not exposed to advertising. But the way social networks exist today, the advertising is forced at us, whether we want it or not. And while I can ignore it (although I still hate the visual distraction), it's harder for my kids to do so.

We've unfortunately ended up in a situation where the only revenue model for these businesses seems to be advertising based, even though there are alternatives.

Siloing of Networks and Identity

I have a blog, a couple of other websites, accounts on Twitter, Facebook, LinkedIn, Google Plus, FourSquare, YouTube, and Flickr. But, for all intents and purposes, there's just one me. We try to glue these pieces together: sharing Instagram photos on Facebook, or using TweetDeck to see Facebook and Twitter posts in one place, sharing checkins, but this is a terrible approach because our friends and readers either see the same information in multiple places (if we share and cross-link) or miss it entirely (if we don't). Because the networks are fighting over control points, they're disallowing the natural openness that should be possible.

Privacy

For some people, privacy is a big concern. This isn't a big one for me because I subscribe to the basic theory of Tim O'Reilly that obscurity is a bigger concern. (He was talking about authors and piracy, but I think the theory applies to most people, whether they're furthering their career, starting a business, selling a product, etc.) I'm concerned about the use and misuse of my data by commercial interests, but I think that can be handle through mechanisms other than privacy. If I'm wrong, then yes, privacy becomes a bigger issue.

The Benefits of Social Networking as it Exists Today

Yet for all these complaints, there are pieces that are working.

I have a niece and her husband that I don't get to see often, but they're active on Facebook, and I feel much more of a connection to them as compared to family not on Facebook. I'm glad to share what's happening with my kids with my mom. I have far more interactions with fans on Facebook than I ever had comments on my blog.

The attempts surface the content that matters to me are imperfect (to be honest, often awful), but exist in some form:

I don't see a hundredth of the tweets of the people I follow, but using TweetDeck and searches on hashtags and particular people, I'm able to find many I am interested in.
Google Circles are far too much work to maintain, but for a few small groups of people, it helps me find the content about them.
Facebook's automated algorithms are awful, showing me the same few stories over and over and over, but it's an attempt in the right direction: trying to glean from some mix of people plus likes plus comments what to show to me.

The Solution

I think there is a solution that combines the best of social networks and the best of the old, open web. I think it's also possible to get there with what we have today, and iterate over time to make it better.

What I'm going to describe is a federated social network.

Others have discussed distributed social social networks. You can read a good overview at the EFF: An Introduction to the Federated Social Network. If you look at the list of projects attempting distributed social networking, you'll notice that they all list features that they'll support, like microblogging, calendars, wikis, images. You'd host the social network on your own server or on a service provider.

Despite distributed social network and federated social network being used somewhat interchangeably in the EFF article, I want to argue that there are critical differences.

The fully distributed social network describe in the EFF article and in the list of projects feels like mesh networking: theoretically superior, totally open and fault tolerant, but in practice, very hard to create on any scale.

I prefer to use the term federated social network to describe a social network in which the core infrastructure is centrally managed, but all of the content and services are provided by third parties. The network is singular and centralized; the endpoints are many and federated. To continue the analogy to computer networking, it's a bit like the Internet: we have some backbones tying everything under the control of big companies, but we all get to plug into a neutral infrastructure.

(I'll acknowledge that in recent years we've seen the weakness of this approach: we end up with a few big companies with too much control. But it's still probably better that we have an imperfect Internet than a non-existant mesh network.)

Here's my vision.

SocialX Level One:

Let's start by imagining a website called SocialX. I have an identity on SocialX, and I tie in multiple endpoints into my account: Twitter, my blog, and Flickr.

Behind the scenes, SocialX will use the Twitter API to pull in tweets, RSS to pull in blog posts, and the Flickr API to pull in photos.

Visitors to my profile on SocialX will see an interwoven, chronological stream of my content, including tweets, blog posts, and photos, similar to the stream on Facebook or Google Plus.

SocialX will be smart enough to eliminate or combine duplicate content. If a tweet points to my own blog post, it can surmise that these should be displayed together (or the tweet suppressed), knowing the tweet is my own glue between twitter and my blog: the tweet is an introduction to the blog post.

Similarly, if a blog post includes a flickr photo, then the photo doesn't need to be separately shown in my stream.

Of course, SocialX will feature commenting, like all other social networks. Let's talk about comments on blogs first. Let's assume I'm using a comment service like DISQUS. By properly identifying the blog post in question, SocialX can display the DISQUS comment stream exactly as it would appear on the blog: in other words, both SocialX and the original blog post share the same comment stream. Comment on my blog, and your comment will show up in the SocialX stream associated with the post. Comment on SocialX, and the comment will show up on the blog.

Twitter replies can be treated as comments. In fact, the current approach of handling related messages on twitter is obscured behind the "view conversations" button. On SocialX, Twitter replies will look like associated comments. And if you reply on SocialX, your comment gets posted back to Twitter as a reply. So both Twitter and SocialX will share the same sequence of shared content, they'll just be represented as comments on SocialX, and as Twitter replies/conversation on Twitter.

In other words, the user interface of SocialX might look a lot like Facebook or Google Plus, but behind the scenes, we have two-way synchronization of comments.

SocialX can handle the concepts of liking/+1/resharing in a similar manner. The two high level concepts are "show your interest in something", and "promote something". Each can be mapped back to an underlying action that makes sense for the associated service. For twitter, "show interest" can be mapped to favoriting a tweet, and "promote" can be mapped to retweet.

So far, we've also discussed how a single user's stream of content looks. In other words, we've looked at it from the content provider's point of view.

If a user named Tom comes to SocialX to view content, he can, of course, view a single user's content stream. But Tom likely has multiple friends, and of course this is social networking, not just the web, so we've got to use social graphs to determine who Tom is interested in.

SocialX will use any available social graphs that it's connected to, and will display the sum total of them. So if Tom connects with Twitter, he'll see the streams of everyone he follows on Twitter. If Tom connects with Twitter and LinkedIn, Tom will see interwoven streams of both. (Although SocialX will try to remove redundant entries across services by scanning posts to see if the content is the same.)

This is today. We can make this work with a handful of existing services, plugging them into a centralized network, do the work on the central network to get these existing providers connected. It's about bootstrapping.

Notice that we don't need everyone to use SocialX for it to start being valuable. If Tom visits the site, and follows another twitter user named Sally, we can display Sally's twitter stream for Tom, and probably auto-discover her blog feed, making the service useful for Tom before Sally every starts to use it. In essence, at this point we have a very nice social reader.

SocialX Level Two:

The next step beyond this is an API for the platform. Rather than force the platform to do work to integrate each new endpoint, we provide an open API so that other services can integrate into the network. When the next post-hipster-photo service comes out, they can integrate with SocialX just as Instagram once did with Facebook and Twitter APIs.

The API will require services to support a common set of actions for posting, commenting, liking, and promoting. Services will be required to provide posts in two formats: a ready-to-render HTML format, as well as a semantic form that allows other services to create viewers. (Semantic HTML would work as well.)

We require the semantic form because SocialX can't be the only ones in the business of rendering these streams. So SocialX will also provide an API for other services to provide a reader/viewer or whatever you'd like to call it. This enables the equivalent of TweetDeck and Hootsuites in our environment. If someone can provide a superior user experience, they're welcome to do so.

We also need to take a stab at figuring out what content to display. Should SocialX display everything, like Twitter? Use circles, like Google Plus? Heuristics like Facebook? Have a great search ability?

Let's open it up to third parties to figure it out. A third party can consume all the streams I'm subscribed to, and then take their best attempt to figure out what I'm interested in. And if we set up this API in a smart way, it'll function like a pipeline, so that we could have a circle-defining service divide up streams into circle-specific streams, and an interest-heuristic take each circle and figure out the most interesting content within that circle, etc.

Services like news.me are a perfect example of existing stream filtering, they just do it out-of-band.

Newsle is another good example of a content service we'd want to plug in because these news stories are associated with people we follow, even if they originate outside someone's own content stream.

So far we have a content API on one side of the service that allows us to pull in content from and about people. On the other side of the service, we have a filter API that can remix, organize, and filter what stories appear in the stream. And a reader API to consume the final stream and render it.

SocialX will continue to provide default, base level filtering and reader native to the service, but all content originates from somewhere else.

Now we have a rich ecosystem that invites new players to create content, filter it, and display it.

In contrast to distributed social networking systems that spread out the network, but build in the features, SocialX would distributed the features, but have a singular central network.

SocialX Level Three:

Technology businesses need to make money. I respect that. As a technology guy, I'm often on that side of the fence. Content providers want to make money. I respect that, too. As an author and a blogger, I'd like to earn something from my writing.

But I also want to be free from advertising.

How can we resolve this dilemma?

Advertising is just one way of making money, but I'd like to suggest two other ways.

Patronage

Let's think about Twitter for the moment. Their need to make money from advertising has led to all sorts of decisions that their users hate. They want to insert ads into the tweetstream. They want control over all Twitter clients, to ensure their ads are shown. They're restricting what can be done with the Twitter API.

Anytime a company makes their users hate them can't be good.

Here's a different idea. The more followers one has on Twitter, the more valuable Twitter is. At the very top of the ecosystem, there are users with millions of followers, whose tweets are worth thousands of dollars each. Even at the lower end of the system, a user who has 10k or 50k followers on twitter is likely gaining a tremendous value from that network.

What if Twitter charged the top 1% of most-followed users a fee? Twitter would be free to use under 2.5k followers, but followers are capped unless you pay. A fee starting at $20/year, of roughly 1 cent per follower, would raise about $200M a year -- in the same ballpark as their current ad-based revenue.

Ad-Free, Premium Subscriptions

The second opportunity is to charge for an ad-free, premium experience. If 10% of Twitter users paid $10/annually for an ad-free experience, that's $500M in revenue. Personally, I'd be delighted to pay for an ad-free experience. Part of the reason this doesn't work well today is that my time reading is split between Twitter, Blogger, Wordpress, individually hosted blogs, news sites, Facebook, Google Plus, and so on.

It's simply not feasible to pay them all individually.

However, if I'm getting the content for all these services through one central network, and can pay once for an ad-free experience, suddenly it starts to make sense.

SocialX knows who the user is, what they've viewed, which services helped to display the content.

Now we start to see a revenue model that can work across this ecosystem. Revenue could come from a mix of patronage, paid ad-free users, and advertisements. We'll keep ads in the system to support free users, but now that we have multiple revenue streams, there's less pressure to oriented the entire experience around serving ads and invading people's privacy.

Example 1: Ben is a paid-subscriber of the system. Ben's $5/month fee is proportioned out based on what he interacts with, by liking an item, sharing it, bookmarketing it, or clicking "more" to keep reading beyond the fold. He's going to pay $5/month no matter what, so there's no incentive for him to behave oddly. He'll just do whatever he wants.

If Ben interacts with 300 pieces of content in a month, each gets allocated $5/300=1.6 cents.

Those 1.6 cents are shared with among the ecosystem partners, something like this:

network infrastructure: 15% (SocialX)
stream optimization: 15% (news.me, tbd)
reader: 15% (the feedlys, tweetdecks, hootsuites of the world)
content service: 15% (the twitter, flickr, blogger, wordpresses of the world)
content creator: 30% (you, me, joe blogger, etc.)

Example 2: Amanda is a free user of the system. She sees ads when using SocialX. Amanda will be assigned an ad provider at random, or she can choose a specific one. (Because the ads too, will be an open part of the system.) Ad providers will be able to access user data for profiling, unless the user opts out.

If Amanda clicks on 5 ads during the month, that will generate some amount of ad revenue. The ad provider keeps 20% of the revenue, and the rest flows through the system as above. The revenue is allocated to whatever content Amanda was viewing at the time.

Ad providers are induced not to be evil, because users have a choice, and can switch to a different provider.

Example 3: George is a famous actor from a famous science fiction show. He has four million followers. Only the first 2,500 of George's followers on SocialX will be able to view his stream, unless George pays a Patronage fee. He does, which for his level of usage is $4,000 a year. However, George is also a content provider, so if his content is interacted with (liked, reshared, etc.), he'll also be earn money. Since George is frequently resharing other people's content, the original content creator will get the bulk of the revenue (25% instead of 30%), but we'll give George 5% for sharing.

Conclusion

Let me come back to some of my problems with existing social networks, and see if we've improved on any of them:

Advertising: We've made a good dent in advertising. By having a central network and monetization process that relies on a combination on paid ad-free experiences, patronage, and advertising, we've taken some of the pressure off ads as the only revenue model, and hence the primary force behind the user experience. We're allowing people to select their ad provider, so they can choose if they want targeted ads or random ads, or organic product ads, or whatever they want. Ad providers can't be evil, or customers will switch providers.
Ownership and Control Over Our Data: SocialX owns very little data. It resides in the third party services. When users have choice over one blogging platform or another, or hosting one themselves, then they will regain control over their data by being free to choose the best available terms or by hosting it themselves.
Privacy:

My primary privacy concern is over the commercial use of my data, and in this regard, I have much more control. I can choose to use a stream filtering service which profiles me and my interests and receive a more personalized stream, or I can choose not to. Either way, the data is only used to benefit me. I can pay to opt-out of advertising totally, or opt-out of targeted advertising at no cost.
I haven't really thought through the scenario of "I don't want anyone but a select group of people to see this content," the other type of privacy concern. My guess is that we could solve this architecturally by having selectable privacy providers that live upstream from the filters and readers. These privacy providers would tag content with visibility attributes as it is onboarded.

Transitory Nature: My concern here was the case of being able to find a given Facebook post where I had solicited beta-readers. In the SocialX case, I see a few fixes:

Some of the "stream filter providers" could be search engines.
I could have chosen to originate my post as a blog entry.
The platform could support better bookmarking of posts.

Siloing of Networks and Identity: By it's very nature, this is the anti-silo of networks and identity.

The main problem we're left with is that we need a benevolent organization to host SocialX. Because it is a centralized social network, someone must host it, and we have to trust that someone to keep it open.

A few years ago, I was sure this was going to be Google's social strategy. It seemed to fit their mission of making the world's information accessible. It seemed to be a platform play akin to Android. Alas, it hasn't turned out to be, and I no longer trust them to be the neutral player.

It could be built as a distributed social network, but then we're back to the current situation. Lots of distributed social networks, but no one has the momentum to get off the ground.

If you've made it this far -- thanks for reading! This is my longest post by far, and I appreciate you making it all the way through my thought experiment. Would this work? What are the shortcomings? How could this become a reality? I'd love feedback and discussion.

William Hertling's Thoughtstream

Pages

An Open Social Stream