
The current gatekeepers of the web and computing industry include publications such as TechCrunch and TechMeme.
I barely ever read these, and here's why. They only tell me things about the big boys, mostly when money is involved. It hardly matters to me who VCs are investing in, what advertising strategy Facebook is pursuing, or the fact yet another social network for cats has been launched.
That's not what I'm in technology for: I want to hear about genuine advance, discovery, code I can read, services I can use, new applications of research. And I want to share with and learn from others in the same ecosphere.
Unfortunately, the gatekeepers can have a stifling, negative effect on the industry and community. Our thinking has become dull, and our attitude one of sniping. (I have a deep urge to rant about various small-minded inaccurate stories I've seen of late. But if you're getting my point, I needn't bother. And if you're not, well, it won't help)
The competition for cash — directly connected to TechCrunch exposure — is odious. I'm not prepared to even start doing the self-prostitution it takes to get into that echo chamber of A-list tech people. By far and away the most interesting and inspiring people I've encountered on the web recently haven't registered at all on the valley meme-o-meter.
These things do come in cycles, of course. TC and TechMeme are themselves usurpers of a previous generation of media gatekeepers, and they in turn will be overtaken.
In the space between the installation of gatekeepers it's a great time for innovation, rich discussions, and changing people's minds. When I read tech news, I want to be inspired to build, create and cheer about it.
There was a children's TV programme while I was growing up called "Why don't you?", entitled in full "Why don't you just switch off your television set and go out and do something less boring instead?" I want to read stuff like that.
Some of the places I've been finding worthwhile news recently include:
FriendFeed — essentially a "lifestream aggregator", it's the commenting feature in FriendFeed that has allowed it to become a useful means of gathering news and information. It systematizes the way I've discovered tech news for the last ten years, through a network of individuals whom you are interested in. Additionally FriendFeed presents the opportunity for engaging debate that feels a lot more alive than blog comments (I wonder if this isn't due in part to the neutrality of the venue.)
TechJunk — a new tech news aggregator created by Dave Winer, with the intent of enabling discovery of smaller interesting technical news items, not just what the behemoths and well-connected are doing.
@timoreilly on Twitter — Tim's always been a discoverer and amplifier of important and interesting trends, and what he does on Twitter is a microcosm of what he does for his day job.
One of the things all those sources I just mentioned have in common is people. The kind of people who — whether you agree with them or not — don't get bound up by gatekeepers.
I've always believed that the best publications are those with the best editors. I've never cared for the "daily me" style of personalized news, because I want to learn things outside of my own scope, and neither for the Digg style of populism, because all too often it's folly, not wisdom, one finds in crowds.
In the spirit of this, I'd love to hear where others go for incisive, non-mainstream, news. Let me know in the comments.
Over the last 24 hours, hordes of Twitter-refugees have been signing up with the microblogging service Identi.ca. Fed up with the restricted and unpredictable service from Twitter, over the last week or two people have been jumping this way and that: Pownce, Plurk, FriendFeed, and now Identi.ca. Here's my Identi.ca stream.
Let's get something straight up-front. Identi.ca's in its early stages. At version 0.4.1 it's not yet added all the features that Twitter has now broken.
A lot of people are jumping on Identi.ca and going away again, muttering "it's not got X, I'm off back to Twitter." True, but it's not as a Twitter-replacement today that Identi.ca is important.
Here's why I think Identi.ca counts for more than just being a Twitter clone.
Anybody can help fix it. Anybody can set up their own Laconica (the name of the underlying software.) I've seen a clutch of posts from developers all offering their advice on fixing Twitter. With Identi.ca they can just get on and help.
When you sign up for Identi.ca you agree to license your contributions to it under the Creative Commons 3.0 Attribution license. You agree to let others share and remix your output, in return for giving attribution.
The open data ethos is baked into the codebase already. All output is available in RSS, and you can take your friends with you thanks to the FOAF exports available.
Twitter has millions of VC funding. Those folks will want a return. What does Twitter have to make money from? You and your content. Identi.ca gives you control in that situation.
Federation is one of the most enigmatic and exciting things about Identi.ca. I can set up my own server running the Laconica software, and still subscribe to people with accounts on Identi.ca's server.
This is how the XMPP instant messaging protocol works, and it's no surprise to find that XMPP is fundamental to Identi.ca's operation. Identi.ca is to Twitter as XMPP is to AIM. This may finally be the way XMPP breaks out into popular use for developers.
Because of the commitment to open data, you've nothing to lose by giving Identi.ca a little spin. So, head over and make your own account. I added my RSS output to my Friendfeed page, along with my other output. Chris Blizzard has already added Identi.ca into whoisi. The little ripples you make in Identi.ca today can still waft out into your personal publishing pool.
And please, don't waste time complaining about "it hasn't got X". In 12 hours I've already seen Evan Prodromou, its developer, add features and fix bugs. He's got an open bug tracker, and listens to feedback.
Right now Evan seems to have his hands full dealing with the thousands of new arrivals, and to his credit, Identi.ca's still working fine.
Follow me here: identi.ca/edd
We've switched on personal schedule sharing on the OSCON web site.
When you've put together your desired schedule by starring sessions of interest, just hand out the "public view" link to let others know what you want to see.
Here's my personal schedule. In it you'll find all the plenary sessions (as co-chair I simply cannot miss these, and neither should you, however late the party!)
Also there's a fair smattering of my pet topics such as open web technologies, virtualization and dynamic languages, and a bunch of things I want to hear more about: Prophet, female participation in open source, Clutter, and of course Erlang.
I'm fascinated to find out what other people have got planned, so please publish your schedules too and let's compare notes.
In just under a month, the tenth O'Reilly Open Source Convention will get underway in Portland.
Over ten years OSCON has developed—along with the world of open source—into an intense, exciting, informative, diverse and exhausting event. This year I've the privilege of being co-chair, along with Alison Randall. We've packed so much into the show, it's a difficult job even being able to comprehend it as a whole!
Fortunately, there's a way to start making sense of things before you arrive there, thanks to the personal scheduler. Just mark the sessions you want to go to with a star, and you'll be able to plan out your time in advance.
I wanted to list a few sessions from my own personal schedule that particularly piqued my interest. Then at the bottom of this post I'll share a discount code which can give readers of this blog 15% off OSCON registration. There's bribery for you.
Largely thanks to XMPP enthusiasts and ejabberd, I've been hearing increasing amounts about Erlang, and I'd like to know enough about it to be dangerous. This three hour tutorial looks just the ticket.
Open Source Virtualization Hacks
This is one of several sessions we have on virtualization, something I'm particularly pleased about. Virtualization may be "done" at the kernel level, but I think we're only just starting out on its application. This session is by my friend and sometime co-author, Niel Bornstein, who works for Novell on just this sort of thing.
Using Puppet: Real World Configuration Management
Puppet is the piece of open source software that is most exciting to me at the moment. As a developer, it enables me to manage my machines like I'd manage my code libraries. A must-see if you've not used Puppet yet.
These are just 3 out of the 300 or so confirmed sessions. Don't forget there's a large number of events and parties happening around OSCON too.
And finally, the discount code. Use the code os08pgm when you're registering, and you'll get 15% off the ticket price.
See you in Portland!
The BBC have recently opted to remove hCalendar microformats from their Programmes site, due to problems with the use of the abbr tag clashing with accessibility tools. One of the potential alternative solutions they're discussing is RDFa.
The excellent John Resig, brain behind jQuery and a million other wonderful Javascript projects, comments on this development in his blog. Take the time to read his post now, it's short.
Resig is someone whom I admire greatly. In particular the quality of his work and thinking, and his dedication to tidying up hairy technologies like JavaScript and Mozilla APIs into developer-accessible frameworks (jQuery, FUEL).
So I was a bit disappointed, and frankly weary, to pick up on the continuation of the bogus microformats vs RDF holy war in his post. I wrote the substance of his post in the comments on his blog, but will repost here for completeness.
The BBC criticism of microformats' use of the abbr tag is a valid one. The microformats' community don't need to "step up and prevent attrition" as Resig writes — as if the enemy was advancing over the front — they need to fix a bug.
Resig reads the RDFa primer and comments that it is
"... obvious that RDFa still has a long ways to go before any sort of practical adoption by developers and designers. Riddled with advanced, or just plain confusing, terminology (XML namespaces, Dublin Core, semantic web, and not to mention the addition of many new attributes - like typeof, about, and property) it appears to be solidly entrenched in the ways that Microformats were able to shake themselves free of, allowing them to achieve widespread adoption."
Resig moves too quickly to dismiss RDFa. In a similar way I know many people who on encountering the HTML5 specs strongly espoused by Mozilla have the same impression of confusion and complexity. It doesn't necessarily make the work less valid, it's just a reflection on the document.
One of the wonderful things Resig has done with JavaScript is take time to love it and figure out its corners. Take some of the "confusing" and "advanced" things away and you're not able to achieve the same things. What he's done in jQuery is add a layer of elegance, predictability and accessibility.
I for one would love to see what Resig would do with semantic markup. jQuery really encourages and enables good markup practices, so there's a lot of synergy with his current style.
I'll happily concede that RDF people rarely do themselves any favours in the departments of over-engineering or academic self-satisfaction. I also think microformats have natural limitations. There's a place in between, and it's where people John Resig do their best work.
Tom Morris mentioned the gitjour project on the semantic web IRC channel today and it set me thinking.
Gitjour enables collaboration on a local network by tying together Bonjour (aka Zeroconf) and the git distributed source control system. It lets a developer publish the source direct from their own machine, without having to set up a public mirror. The advantages are great for camp-style hackathons.
However, I rarely get to such hurrahs and if I do, prefer to spend time talking to others and mining their brains. What I do have though is an extended, continuous "camp" that exists among a subset of my IM contacts, Twitter friends, and so on.
What I want is a version of Bonjour that works over a virtual network established from an ad-hoc list of friends and groups selectable from a social networking tool.
I know, this is starting to sound like Groove or any other number of peer-based collaborative tools. The point is I don't want to join any walled garden and get "monetized", I want to use existing Bonjour-aware tools, just among an ad-hoc group of people.
The hard bit as ever is firewall traversal, but this has been solved more than a few times now. It seems we've got the tools, we just need some enterprising developer to glue it all together.
Ever the sucker for punishment, I decided to pick three difficult things and stick them all together: LDAP, SSL and replication. Here's how to make it go on Debian and Ubuntu.
You want LDAP replication to happen over the internet, and you want it to happen securely.
I'm not going to tell you how to set up your LDAP from scratch here: I'm assuming you've reached a solution you're happy with and want to replicate it.
We're going to set up a replicating slave LDAP server, which communicates with the master over the internet via an SSL-protected connection.
First up, the master LDAP server needs to be configured to permit replication.
The key lines to add to your slapd.conf include:
moduleload syncprov index entryCSN,entryUUID eq overlay syncprov syncprov-checkpoint 100 10 syncprov-sessionlog 200
These load up the synchronization module, add indices which make sync go faster, and enable sync. For more detail see the OpenLDAP site.
Next you need to add a replicator user to your LDAP database, give your replicator user access to passwords as well as general read access. To create the replicator user, I made this simple LDIF file and fed it to ldapadd.
dn: cn=replicator,dc=mydomain,dc=com objectClass: simpleSecurityObject objectClass: organizationalRole cn: replicator description: LDAP replicator userPassword: TOPSEKRIT
Once this user is in your LDAP database, you should give it read access to passwords (I assume you've already given read access to authenticated users.) I have this in my slapd.conf:
access to attrs=userPassword,sambaNTPassword,sambaLMPassword ... by dn="cn=replicator,dc=mydomain,dc=com" read
To check that this works, try using ldapsearch to check that the passwords are returned:
ldapsearch -x -D cn=replicator,dc=mydomain,dc=com \ -W | grep -i password
Enter the replicator password when prompted, and you should see the encrypted passwords from your LDAP database.
Now you've got replication enabled on the master, you will want to ensure it is available on the internet only via TLS or SSL. Here's what I added to slapd.conf to enable this:
TLSCertificateFile /etc/ssl/certs/ldapserver_crt.pem TLSCertificateKeyFile /etc/ssl/private/ldapserver_key.pem TLSCACertificateFile /etc/ssl/certs/myCA.pem TLSVerifyClient demand
As you will guess from the configuration, the first two lines set the SSL key and certificate the master uses (see "A little twist" below for an important note on key permissions.) The third line tells slapd where to find my site-local certificate authority (CA), and the fourth line says slapd must require any connecting client to have a valid SSL certificate signed by the site-local CA. This is important, as it provides a second layer of access control: a replicating client must connect using a certificate you signed, plus the replicator password.
Before this enables TLS access, we must tell slapd which network interfaces to listen on. To do this, edit the SLAPD_SERVICES variable in /etc/default/slapd. Here's my configuration:
SLAPD_SERVICES="ldap://127.0.0.1/ ldap://192.168.0.1/ ldaps:///"
This enables regular LDAP on the loopback and intranet network interfaces, and LDAP/SSL on all interfaces, including the public internet.
So, with slapd restarted we are at this situation: connections are now possible from the internet, as long as they are made over SSL with a certificate signed by our site-local CA.
(In fact, you can make much finer-grained access restrictions in your configuration than I have done. Using LDAPS rather than TLS over regular LDAP is a rather broad precaution. As explained on the OpenLDAP site, the ssf= parameter can be used to require a certain level of secure connectivity on a per-user or client basis.)
Your slave server should have the same configuration as the master, except you can leave out the bits enabling replication.
Firstly, you'll need add to slapd.conf the replication configuration:
syncrepl rid=123
provider=ldaps://ldapmaster.mydomain.com/
type=refreshAndPersist
searchbase="dc=mydomain,dc=com"
filter="(objectClass=*)"
scope=sub
attrs="*"
schemachecking=off
bindmethod=simple
binddn="cn=replicator,dc=mydomain,dc=com"
credentials=TOPSEKRIT
Most of this I took as boilerplate from the OpenLDAP documentation. Items to note include:
And here's the /etc/default/slapd configuration:
SLAPD_SERVICES="ldap://127.0.0.1/"
The slave slapd exists only in this case to serve the local machine.
Finally, there's the tricky bit! You need to configure slapd to connect to the master server using a certificate. I'll assume you've created and signed a key and certificate pair for your slave server (see my post Low-tech SSL certificate maintenance for more on this.)
Awkwardly, the TLS configuration in slapd.conf is for the server only. Replication works as a client, and thus needs separate configuration. Furthermore, you cannot configure this globally on your machine, as the SSL certificate is a per-user only parameter (see man ldap.conf for more information on this.)
Instead, we must set it in slapd's environment. Add these two lines to the end of /etc/default/slapd:
export LDAPTLS_CERT=/etc/ssl/certs/slapd.crt export LDAPTLS_KEY=/etc/ssl/private/slapd.key
This file is sourced as a shell script by slapd's init script. Amend the path to your certificate and keys as required. Use /etc/init.d/slapd restart and you should be good to go.
Finally, we want the slave server to be certain it's talking to the real master. So we also configure client connections to verify the SSL certificate of the peer, in ldap.conf again:
TLS_CACERT /etc/ssl/certs/myCA.crt TLS_REQCERT demand
One gotcha to notice with both client and server is that slapd runs as the openldap user by default on Debian. Also by default SSL keys are readable only by the ssl-cert group. You'll need add the openldap user to this group, otherwise it won't be able to access /etc/ssl/private.
Related articles on this site:
I maintain a bunch of SSL certificates, mostly signed by my own site authority. Too many not to automate, but not enough to warrant heavy machinery. Here's how I do it.
Each certificate needs a config to describe what's in it. I create each of these and name it with a .cnf suffix. Here's an example:
[ req ] prompt = no distinguished_name = server_distinguished_name [ server_distinguished_name ] commonName = server.usefulinc.com stateOrProvinceName = England countryName = GB emailAddress = edd@usefulinc.com organizationName = Useful Information Company organizationalUnitName = Hosting [ req_extensions ] subjectAltName=edd@usefulinc.com issuerAltName=issuer:copy nsCertType = server [ x509_extensions ] subjectAltName=edd@usefulinc.com issuerAltName=issuer:copy nsCertType = server
Let's say this config is server.cnf. I then just type make server.pem to generate the corresponding certificate and key, signed by my local certificate authority. As I don't want to attend the startup of every service, I ensure the key is password-less.
Here are the makefile steps I use to generate and sign keys.
.SUFFIXES: .pem .cnf
.cnf.pem:
OPENSSL_CONF=$< openssl req -newkey rsa:1024 -keyout tempkey.pem -keyform PEM -out tempreq.pem -outform PEM
openssl rsa <tempkey.pem > `basename $< .cnf`_key.pem
chmod 400 `basename $< .cnf`_key.pem
OPENSSL_CONF=./usefulCA/openssl.cnf openssl ca -in tempreq.pem -out `basename $< .cnf`_crt.pem
rm -f tempkey.pem tempreq.pem
cat `basename $< .cnf`_key.pem `basename $< .cnf`_crt.pem > $@
chmod 400 $@
ln -sf $@ `openssl x509 -noout -hash < $@`.0
The resultant files are:
Some notes on these steps: my site-local certificate authority is in the directory usefulCA, along with an OpenSSL config which describes my preferences. This config was created by copying and making appropriate adjustments to the default /etc/ssl/openssl.cnf which ships with Debian.
For generating certificate signing requests to ship to a commercial certificate authority, it's a bit simpler. I save the config files with a .reqcnf suffix instead, and use this rule:
.SUFFIXES: .pem .cnf .reqcnf .csr
.reqcnf.csr:
OPENSSL_CONF=$< openssl req -newkey rsa:1024 -keyout `basename $< .reqcnf`.key -keyform PEM -out `basename $< .reqcnf`.csr -outform PEM
And finally, a rule I use to sign incoming certificate requests from other systems:
.csr.pem:
OPENSSL_CONF=./usefulCA/openssl.cnf openssl ca -in $< -out `basename $< .csr`_crt.pem
I offer these without warranty in the hope they might be useful to somebody. They're not much more than a transcription of a how-to into a makefile, but it's just enough technology to ensure creating certificates isn't a big nuisance.
Why do I bother with a site-local CA, rather than just self-sign? It lets me bypass the annoyance of SSL warnings on clients once I've installed my own CA certificate, and gives me a coarse grained level of access control: for instance, only clients with certificates signed by my CA are allowed to access the site's LDAP server.
My personal next step with this is to integrate the certificate production process with my emerging Puppet recipes for managing local infrastructure.
I'd not seen this definition of the semantic web before, but I like it.
The Semantic Web is an attempt, largely, to map large quantities of existing data onto a common language so that the data can be analyzed in ways never dreamed of by its creators
— Tim Berners-Lee, Principles of Design
Now why can't we all get along?
Ten years ago, most of us wouldn't have dreamt we'd be managing terabits of storage, tens of megabits of bandwidth, arrays of network-distributed services. The height of a programmer's worry would likely be choice of UI toolkit or finding the right way to indent code, and the height of consumer concern deciding which room to put the new computer in.
Now the problems associated with managing large networks are becoming real for everyone, right down to the consumer level. Stupendously large amounts of computing resource are available at an instant.
Your household probably has more than a terabyte of storage already. Issues such as single sign-on are going to hit home over the next year, as networked computing and entertainment devices profilerate. Features such as Apple's Time Machine will be increasingly vital — software that makes traditionally gnarly sysadmin tasks consumer-friendly. The rebranding of .Mac into "Mobile Me" is also a step in this direction.
As software developers, we also have to cope with the effects of this resource-richness. For small sums of money we can get access to large computing clusters, geographically redundant hosting services. Our programs have left the desktop and found their new home on the web. System administration issues loom large upon us, security concerns lurk auspiciously in the corners of our minds.
Although the cost of infrastructure has dropped radically, other costs remain high and are going to stay that way. System administrators are not only grumpy, they demand high wages. Commercial software license fees spiral out of control: traditional per-CPU licensing models make little sense when you can quickly bring up tens of machines. The cost in power is already troubling large companies, and there's no reason to suspect the problems won't ripple down.
Help is at hand from a variety of technologies. If they don't yet make massive resource management trivial, they at least make it possible. Some of these also inhabit the weird territory of being both the source of a problem and a solution at the same time: virtualization, for example.
Distributed revision control is a technology whose time has finally come in popular circles, thanks in part to Linus Torvald's Git system. DRCS has several important impacts on today's developer:
All these trends lower the barrier to entry, increase collaboration and agility of development. You can the value of this as more software tools become free. Selling such tools is rapidly becoming a thing of the past, the advantages of sharing enable the developers at the sharp end to get their jobs done quicker.
However, such increased agility and, well, messiness leave other problems to solve, which the next two technologies address.
Hardware-as-a-service, infrastructure-as-a-service, call it what you will. The ability to create what we used to call entire machines, pick them up and move them around the network is revolutionary, and it's something that will have a real impact on regular developers. The benefits are at several levels.
Computing is a zero-sum game, and despite our increased ability to create and distribute software, problems still exist. We just pushed them to the next level.
In good part, this next level is the problem of configuration management. We now have networks and clusters of (virtual) machines, software so agile we need six decimal places to describe its revision levels, and network and authentication paths that are starting to tangle. How do we manage that?
One thing developers crave is repeatability. That's why we love our makefiles, autoconf, Ant, rake and so on. It's the one time even the most imperative-minded programmer writes declarative code. We like to say "let the world be like this."
Our new sprawling world lacks this feature, and the best of our old toolkits — .debs, RPMs — address things only at the level of packages in a single environment.
So developers must look to the world of operations, a territory we probably thought we needn't enter. In this world the new "make" is called Puppet. You write recipes to describe how things ought to be, and Puppet will make it so.
I've been spending some time digging into Puppet, and feel excited by the confidence it's giving me. Now my applications exceed single source trees, and single machines, it gives me the means to tie the whole together. This article was going to be solely about Puppet, but that will have to wait now for another time.
It's likely you'll have played with virtual machines and distributed revision control, but have you tried Puppet yet? Give it a spin, and let your mind wander over the benefits for your organization and development approaches.
For developers and users alike, our world is changing. Hardware, connectivity and increasingly software is becoming cheap or free. The solidity of the old things we put value on — real things you can touch like disks — is eroding.
What really matters is our data, our creations, and their communication. If they don't quite yet exist in a universal "cloud" yet, they're certainly getting frisky.
As vendors provide solutions for consumers to manage their new domestic infrastructure, developers must look to network-aware toolkits and operations techniques to manage and get the best from their emergent infrastructures.
Also on this topic: