What would it look like if you asked 600+ science bloggers to list up to three science blogs, other than their own, that they read on a regular basis, and then visually mapped the resulting data? Like this:
Update: You can now play with this data via an interactive Gephi graphic here: bit.ly/MySciBlogREAD
In #MySciBlog survey of over 600 science bloggers, I asked participants to list up to the top three science blogs, other than their own, that they read on a regular basis, if applicable. With this data, I'm looking to explore potential communities of practice and relationships between science bloggers that may lead to shared content decision rules or blogging approaches. For example, do communities of bloggers that regularly read each other's blogs begin to share rules of format, topic choice, tone, etc?
After pulling the data into Excel and rather tediously cleaning it up (looking for blogs listed under alternative or incorrect, names, etc.), I mapped the resulting dataset in Gephi, an open and free social network mapping software. I then laid out the network (consisting of survey participant blog nodes connected via up to three edges, or lines, to target 'regularly read' science blogs) according to a ForceAtlas 2 algorithm. The full resolution map is available at Figshare.
The ForceAtlas 2 layout is relatively straightforward. It treats each node - each blog in my mapped network - as a charged particle that is repulsed by any other particle that it shares no ties with. However, ties between nodes (created each time a target blog is listed by a survey participant as one he or she regularly reads) act like springs, attracting linked nodes together. For example, if I listed Ed Yong's 'Not Exactly Rocket Science' as a blog that I regularly read, the 'From The Lab Bench' node would be relatively close to the 'Not Exactly Rocket Science' node, and a visible line or 'edge' would link the two nodes together.
"ForceAtlas2 is a force directed layout: it simulates a physical system in order to spatialize a network. Nodes repulse each other like charged particles, while edges attract their nodes, like springs. These forces create a movement that converges to a balanced state. This final configuration is expected to help the interpretation of the data." - Plos One
Each node in the network represents a science blog - either a survey participant's blog or a blog listed by a participant. Each node is linked to a maximum of 3 other nodes. Nodes with no outgoing edges represent either blogs whose authors did not take my survey, or blogs whose authors didn't list any other blogs as ones they read on a regular basis. Nodes and node labels are sized according to in-degree, or how many times the blog (node) was listed by other bloggers as regularly read.
Then comes the fun part - the colors! Communities (represented by color-coded nodes in my network map) were detected automatically through Gephi's modularity class function (resolution = 3.0). Modularity measures the strength of division of a network into clusters, or communities. Networks with high modularity (with a maximum modularity score of 1) have dense connections or edges between the nodes within communities but sparse connections or edges between nodes in different communities. My network of 'read science blogs by science bloggers' has a modularity score of .702, indicating significant community structure. This structure is often visually apparent, as in the climate blogs and 'geo' blogs visible as distinct clusters in purple at the bottom of my network map above. I've isolated this community in the image below.
If you took my survey and you can find your blog on this network map, which community or cluster does your blog belong to? Do you think this colors your blogging?