<?xml version='1.0' encoding='UTF-8'?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/'><id>tag:blogger.com,1999:blog-14832160</id><updated>2008-09-02T23:29:25.088+01:00</updated><title type='text'>Flags and Lollipops - Bioinformatics Blog</title><subtitle type='html'>Blog of bioinformatics papers, links and stories that I thought were interesting (so your mileage may vary).</subtitle><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default?start-index=26&amp;max-results=25'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default'/><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>168</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-14832160.post-3844133151062123693</id><published>2008-05-24T22:46:00.002+01:00</published><updated>2008-05-24T22:50:30.685+01:00</updated><title type='text'>Disappointed with Popfly</title><content type='html'>&lt;a href='http://www.popfly.com/'&gt;Popfly&lt;/a&gt; is the mashup editor that Microsoft released last year. The idea is good. The 3D graphics are good. Silverlight is a bit buggy in Firefox (sidebars don't always redraw properly) but that's OK.&lt;br /&gt;&lt;br /&gt;If you're going to create a web 2.0 mashups builder, though, don't you think it's be a good idea to &lt;a href='http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1654015&amp;SiteID=1'&gt;provide some Atom support&lt;/a&gt;?</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/05/disappointed-with-popfly.html' title='Disappointed with Popfly'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=3844133151062123693' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3844133151062123693'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3844133151062123693'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-2058507409502687178</id><published>2008-05-19T17:08:00.006+01:00</published><updated>2008-05-19T17:15:29.759+01:00</updated><title type='text'>Meta-analysis</title><content type='html'>The journal platform team here at NPG just rolled out machine readable metadata for the papers we publish in Dublin Core, &lt;a href='http://www.prismstandard.org/'&gt;PRISM&lt;/a&gt; (good PRISM, not to be confused with &lt;a href='http://www.researchinformation.info/news/news_story.php?news_id=120'&gt;evil PRISM&lt;/a&gt;) and Google metadata formats.&lt;br /&gt;&lt;br /&gt;No more scraping to automatically get the citation for a paper, it's all in the HEAD:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_journal_title&amp;quot; content=&amp;quot;Nature&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_publisher&amp;quot; content=&amp;quot;Nature Publishing Group&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_authors&amp;quot; content=&amp;quot;Paul Schenk, Isamu Matsuyama, Francis Nimmo&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_title&amp;quot; content=&amp;quot;True polar wander on Europa from global-scale small-circle depressions&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_volume&amp;quot; content=&amp;quot;453&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_issue&amp;quot; content=&amp;quot;7193&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_firstpage&amp;quot; content=&amp;quot;368&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_doi&amp;quot; content=&amp;quot;doi:10.1038/nature06911&amp;quot; /&amp;gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Useful for apps like Zotero and Connotea (which before now downloaded two files each time you bookmarked a Nature paper: the page itself and then the linked EndNote file to parse).&lt;br /&gt;&lt;br /&gt;The metadata will be there for all papers going forward and back through some of the archives.&lt;br /&gt;&lt;br /&gt;For fulltext indexing of papers behind the paywall you can use the linekd &lt;a href='http://opentextmining.org/wiki/Main_Page'&gt;OTMI&lt;/a&gt; file (I only just saw &lt;a href='http://otmi.twease.org/otmi/app'&gt;Twease&lt;/a&gt;, which does just that) although there's only OTMI for Nature papers at the moment, I think.&lt;br /&gt;&lt;br /&gt;Lastly at some point in the future we're aiming to put &lt;a href='http://www.adobe.com/products/xmp/'&gt;XMP&lt;/a&gt; metadata in our PDFs, which should make it much easier for scripts and applications (like &lt;a href='http://mekentosj.com/papers/'&gt;Papers&lt;/a&gt;) to look at PDF files on your filesystem and work out what they represent.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/05/meta-analysis.html' title='Meta-analysis'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=2058507409502687178' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2058507409502687178'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2058507409502687178'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-6060331122924735147</id><published>2008-04-18T11:41:00.004+01:00</published><updated>2008-04-18T12:01:26.234+01:00</updated><title type='text'>Nice work Pedro!</title><content type='html'>Noticed while leafing through today's Nature that Pedro has a paper out (&lt;a href='http://www.nature.com/nature/journal/v452/n7189/full/nature06847.html'&gt;Isalan et al.&lt;/a&gt;, Evolvability and hierarchy in rewired bacterial gene networks).&lt;br /&gt;&lt;br /&gt;There's more on this over at &lt;a href='http://pbeltrao.blogspot.com/2008/04/shuffle-project.html'&gt;Public Rambling&lt;/a&gt;.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/nice-work-pedro.html' title='Nice work Pedro!'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=6060331122924735147' title='1 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6060331122924735147'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6060331122924735147'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-1912616361265309197</id><published>2008-04-17T16:34:00.001+01:00</published><updated>2008-04-17T16:34:43.609+01:00</updated><title type='text'>Ian owes me a pint</title><content type='html'>&lt;i&gt;(update: Gavin Bell at Nature gave up one of his app spots so that I could put this live, which I did: only to discover that Google App Engine is even more unforgiving of timeouts than Facebook. Currently trying to work out how to make the bookmarking process, for now it doesn't work very well. Also the search is broken, though that's Google's fault and not mine.)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I bet &lt;a href="http://network.nature.com/profile/U3DF456C6"&gt;Ian&lt;/a&gt; earlier that I could rewrite Connotea on &lt;a href="http://code.google.com/appengine/"&gt;App Engine&lt;/a&gt; in six hours. I can't remember why. Probably ego (mine, I mean). He didn't actually bet me a pint but he should have done...&lt;br /&gt;&lt;br /&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://www.ghastlyfop.com/blog/uploaded_images/pycite-718416.png" alt="" border="0" /&gt;&lt;br /&gt;&lt;br /&gt;... because the original estimate was a tad optimistic (ahem). After twelve hours I've produced &lt;a href="http://code.google.com/p/pycite/"&gt;pycite&lt;/a&gt;, though, which is pretty good going I think. I'll admit it: Python is actually very cool.&lt;br /&gt;&lt;br /&gt;pycite is three hundred lines of logic and a set of html templates that implements a (very simple) social bookmarking service. Sadly I don't actually have an App Engine account so it's not live on the web anywhere (I'll buy whoever &lt;i&gt;does&lt;/i&gt; have an account and puts it up first a pint - let's spread the love), you'll have to download it and run it locally to see it in action.&lt;br /&gt;&lt;br /&gt;What you can do with it:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;run it without owning a server of your own&lt;/li&gt;&lt;br /&gt;&lt;li&gt;log in with your Google account&lt;/li&gt;&lt;br /&gt;&lt;li&gt;add new bookmarks (the citation will be collected automagically)&lt;/li&gt;&lt;br /&gt;&lt;li&gt;view everybody's bookmarks&lt;/li&gt;&lt;br /&gt;&lt;li&gt;filter bookmarks by user:&lt;pre&gt;http://path.to.pycite/users/bob.smith&lt;/pre&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;and by tag:&lt;pre&gt;http://path.to.pycite/tags/diabetes&lt;/pre&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;and by user and tag:&lt;pre&gt;http://path.to.pycite/users/bob.smith/tags/diabetes&lt;/pre&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;and by keyword (the full text of each bookmarked page is searchable):&lt;pre&gt;http://path.to.pycite/users/bob.smith?q=t2d&lt;/pre&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;get atom feeds for all of the above&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;What you &lt;b&gt;can't&lt;/b&gt; do with it (yet):&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;edit or delete bookmarks&lt;/li&gt;&lt;br /&gt;&lt;li&gt;anything else&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;I've put it all up on &lt;a href="http://code.google.com/p/pycite/"&gt;Google Code&lt;/a&gt;. It's fairly straightforward stuff so if you've got any brilliant social bookmarking ideas then go for it. Send me an email and I'll give you write access to the subversion repository.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/ian-owes-me-pint_17.html' title='Ian owes me a pint'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=1912616361265309197' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1912616361265309197'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1912616361265309197'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-2282167223440949006</id><published>2008-04-07T14:26:00.003+01:00</published><updated>2008-04-07T14:39:40.696+01:00</updated><title type='text'>Gaggle</title><content type='html'>&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://www.ghastlyfop.com/blog/uploaded_images/canada-goose-729490.jpg" border="0" alt="" /&gt; I hadn't heard of &lt;a href='http://www.systemsbiology.org/Technology/Data_Management/Gaggle'&gt;Gaggle&lt;/a&gt; before but both &lt;a href='http://mndoci.com/blog/'&gt;Deepak&lt;/a&gt; and Sutee Dee (who needs a homepage.. ;)) from the ISB mentioned it last week so I figured it was worth a look. It's a system built by Paul Shannon at the ISB in Seattle to share data between different bioinformatics applications on the fly. It has been around for a while, I think - there was a &lt;a href='http://www.biomedcentral.com/1471-2105/7/176'&gt;BMC Bioinformatics paper&lt;/a&gt; describing the system in March 2006.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;A small server program (the ´Gaggle Boss´) provides communication among analysis and display programs (the ´geese´) which are modest and minimal adaptations of existing (or novel) bioinformatics and computational biology programs, and web resources. The Boss and the geese all run as separate programs on the user´s desktop computer, communicating with each other, at the user´s behest, by passing simple messages.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;(from &lt;a href='http://www.systemsbiology.org/Technology/Data_Management/Gaggle'&gt;the ISB's 'about Gaggle' page&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;I ran through &lt;a href='http://gaggle.systemsbiology.org/docs/2007-04/demo/hpylori/'&gt;a tutorial&lt;/a&gt; showing data sharing between (modified versions of) &lt;a href='http://www.cytoscape.org/'&gt;Cytoscape&lt;/a&gt; (also developed by ISB), R and a data matrix viewer no problem. Quite cool.&lt;br /&gt;&lt;br /&gt;You can't share data from an arbitrary application (I don't think?), they need to be modified to send messages to the Boss goose. Having said that there's a Firefox extension called Firegoose which lets you pass messages to and from web apps, Entrez etc. I couldn't get it working properly but suspect that's something to do with my install rather than the extension itself.&lt;br /&gt;&lt;br /&gt;Anyway, it's good to see stuff like this. Truth be told it's not the slickest thing ever, but it's still pretty cool - and it works. I wonder if you could turn it into a simple lab notebook - could you write a brief description of what you're going to try and do for the Boss app every time you send data to another app or something?</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/gaggle.html' title='Gaggle'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=2282167223440949006' title='3 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2282167223440949006'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2282167223440949006'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-6439854942449577655</id><published>2008-04-04T00:22:00.006+01:00</published><updated>2008-04-04T06:49:54.883+01:00</updated><title type='text'>Why you should try online dating</title><content type='html'>(you can jump to the short answer &lt;a href='#whyonline'&gt;here&lt;/a&gt;, if you're feeling impatient)&lt;br /&gt;&lt;br /&gt;Onto the psychology &lt;i&gt;of&lt;/i&gt; social media. &lt;a href='http://students.washington.edu/stech/'&gt;Kristin Stecher&lt;/a&gt; of the University of Washington and Dave Evans of Psychster LLC both gave interesting talks about profile pages.&lt;br /&gt;&lt;br /&gt;&lt;a href='http://www.psychster.com/'&gt;Psychster&lt;/a&gt; is a consulting company dedicated to "the social science of social networking". Recently they've been looking at interpersonal perception (how does person A perceive person B? How close is that to B's self perception?). Most research into this uses 'fake' people - i.e. A is given a detailed written description of B and works off of that, rather than meeting anybody face to face.&lt;br /&gt;&lt;br /&gt;To try and get a large 'real people' dataset Psychster created a Facebook application (and later &lt;a href='http://youjustgetme.com/'&gt;a website&lt;/a&gt;) where users could fill out a questionnaire that rated their personality on a variant of the &lt;a href='http://en.wikipedia.org/wiki/Big_Five_personality_traits'&gt;big five&lt;/a&gt; personality inventory (the big five being openness, conscientiousness, extraversion, agreeableness, and neuroticism). They then had the option of rating the personalities of other people (not just their friends), the idea being to collect how users saw themselves, how others saw them and the correlation between the two.&lt;br /&gt;&lt;br /&gt;On the standalone website users created profiles to reflect their personalities. Profiles could contain any number of elements (name, location, gender, favourite movie, most embarrassing moment...) chosen from a large list.&lt;br /&gt;&lt;br /&gt;The results in general:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; people do 'get' each other (where to 'get' a person means to guess a personality close to their actual, self-rated personality).&lt;br /&gt;&lt;li&gt; people on Facebook get each other better (this kind of figures - you'd want to go rate your real life friends).&lt;br /&gt;&lt;li&gt; women are better guessers than men - but only when guessing random strangers.&lt;br /&gt;&lt;li&gt; women are easier to get.&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Psychster looked at different profile elements on the standalone website to see if the presence or any in particular were correlated with higher rates of accuracy.&lt;br /&gt;&lt;br /&gt;Profile elements that make somebody easier to get:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; A link to a funny video (the number one predictor of personality)&lt;br /&gt;&lt;li&gt; What makes me glad to be alive?&lt;br /&gt;&lt;li&gt; Most embarassing thing I ever did:&lt;br /&gt;&lt;li&gt; Proudest thing I ever did:&lt;br /&gt;&lt;li&gt; My spirituality:&lt;br /&gt;&lt;li&gt; A great person:&lt;br /&gt;&lt;li&gt; I believe this:&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Profile elements that make you harder to get:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; Profile picture (but only if it is of a non-person)&lt;br /&gt;&lt;li&gt; An awful website:&lt;br /&gt;&lt;li&gt; An awful person:&lt;br /&gt;&lt;li&gt; A great book:&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;That last one (naming a great book making it harder to guess your personality) is pretty interesting. Dave did say that he hadn't yet done any proper analysis of why it might be. I wonder if there's any research into how much (or little) reading habits have to do with your personality? &lt;a href='http://www.intergalacticmedicineshow.com/cgi-bin/mag.cgi?article=012&amp;do=columns&amp;vol=carol_pinchefsky'&gt;Here's a tangent&lt;/a&gt; (why do some people get interested in science fiction?) if you're interested. &lt;a href='http://www.sciencedirect.com/science?_ob=ArticleURL&amp;_udi=B6WM0-4H3Y9GN-3&amp;_user=10&amp;_rdoc=1&amp;_fmt=&amp;_orig=search&amp;_sort=d&amp;view=c&amp;_acct=C000050221&amp;_version=1&amp;_urlVersion=0&amp;_userid=10&amp;md5=603356cc387f95194a5f4aac8a7fe31c'&gt;Here's another&lt;/a&gt; (people who read lots of fiction aren't socially awkward, in fact the tendency to get absorbed in a story correlates with empathy scores).&lt;br /&gt;&lt;br /&gt;OK, anyway...&lt;br /&gt;&lt;br /&gt;Why were women easier to read? Because they tended to fill out the profile elements that were good predictors ("my most embarrassing moment").&lt;br /&gt;&lt;br /&gt;At this point you might be wondering (well, I wondered) who cares how well an online profile reflects your true personality. One answer is the online dating industry who have a vested interest in not setting you up with anybody plainly unsuitable. If profiles were set up the right way then maybe you could tell in advance if the guy or girl messaging you is worth seeing in the real world.&lt;br /&gt;&lt;br /&gt;Sticking with the online dating theme, &lt;a name='whyonline'&gt;&amp;nbsp&lt;/a&gt;it turns out that the levels of agreement (between actual and guessed personalities) you get by looking at Facebook profiles approach those you see in long term acquaintances. They're certainly better than what you get after a short face to face meeting (like a date). In fact, short f2f meetings are particularly bad at helping you gauge levels of agreeableness and neuroticism - not good. I think this means that stalking potential partners online actually makes good, practical sense and should be encouraged.&lt;br /&gt;&lt;br /&gt;In case you needed any reassurance.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/why-you-should-try-online-dating.html' title='Why you should try online dating'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=6439854942449577655' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6439854942449577655'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6439854942449577655'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-6361116208557056094</id><published>2008-04-03T19:32:00.004+01:00</published><updated>2008-04-04T00:13:33.778+01:00</updated><title type='text'>Do you use language differently when you're depressed?</title><content type='html'>Can you tell if somebody is clinically depressed by analyzing their use of language? I'm not a psychologist, so take the background info below with a pinch of salt but the topic came up at ICWSM (more on how later) and I thought it was fascinating.&lt;br /&gt;&lt;br /&gt;In 2001 &lt;a href='http://www.psychosomaticmedicine.org/cgi/content/full/63/4/517'&gt;Stirman et al&lt;/a&gt; compared the collected works of nine poets who eventually committed suicide and nine poets who didn't (as a control set). Their theory was that the depressed (and eventually suicidal) poets would use more first person singular (&lt;i&gt;I, me, my&lt;/i&gt;) and words related to hopelessness and desperation (&lt;i&gt;hate, worthless, death, grave&lt;/i&gt;) and that was supported by the data.&lt;br /&gt;&lt;br /&gt;&lt;a href='http://www.ingentaconnect.com/content/psych/pcem/2004/00000018/00000008/art00006'&gt;Rude et al&lt;/a&gt; later found something similar when they compared essays (on a common topic - "coming to college") written by college students. Depressed students used "I" and negative words significantly more often than controls.&lt;br /&gt;&lt;br /&gt;Interestingly &lt;a href='http://ajp.psychiatryonline.org/cgi/content/abstract/145/4/464?ijkey=e783c5358ba68086685be026e20d59dfa026b950&amp;keytype2=tf_ipsecsha'&gt;Oxman et al&lt;/a&gt; has found that spoken language patterns can be a good discriminator for classifying patients as depressed or not, so it's not just written language use that may be different.&lt;br /&gt;&lt;br /&gt;Anyway, at ICWSM &lt;a href='http://homepage.psy.utexas.edu/HomePage/Faculty/Gosling/People.htm#Nairan%20Ramirez'&gt;Nairán Ramírez-Esparza&lt;/a&gt; from the University of Texas presented a language analysis of some depression discussion boards on About.com. She ran a two part study: the first to confirm Stirman and Rude's findings and the second making use of the fact that the About.com boards are bilingual (there's a Spanish section too) to see how different cultures talk about depression.&lt;br /&gt;&lt;br /&gt;Her approach was pretty simple - she collected ~ 400 posts from the depression forum and 400 posts from a breast cancer forum as a control, broke each post down into single words and then used off-the-shelf software to classify them (as verb, adjective, pronoun, positive emotion, negative emotion, etc.). She did this for both English and Spanish sections of the site.&lt;br /&gt;&lt;br /&gt;Her results seemed to confirm the earlier studies: first person pronouns were found three times more frequently in the depression forum posts than in the controls and words relating to negative emotions occurred four times as frequently. This was true for both English and Spanish datasets.&lt;br /&gt;&lt;br /&gt;The second part of her study was to see if English and Spanish speakers approach depression differently; what do they talk about? She studied this by using normalized word frequency counts then grouping different words into themes.&lt;br /&gt;&lt;br /&gt;The top five themes discussed in the English dataset:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Treatment (medicine, doctor, therapist...)&lt;br /&gt;Disclosure (tell, discuss, talk...)&lt;br /&gt;Family (mom, dad, brother, sister...)&lt;br /&gt;Symptoms ...&lt;br /&gt;School &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And the top five themes from the Spanish dataset:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Family&lt;br /&gt;Relationship history  &lt;br /&gt;Hopelessness&lt;br /&gt;School&lt;br /&gt;Treatment&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I'm a bit suspicious of results that are so intuitively appealing (family and romance are more important to Spanish people?). One thing that I did wonder was how much the results are skewed by different community expectations: if you visit a discussion forum where people are sharing stories about their depression and everybody else mentions their family maybe you feel compelled to mention your family too. Maybe the English language forums are dominated by a younger age group and so older visitors shy away, or v.v.&lt;br /&gt;&lt;br /&gt;Anyway, it was interesting stuff. Somebody in the audience wondered aloud if this means that you could build a system to identify people at risk of depression (or perhaps more to the point suicide) by analyzing their language online. Maybe this could be built into the next version of the anti-plagiarism software used in high schools and colleges (I'm not advocating that, just saying)...</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/do-you-use-language-differently-when.html' title='Do you use language differently when you&apos;re depressed?'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=6361116208557056094' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6361116208557056094'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6361116208557056094'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-4920127895857641509</id><published>2008-04-02T02:26:00.011+01:00</published><updated>2008-04-02T02:56:11.267+01:00</updated><title type='text'>Analyzing MySpace profiles</title><content type='html'>&lt;p&gt;This morning &lt;a href="http://faculty.cs.tamu.edu/caverlee/index.html"&gt;James Caverlee&lt;/a&gt; presented his study of almost two million (well, two sets of ~ one million - one set of profiles picked at random and one gathered by traversing the social graph) MySpace profiles. It was interesting stuff. Some bits and pieces below.&lt;br /&gt;&lt;/p&gt;&lt;p&gt;MySpace users live up to gender stereotypes, rather disappointingly:&lt;br /&gt;&lt;br /&gt;&lt;style type="text/css"&gt;.nobrtable br { display: none }&lt;/style&gt;&lt;br /&gt;&lt;div class="nobrtable"&gt;&lt;br /&gt;&lt;b&gt;Words most frequently appearing in MySpace profiles&lt;/b&gt;&lt;br/&gt;&lt;br /&gt;&lt;table style="padding: 0px; margin-top: 5px;" border="1" cellpadding="4" cellspacing="0" width="600"&gt;&lt;br /&gt;&lt;tr&gt;&lt;br /&gt;&lt;td width="50%"&gt;Women&lt;/td&gt;&lt;br /&gt;&lt;td width="50%"&gt;Men&lt;/td&gt;&lt;br /&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;br /&gt;&lt;td width="50%"&gt;&lt;br /&gt;love, people, dancing, life, shopping, can, girl, family, hearts, being, have, notebook, are, dance, favourite, things&lt;br /&gt;&lt;/td&gt;&lt;br /&gt;&lt;td width="50%"&gt;&lt;br /&gt;dating, sport, networking, metal, serious, football, relationship, sh*t, single, wars,&lt;br /&gt;straight, band, video, f*ck, guitar, gay&lt;br /&gt;&lt;/td&gt;&lt;br /&gt;&lt;/tr&gt;&lt;br /&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;And geographic ones (didn't manage to write all of these down in time):&lt;br /&gt;&lt;br /&gt;&lt;div class="nobrtable"&gt;&lt;br /&gt;&lt;table border="1" cellpadding="4" cellspacing="0" width="600"&gt;&lt;br /&gt;&lt;tbody&gt;&lt;br /&gt;&lt;tr&gt;&lt;br /&gt;&lt;td width="50%"&gt;users in Oregon&lt;/td&gt;&lt;br /&gt;&lt;td width="50%"&gt;users in Alabama&lt;/td&gt;&lt;br /&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;br /&gt;&lt;td width="50%"&gt;&lt;br /&gt;camping, hiking, pixies, snowboarding, wine, vegans&lt;br /&gt;&lt;/td&gt;&lt;br /&gt;&lt;td width="50%"&gt;&lt;br /&gt;football, jesus, gospel, nascar&lt;br /&gt;&lt;/td&gt;&lt;br /&gt;&lt;/tr&gt;&lt;br /&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Demographics wise ~ 50% of the profiles that they picked at random had one or no friends (i.e. weren't active). Age wise the peak is at 24, with smaller peaks at 69 and 100. The 69 peak is a secret MySpace code, apparently - it means that you're interested in, uh, one-handed typing (this wasn't made clear, but I'm guessing). By having a common age - 69 - you can use MySpace's advanced search to find others looking for the same thing. 69 year olds on MySpace are most similar (in their use of language) to people in their mid thirties.&lt;br /&gt;&lt;br /&gt;Younger users are overwhelmingly female. There is a 2:1 ratio of girls to boys at age 14. This difference decreases as age increases. The flip over point is at 20 - after that you start seeing more men than women.&lt;br /&gt;&lt;br /&gt;About 20% of the profiles in the connected dataset were marked as 'private'. Over time this percentage is rising. Having privacy preferences set is negatively correlated with age.&lt;br /&gt;&lt;br /&gt;He had a fantastic slide showing top terms wrt to age... will post it and a link to the slideshow when it's online.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/analyzing-myspace-profiles.html' title='Analyzing MySpace profiles'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=4920127895857641509' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/4920127895857641509'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/4920127895857641509'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-384010838642059061</id><published>2008-04-01T06:43:00.003+01:00</published><updated>2008-04-01T07:00:24.343+01:00</updated><title type='text'>Tossed Salad and Scrambled Eggs</title><content type='html'>I'm in Seattle for the ICWSM. The first day just finished and I'm going to blog about the more interesting talks tomorrow when I'm more awake. In the meantime:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; Crowdvine for conferences is actually pretty useful&lt;br /&gt;&lt;li&gt; &lt;a href='http://www.imdb.com/title/tt0891527/'&gt;Lions for Lambs&lt;/a&gt; is terrible&lt;br /&gt;&lt;li&gt; &lt;a href='http://images.google.co.uk/images?hl=en&amp;client=firefox-a&amp;channel=s&amp;rls=org.mozilla:en-US:official&amp;hs=VtY&amp;resnum=0&amp;q=st+trinians&amp;um=1&amp;ie=UTF-8&amp;sa=N&amp;tab=wi'&gt;St Trinians&lt;/a&gt; is actually quite good&lt;br /&gt;&lt;li&gt; &lt;a href='http://en.wikipedia.org/wiki/The_Century_of_the_Self'&gt;The Century of the Self&lt;/a&gt; is brilliant&lt;br /&gt;&lt;li&gt; Seattle looks really nice from the air&lt;br /&gt;&lt;li&gt; Note to self: US Milky Way bars = UK Mars bars, tricksy bastards&lt;br /&gt;&lt;li&gt; Steak + beer + bay views = awesome (thanks Deepak!)&lt;br /&gt;&lt;li&gt; More Starbuckses than normal&lt;br /&gt;&lt;li&gt; Everybody is disconcertingly friendly. People keep offering to take me skiing. And to see waterfalls. People here big on waterfalls&lt;br /&gt;&lt;li&gt; &lt;a href='http://labs.live.com/'&gt;MS Live Labs&lt;/a&gt; are hiring&lt;br /&gt;&lt;li&gt; &lt;a href='http://en.wikipedia.org/wiki/Brad_Fitzpatrick'&gt;Brad Fitzpatrick&lt;/a&gt; is a great speaker but I found his talk disappointing - too much hand waving about OpenID / OAuth / XMPP / XRDS. Dude, it's a room full of social network developers, you're preaching to the converted&lt;br /&gt;&lt;li&gt; Sadly &lt;a href='http://www.hpl.hp.com/research/idl/people/huberman/'&gt;Bernardo Huberman&lt;/a&gt; has cancelled. Marc "most unGoogleable name ever" Smith is talking instead. Marc is either founder of Poetry Slam (cool) a Happy Hardcore DJ (not cool) or a senior research sociologist at Microsoft Research (as yet undecided)&lt;br /&gt;&lt;/ul&gt;</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/tossed-salad-and-scrambled-eggs.html' title='Tossed Salad and Scrambled Eggs'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=384010838642059061' title='3 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/384010838642059061'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/384010838642059061'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-1407953762224469141</id><published>2008-03-19T11:31:00.008Z</published><updated>2008-03-19T12:44:49.476Z</updated><title type='text'>Dawkins officially bigger than Jesus - datamining Scienceblogs.com</title><content type='html'>I've run all of the posts from Scienceblogs.com in 2007 through the &lt;a href='http://www.programmableweb.com/api/clearforest-semantic-web-services1/mashups'&gt;ClearForest API&lt;/a&gt;. ClearForest extracts entities - people, places, organizations - from plain text.&lt;br /&gt;&lt;br /&gt;I'm in the process of pulling things together for a visualization, but here's a quick answer to the 'who are Sciblings talking about?' question. The 'count' is the number of times that each entity was seen (could be multiple times in the same post) across 2007.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;+-----------------------------------------------+-------+&lt;br /&gt;| term                                          | count |&lt;br /&gt;+-----------------------------------------------+-------+&lt;br /&gt;| Michael Egnor                                 |  1855 | &lt;br /&gt;| Richard Dawkins                               |  1737 | &lt;br /&gt;| Bush                                          |  1669 | &lt;br /&gt;| Congress                                      |  1430 | &lt;br /&gt;| Charles Darwin                                |  1226 | &lt;br /&gt;| Michael Behe                                  |  1031 | &lt;br /&gt;| Chris Mooney                                  |   927 | &lt;br /&gt;| FDA                                           |   920 | &lt;br /&gt;| DCA                                           |   765 | &lt;br /&gt;| National Aeronautics and Space Administration |   745 | &lt;br /&gt;| National Institute of Health                  |   741 | &lt;br /&gt;| Bush administration                           |   721 | &lt;br /&gt;| Google                                        |   700 | &lt;br /&gt;| Guillermo Gonzalez                            |   691 | &lt;br /&gt;| White House                                   |   658 | &lt;br /&gt;| Supreme Court                                 |   655 | &lt;br /&gt;| Thomas Jefferson                              |   632 | &lt;br /&gt;| John Edwards                                  |   614 | &lt;br /&gt;| Casey Luskin                                  |   605 | &lt;br /&gt;| George W. Bush                                |   603 | &lt;br /&gt;| Jesus Christ                                  |   601 | &lt;br /&gt;| Discovery Institute                           |   596 | &lt;br /&gt;| the New York Times                            |   587 | &lt;br /&gt;| Larry Moran                                   |   576 | &lt;br /&gt;| World Health Organization                     |   543 | &lt;br /&gt;| Hillary Clinton                               |   517 | &lt;br /&gt;+-----------------------------------------------+-------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Bear in mind that ClearForest extracts &lt;i&gt;entities&lt;/i&gt;, not key terms. It can't tell us how often blog posts are talking about mammoth DNA, supernovae or dicyemid mesozoa. That's a different dataset entirely...&lt;br /&gt;&lt;br /&gt;.... this one, in fact, generated using the Yahoo! term extraction API which pulls out important concepts (terms) from text. The dataset is about half the size of the above as I'm only including ScienceBlogs indexed in &lt;a href='http://www.postgenomic.com'&gt;Postgenomic&lt;/a&gt;. Here 'count' is the number of distinct posts containing a term:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;+---------------------+-------+&lt;br /&gt;| term                | count |&lt;br /&gt;+---------------------+-------+&lt;br /&gt;| evolution           |   963 | &lt;br /&gt;| carnival            |   923 | &lt;br /&gt;| global warming      |   640 | &lt;br /&gt;| intelligent design  |   543 | &lt;br /&gt;| new york times      |   542 | &lt;br /&gt;| blogosphere         |   468 | &lt;br /&gt;| religion            |   460 | &lt;br /&gt;| brain               |   437 | &lt;br /&gt;| climate change      |   432 | &lt;br /&gt;| creationist         |   420 | &lt;br /&gt;| birds               |   415 | &lt;br /&gt;| creationism         |   409 | &lt;br /&gt;| creationists        |   398 | &lt;br /&gt;| pz                  |   378 | &lt;br /&gt;| darwin              |   367 | &lt;br /&gt;| discovery institute |   354 | &lt;br /&gt;| atheists            |   351 | &lt;br /&gt;| atheist             |   333 | &lt;br /&gt;| biology             |   314 | &lt;br /&gt;| richard dawkins     |   301 | &lt;br /&gt;| skeptics            |   290 | &lt;br /&gt;| love                |   289 | &lt;br /&gt;| genes               |   288 | &lt;br /&gt;| job                 |   286 | &lt;br /&gt;| money               |   283 | &lt;br /&gt;| orac                |   281 | &lt;br /&gt;| god                 |   276 | &lt;br /&gt;| atheism             |   266 | &lt;br /&gt;| animals             |   261 | &lt;br /&gt;| bush                |   258 | &lt;br /&gt;| google              |   258 | &lt;br /&gt;+---------------------+-------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;In light of this data it's tempting to revisit that Bayblab post suggesting that &lt;a href='http://bayblab.blogspot.com/2008/02/state-of-science-blogging.html'&gt;Sciblings spend too much time discussing ID&lt;/a&gt;. That'd be a mistake, though: the numbers above are absolutes. 963 posts had 'evolution' as a key term but that's only 2.4% of all posts that year (my 2c: I think that Sciblings &lt;i&gt;do&lt;/i&gt; talk about Egnor, ID and creationism too much, but hey, it's their blogs - I just skip over those posts).&lt;br /&gt;&lt;br /&gt;I also had a look at linking patterns - who do ScienceBloggers link to the most? Here 'count' is the number of unique posts that have a link to a particular domain.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;+-------------------------+-------+&lt;br /&gt;| domain                  | count |&lt;br /&gt;+-------------------------+-------+&lt;br /&gt;| www.scienceblogs.com    | 15966 | &lt;br /&gt;| en.wikipedia.org        |  2016 | &lt;br /&gt;| www.technorati.com      |  1797 | &lt;br /&gt;| www.nytimes.com         |  1388 | &lt;br /&gt;| www.amazon.com          |  1078 | &lt;br /&gt;| www.sciencedaily.com    |   661 | &lt;br /&gt;| www.washingtonpost.com  |   478 | &lt;br /&gt;| feeds.feedburner.com    |   467 | &lt;br /&gt;| www.nature.com          |   453 | &lt;br /&gt;| news.yahoo.com          |   401 | &lt;br /&gt;| news.bbc.co.uk          |   333 | &lt;br /&gt;| www.youtube.com         |   305 | &lt;br /&gt;| www.del.icio.us         |   297 | &lt;br /&gt;| www.cnn.com             |   260 | &lt;br /&gt;| www.eurekalert.org      |   260 | &lt;br /&gt;| farm3.static.flickr.com |   259 | &lt;br /&gt;| www.sciencemag.org      |   231 | &lt;br /&gt;| www.ncbi.nlm.nih.gov    |   225 | &lt;br /&gt;| www.pandasthumb.org     |   224 | &lt;br /&gt;| www.google.com          |   219 | &lt;br /&gt;| www.latimes.com         |   213 | &lt;br /&gt;| www.gnxp.com            |   208 | &lt;br /&gt;| sandwalk.blogspot.com   |   197 | &lt;br /&gt;| www.dailykos.com        |   196 | &lt;br /&gt;| www.donorschoose.org    |   194 | &lt;br /&gt;+-------------------------+-------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Presumably the technorati links are from tags. Sciencebloggers link to scienceblogs.com far more than anywhere else - but I'd guess that this is simply because there are a lot of good science blogs on one domain there.&lt;br /&gt;&lt;br /&gt;Wikipedia's reliability might be &lt;a href='http://news.bbc.co.uk/1/hi/technology/4530930.stm'&gt;in question&lt;/a&gt; but it's interesting that almost everybody uses it to define terms.&lt;br /&gt;&lt;br /&gt;Drilling down, where do ScienceBloggers link to papers?&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;+--------------------------------+-------+&lt;br /&gt;| domain                         | count |&lt;br /&gt;+--------------------------------+-------+&lt;br /&gt;| www.nature.com                 |   241 | &lt;br /&gt;| www.sciencemag.org             |   194 | &lt;br /&gt;| www.dx.doi.org                 |   177 | &lt;br /&gt;| www.ncbi.nlm.nih.gov           |   111 | &lt;br /&gt;| www.pnas.org                   |   104 | &lt;br /&gt;| www.plosone.org                |    89 | &lt;br /&gt;| biology.plosjournals.org       |    76 | &lt;br /&gt;| content.nejm.org               |    67 | &lt;br /&gt;| medicine.plosjournals.org      |    65 | &lt;br /&gt;| www.sciencedirect.com          |    43 | &lt;br /&gt;| www.arxiv.org                  |    33 | &lt;br /&gt;| genetics.plosjournals.org      |    22 | &lt;br /&gt;| www.jneurosci.org              |    15 | &lt;br /&gt;| www.cell.com                   |    14 | &lt;br /&gt;| compbiol.plosjournals.org      |    10 | &lt;br /&gt;| pediatrics.aappublications.org |    10 | &lt;br /&gt;| www.jcb.org                    |    10 | &lt;br /&gt;| mbe.oxfordjournals.org         |     9 | &lt;br /&gt;| www.ajp.psychiatryonline.org   |     8 | &lt;br /&gt;| www.current-biology.com        |     8 | &lt;br /&gt;| www.journals.uchicago.edu      |     8 | &lt;br /&gt;| www.plosntds.org               |     8 | &lt;br /&gt;| www.blackwell-synergy.com      |     7 | &lt;br /&gt;+--------------------------------+-------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Nature and Science are at the top, perhaps unsurprisingly - but if you add up the counts from the different PLoS journals it'd be up there too.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/03/dawkins-officially-bigger-than-jesus.html' title='Dawkins officially bigger than Jesus - datamining Scienceblogs.com'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=1407953762224469141' title='5 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1407953762224469141'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1407953762224469141'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-5699348932312761650</id><published>2008-03-19T00:39:00.003Z</published><updated>2008-03-19T11:10:34.635Z</updated><title type='text'>Science streaming</title><content type='html'>&lt;a href='http://www.bioinformaticszen.com/2008/03/passive-research-streaming-using-twitter-flickr-and-citeulike/'&gt;Michael Barton&lt;/a&gt; has a nice post up:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;I currently use Subversion to back up my project files, and I noticed Twitter status updates are very similar in length to subversion log messages. I created a short script so that every time I do a subversion repository check in, the message is also sent to Twitter.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;I'd like to see activity aggregators accept arbitrary updates - sort of like Facebook's Beacon updating people's News Feed, but done properly.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/03/science-streaming.html' title='Science streaming'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=5699348932312761650' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/5699348932312761650'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/5699348932312761650'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-8827439579897692707</id><published>2008-03-18T03:08:00.006Z</published><updated>2008-03-18T03:41:52.118Z</updated><title type='text'>Nature archive visualized - draft</title><content type='html'>I'm using up my annual carry-over vacation days by taking some time off work this week. Normal people probably use this valuable breathing space to bond with their loved ones, play badminton and learn exciting new hobbies. So far I've sat alone in my flat for thirty six hours straight writing &lt;a href='http://en.wikipedia.org/wiki/Processing_(programming_language)'&gt;Processing&lt;/a&gt; sketches &lt;a href='#nb-natviz'&gt;*&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;So... &lt;a href='http://www.ghastlyfop.com/terms/movie_dropdown.mp4'&gt;here's a draft visualization&lt;/a&gt; (14mb MP4, should play in your browser with Quicktime) of the key words and phrases found in Nature journal over the past thirty years.&lt;br /&gt;&lt;br /&gt;The video starts with the phrases from 1970 and continues until 2007.&lt;br /&gt;&lt;br /&gt;Phrases appear on the right in the year that they were first seen, then travel leftwards, disappearing in the year they were last seen.&lt;br /&gt;&lt;br /&gt;The size of each phrase is related to how often it was seen relative to all the other phrases.&lt;br /&gt;&lt;br /&gt;The hue of each phrase is related to how many distinct journal issues it appeared in - green / yellow phrases are relatively transient while red / brown phrases are stable, appearing in many different contexts.&lt;br /&gt;&lt;br /&gt;The data is incomplete (it's a bit sparse after '88) and I took lots of shortcuts to see how things might look, so don't read too much into which phrases appear and when for now... a better version will follow - this is just a release early, release often draft.&lt;br /&gt;&lt;br /&gt;Eventually I'd like to have a sort of &lt;a href='http://en.wikipedia.org/wiki/Pop-Up_Video'&gt;Pop-up Video&lt;/a&gt; timeline of science from the 50s till today, with major events (and relevant terms) flashing up on screen.&lt;br /&gt;&lt;br /&gt;If you're particularly impatient here's a version from Vimeo. The quality is rubbish, mainly because I munged the file with iMovie (which is crap) to add some rockin' beats. I still suggest you get the &lt;a href='http://www.ghastlyfop.com/terms/movie_dropdown.mp4'&gt;mp4 instead&lt;/a&gt;, though.&lt;br /&gt;&lt;br /&gt;Tommorrow I'm going to the park.&lt;br /&gt;&lt;br /&gt;&lt;object type="application/x-shockwave-flash" width="600" height="483" data="http://www.vimeo.com/moogaloop.swf?clip_id=796830&amp;amp;server=www.vimeo.com&amp;amp;fullscreen=1&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=01AAEA"&gt; &lt;param name="quality" value="best" /&gt; &lt;param name="allowfullscreen" value="true" /&gt; &lt;param name="scale" value="showAll" /&gt; &lt;param name="movie" value="http://www.vimeo.com/moogaloop.swf?clip_id=796830&amp;amp;server=www.vimeo.com&amp;amp;fullscreen=1&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=01AAEA" /&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='nb-natviz'&gt;*&lt;/a&gt; I watched &lt;a href='http://www.imdb.com/title/tt0804522/'&gt;Rendition&lt;/a&gt;, too, it was quite good.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/03/nature-archive-visualized.html' title='Nature archive visualized - draft'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=8827439579897692707' title='1 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/8827439579897692707'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/8827439579897692707'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-2549078339726866397</id><published>2008-03-11T10:44:00.005Z</published><updated>2008-03-11T10:49:24.227Z</updated><title type='text'>Seattle</title><content type='html'>I'm going to be in Seattle the first week of April for the &lt;a href='http://icwsm.org/2008/index.shtml'&gt;ICWSM&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;There are a whole bunch of awesome looking talks and one or two wild cards. Wild cards like:&lt;br /&gt;&lt;br /&gt;Spontaneous Inference of Personality Traits from Online Profiles&lt;br /&gt;Kristin Stecher, Scott Counts&lt;br /&gt;&lt;br /&gt;Which sounds interesting, anyway.&lt;br /&gt;&lt;br /&gt;Let me know if you're in the area and fancy &lt;a href='mailto:e.adie@nature.com'&gt;meeting up&lt;/a&gt; for lunch or a drink. I'm in town from the 29th of March to the 5th April.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/03/seattle.html' title='Seattle'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=2549078339726866397' title='2 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2549078339726866397'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2549078339726866397'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-6956589572679735448</id><published>2008-03-10T23:11:00.003Z</published><updated>2008-03-10T23:50:56.119Z</updated><title type='text'>New JoVE blog &amp; commenting on papers</title><content type='html'>Anna Kushnir's &lt;a href='http://jove-blog.blogspot.com/'&gt;new blog for JoVE&lt;/a&gt; is up and running (actually it has been up and running for a while, I'm a bit behind with blogging. Those January Open Science posts are coming at some point, too). It's a nice mix of content.&lt;br /&gt;&lt;br /&gt;Of particular interest are a &lt;a href='http://jove-blog.blogspot.com/2008/02/science-participation.html'&gt;couple&lt;/a&gt; of &lt;a href='http://jove-blog.blogspot.com/2008/03/going-incognito.html'&gt;interesting entries&lt;/a&gt; talking about the online participation - or lack thereof - of scientists. See also Noah Gray's &lt;a href='http://blogs.nature.com/nn/actionpotential/2008/03/ng_neuroscience_and_web.html'&gt;take on neuroscientists and web 2.0&lt;/a&gt; and David Crotty's &lt;a href='http://www.cshblogs.org/cshprotocols/2008/02/14/why-web-20-is-failing-in-biology/'&gt;'why web 2.0 is failing in biology'&lt;/a&gt; post.&lt;br /&gt;&lt;br /&gt;Did you skip over all those links? You shouldn't, really. At least read &lt;a href='http://www.cshblogs.org/cshprotocols/2008/02/14/why-web-20-is-failing-in-biology/'&gt;David Crotty's&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;So, yeah, anyway, why scientists don't comment on papers - my take is that being too busy and being afraid of the consequences don't come into it. &lt;br /&gt;&lt;br /&gt;Sure, they're valid concerns - but &lt;i&gt;everybody&lt;/i&gt; is busy at work and everybody realizes that what you say on the internet is recorded forever by Googlebot. People still write ranty forum posts and blog comments.&lt;br /&gt;&lt;br /&gt;IMHO the main reasons scientists don't leave comments are:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;There's no point&lt;/b&gt; - who's going to read it? Will you get any feedback? Will you get any credit for it?&lt;br /&gt;&lt;br /&gt;and&lt;br /&gt;&lt;br /&gt;&lt;b&gt;It's too much work&lt;/b&gt; - writing a comment should be a one click operation. Well, two clicks, one to get the focus in the textbox and the other to press 'submit'.&lt;br /&gt;&lt;br /&gt;Science publishers can address both of these issues, but we've been failing to do so.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/03/new-jove-blog-commenting-on-papers.html' title='New JoVE blog &amp; commenting on papers'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=6956589572679735448' title='8 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6956589572679735448'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6956589572679735448'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-6967540023489770436</id><published>2008-02-21T01:09:00.002Z</published><updated>2008-02-21T01:24:58.475Z</updated><title type='text'>Hot diggity</title><content type='html'>The &lt;a href='http://cifoo.crowdvine.com/'&gt;Collective Intelligence Foo Camp&lt;/a&gt; looks awesome, especially if you're into blogs and recommender systems (and who isn't, right? Right...? Everybody?). &lt;br /&gt;&lt;br /&gt;Some particularly cool attendees:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; &lt;a href='http://cifoo.crowdvine.com/profiles/8482'&gt;Greg Linden&lt;/a&gt; (worked on Amazon's recommender system, &lt;a href='http://glinden.blogspot.com/'&gt;great blog&lt;/a&gt;)&lt;br /&gt;&lt;li&gt; &lt;a href='http://cifoo.crowdvine.com/profiles/6898'&gt;Matt Hurst&lt;/a&gt; (ex of Nielsen BlogPulse, now at Microsoft, &lt;a href='http://datamining.typepad.com/data_mining'&gt;great blog&lt;/a&gt;)&lt;br /&gt;&lt;li&gt; &lt;a href='http://cifoo.crowdvine.com/profiles/1193'&gt;Niall Kennedy&lt;/a&gt; (ex of Technorati, Microsoft, bunch of other places, &lt;a href='http://www.niallkennedy.com/blog/'&gt;great blog&lt;/a&gt;)&lt;br /&gt;&lt;li&gt; &lt;a href='http://cifoo.crowdvine.com/profiles/8476'&gt;Bernardo Huberman&lt;/a&gt; (also talking at &lt;a href='http://www.icwsm.org/2008/index.shtml'&gt;ICWSM '08&lt;/a&gt; next month, sad lack of blog)&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;(via &lt;a href='http://glinden.blogspot.com/'&gt;Greg Linden&lt;/a&gt;)</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/02/hot-diggity.html' title='Hot diggity'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=6967540023489770436' title='2 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6967540023489770436'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6967540023489770436'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-5215510218615781420</id><published>2008-02-19T14:18:00.005Z</published><updated>2008-02-21T00:03:11.076Z</updated><title type='text'>Uh, lazyweb...?</title><content type='html'>&lt;i&gt;&lt;b&gt;Update&lt;/b&gt;: Thanks, universally helpful lazyweb!&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I know it was only a month ago that you were invoked here last. But... Freebase. It's killing me (I was inspried to finally build something using it by &lt;a href='http://plindenbaum.blogspot.com/'&gt;Pierre&lt;/a&gt;, whose examples I have sadly failed to adapt :( ).&lt;br /&gt;&lt;br /&gt;What &lt;a href='http://www.freebase.com/view/queryeditor/'&gt;MQL&lt;/a&gt; can I use to get *all* of the information about *all* people with profession of 'author'?&lt;br /&gt;&lt;br /&gt;I have&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;br /&gt;{&lt;br /&gt;  "query" : [&lt;br /&gt;  {&lt;br /&gt;    "*" : [&lt;br /&gt;       {}&lt;br /&gt;    ],   &lt;br /&gt;    "guid" : null,&lt;br /&gt;    "limit" : 5,&lt;br /&gt;    "name" : null,&lt;br /&gt;    "profession" : "author",&lt;br /&gt;    "type" : "/people/person"&lt;br /&gt;  }&lt;br /&gt;  ]&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;i&gt;(thanks to Pierre for pointing out a syntax error here originally, oops)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Which should list all of the properties for 5 authors, right? But I only see properties from the &lt;a href='http://www.freebase.com/view/people/person'&gt;/people/person&lt;/a&gt; schema.&lt;br /&gt;&lt;br /&gt;How do I get all the &lt;a href='http://www.freebase.com/view/book/author'&gt;author&lt;/a&gt; properties too?&lt;br /&gt;&lt;br /&gt;&lt;i&gt;I ended up using the MQL in Pierre's second comment to get a list of GUIDs for all authors (having changed the limit to 10k) then iterated over them getting all of the properties from /people/person (D.O.B, nationality...), /book/author (list of books) and /common/type (image, article) in three different calls (oy). It works, though - again, as Pierre suggests. ;)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;There is a way to get everything in one MQL call, though, as Alf says:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;{&lt;br /&gt;  "query" : [&lt;br /&gt;    {&lt;br /&gt;      "*" : null,&lt;br /&gt;      "/book/author/books" : [&lt;br /&gt;        {&lt;br /&gt;          "*" : null&lt;br /&gt;        }&lt;br /&gt;      ],&lt;br /&gt;      "/common/topic/article" : [&lt;br /&gt;        {&lt;br /&gt;          "*" : null&lt;br /&gt;        }&lt;br /&gt;      ],&lt;br /&gt;      "/common/topic/image" : [&lt;br /&gt;        {&lt;br /&gt;          "*" : null&lt;br /&gt;        }&lt;br /&gt;      ],&lt;br /&gt;      "limit" : 5,&lt;br /&gt;      "profession" : "author",&lt;br /&gt;      "type" : "/people/person"&lt;br /&gt;    }&lt;br /&gt;  ]&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The disadvantage to this is that it's a lot of data to get in one go (considering there are thousands of authors each with lots of books)... I guess that's where paging through the results would come in (as Brendan correctly predicts).&lt;br /&gt;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;And finally, for future reference.... skud points out that &lt;a href='http://lists.freebase.com/mailman/listinfo/developers'&gt;the mailing list&lt;/a&gt; is here.&lt;/i&gt;</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/02/uh-lazyweb.html' title='Uh, lazyweb...?'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=5215510218615781420' title='9 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/5215510218615781420'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/5215510218615781420'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-3675536959735096705</id><published>2008-02-17T15:22:00.008Z</published><updated>2008-02-18T10:44:40.405Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='evil'/><category scheme='http://www.blogger.com/atom/ns#' term='publishing'/><title type='text'>Publishing house scale of web server evil</title><content type='html'>Kevin Burton recently checked to see &lt;a href='http://feedblog.org/2008/02/05/obama-runs-linux-clinton-windows-ron-paul-linux/'&gt;which operating system&lt;/a&gt; the websites of different US presidential candidates are built on. The executive summary: Democrats use lots of Linux while Republicans (Ron Paul excepted) mainly use Windows.&lt;br /&gt;&lt;br /&gt;Some have suggested a correlation between Windows web server usage and being evil*. This makes sense as only somebody with no soul could love &lt;a href='http://en.wikipedia.org/wiki/Active_Server_Pages'&gt;ASP&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;Does the theory hold true in the publishing world?&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; PLoS run Linux&lt;br /&gt;&lt;li&gt; Nature run Linux&lt;br /&gt;&lt;li&gt; Science run Linux&lt;br /&gt;&lt;li&gt; Wiley run Solaris&lt;br /&gt;&lt;li&gt; Elsevier run Windows 2000 &lt;br /&gt;&lt;li&gt; Springer run Windows 2003&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;So from a purely progressive science on the web point of view.... yeah, sort of.&lt;br /&gt;&lt;br /&gt;Springer, Elsevier and Wiley are pretty big companies and have lots of different sites, so maybe it's doing them a disservice to assume that whatever serves their root domain is their primary choice of OS. For example, I couldn't tell what Elsevier's ScienceDirect site runs because NetCraft returns 'unknown', so maybe it runs Linux.... or maybe NetCraft just doesn't have an entry for CRUSHED UP PUPPIES AND THE SWEAT OF THE OPPRESSED.&lt;br /&gt;&lt;br /&gt;Trade publishing, for completeness:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; Canongate (arty Edinburgh based independent) run Linux&lt;br /&gt;&lt;li&gt; Penguin (ironically) run Solaris 8 &lt;br /&gt;&lt;li&gt; Macmillan (publish Jeffrey Archer, employ me) run Windows 2000&lt;br /&gt;&lt;li&gt; Simon &amp; Schuster (publishing Lindsey Lohan's autobiography) run Windows 2003&lt;br /&gt;&lt;li&gt; HarperCollins (owned by News International) run Windows 2003&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;* Don't read this the wrong way. Microsoft do lots of cool stuff nowadays too. But IIS is cold and heartless.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/02/publishing-house-scale-of-web-server.html' title='Publishing house scale of web server evil'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=3675536959735096705' title='1 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3675536959735096705'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3675536959735096705'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-4690701407208848824</id><published>2008-02-12T21:24:00.000Z</published><updated>2008-02-12T22:07:40.528Z</updated><title type='text'>Nature on Facebook</title><content type='html'>&lt;a href='http://ed.facebook.com/group.php?gid=8274710679'&gt;It's in my Nature.com&lt;/a&gt; is Nature's splendid new Facebook group (there's a &lt;a href='http://www.facebook.com/pages/Nature/6115848166'&gt;fan page&lt;/a&gt;, too, with a selection of fresh Nature.com content on it in case you need a quick fix of science news without leaving the 'book). &lt;br /&gt;&lt;br /&gt;Anyway, it's splendid because it's a chance to interact with NPGers on an informal level - or rather a chance for us to interact with students &amp; scientists on an informal level. The group started out as a mini-project run by a couple of Facebook early adopters and while it has picked up a lot of advice and support from within the company since then it's still a friendly place where everybody knows your name*.&lt;br /&gt;&lt;br /&gt;Some of that company support has come in the form of free Nature-header-red iPods, so sign up and keep an eye out for giveaways in the near future.&lt;br /&gt;&lt;br /&gt;(if you haven't seen it already you should also check out &lt;a href='http://ed.facebook.com/group.php?gid=2401713690'&gt;the PLoS group&lt;/a&gt;. &lt;i&gt;After&lt;/i&gt; you've joined the Nature one, of course).&lt;br /&gt;&lt;br /&gt;* though it's not the right place for &lt;a href='http://network.nature.com/forums/askthenatureeditor/540'&gt;ask the editor&lt;/a&gt; type questions</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/02/nature-on-facebook.html' title='Nature on Facebook'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=4690701407208848824' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/4690701407208848824'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/4690701407208848824'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-6788374041198783030</id><published>2008-02-10T14:36:00.001Z</published><updated>2008-02-10T15:37:12.594Z</updated><title type='text'>Where I lazily recycle news from OpenHelix</title><content type='html'>Geoff Bilder over at CrossRef &lt;a href='http://www.crossref.org/CrossTech/2008/02/crossref_citation_plugin_for_w.html'&gt;has announced&lt;/a&gt; that their citation plugin for MT and WordPress is now available for download. &lt;br /&gt;&lt;br /&gt;&lt;a href='http://en.wikipedia.org/wiki/CrossRef'&gt;CrossRef&lt;/a&gt; is the &lt;a href='http://youtube.com/watch?v=W0sNFb8wLC0'&gt;shadowy cabal&lt;/a&gt; that runs the DOI system for journals. Shadowy because despite the fact that DOIs underpin academic publishing who outside of publishing tech circles has ever heard of them? Anyway, they've been doing a lot of cool science 2.0 stuff recently, probably because of Geoff, for whom I have a lot of respect. He does need to update &lt;a href='http://www.gbilder.com/blog/'&gt;his blog&lt;/a&gt; more often, though. ;)&lt;br /&gt;&lt;br /&gt;(via &lt;a href='http://www.openhelix.com/blog/?p=117'&gt;OpenHelix&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;Also at OpenHelix is a writeup of &lt;a href='http://www.plosone.org/article/info:doi/10.1371/journal.pone.0001440'&gt;Gene Characterization Index: Assessing the Depth of Gene Annotation&lt;/a&gt;, published in PLoS One at the end of last year. It's a nice project.&lt;br /&gt;&lt;br /&gt;One (very) minor niggle: the &lt;a href='http://burgundy.cmmt.ubc.ca/gci/data/data2007/Gene_table_2007.txt'&gt;whole genome dataset&lt;/a&gt; they provide looks like this:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;Gene ID GCI&lt;br /&gt;2 10.0&lt;br /&gt;19 10.0&lt;br /&gt;24 10.0&lt;br /&gt;25 10.0&lt;br /&gt;...&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;WTF kind of gene ID?&lt;br /&gt;&lt;br /&gt;It is blatantly obvious once you've read the actual paper (they're from Entrez), of course, but still. Dudes, some better column names or /* descriptive comments */ wouldn't go amiss.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/02/where-i-lazily-recycle-news-from.html' title='Where I lazily recycle news from OpenHelix'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=6788374041198783030' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6788374041198783030'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6788374041198783030'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-3613239218694692290</id><published>2008-01-15T01:38:00.000Z</published><updated>2008-01-15T19:34:04.280Z</updated><title type='text'>Viral spread visualized, update</title><content type='html'>&lt;object type="application/x-shockwave-flash" width="640" height="360" data="http://www.vimeo.com/moogaloop.swf?clip_id=610925&amp;amp;server=www.vimeo.com&amp;amp;fullscreen=1&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=01AAEA"&gt; &lt;param name="quality" value="best" /&gt; &lt;param name="allowfullscreen" value="true" /&gt; &lt;param name="scale" value="showAll" /&gt; &lt;param name="movie" value="http://www.vimeo.com/moogaloop.swf?clip_id=610925&amp;amp;server=www.vimeo.com&amp;amp;fullscreen=1&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=01AAEA" /&gt;&lt;/object&gt;&lt;br /&gt;&lt;a href="http://www.vimeo.com/610925/l:embed_610925"&gt;Viral spread of Facebook app&lt;/a&gt; from &lt;a href="http://www.vimeo.com/user343605/l:embed_610925"&gt;Stew&lt;/a&gt; on &lt;a href="http://vimeo.com/l:embed_610925"&gt;Vimeo&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The movie above is a (blurry when streamed, try the &lt;a href='http://www.vimeo.com/download/video:26127644'&gt;3.6mb MPEG download&lt;/a&gt;) visualization of the signup and invite data from &lt;a href='http://www.facebook.com/apps/application.php?id=2395952879'&gt;Bookshare&lt;/a&gt; from July 18th of last year to now. As users sign up they are geolocated and plotted on the map. More users signing up in a particular location mean bigger, brighter spots. Invites from one user to another are drawn as snaking lines that travel from the city of the inviter to the city of the invitee.&lt;br /&gt;&lt;br /&gt;It's written in &lt;a href='http://processing.org/'&gt;Processing&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I'd love to hear from anybody who owns a large Facebook application and has kept id / install date / invitee data (bearing in mind Facebook's data privacy rules...). See also &lt;a href='http://blogs.nature.com/wp/nascent/2007/07/appidemiology.html'&gt;this&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;First draft available &lt;a href='http://www.vimeo.com/download/video:25960348'&gt;here&lt;/a&gt; (1.6mb MPEG)&lt;/i&gt;</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/01/viral-spread-visualized.html' title='Viral spread visualized, update'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=3613239218694692290' title='2 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3613239218694692290'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3613239218694692290'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-1617079863415893816</id><published>2008-01-13T03:14:00.000Z</published><updated>2008-01-13T03:18:57.984Z</updated><title type='text'>Lazyweb, I invoke thee</title><content type='html'>Dear maths geeks,&lt;br /&gt;&lt;br /&gt;On &lt;a href='http://www.facebook.com/apps/application.php?id=2395952879'&gt;Bookshare&lt;/a&gt; each user owns a set of books. The same book can be owned by multiple users.&lt;br /&gt;&lt;br /&gt;What algorithm can I use to find the &lt;i&gt;smallest possible&lt;/i&gt; set of books that contains at least one book from each user?</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/01/lazyweb-i-invoke-thee.html' title='Lazyweb, I invoke thee'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=1617079863415893816' title='10 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1617079863415893816'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1617079863415893816'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-3831641260204509625</id><published>2008-01-11T21:47:00.000Z</published><updated>2008-01-12T02:28:34.029Z</updated><title type='text'>LAMP performance for dummies (e.g. me)</title><content type='html'>I've spent the last six months combing Google for 'slow mysql' results. It was enlightening. Now in the spirit of giving back to the community I give you:&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Euan's Top Tips For Medium Sized LAMP Powered Web Apps&lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;Where 'medium sized' means you have four or five concurrent users on ten tables with around half a million rows each and 'top' means that you've already done all the basic stuff - picking table types, adding indexes, designing the database properly in the first place etc.&lt;br /&gt;&lt;br /&gt;Some are common sense. Some are not appropriate for all situations. Use at your own risk.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; If most operations are reads turn the MySQL &lt;a href="http://dev.mysql.com/doc/refman/5.0/en/query-cache.html"&gt;query cache&lt;/a&gt; on.&lt;/li&gt;&lt;br /&gt;&lt;li&gt; Don't use mysqldump for backups. Users don't like five minutes of being unable to write anything to the database because the tables are locked. Use &lt;a href="http://dev.mysql.com/doc/refman/5.0/en/mysqlhotcopy.html"&gt;mysqlhotcopy&lt;/a&gt; or an &lt;a href="http://lenz.homelinux.org/mylvmbackup/"&gt;LVM&lt;/a&gt; based system if you're feeling adventurous.&lt;/li&gt;&lt;br /&gt;&lt;li&gt; If you're retrieving large amounts of data with PHP but you don't need the entire rowset in memory at once (or mysql_num_rows()) then use mysql_unbuffered_query instead of mysql_query. Stops you running out of memory so fast.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;a href='http://xdebug.org/docs/profiler'&gt;xdebug&lt;/a&gt; is a PHP profiler. With &lt;a href='http://kcachegrind.sourceforge.net/cgi-bin/show.cgi'&gt;kcachegrind&lt;/a&gt; it's like hot coder porn. It's incredibly useful to see how long your code takes to run and why.&lt;/li&gt;&lt;br /&gt;&lt;li&gt; in_array()s are innocent looking... and very slow. If the values are unique then use associative arrays instead - i.e. instead of&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$data = array("a", "b" .. large amount of data .. "zzz");&lt;br /&gt;&lt;br /&gt;for ($i=0; $i &lt; 1000; $i++) {&lt;br /&gt;   print "is c in array? ".in_array("c", $data);&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;use&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$data = array("a", "b" .. large amount of data .. "zzz");&lt;br /&gt;$a_data = array();&lt;br /&gt;foreach ($data as $point) {$a_data[$point] = true;}&lt;br /&gt;&lt;br /&gt;for ($i=0; $i &lt; 1000; $i++) {&lt;br /&gt;   print "is c is array? ".isset($a_data["c"]);&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;key_buffer_size&lt;/b&gt; is the amount of memory that MySQL puts aside to keep table indexes in memory, which is a very good thing. As a rule of thumb it should be 25%-50% of your database server's total RAM (reduce this if you're also using it for other things, natch). The default is 8mb, so once you've got more than 8mb worth of indexes...&lt;br /&gt;&lt;br /&gt;How can you tell if you need a bigger key buffer size?&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;mysql&gt; SHOW STATUS LIKE '%key_read%';&lt;br /&gt;+-------------------+---------+&lt;br /&gt;| Variable_name     | Value   |&lt;br /&gt;+-------------------+---------+&lt;br /&gt;| Key_read_requests | 6375479 |&lt;br /&gt;| Key_reads         | 130562  |&lt;br /&gt;+-------------------+---------+&lt;br /&gt;2 rows in set (0.00 sec)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;There should be at least 100 key_read_requests (from memory) to every key_reads (from disk), as above.&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt; Do a lot of sorting? Increase &lt;b&gt;sort_buffer_size&lt;/b&gt; but bear in mind that unlike key_buffer_size it is allocated per connection - i.e. increase it to 32mb and MySQL will gobble 32mb x 4 = 128mb the next time you have four connections open at once (also remember that your code will - probably - sometimes fail or forget to shut down the connection).&lt;/li&gt;&lt;br /&gt;&lt;li&gt; Do a lot of ORDER BYs? Increase &lt;b&gt;read_rnd_buffer_size&lt;/b&gt;. Allocated per connection, as above.&lt;/li&gt;&lt;br /&gt;&lt;li&gt; Fetching frequently used data from disk can &lt;i&gt;sometimes&lt;/i&gt; be faster than using &lt;a href='http://www.danga.com/memcached/'&gt;memcached&lt;/a&gt; or similar (no, really. Far less overhead + the file is cached by OS and so is in memory anyway).&lt;/li&gt;&lt;br /&gt;&lt;li&gt; serialize() and unserialize() are slow. Use JSON where possible instead (nowhere near as flexible, sadly)&lt;/li&gt;&lt;br /&gt;&lt;li&gt; To &lt;a href='http://www.mysqlperformanceblog.com/'&gt;paraphrase&lt;/a&gt; the best optimization is avoiding doing the work in the first place: do you really need to return ALL of the rows, or just the first few hundred? Do you really need to sort them? Do they need to be distinct? Is it faster to get more data than you need and process it in code?&lt;/li&gt;&lt;br /&gt;&lt;li&gt; Avoid ORDER BY RAND(). Do something clever in code (pick some randomly selected ids?)&lt;/li&gt;&lt;br /&gt;&lt;li&gt; SQL_CALC_FOUND_ROWS can be faster than rerunning a query with COUNT() - but not always.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/01/lamp-performance-for-dummies.html' title='LAMP performance for dummies (e.g. me)'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=3831641260204509625' title='14 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3831641260204509625'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3831641260204509625'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-5245862178159407857</id><published>2008-01-10T18:47:00.000Z</published><updated>2008-01-11T12:08:35.542Z</updated><title type='text'>Facebook code on Bebo</title><content type='html'>Yes, nothing to do with bioinformatics, but social networking:&lt;br /&gt;&lt;br /&gt;Bebo announced that it'd support Facebook style apps as well as OpenSocial late last year. They've &lt;a href='http://developer.bebo.com'&gt;launched properly&lt;/a&gt; now.&lt;br /&gt;&lt;br /&gt;The &lt;a href='http://www.bebo.com/docs/snml/index.jsp'&gt;documentation&lt;/a&gt; has been up for a while already and there aren't any real surprises, though there's a couple of new Bebo specific tags (which seems like a bad idea to me; I mean, the whole point is code compatibility between multiple systems).&lt;br /&gt;&lt;br /&gt;One major omission is Javascript support; there's no equivalent to FBJS (Facebook's sandboxed Javascript version). This means that a lot of the bigger apps will hold off moving over, I'd have thought - though it's not a problem for simpler quizzes and Flash based stuff like Scrabulous.&lt;br /&gt;&lt;br /&gt;There are a couple of OpenSocial containers now but I haven't heard of any Facebook server components being made available (just a 'top level overview' document on their wiki). We've just finished building an API for Nature Network (private, for now), I wonder how much work it'd be to start supporting apps?</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2007/10/facebook-code-on-bebo.html' title='Facebook code on Bebo'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=5245862178159407857' title='1 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/5245862178159407857'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/5245862178159407857'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-7560379596399961445</id><published>2008-01-05T16:33:00.000Z</published><updated>2008-01-05T17:03:06.776Z</updated><title type='text'>Diversions, excuses, etc.</title><content type='html'>OK, so it's going to be a week of open science &lt;i&gt;spread out over the month&lt;/i&gt;. Probably. I've been distracted by:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; &lt;a href='http://silverlight.net/'&gt;Silverlight&lt;/a&gt;: actually it's quite cool - writing simple multimedia apps is a cinch. Shame that it's probably not going to catch on... it's not &lt;i&gt;that&lt;/i&gt; much better than Flash.&lt;br /&gt;&lt;li&gt; All that &lt;a href='http://www.zedshaw.com/rants/rails_is_a_ghetto.html'&gt;Zed Shaw ragging on the Rails community&lt;/a&gt; stuff - mainly cause I think Rails is overhyped (though definitely has its place). I also liked &lt;a href='http://ajaxian.com/archives/zed-shaw-interview-on-rails-community-enterprise-ajax-patents-and-a-whole-lot-more'&gt;his quote&lt;/a&gt; about implementing the semantic web over existing web infrastructure: "Einsteins brain on a crack whores body isn’t going to happen". I disagree but made me laugh...&lt;br /&gt;&lt;li&gt; &lt;a href='http://precedings.nature.com/documents/1486/version/1'&gt;BioMoby&lt;/a&gt; (via &lt;a href='http://del.icio.us/egonw'&gt;Egon's delicious bookmarks&lt;/a&gt;)&lt;br /&gt;&lt;li&gt; Hangovers (there is &lt;a href='http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1322250'&gt;no cure&lt;/a&gt;).&lt;br /&gt;&lt;li&gt; &lt;a href='http://masseffect.bioware.com/'&gt;Mass Effect&lt;/a&gt; - incredibly cheesy sci-fi plot. Naked alien ladies. Voice acting from Seth Green and Lance Henriksen. Had to buy it. Is OK.&lt;br /&gt;&lt;li&gt; Annual &lt;a href='http://www.google.co.uk/search?hl=en&amp;q=i+rock+give+me+a+pay+rise&amp;btnG=Google+Search&amp;meta='&gt;appraisal time&lt;/a&gt; at work.&lt;br /&gt;&lt;/ul&gt;</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/01/diversions-excuses-etc.html' title='Diversions, excuses, etc.'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=7560379596399961445' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/7560379596399961445'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/7560379596399961445'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-6436576414509539048</id><published>2007-12-31T16:41:00.001Z</published><updated>2007-12-31T17:15:20.165Z</updated><title type='text'>Open notebook - what's a disease again?</title><content type='html'>Is there a super-semantic-web-enabled phenotype database out there*? I want to ask a question like 'give a list of monogenic disorders whose locus has been confirmed by at least two labs, broken down by type of causative mutation type' and get an answer.&lt;br /&gt;&lt;br /&gt;(* on a tangent: is &lt;a href='https://www.23andme.com/'&gt;23andMe&lt;/a&gt;'s gene book thing freely available?)&lt;br /&gt;&lt;br /&gt;&lt;a href='http://www.ncbi.nlm.nih.gov/sites/entrez?db=omim'&gt;OMIM&lt;/a&gt; falls quite a long way short of this... it never set out to be a resource for programmatic access so you can't really blame them. The &lt;a href='ftp://ftp.ncbi.nih.gov/repository/OMIM/morbidmap'&gt;morbid map&lt;/a&gt; is available for download and contains all of the gene -&gt; disorder mappings in their database.&lt;br /&gt;&lt;br /&gt;A couple of issues:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;OMIM's weird &lt;a href='http://www.ncbi.nlm.nih.gov/Omim/mimstats.html'&gt;entry categorization system&lt;/a&gt; (#*%+...) is very confusing. There are 2229 'phenotypes' (note: not 'Mendelian phenotypes') with a known molecular basis in the database, apparently, but only 386 genes with a phenotype associated with them? Some of those phenotypes are going to be caused by gross insertions / deletions / whatever and not small mutations in single genes, multiple phenotypes might arise from different mutations in the same genes but even so... what's with the disparity?  &lt;br /&gt;&lt;li&gt;It contains polygenic disorders (diabetes, schizophrenia) as well as monogenic ones&lt;br /&gt;&lt;li&gt;You can't tell which is which - you could count the number of genes associated with the disorder but a 'monogenic' disorder might be a complex one whose OMIM entry hasn't been updated yet&lt;br /&gt;&lt;li&gt;It's not a disease database - it has other phenotypes in it too. &lt;a href='http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=152430'&gt;Longevity&lt;/a&gt;? Wet or dry ear wax? &lt;a href='http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=601696'&gt;Novelty seeking personality&lt;/a&gt;?&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;The last point is interesting, really. When is a phenotype a disease? If you have a novelty seeking personality and so are relatively impulsive and prone to climbing mountains, swimming with sharks, cycling without a helmet etc. then are you ill? &lt;br /&gt;&lt;br /&gt;Well, no, is the obvious answer. But where do you draw the line? Is &lt;a href='http://www.guardian.co.uk/society/2007/aug/07/health.medicineandhealth'&gt;autism&lt;/a&gt; a disease?&lt;br /&gt;&lt;br /&gt;Neh. Beyond our remit. For us a monogenic disease = a clinically recognized disorder with a single, genetic cause.</content><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2007/12/open-notebook-whats-disease-again.html' title='Open notebook - what&apos;s a disease again?'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14832160&amp;postID=6436576414509539048' title='3 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6436576414509539048'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6436576414509539048'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author></entry></feed>