Three avenues to support open approaches to science - the cases of funding, data acquisition and knowledge curation
We'd like to ask you to think about two to three emerging opportunities for--or threats to--open society institutions and values that you are aware of which are not receiving sufficient attention and where a funder like OSI could usefully intervene. We encourage you to suggest issues that are still very much on the horizon; there need not be an obvious solution to the points you raise.
I know that the OSI had and has many interesting projects running (also in regions and cultures normally off the radar, including some of those dear to me) but I have often (not just jokingly) taken its abbreviation to stand for "Open Science Institute", and so I take the liberty here to shrink the space of possible replies by concentrating on openness in science, anyway the most prominent topic in my blog.
My intuitive response would be that several inefficiencies in our current knowledge creation and curation systems cry for a test run of open approaches. Not sure whether I can distill this down to three issues, but let's get started by listing some of the ideas, and I hope that you can then help me structure and adapt them appropriately. To facilitate the discussion, I will resort to Cameron's depiction of the research cycle:
Normally, none of these steps are being performed in the open, not even publishing (most scientific journals are still behind subscription barriers, but big changes are ahead), so let's start at the idea stage: Any idea that is neither obvious nor obviously stupid will need time and resources (particularly for experimental studies) to be developed and tested. Where do the resources come from? Well, ideally from a mixture of
- Some sort of no-strings-attached baseline grants for established researchers (with no peer review before the money is awarded, but with public peer review from then onwards) or winners of scientific competitions (which necessitates some form of review, but can best be done fully in the open), to try out ideas at their very early stages and collect preliminary data
- The classical "calls for proposals" schemes in which funders define the rules, and scientists bend and squeeze their research proposals (ideally in public, so as to avoid multiple reinventions of the wheel) to fit in, to develop ideas until their realization
- Some not-so-classical "calls for funders" schemes, in which scientists lay out their best ideas (obviously in public), and funders (possibly including scientists with baseline grants) can choose which ones to fund, or to what extent.
While non-open (and lengthy) variants of 2 and 1 are currently the norm, and open approaches would no doubt render these schemes considerably more efficient (e.g. by avoiding multiple reinventions of the wheel, but also by quicker error correction, and enhanced interaction amongst participants), I think the highest impact can currently be achieved by actively supporting developments in the direction of scheme 3, which is only about to begin to be explored, though public peer review of manuscripts (at the opposite side of our research cycle) is gaining ground and may serve as a model (to avoid confusion: in most contexts of the research cycle, "open" means "in public", while in a peer review context, "public" is used directly instead because "open" has kept its pre-web meaning of "revealing the reviewer's identity", which is not necessary for the web-public system to function). In summary, my first recommendation would be to specifically support collaborative open research funding environments, in which research proposals and funding decisions, along with their discussions, take place in public. If that is too far-fetched, then a scientifically rigorous test of the efficiency of the current peer review system (especially for grants but also for manuscripts) would already be an important step forward. A proposal in this direction has been sketched out here one year ago, ready to be fleshed out for the upcoming deadline of April 30 — anyone interested?
OK, suppose we now have both a developed idea and sufficient funds to put it into practice. The next step in which open approaches can make a difference is then that of recording data and making them available. Coincidentally, a one-day symposium dedicated to precisely that took place on Saturday, attended by about 60 people in person and about 400 via live streaming (another way of open sharing). One of the presentations there focused on the use of free (but not always open) tools for recording, visualizing, analyzing and archiving data in public and is embedded below:
I have recently started to move some parts of my research notes online, using yet another tool — OpenWetWare, which is free and based on open software, though not all customizations seem to have been made public. Via the Recent Changes page there, I got to know another novice there, and we are currently exploring to which extent we could join forces where our research overlaps. Where Jean-Claude bundled together many different tools to handle his data, OpenWetWare is closer to a one-stop shop that can be integrated with some typical research workflows in the biomedical sciences. Yet it is not quite there yet. Along these lines, a research proposal has been drafted in public and was submitted to the DFG 9 months ago, so that notification about the outcome is hopefully not too far away from now. If such data-centric aspects of research could be linked with people-centric aspects by means of social networks, this would allow to move towards more collaborative modes of research and away from using ill-suited journal-level metrics for the evaluation of individual scientists, departments or institutions (and to bypass article-level metrics, which are an important but in itself not sufficient step in this direction). So far, though, few of the so-called "Facebooks for scientists" (including Mendeley) provide integration with scientific workflows, few of them (including ways.org which hosts this blog) are entirely open source, and none meet both criteria and none are widely used to discuss the research as it happens. My second recommendation is thus to support collaborative open research environments in which data (however defined, and be it equations) are recorded in a quotable way in public (and with a Panton-compatible license) as soon as possible after they have been gathered (thus reducing the timespan between data acquisition and formal publication, and drastically increasing the amount of data — and with it reproducibility — available for scholarly communication), ideally in a manner that integrates standardized and public tools for the processing of public data and allows for social filtering of information on an open platform (open also in the sense that non-scientists are welcome).
Supposing we now have some relevant data processed in such an open research environment, the next step would be to inform the world about it. Current practice is to write up an article (still called "paper", and PDF is not much different from that), and communication of the results before formal publication is kept to a minimum (usually conferences) in many disciplines. This means that the results, when "published" (we are coming to a close with one cycle) are already outdated in fast moving fields. It also means that the new methods or findings are reported in a container format that is not well suited to establish relevant context because it integrates in a very inefficient fashion with the remainder of existing human knowledge — wikis, for instance, achieve this much more readily. My third recommendation, hence, is to support collaborative open knowledge environments — collaborative efforts to collect, structure, and update knowledge and to render it conveniently accessible to the public for free. If these knowledge environments are reasonably complete and accurate (as well as appropriately licensed), gaps in the knowledge environments (and cases of duplication or disambiguation) could be more easily identified than in the current literature and thus serve as seeds for new research proposals, possibly even in an automated fashion. Again, such environments with version control would allow to reuse log information for evaluation purposes — way more fine-grained and scalable (and open to newcomers, especially young scientists) than anything possible at the journal level, particularly when contributions to open knowledge environments can be mashed up (transparently, in public) with contributions to the other open environments.
What would you suggest if you had to single out three major avenues in which open institutions or values could be supported with a long-term perspective, starting soon?
The Friendfeed part of the discussion is embedded below: