Rankings

This page is for discussion of the contentious topic of rankings!

The 2014 CORE conference venue ranking exercise:

http://103.1.187.206/core/changes/

The 2013 Core Rankings update process can be found at:

http://core.edu.au/cms/images/downloads/conference/RankingChangeProcess.pdf

The ERA 2010 rankings can be found at:

http://www.arc.gov.au/era/era_2010/archive/era_journal_list.htm#2

http://www.arc.gov.au/era/era_2010/archive/era_journal_list.htm#1

The CORE 2008 (or is it 2009?) rankings can be found at:

http://core.edu.au/index.php/categories/conference%20rankings

http://core.edu.au/index.php/categories/journals

= Please add comments here =

Open Letter from Computer Science Fellows of Australian Academy of Science - 18 June 2014
We the undersigned are computer scientists working in Australia who are fellows of the Australian Academy of Science. We write to oppose the idea of the CORE conference ranking exercise http://core.edu.au/index.php/categories/conference%20rankings and recommend CORE abandon it and that Australian computer scientists do not not take participate in it.

We have all individually signed the San Francisco Declaration on Research Assessment http://am.ascb.org/dora/. As the DORA clearly states, the use of indirect surrogates such as where a piece of work is published as a proxy for evaluating a person's (or a group's) work is a bad idea. The Academy of Science pays no heed to such things (we speak as people who have all been involved in the selection process for new fellows, rather than formally representing the academy). We believe that not only is such venue ranking worthless for the academy, but it is actively harmful more broadly. It suggests one can evaluate the quality of a researcher’s work without reading it, merely by noting where it was published. We also reject other reasons that are used to justify the conference ranking exercise, such as providing guidance to naïve researchers who can not tell the quality of conferences. It is true that some conferences serve the needs of the community and the individual less than others but we do not think the ranking exercise is the right way to deal with that.

We would be delighted if CORE institutionally signed the DORA and abandoned the conference ranking exercise.

Robert Williamson Richard Brent Brendan McKay Rao Kotagiri Hugh Durrant-Whyte Richard Hartley

Response from CORE Executive - 27 June 2014
Dear Colleagues

A recent letter signed by six senior researchers has requested CORE to abandon its conference ranking exercise. The six argue that the conference ranking "is actively harmful" and that it "suggests that one can evaluate the quality of a researcher's work without reading it, merely by noting where it was published". In response, we make the following points:

(1) Any detailed evaluation of research work is, naturally, best undertaken by having discipline experts read that work and form an opinion of it. Indeed, that is the fundamental nature of the peer-review process, and is a process that stratifies publications venues. But it must also be accepted that there are also contexts in which peer review of published work cannot be sought, because those CVs will be assessed by non-experts, or because the volume of evaluation needing to be performed prohibits detailed evaluation. One important example that all academics must face is when promotion applications are being evaluated by multi-discipline faculty committees. It is all very well to argue for an ideal process in which individual papers are evaluated by each committee member, but pragmatic likelihoods in which those readers need high-level guidance must also be acknowledged. The rankings serve a role in this latter case.

(2) Providing guidance to junior researchers, including graduate research students, is a second benefit of the rankings. The six signatories "reject other reasons that are used to justify the conference ranking exercise, such as providing guidance to naive researchers who can not tell the quality of conferences", but give no background or logic for this position. But without guidance from their community, how will such researchers calibrate their own standards and expectations? The CORE rankings are not just about assessment of research, which is the basis of the San Francisco declaration; it is also about setting goals and expectations for future research. In this sense, all the rankings do is publicize in a single place and in a documented manner shared community values: Journal of the ACM *is* a more ambitious publications submission target than (say) Journal of Research and Practice in Information Technology.

(3) The 2013 changes to the CORE rankings process have made them more transparent and more scrutable; changes to rankings that are made in 2014 will be carefully documented, and evidence-based. Moreover, the CORE Conference Rankings Portal allows for commentary, opinion, and numeric reviews to be recorded; that is, the rankings are expected to become a much richer resource than merely an ordering of conferences. The new arrangements will help create the kind of multi-faceted resource that the six signatories are seeking.

(4) Many of the principles espoused in the Declaration are laudable. No-one should be evaluated solely based on their count of CORE A* publications, just as no-one's teaching should be evaluated solely on the basis of student questionnaires. Academic performance and productivity in both teaching and research is evidenced, in a variety of ways, and hence can be documented in a variety of ways. But it is also true that removing a resource that provides overall community evaluations of publication venues weakens the overall options for evaluating academic performance. Use of conferences as high repute, peer reviewed publication venues is still not well understood in many disciplines. Rankings provide some evidence that we as a community treat them seriously.

(5) Regardless of the seniority of the six signatories, we are a community of colleagues; and it should be noted that previous discussions at CORE meetings (including the Professors and Heads Meeting held in January this year) has resulted in a preponderance of opinion that supports the continuation of the rankings.

With our best regards,

Prof. John Grundy, CORE President Prof. Alistair Moffat, CORE Vice-President Prof. Lin Padgham, CORE Treasurer/Secretary Prof. Tom Gedeon, CORE Immediate Past-President

David Suter
If one actually reads the DORA declaration it does NOT recommend complete abandonment of metrics in assessing quality but talks more about a range of assessment criteria that can include metrics so it seems to me that the appeal to DORA somewhat misrepresents what that declaration actually says. Yes, the central message is indeed that reading the paper (presumably by those close enough to the area for this to make sense) is no substitute for metrics and we’d all agree I suspect. BUT selecting from a handful of candidates to gain membership of an elite society such as the academy is a far cry from doing a first cull of 200 applicants for a lectureship and I’m afraid - warts and all - conference rankings and journal impact factors will necessarily be useful in the latter exercise.

In short - I do not support CORE abandoning the rankings as *used properly* they are a useful tool. I don’t buy the “they are open to misuse so don’t provide them” argument as any tool is open to misuse.

Bob W response (private but permission to release
DORA does not say they should be “abandoned”, true. It says "General Recommendation 1. Do not use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions." The only metric the declaration endorses is “article-level” (recommendation 7). This is pretty darn clear. And I thought our letter was too. Don’t assess work based on where it was published.

David S response (ditto)
Thanks Bob (and Brendan) - I concede the issue of DORA recommending not using the “where published” metric, rather somewhat advocating other “article based metrics” be developed and improved. Fair cop - apologies.

But on the point of good practice vs bad practice - there is a range of positions from the idealist positional all the way through to the pragmatic position. I see the position being put as too far towards the former.

Like many things hotly debated - the tendency is to unnecessarily binarise the reality. If one were to put the position that because you’ve published in a highly rated avenue then it’s good work (or to assert conversely that because it’s in a poor rated avenue it is trash) then we’d all agree that is uneatable. But that’s not the position we are in or even likely to be in. If you take the more statistical viewpoint that because it is published in a high ranked journal or conference that has more stringent reviewing (the two are very highly correlated in most cases) then - a priori without other evidence to use - it is more likely to be better work than a random selection from poor ranked avenues, then it is tenable. Far rather have a declared and contested list than people to pretend it doesn’t matter or to use (subconsciously or otherwise) and undeclared and uncontested list.

I’m pretty sure I’ve even heard members of the signatories make remarks about avenues of publication that betray the reality - there simply is a set of journals and conferences worth publishing and another set (much larger set) not worth it (and a sizeable grey area as well). Making this reasonably explicit (and even more - **contestable** when the rankings are revised) not only helps the community (particularly younger researchers whose other metrics are much more laggy) but makes the processes of assessment somewhat more transparent (the latter is something naturally that DORA does highlight as a key issue). The statement in the letter to the effect “that there are ways to deal with journals and conferences that don’t serve the community well” sort of misses the point. Clearly whatever measures there are, they aren’t sufficient to actually reduce the number to level where they essentially don’t matter - but worse, it conceded the very point that there is a largely agreed upon but (unless ranked in at least a coarse way) implicit ranking known to the inner sanctum but not declared.

There are clear cases where one has to make decisions (resource allocations - as small for example as conference funding, initial culls on hiring etc.) where a first cut of the chaff on conference ranking and journal ranking is a sensible, pragmatic, and reasonably fair way to do things. I don’t see we should necessarily lose this just because there is some misuse. Yes, you shouldn’t hire a prof (or even do the final hiring selection of any level), based on rankings of avenues of publication.

Look at the other notions of esteem commonly used (institutions graduated from, membership of societies, conferences referee committees served on, conferences invited to speak at….) there is an implicit (but hopefully small and not dominant) advantage in graduating from the "better" schools, joining the right organisations, getting prizes from the right organisations, etc. Its certainly no different to the pretty universal practice of valuing a reference given by a referee from an elite institution/group over a reference from someone “unknown”.

To pretend they aren’t there - or attempt to wish them away - doesn’t stop people using them (sometimes reasonably and sometimes not).

BTW (genuinely meant more as an interesting comment then a counter argument or Reductio ad absurdum) Seems to me the logical conclusion of the position being put is that hiring should ban CV’s because much of the content of CV’s is patently to influence the decision based on “association” rather than achievement. "Send us your latest unpublished work* (only) with name and association redacted…..”. Blind/anonymous hiring like blind review.


 * unpublished because with modern search, there is no way to be anonymous nor to stop people to look-up your history (where you’ve been hired, what journals you have published in, who you co-authored with…..)

Bob W further response
Several points have been raised by folks mailing to the list, which I offer the following response.

How to deal with 100s of applications: This seemed to come up most. It’s dead easy. For jobs: You ask people when they apply for a job to write a couple of pages describing what they have done, why it was good, and what they plan to do. One can easily read 100s of these and very rapidly cull to a shortlist where you dig deeper. Re doing this for grant assessments, no you do not need to read all 100 papers an applicant has written. It suffices to look at 1 or 2 of their best 10. I was very impressed with the change the DFG (German equivalent of the ARC) made a few years ago that only allows you to list 6 of your papers (in total) in your application. I have encouraged Aiden Byrne to consider this for the ARC. You can then randomly pick one of these, and if its terrible, reject. If not, you look deeper…

Easy to criticise, hard to do: See above. The key point is that for rejection, it is sufficient to look at one of the candidates best papers. If the best is no good, reject…

What to do instead: Related to the above, what might CORE do instead? Focus on what are the factors that help hiring good people. How to assess grant proposals. What makes good research. How to train people to evaluate research well. How to make ERA as useful as possible. There are lots of things one could do. Many of these are amenable to empirical research even! Our letter had one purpose: to try and stop something we thought was bad. We did not want to make it too long, so there were no positive suggestions. But absence of evidence is not evidence of absence!

Mythologies versus data: One claim was that there was “mythology about the conferences quality being irrelevant”. We never claimed that conference quality is irrelevant. We actually said some conferences are better than others. Of course they are. We claim that trying to maintain a certified list of which is which is not a good idea. One reason is that people then use it as an excuse to do less work in assessment which is intrinsically difficult. Specifically our point was that you should not use the venue as a proxy for evaluating work. Full stop. Regarding the “mythology” part, as a data-centric chap I was impressed by the analysis in this paper (referred to in this admirable blog post of David Colquhoun FRS that I would have cited if I had been aware of it before drafting our letter; he has elsewhere (in Nature for those who think it matters!) correctly pointed out that “Any selection or promotion committee that asks you for impact factors is probably a second-rate organization. A good place will want to know about the quality of what you have written, not where you published it — and will be aware that the two things are uncorrelated.” ). It has a lovely empirical demonstration that even the impact factor (that the conference ranking in a way seeks to emulate) does not actually correlate very well with citations of papers. There is no contradiction here; its a nice example of a statistical phenomenon. See the figure below

Gernot Heiser 25-Jun-2013
The below is a summary of what I sent to John directly, who suggested I post it here.

Core problem
My core [excuse the pun] argument is
 * 1) the present ranking list is flawed because the process is flawed
 * 2) using the (essentially) same flawed process isn't going to change the outcome significantly
 * 3) a flawed list is worse than none (in fact, it's dangerous).

The present process is flawed as it's based on pleading and there is a lack of well-defined criteria as well as a lack of a trusted authority that could arbitrate.

Lack of clarity of the list's purpose
Things are made worse due to the widespread abuse of the list: A good list would be useful for (in John's words) "pre-publication venue selection" (PPVS). Especially students and ECRs in small groups, who aren't well integrated into the community of local leaders in a discipline, would benefit from the guidance given by a good list, essentially as a (poor) replacement for good mentors with a good understanding of the discipline. In other words: I don't need a ranking list to understand where I sent my papers to maximise impact, but junior people without help might.

The problem is that the list was neither designed for this purpose, nor is it normally used that way. It was designed as part of ERA'10 exactly as a (again John's words) "post-publication quality" measure (PPQM). And when we talk about evaluating people for hiring and promotions, we're talking of using it in exactly that way.

This is bad for multiple reasons:
 * 1) We all know that venue ranking is a very poor indicator of research quality. While far from perfect, citation statistics are far better, and we should continue to argue for their use. The acknowledgment that outlet rankings as a PPQM are flawed was presumably the reason the ARC abandoned them in ERA'12. Yet, clueless heads and deans continue to use them that way
 * 2) It maximises the incentive for gaming the list. If you publish in crappy venues, then in the PRQM game you have much to win by arguing them into the A class.

ACTION: As a very first step we need to clarify the purpose of the list, and that can only be that it is for PPVS.

Of course, no matter with how many PPVS-disclaimers we decorate the list, clueless heads and deans will continue to abuse it, but at least we can point them to the definition.

Fixing the process
By clarifying the purpose as PPVS, we reduce the incentive to gaming. However, it won't stop people from trying. If we want the list to have any standing of respect internationally (and if we don't we should stop wasting our time right away) then we need a process that will lead to a better outcome.

I don't believe the present process can do this. My rough estimate is that Rank "A" is bloated by about a factor of two. It is also deficient in that it essentially only contains venues where Australians have published recently (or are trying to publish). Neither problem is likely to be fixed by the present process. And both undermine the credibility of the list.

I have just gone through the exercise of defining a new "publication target list" for my group in NICTA. (NICTA is obliged by the government to have such a list, and the contractual requirement is again a confusion of PPVS and PRQM, but I'm using it solely for PPVS. For the fields covered by my group, I see this list as what I think CORE A should be.)

The process we adopted (which was in essence proposed by Alan Fekete) aims to be highly transparent, reproducible and, as far as possible, objective. It is very simple:
 * 1) in each discipline identify a set of "discipline heroes". These are generally ACM or IEEE fellows, have >5,000 citations (most >> 10,000) and h-index > 30 (most >> 50). To ensure an up-to-date list, only select heroes who were actively publishing in the field in the last 5 years
 * 2) collect the set of venues (and frequencies) where each of the heroes has published in the last 10 years
 * 3) the set of venues that capture the vast majority of publications (80-90%) are that discipline's "top-tier venues", i.e. Rank A

This worked very well. For the disciplines I know well (systems and real-time), the results were pretty much what I expected, except that they very well clarified the status of the venues which I though' were borderline. For the other disciplines it delivered a set I feel comfortable with.

ACTION: I believe that if CORE is serious about having a credible ranking list, then we should adopt this approach to determine the A venues.

We can then discuss what defines a B venue.

Gernot

John Grundy 11-Jun-2013
From my May newsletter, a synopsis of the exec discussion:

The CORE executive has discussed this at length and determined (1)	to continue with maintaining a list of ranked venues (2)	to develop a ranked venue update process (3)	to call for updates using the update process.

On the topic of (1), we determined there were really only three approaches open to us: (1)	Abandon a CORE-maintained list of ranked outlets and remove from web site, ask people not to use them, etc etc; (2)	Start the process from scratch i.e. throw the old one completely away and do it again from new with a new process etc; (3)	Develop an update process and update the old ones with community feedback

Conscious of the various issues raised by a number of people on (1) and (2), we determined on (3) based on a combination of feedback to my earlier survey; a number of feedback items on the ACDICT performance metrics as above, including quite a number requesting they be maintained and updated (and a number observing “stale” rankings e.g. ERA are still in use with all their attendant problems); and a perception of overall demand from the CORE community (OK, so we’re making that call purely from our own judgement).

We will focus on updates to the CORE conference list first, then journals later, viewing the former as more in need of revision.

Our concern is that with no maintained CORE venue ranking, various other mechanisms will / are being used anyway, as observed by a number of respondents to the ACDICT performance metrics paper, including the unmaintained ERA, old CORE, and non-CS oriented ABDC ranked journals (for 0806).

The USE of such a venue ranking is not proscribed of course.

Use to judge post-publication research/article quality is highly problematic – see:

http://arxiv.org/ftp/arxiv/papers/1301/1301.3748.pdf

for just one of many recent (I think) interesting analyses…

Use of venue quality rank for pre-publication venue selection may well be much more defensible (or maybe not, depending on your point of view).